Translational research thread
Phenotype fingerprinting
A line of work using clinical sequence models to identify rare, under-recognised, or hard-to-characterise disease patterns.
The aim is to use latent clinical trajectories to surface patients or cohorts whose records contain early, partial, or indirect evidence of a phenotype before conventional coding makes it obvious. This work is intended to support earlier recognition, better cohort definition, and more testable translational hypotheses.
Clinical sequence modellingRare-case and under-recognised phenotype discovery
Research programme
ORCA
A programme built around the idea that electronic health records can be treated as a language, with implications for representation, translation, and generation.
The aim is to move beyond treating EHRs as flat tables of variables and instead study the statistical structure, translation behaviour, and generative possibilities of clinical event sequences. ORCA asks whether coded records can be read, aligned across systems, translated into clinical language, and used to reason about plausible futures.
Working paper in progressLanguage-of-EHR framing
Scientific-discovery system
NEXUS
A system for predicting cross-domain scientific method transfer using historical discoveries and validated transfers.
The aim is to build a system that learns from previous scientific discoveries to suggest where methods, representations, or experimental ideas may transfer next. NEXUS is intended as a grounded discovery assistant: less about generic brainstorming, more about structured evidence for why a method from one domain may work in another.
Methods whitepaper in draftDiscovery-system thread
Programme-level thread
ASCEND research arc
A broader research arc spanning ASCENDgpt, FlatASCEND, and ORCA, focused on modelling structured clinical trajectories.
The aim is to develop a coherent sequence of EHR foundation-model work: from phenotype-aware representation learning, through autoregressive clinical trajectory generation, to interpretation and language-like translation. The arc is designed to make clinical sequences modelable, inspectable, and ultimately more useful for real clinical research questions.
ASCENDgptFlatASCENDORCA
Public-facing project
glucose.ai
A literature intelligence and discovery surface for diabetes and AI / machine learning work.
The aim is to make the diabetes and AI literature easier to monitor, search, and turn into useful research leads. glucose.ai acts as a public-facing discovery surface for papers, themes, and emerging methods in diabetes, metabolic medicine, and clinical AI.
Public websiteDiabetes + AI / ML discovery
Research infrastructure
pi / cog workflows
Research and coding infrastructure for organising sources, tasks, synthesis, and project state in a more usable system.
The aim is to make complex research work easier to run: keeping sources, tasks, synthesis, code, and project state connected enough that ideas can be developed without being lost. This infrastructure thread is partly about personal productivity, but also about reusable patterns for agent-supported research workflows.
Infrastructure and workflow notesAgent workflow systems