Research

Papers

Still: Amortized KV Cache Compaction in a Single Forward Pass

Charles O'Neill, Alex Sandomirsky, Harry Partridge, Mudith Jayasekara, and Max Kirkby

arXiv preprint, 2026

From superposition to sparse codes: interpretable representations in neural networks

David Klindt, Charles O'Neill, Patrik Reizinger, Harald Maurer, and Nina Miolane

arXiv preprint, 2025

Sparse Autoencoders for Disentangling Dense Embeddings of Scientific Concepts

Charles O'Neill

NeurIPS 2024, Foundation Models for Science (Oral)

Sparse autoencoders for dense text embeddings reveal hierarchical feature sub-structure

Charles O'Neill

NeurIPS 2024, Scientific Methods for Understanding Deep Learning

Steering semantic search with interpretable features from sparse autoencoders

Charles O'Neill

NeurIPS 2024, Foundation Model Interventions

Steering Language Generation: Harnessing Contrastive Expert Guidance and Negative Prompting for Coherent and Diverse Synthetic Data Generation

Charles O'Neill, Yuan-Sen Ting, Ioana Ciuca, Roberta Raileanu, Jack Miller, and Thang Bui

arXiv preprint

AstroLLaMA: Towards specialised foundation models in astronomy

Tuan Dung Nguyen, Yuan-Sen Ting, Ioana Ciuca, Charles O'Neill, and others

IJCNLP-AACL 2023

AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse Datasets

Ernest Perkowski, Rui Pan, Tuan Dung Nguyen, Yuan-Sen Ting, Sandor Kruk, Tong Zhang, Charles O'Neill, and others

Research Notes of the American Astronomical Society

Measuring Sharpness in Grokking

Jack Miller, Patrick Gleeson, Charles O'Neill, Thang Bui, and Noam Levi

ICLR 2024, Bridging the Gap Between Practice and Theory Workshop

Baseten research

Post-training frontier legal agents with Baseten Research

April 2026, with Mudith Jayasekara, Matthew Blau, Aaron Ellis-Bloor, Niko Grupen, and Gabe Pereyra

Towards infinite context windows: neural KV cache compaction

March 2026, with Alex Sandomirsky and Harry Partridge

Dense, on-policy, or both?

March 2026, with Max Kirkby

Distillation without the dark

February 2026, with Max Kirkby and Mudith Jayasekara

Lumina: building self-improving evaluation through customer-in-the-loop refinement

October 2025, with Harry Partridge, Max Kirkby, Jonathon Liu, Paras Stefanopoulos, and Mudith Jayasekara

Attention-based attribution: what your model is actually looking at

October 2025, with Jonathon Liu, Kimbrian Canavan, Max Kirkby, and Mudith Jayasekara

Robust, sample-efficient SFT with prompt mutations

October 2025, with Harry Partridge

Iterative SFT: dense reward learning

October 2025, with Jonathon Liu, Harry Partridge, Max Kirkby, and Mudith Jayasekara

Write small, learn forever: rank-1 LoRA for continual learning

October 2025, with Max Kirkby, Harry Partridge, and Jonathon Liu

Practical LoRA research

September 2025, with Max Kirkby

Do transformers notice their own mistakes? Finding a linear hallucination detector inside LLMs

February 2025, with Mudith Jayasekara, Max Kirkby, Sviatoslav Chalnev, and Rune Chi Zhao

Resurrecting the salmon: seeing clearer inside LLMs with domain-specific SAEs

January 2025, with Mudith Jayasekara and Max Kirkby

Why mechanistic interpretability needs a paradigm inversion

January 2025, with Mudith Jayasekara and Max Kirkby