Research

Still: Amortized KV Cache Compaction in a Single Forward Pass

Charles O'Neill, Alex Sandomirsky, Harry Partridge, Mudith Jayasekara, and Max Kirkby

arXiv preprint, 2026

Compute Optimal Inference and Provable Amortisation Gap in Sparse Autoencoders

Charles O'Neill

ICML 2025

From superposition to sparse codes: interpretable representations in neural networks

David Klindt, Charles O'Neill, Patrik Reizinger, Harald Maurer, and Nina Miolane

arXiv preprint, 2025

Sparse Autoencoders for Disentangling Dense Embeddings of Scientific Concepts

Charles O'Neill

NeurIPS 2024, Foundation Models for Science (Oral)

Sparse autoencoders for dense text embeddings reveal hierarchical feature sub-structure

Charles O'Neill

NeurIPS 2024, Scientific Methods for Understanding Deep Learning

Steering semantic search with interpretable features from sparse autoencoders

Charles O'Neill

NeurIPS 2024, Foundation Model Interventions

Disentangling Dense Embeddings with Sparse Autoencoders

Charles O'Neill

arXiv preprint

Sparse Autoencoders Enable Scalable and Reliable Circuit Identification in Language Models

Charles O'Neill and Thang Bui

arXiv preprint

Steering Language Generation: Harnessing Contrastive Expert Guidance and Negative Prompting for Coherent and Diverse Synthetic Data Generation

Charles O'Neill, Yuan-Sen Ting, Ioana Ciuca, Roberta Raileanu, Jack Miller, and Thang Bui

arXiv preprint

AstroLLaMA: Towards specialised foundation models in astronomy

Tuan Dung Nguyen, Yuan-Sen Ting, Ioana Ciuca, Charles O'Neill, and others

IJCNLP-AACL 2023

Grokking Beyond Neural Networks: An Empirical Exploration with Model Complexity

Jack Miller, Charles O'Neill, and Thang Bui

TMLR

AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse Datasets

Ernest Perkowski, Rui Pan, Tuan Dung Nguyen, Yuan-Sen Ting, Sandor Kruk, Tong Zhang, Charles O'Neill, and others

Research Notes of the American Astronomical Society

Measuring Sharpness in Grokking

Jack Miller, Patrick Gleeson, Charles O'Neill, Thang Bui, and Noam Levi

ICLR 2024, Bridging the Gap Between Practice and Theory Workshop

Post-training frontier legal agents with Baseten Research

April 2026, with Mudith Jayasekara, Matthew Blau, Aaron Ellis-Bloor, Niko Grupen, and Gabe Pereyra

Towards infinite context windows: neural KV cache compaction

March 2026, with Alex Sandomirsky and Harry Partridge

Dense, on-policy, or both?

March 2026, with Max Kirkby

Repeated KV cache for long-running agents

February 2026

Distillation without the dark

February 2026, with Max Kirkby and Mudith Jayasekara

If we can't design neat latent structures, then maybe we can Bitter Lesson it through self-study

January 2026

BYO SWE-grep: automatically train blazing fast search sub-agents on your knowledge base

October 2025, with Jonathon Liu

Lumina: building self-improving evaluation through customer-in-the-loop refinement

October 2025, with Harry Partridge, Max Kirkby, Jonathon Liu, Paras Stefanopoulos, and Mudith Jayasekara

Upweight the strategy, not the tokens: faster training with explicit reasoning through RGT

October 2025, with Harry Partridge and Mudith Jayasekara

Attention-based attribution: what your model is actually looking at

October 2025, with Jonathon Liu, Kimbrian Canavan, Max Kirkby, and Mudith Jayasekara

Training loss predicts evaluation performance, even for non-verifiable tasks

October 2025, with Max Kirkby

Robust, sample-efficient SFT with prompt mutations

October 2025, with Harry Partridge

Iterative SFT: dense reward learning

October 2025, with Jonathon Liu, Harry Partridge, Max Kirkby, and Mudith Jayasekara

Write small, learn forever: rank-1 LoRA for continual learning

October 2025, with Max Kirkby, Harry Partridge, and Jonathon Liu

Practical LoRA research

September 2025, with Max Kirkby

Do transformers notice their own mistakes? Finding a linear hallucination detector inside LLMs

February 2025, with Mudith Jayasekara, Max Kirkby, Sviatoslav Chalnev, and Rune Chi Zhao

Resurrecting the salmon: seeing clearer inside LLMs with domain-specific SAEs

January 2025, with Mudith Jayasekara and Max Kirkby

Why mechanistic interpretability needs a paradigm inversion

January 2025, with Mudith Jayasekara and Max Kirkby

Papers

Baseten research