Research

Charles O'Neill

Compute Optimal Inference and Provable Amortisation Gap in Sparse Autoencoders
Under submission to ICML 2025

Charles O'Neill

Sparse Autoencoders for Disentangling Dense Embeddings of Scientific Concepts
Published as an Oral in NeurIPS 2024 - Foundation Models for Science

Charles O'Neill

Sparse autoencoders for dense text embeddings reveal hierarchical feature sub-structure
Published in NeurIPS 2024 - Scientific Methods for Understanding Deep Learning

Charles O'Neill

Steering semantic search with interpretable features from sparse autoencoders
Published in NeurIPS 2024 - Foundation Model Interventions

Charles O'Neill

Disentangling Dense Embeddings with Sparse Autoencoders
Under review. Released an accompanying web application on HuggingFace

Charles O'Neill and Thang Bui

Sparse Autoencoders Enable Scalable and Reliable Circuit Identification in Language Models
Pre-print on arXiv

Charles O'Neill, Yuan-Sen Ting, Ioana Ciuca, Roberta Raileanu, Jack Miller, and Thang Bui

Steering Language Generation: Harnessing Contrastive Expert Guidance and Negative Prompting for Coherent and Diverse Synthetic Data Generation
Pre-print on arXiv

Tuan Dung Nguyen, Yuan-Sen Ting, Ioana Ciuca, Charles O'Neill, and others

AstroLLaMA: Towards specialised foundation models in astronomy
Published in IJCNLP-AACL 2023

Jack Miller, Charles O'Neill, and Thang Bui

Grokking Beyond Neural Networks: An Empirical Exploration with Model Complexity
Published in TMLR

Ernest Perkowski, Rui Pan, Tuan Dung Nguyen, Yuan-Sen Ting, Sandor Kruk, Tong Zhang, Charles O'Neill, and others

AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse Datasets
Published in Research Notes of the American Astronomical Society

Jack Miller, Patrick Gleeson, Charles O'Neill, Thang Bui, and Noam Levi

Measuring Sharpness in Grokking
Published in ICLR Workshop BGPT 2024