Charlie O'Neill - Research

Charles O'Neill

Compute Optimal Inference and Provable Amortisation Gap in Sparse Autoencoders

Accepted to ICML 2025

arXiv

Charles O'Neill

Sparse Autoencoders for Disentangling Dense Embeddings of Scientific Concepts

Published as an Oral in NeurIPS 2024 - Foundation Models for Science

OpenReview

Charles O'Neill

Sparse autoencoders for dense text embeddings reveal hierarchical feature sub-structure

Published in NeurIPS 2024 - Scientific Methods for Understanding Deep Learning

OpenReview

Charles O'Neill

Steering semantic search with interpretable features from sparse autoencoders

Published in NeurIPS 2024 - Foundation Model Interventions

Charles O'Neill

Disentangling Dense Embeddings with Sparse Autoencoders

Under review. Released an accompanying web application on HuggingFace

arXiv

Charles O'Neill and Thang Bui

Sparse Autoencoders Enable Scalable and Reliable Circuit Identification in Language Models

Pre-print on arXiv

arXiv

Charles O'Neill, Yuan-Sen Ting, Ioana Ciuca, Roberta Raileanu, Jack Miller, and Thang Bui

Steering Language Generation: Harnessing Contrastive Expert Guidance and Negative Prompting for Coherent and Diverse Synthetic Data Generation

Pre-print on arXiv

arXiv

Tuan Dung Nguyen, Yuan-Sen Ting, Ioana Ciuca, Charles O'Neill, and others

AstroLLaMA: Towards specialised foundation models in astronomy

Published in IJCNLP-AACL 2023

arXiv

Jack Miller, Charles O'Neill, and Thang Bui

Grokking Beyond Neural Networks: An Empirical Exploration with Model Complexity

Published in TMLR

arXiv

Ernest Perkowski, Rui Pan, Tuan Dung Nguyen, Yuan-Sen Ting, Sandor Kruk, Tong Zhang, Charles O'Neill, and others

AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse Datasets

Published in Research Notes of the American Astronomical Society

View Paper

Jack Miller, Patrick Gleeson, Charles O'Neill, Thang Bui, and Noam Levi

Measuring Sharpness in Grokking

Published in ICLR Workshop BGPT 2024

arXiv