Jan Leike

18 publications

9 venues

H Index 9

Affiliation

Anthropic PBC, San Francisco, CA, USA
OpenAI, San Francisco, CA, USA
Australian National University, Canberra, ACT, Australia
University of Freiburg, Germany

Links

Name Venue Year citations
Scaling and evaluating sparse autoencoders. ICLR 2025 0
Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision. ICML 2024 0
Let's Verify Step by Step. ICLR 2024 0
Training language models to follow instructions with human feedback. NIPS/NeurIPS 2022 18854
Quantifying Differences in Reward Functions. ICLR 2021 0
Learning Human Objectives by Evaluating Hypothetical Behavior. ICML 2020 1
Pitfalls of Learning a Reward Function Online. IJCAI 2020 0
Reward learning from human preferences and demonstrations in Atari. NIPS/NeurIPS 2018 454
On Thompson Sampling and Asymptotic Optimality. IJCAI 2017 54
Deep Reinforcement Learning from Human Preferences. NIPS/NeurIPS 2017 4738
Generalised Discount Functions applied to a Monte-Carlo AI u Implementation. AAMAS 2017 4
Universal Reinforcement Learning Algorithms: Survey and Experiments. IJCAI 2017 19
Thompson Sampling is Asymptotically Optimal in General Environments. UAI 2016 40
A Formal Solution to the Grain of Truth Problem. UAI 2016 18
Loss Bounds and Time Complexity for Speed Priors. AISTATS 2016 9
Bad Universal Priors and Notions of Optimality. COLT 2015 45
Sequential Extensions of Causal and Evidential Decision Theory. ADT 2015 15
On the Computability of AIXI. UAI 2015 9
Copyright ©2019 Universität Würzburg

Impressum | Privacy | FAQ