| Position: Video as the New Language for Real-World Decision Making.   | ICML | 2024 | 0 | 
        
        
            | A Distributional Analogue to the Successor Representation.   | ICML | 2024 | 0 | 
        
        
            | A Definition of Continual Reinforcement Learning.   | NIPS/NeurIPS | 2023 | 0 | 
        
        
            | Deep Reinforcement Learning with Plasticity Injection.   | NIPS/NeurIPS | 2023 | 0 | 
        
        
            | Temporal Abstraction in Reinforcement Learning with the Successor Representation.   | JMLR | 2023 | 0 | 
        
        
            | Generalised Policy Improvement with Geometric Policy Composition.   | ICML | 2022 | 1 | 
        
        
            | Model-Value Inconsistency as a Signal for Epistemic Uncertainty.   | ICML | 2022 | 0 | 
        
        
            | The Phenomenon of Policy Churn.   | NIPS/NeurIPS | 2022 | 0 | 
        
        
            | Approximate Value Equivalence.   | NIPS/NeurIPS | 2022 | 0 | 
        
        
            | Risk-Aware Transfer in Reinforcement Learning using Successor Features.   | NIPS/NeurIPS | 2021 | 3 | 
        
        
            | Discovering a set of policies for the worst case reward.   | ICLR | 2021 | 14 | 
        
        
            | Proper Value Equivalence.   | NIPS/NeurIPS | 2021 | 14 | 
        
        
            | The Value-Improvement Path: Towards Better Representations for Reinforcement Learning.   | AAAI | 2021 | 0 | 
        
        
            | Expected Eligibility Traces.   | AAAI | 2021 | 0 | 
        
        
            | Temporally-Extended ε-Greedy Exploration.   | ICLR | 2021 | 0 | 
        
        
            | On Efficiency in Hierarchical Reinforcement Learning.   | NIPS/NeurIPS | 2020 | 14 | 
        
        
            | The Value Equivalence Principle for Model-Based Reinforcement Learning.   | NIPS/NeurIPS | 2020 | 40 | 
        
        
            | Fast Task Inference with Variational Intrinsic Successor Features.   | ICLR | 2020 | 0 | 
        
        
            | Adaptive Temporal-Difference Learning for Policy Evaluation with Per-State Uncertainty Estimates.   | NIPS/NeurIPS | 2019 | 7 | 
        
        
            | Composing Entropic Policies using Divergence Correction.   | ICML | 2019 | 0 | 
        
        
            | The Option Keyboard: Combining Skills in Reinforcement Learning.   | NIPS/NeurIPS | 2019 | 0 | 
        
        
            | Fast deep reinforcement learning using online adjustments from the past.   | NIPS/NeurIPS | 2018 | 31 | 
        
        
            | Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement.   | ICML | 2018 | 109 | 
        
        
            | Value-Aware Loss Function for Model-based Reinforcement Learning.   | AISTATS | 2017 | 66 | 
        
        
            | Natural Value Approximators: Learning when to Trust Past Estimates.   | NIPS/NeurIPS | 2017 | 8 | 
        
        
            | The Predictron: End-To-End Learning and Planning.   | ICML | 2017 | 0 | 
        
        
            | Successor Features for Transfer in Reinforcement Learning.   | NIPS/NeurIPS | 2017 | 0 | 
        
        
            | Incremental Stochastic Factorization for Online Reinforcement Learning.   | AAAI | 2016 | 6 | 
        
        
            | Practical Kernel-Based Reinforcement Learning.   | JMLR | 2016 | 0 | 
        
        
            | An Expectation-Maximization Algorithm to Compute a Stochastic Factorization From Data.   | IJCAI | 2015 | 2 | 
        
        
            | Policy Iteration Based on Stochastic Factorization.   | JAIR | 2014 | 15 | 
        
        
            | Tree-Based On-Line Reinforcement Learning.   | AAAI | 2014 | 4 | 
        
        
            | On-line Reinforcement Learning Using Incremental Kernel-Based Stochastic Factorization.   | NIPS/NeurIPS | 2012 | 19 | 
        
        
            | Reinforcement Learning using Kernel-Based Stochastic Factorization.   | NIPS/NeurIPS | 2011 | 41 | 
        
        
            | Restricted gradient-descent algorithm for value-function approximation in reinforcement learning.   | Artificial Intelligence | 2008 | 54 |