Suvrit Sra

97 publications

18 venues

H Index 39

Affiliation

Massachusetts Institute of Technology (MIT), Laboratory for Information and Decision Systems, Cambridge, MA, USA
Max Planck Institute for Biological Cybernetics, T bingen, Germany
University of Texas at Austin, Department of Computer Sciences, Austin, TX, USA

Links

Name Venue Year citations
Graph Transformers Dream of Electric Flow. ICLR 2025 0
First-Order Methods for Linearly Constrained Bilevel Optimization. NIPS/NeurIPS 2024 13
How to Escape Sharp Minima with Random Perturbations. ICML 2024 0
Transformers Implement Functional Gradient Descent to Learn Non-Linear Functions In Context. ICML 2024 0
Linear attention is (maybe) all you need (to understand Transformer optimization). ICLR 2024 0
Transformers learn to implement preconditioned gradient descent for in-context learning. NIPS/NeurIPS 2023 260
On the Training Instability of Shuffling SGD with Batch Normalization. ICML 2023 6
The Crucial Role of Normalization in Sharpness-Aware Minimization. NIPS/NeurIPS 2023 30
Global optimality for Euclidean CCCP under Riemannian convexity. ICML 2023 8
Sign and Basis Invariant Networks for Spectral Graph Representation Learning. ICLR 2023 0
Efficient Sampling on Riemannian Manifolds via Langevin MCMC. NIPS/NeurIPS 2022 0
Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and Beyond. ICLR 2022 0
Understanding the unstable convergence of gradient descent. ICML 2022 81
CCCP is Frank-Wolfe in disguise. NIPS/NeurIPS 2022 19
Max-Margin Contrastive Learning. AAAI 2022 0
Beyond Worst-Case Analysis in Stochastic Approximation: Moment Estimation Improves Instance Complexity. ICML 2022 0
Neural Network Weights Do Not Converge to Stationary Points: An Invariant Measure Perspective. ICML 2022 0
Understanding Riemannian Acceleration via a Proximal Extragradient Framework. COLT 2022 0
Three Operator Splitting with a Nonconvex Loss Function. ICML 2021 13
Three Operator Splitting with Subgradients, Stochastic Gradients, and Adaptive Learning Rates. NIPS/NeurIPS 2021 12
Open Problem: Can Single-Shuffle SGD be Better than Reshuffling SGD and GD? COLT 2021 13
Provably Efficient Algorithms for Multi-Objective Competitive RL. ICML 2021 24
Can contrastive learning avoid shortcut solutions? NIPS/NeurIPS 2021 164
Online Learning in Unknown Markov Games. ICML 2021 46
Coping with Label Shift via Distributionally Robust Optimisation. ICLR 2021 0
Contrastive Learning with Hard Negative Samples. ICLR 2021 0
Towards Minimax Optimal Reinforcement Learning in Factored Markov Decision Processes. NIPS/NeurIPS 2020 27
Complexity of Finding Stationary Points of Nonconvex Nonsmooth Functions. ICML 2020 90
Strength from Weakness: Fast Learning Using Weak Supervision. ICML 2020 35
From Nesterov's Estimate Sequence to Riemannian Acceleration. COLT 2020 85
Geodesically-convex optimization for averaging partially observed covariance matrices. ACML 2020 3
Why are Adaptive Methods Good for Attention Models? NIPS/NeurIPS 2020 345
SGD with shuffling: optimal rates without component convexity and large epoch requirements. NIPS/NeurIPS 2020 70
Learning Adversarial Markov Decision Processes with Bandit Feedback and Unknown Transition. ICML 2020 0
Why Gradient Clipping Accelerates Training: A Theoretical Justification for Adaptivity. ICLR 2020 0
Are deep ResNets provably better than linear predictors? NIPS/NeurIPS 2019 14
Escaping Saddle Points with Adaptive Gradient Methods. ICML 2019 78
Conditional Gradient Methods via Stochastic Path-Integrated Differential Estimator. ICML 2019 51
Flexible Modeling of Diversity with Strongly Log-Concave Distributions. NIPS/NeurIPS 2019 12
Random Shuffling Beats SGD after Finite Epochs. ICML 2019 0
Learning Determinantal Point Processes by Corrective Negative Sampling. AISTATS 2019 0
Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity. NIPS/NeurIPS 2019 0
Non-Linear Temporal Subspace Representations for Activity Recognition. CVPR 2018 45
An Estimate Sequence for Geodesically Convex Optimization. COLT 2018 63
Direct Runge-Kutta Discretization Achieves Acceleration. NIPS/NeurIPS 2018 112
Exponentiated Strongly Rayleigh Distributions. NIPS/NeurIPS 2018 14
A Generic Approach for Escaping Saddle points. AISTATS 2018 0
Modular Proximal Optimization for Multidimensional Total-Variation Regularization. JMLR 2018 0
Elementary Symmetric Polynomials for Optimal Experimental Design. NIPS/NeurIPS 2017 20
Polynomial time algorithms for dual volume sampling. NIPS/NeurIPS 2017 31
Combinatorial Topic Models using Small-Variance Asymptotics. AISTATS 2017 0
Fast DPP Sampling for Nystrom with Application to Kernel Methods. ICML 2016 76
Kronecker Determinantal Point Processes. NIPS/NeurIPS 2016 32
Proximal Stochastic Methods for Nonsmooth Nonconvex Finite-Sum Optimization. NIPS/NeurIPS 2016 202
First-order Methods for Geodesically Convex Optimization. COLT 2016 320
AdaDelay: Delay Adaptive Distributed Stochastic Optimization. AISTATS 2016 45
Geometric Mean Metric Learning. ICML 2016 178
Stochastic Variance Reduction for Nonconvex Optimization. ICML 2016 642
Fast Mixing Markov Chains for Strongly Rayleigh Measures, DPPs, and Constrained Sampling. NIPS/NeurIPS 2016 39
Gaussian quadrature for matrix inverse forms with applications. ICML 2016 0
Parallel and Distributed Block-Coordinate Frank-Wolfe Algorithms. ICML 2016 0
Efficient Sampling for k-Determinantal Point Processes. AISTATS 2016 0
Riemannian SVRG: Fast Stochastic Optimization on Riemannian Manifolds. NIPS/NeurIPS 2016 0
Fixed-point algorithms for learning determinantal point processes. ICML 2015 55
On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants. NIPS/NeurIPS 2015 199
Matrix Manifold Optimization for Gaussian Mixtures. NIPS/NeurIPS 2015 97
Data modeling with the elliptical gamma distribution. AISTATS 2015 6
Large-scale randomized-coordinate descent methods with non-separable linear constraints. UAI 2015 0
Fast Newton methods for the group fused lasso. UAI 2014 17
Efficient Structured Matrix Rank Minimization. NIPS/NeurIPS 2014 20
Towards an optimal stochastic alternating direction method of multipliers. ICML 2014 59
Riemannian Sparse Coding for Positive Definite Matrices. ECCV 2014 55
Randomized Nonlinear Component Analysis. ICML 2014 183
Geometric optimisation on positive definite matrices for elliptically contoured distributions. NIPS/NeurIPS 2013 30
Jensen-Bregman LogDet Divergence with Application to Efficient Similarity Search for Covariance Matrices. TPAMI 2013 180
Reflection methods for user-friendly submodular optimization. NIPS/NeurIPS 2013 80
Fast projections onto mixed-norm balls with applications. DMKD 2012 29
A new metric on the manifold of kernel matrices with application to matrix geometric means. NIPS/NeurIPS 2012 155
Scalable nonconvex inexact proximal splitting. NIPS/NeurIPS 2012 69
Generalized Dictionary Learning for Symmetric Positive Definite Matrices with Application to Nearest Neighbor Retrieval. ECML/PKDD 2011 50
Fast Newton-type Methods for Total Variation Regularization. ICML 2011 94
Efficient similarity search for covariance matrices via the Jensen-Bregman LogDet Divergence. ICCV 2011 85
Fast Projections onto ℓ1, q -Norm Balls for Grouped Feature Selection. ECML/PKDD 2011 39
Efficient filter flow for space-variant multiframe blind deconvolution. CVPR 2010 265
A scalable trust-region algorithm with application to mixed-norm regression. ICML 2010 40
Workshop summary: Numerical mathematics in machine learning. ICML 2009 0
Convex Perturbations for Scalable Semidefinite Programming. AISTATS 2009 9
Block-Iterative Algorithms for Non-negative Matrix Approximation. ICDM 2008 5
Fast Newton-type Methods for the Least Squares Nonnegative Matrix Approximation Problem. SDM 2007 145
Information-theoretic metric learning. ICML 2007 0
Efficient Large Scale Linear Programming Support Vector Machines. ECML/PKDD 2006 20
Incremental Aspect Models for Mining Document Streams. ECML/PKDD 2006 19
Generalized Nonnegative Matrix Approximations with Bregman Divergences. NIPS/NeurIPS 2005 522
Clustering on the Unit Hypersphere using von Mises-Fisher Distributions. JMLR 2005 1034
Triangle Fixing Algorithms for the Metric Nearness Problem. NIPS/NeurIPS 2004 23
Minimum Sum-Squared Residue Co-Clustering of Gene Expression Data. SDM 2004 329
Generative model-based clustering of directional data. KDD 2003 123
Copyright ©2019 Universität Würzburg

Impressum | Privacy | FAQ