Tor Lattimore
Tor Lattimore
DeepMind
Verified email at google.com - Homepage
TitleCited byYear
Bandit algorithms
T Lattimore, C Szepesvári
preprint, 2018
1022018
Optimal cluster recovery in the labeled stochastic block model
SY Yun, A Proutiere
Advances in Neural Information Processing Systems, 965-973, 2016
65*2016
PAC bounds for discounted MDPs
T Lattimore, M Hutter
International Conference on Algorithmic Learning Theory, 320-334, 2012
472012
Optimal cluster recovery in the labeled stochastic block model
SY Yun, A Proutiere
Advances in Neural Information Processing Systems, 965-973, 2016
352016
The end of optimism? an asymptotic analysis of finite-armed linear bandits
T Lattimore, C Szepesvari
arXiv preprint arXiv:1610.04491, 2016
302016
Optimally confident UCB: Improved regret for finite-armed bandits
T Lattimore
arXiv preprint arXiv:1507.07880, 2015
292015
Unifying PAC and regret: Uniform PAC bounds for episodic reinforcement learning
C Dann, T Lattimore, E Brunskill
Advances in Neural Information Processing Systems, 5713-5723, 2017
282017
Universal knowledge-seeking agents for stochastic environments
L Orseau, T Lattimore, M Hutter
International Conference on Algorithmic Learning Theory, 158-172, 2013
272013
The sample-complexity of general reinforcement learning
T Lattimore, M Hutter, P Sunehag
Proceedings of the 30th International Conference on Machine Learning, 2013
262013
On explore-then-commit strategies
A Garivier, T Lattimore, E Kaufmann
Advances in Neural Information Processing Systems, 784-792, 2016
252016
Asymptotically optimal agents
T Lattimore, M Hutter
International Conference on Algorithmic Learning Theory, 368-382, 2011
252011
No free lunch versus Occam’s razor in supervised learning
T Lattimore, M Hutter
Algorithmic Probability and Friends. Bayesian Prediction and Artificial …, 2013
242013
Conservative bandits
Y Wu, R Shariff, T Lattimore, C Szepesvári
International Conference on Machine Learning, 1254-1262, 2016
232016
Thompson sampling is asymptotically optimal in general environments
J Leike, T Lattimore, L Orseau, M Hutter
arXiv preprint arXiv:1602.07905, 2016
212016
Near-optimal PAC bounds for discounted MDPs
T Lattimore, M Hutter
Theoretical Computer Science 558, 125-143, 2014
212014
Refined lower bounds for adversarial bandits
S Gerchinovitz, T Lattimore
Advances in Neural Information Processing Systems, 1198-1206, 2016
202016
General time consistent discounting
T Lattimore, M Hutter
Theoretical Computer Science 519, 140-154, 2014
192014
Regret analysis of the finite-horizon gittins index strategy for multi-armed bandits
T Lattimore
Conference on Learning Theory, 1214-1245, 2016
182016
Optimal resource allocation with semi-bandit feedback
T Lattimore, K Crammer, C Szepesvári
arXiv preprint arXiv:1406.3840, 2014
162014
Bounded Regret for Finite-Armed Structured Bandits
T Lattimore, R Munos
162014
The system can't perform the operation now. Try again later.
Articles 1–20