Jonathan Baxter
Jonathan Baxter
Independent Researcher
Verified email at panscient.com
TitleCited byYear
Experiments with infinite-horizon, policy-gradient estimation
J Baxter, PL Bartlett, L Weaver
J. Artif. Intell. Res. (JAIR) 15, 351-381, 2001
734*2001
Infinite-Horizon Policy-Gradient Estimation
J Baxter, P Bartlett
Journal Of Artificial Intelligence Research 15, 319-350, 2001
7052001
Infinite-Horizon Policy-Gradient Estimation
J Baxter, P Bartlett
Journal Of Artificial Intelligence Research 15, 319-350, 2001
7052001
Infinite-Horizon Policy-Gradient Estimation
J Baxter, P Bartlett
Journal of Artificial Intelligence Research 15, 319-350, 2001
7052001
Infinite-horizon policy-gradient estimation
J Baxter, PL Bartlett
J. Artif. Intell. Res. (JAIR) 15, 319-350, 2001
7052001
A model of inductive bias learning
J Baxter
Journal of Artificial Intelligence Research 12, 149-198, 2000
6112000
Boosting algorithms as gradient descent
L Mason, J Baxter, PL Bartlett, MR Frean
Advances in neural information processing systems, 512-518, 2000
6112000
Functional gradient techniques for combining hypotheses
L Mason, J Baxter, PL Bartlett, M Frean
Advances in Neural Information Processing Systems, 221-246, 1999
3301999
Learning internal representations
J Baxter
Proceedings of the eighth annual conference on Computational learning theory …, 1995
2261995
A Bayesian/information theoretic model of learning to learn via multiple task sampling
J Baxter
Machine learning 28 (1), 7-39, 1997
2191997
Variance reduction techniques for gradient estimates in reinforcement learning
E Greensmith, PL Bartlett, J Baxter
Journal of Machine Learning Research 5 (Nov), 1471-1530, 2004
2002004
Direct gradient-based reinforcement learning
J Baxter, PL Bartlett
Circuits and Systems, 2000. Proceedings. ISCAS 2000 Geneva. The 2000 IEEE …, 2000
1832000
Knightcap: a chess program that learns by combining td (lambda) with game-tree search
J Baxter, A Tridgell, L Weaver
arXiv preprint cs/9901002, 1999
1431999
Improved generalization through explicit optimization of margins
L Mason, PL Bartlett, J Baxter
Machine Learning 38 (3), 243-255, 2000
1422000
Learning to play chess using temporal differences
J Baxter, A Tridgell, L Weaver
Machine Learning 40 (3), 243-263, 2000
1402000
Reinforcement learning in POMDP's via direct gradient ascent
J Baxter, PL Bartlett
ICML, 41-48, 2000
1222000
Scaling internal-state policy-gradient methods for POMDPs
D Aberdeen, J Baxter
MACHINE LEARNING-INTERNATIONAL WORKSHOP THEN CONFERENCE-, 3-10, 2002
962002
The evolution of learning algorithms for artificial neural networks
J Baxter
Complex systems, 313-326, 1993
821993
A multi-agent, policy-gradient approach to network routing
N Tao, J Baxter, L Weaver
In: Proc. of the 18th Int. Conf. on Machine Learning, 2001
742001
Theoretical models of learning to learn
J Baxter
Learning to learn, 71-94, 1998
731998
The system can't perform the operation now. Try again later.
Articles 1–20