Reading List

Machine Learning
2008-I


Bayesian decision theory

[Tenenbaum06]
Tenenbaum, J. B.; Griffiths, T. L. & Kemp, C.
Theory-based Bayesian models of inductive learning and reasoning
Trends in Cognitive Sciences,
2006, 10, 309-318
[Dietterich02]
Dietterich, T. G.
Machine Learning for Sequential Data: A Review
Structural, Syntactic, and Statistical Pattern Recognition: Joint Iapr International Workshops Sspr 2002 and Spr 2002, Windsor, Ontario, Canada, August 6-9, 2002: Proceedings,
2002
[Friedman99]
Friedman, N.; Getoor, L.; Koller, D. & Pfeffer, A.
Learning probabilistic relational models
Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence,
1999, 1300-1309
[Goldenberg05]
Goldenberg, A. & Moore, A.
Bayes Net Graphs to Understand Coauthorship Networks
KDD Workshop on Link Discovery: Issues, Approaches and Applications,
2005

Kernel methods

[Grauman05]
Grauman, K., and T. Darrell.
Pyramid match kernel: Discriminative classification with sets of image features.
MIT Computer Science and Artificial Intelligence Laboratory Technical Report, MIT-CSAIL-TR-2005-017
2005.
[Leslie02]
Leslie, C., E. Eskin, and W. S. Noble. 
The spectrum kernel: A string kernel for SVM protein classification.
In Proceedings of the 2002 Pacific Symposium on Biocomputing
2002
[Lodhi02]
Lodhi, H., C. Saunders, J. Shawe-Taylor, N. Cristianini, and C. Watkins. 
Text classification using string kernels. 
The Journal of Machine Learning Research 2:419-444.
2002

Regularization and model complexity

[Lawrence96]
Lawrence, Steve, C. Lee Giles, and A.C. Tsoi. 
What Size Neural Network Gives Optimal Generalization? Convergence Properties of Backpropagation. 
UM Computer Science Department. University of Maryland, UMIACS-TR-96-22
1996
[Mehta95]
Mehta, M., J. Rissanen, and R. Agrawal. 
MDL-based decision tree pruning.
In Proceedings of KDD95
1995
[Roberts00]
Roberts, S., and H. Pashler. 
How persuasive is a good fit? A comment on theory testing.
Psychological Review 107, no. 2:358-367
2000

Performance evaluation

[Domingos99]
Domingos, P. 
MetaCost: a general method for making classifiers cost-sensitive. 
In Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, p. 155-164
1999
[Hand01]
Hand, David J., and Robert J. Till. 
A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems. 
Machine Learning 45, no. 2:171-186.
2001
[Japkowicz02]
Japkowicz, N. 
The class imbalance problem: A systematic study. 
Intelligent Data Analysis, 6(5), p.429-449.
2002

Combining multiple classifiers

[Dietterich00] 
Dietterich, T. G. 
An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization. Machine Learning 40, no. 2: 139-157. 
2000
[Mason00] 
Mason, L., J. Baxter, P. Bartlett, and M. Frean. 
Boosting algorithms as gradient descent. 
In Advances in Neural Information Processing Systems, 12:512-518. 
2000
[Oza05]
Oza, N.  
Online bagging and boosting. 
In 2005 IEEE International Conference on Systems, Man and Cybernetics  
2005

Clustering and density estimation