Mathematical Modelling of Ensemble Classifier Systems via Optimization of Diversity- Accuracy Trade off

Final Report Summary - OPT-DIVA (Mathematical Modelling of Ensemble Classifier Systems via Optimization of Diversity- Accuracy Trade off)

The overall aim of this project is to develop novel and effective ensemble classifier systems via the optimization of diversity and accuracy trade off. In this study the following key scientific objectives have been addressed:

1) Improvement of Ensemble classifier systems considering accuracy diversity trade-off via novel optimization schemes in Task 1
2) Reduction of time complexity of model selection in Task 2 and 3
3) Generalisation of overall model for heterogeneous data and experimental comparison for classification of facial expressions in Task 4 and 5.

Task 1 addressed the modelling of weighted ensemble classifier systems by considering the important problem of finding an optimization model for ensemble classifier systems that manages the diversity and accuracy trade-off. A continuous unconstrained optimization model was developed and implemented on the well known UCI machine learning repository [1] and the facial action unit data provided by the University of Surrey CVSSP lab. The objective function involves a quadratic term that includes variable x and matrix G. The binary variable x represents whether a classifier is included in the ensemble, and G is a square error matrix with diagonal terms representing the total error of each classifier, with off-diagonals representing the common errors between pairs of classifiers, thereby corresponding to a measure for diversity.

Task2 addressed the model selection of base classifiers, and integrated the new weighted ensemble classifier obtained from Task 1 with a model Selection strategy. A novel optimization model was created and applied to Error Correcting Output Codes (ECOC) which prunes the number of base classifiers of the ECOC matrix. Although techniques are known for creating efficient ECOC matrices, pruning base classifiers of the ECOC matrix has not previously been studied. Regarding the objective function given in Task 1, a regularization parameter is added. This additional term allows us to find a subset of classifiers that constitutes the most accurate and diverse ensemble. We approximated the zero norm and our model becomes a nonconvex unconstrained optimization problem since the matrix G is symmetric but not positive definite. We transformed the nonconvex problem into a convex one by using difference of convex functions (DC Programming) and solving by the nonlinear optimization method Sequential Quadratic Programming (SQP). More detail can be found in the attached paper.

Task 3 addressed regularization of model complexity, and aims to add a regularization term to penalize the complexity. As the number of classifiers increases, the size of the weight vector and hence the model complexity increases accordingly. Therefore, including the ensemble size constraint given in [2] into the objective function with a parameter in our new formulation regularizes the model complexity directly. In this way, taking the zero norm approximation of the model complexity allows us to get a better approximation then Tikhonov regularization as proposed in Task 2. The proposed model was compared with other pruning methods including Reduced Error Pruning, Kappa Pruning and Random Guessing. We reported the results in a paper (attached) submitted to the Machine Learning journal.

Task 4 addressed the generalization of overall model for heterogeneous multiclass data. A Genetic algorithm was proposed to learn the kernel weights for each subproblem in ECOC but the computational cost was increased. To improve the speed of the algorithm we implemented Generalized MKL [3] adapted to our methodology with ECOC, but obtained similar results with the single kernel approach. The reason may lie with the nature of Error Correcting Output Coding since it corrects the errors regardless of single or multiple kernels. In other words, trying to learn by MKL may be redundant and more costly than single kernel learning. ECOC already corrects the output of single kernel binary classifiers (i.e. base classifiers), so we conclude that single kernel may be sufficient for heterogeneous data in ECOC.

Task 5 addressed Applications, and We applied all the above tasks both to UCI datasets and Cohn Kanade facial expression datasets. Image processing and feature extraction, dimensionality reduction was performed as in [4]. The method was performed on different datasets ecoli, glass, dermatalogy, yeast and wine datasets from UCI, as well as facial expression classification. We compared the accuracy and running time of our algorithm with different pruning methods such as Reduced Error Pruning, Kappa Pruning and Random Guessing. The results showed that eliminating the parameter of ensemble size from the constraints but including it in the objective function by the proposed approximation algorithm reduced the time complexity and improved the accuracy for most datasets. More detail can be found in the attached paper.

In summary, In this project we have proposed a novel algorithm which prunes the Error Correcting Output Code(ECOC) method in an optimization framework. The proposed model enables us to eliminate the ensemble size parameter by relaxing the binary optimization model to a continuous model. Application of the project to facial expression classification has motivated a forthcoming collaborative project on recognition of heart attack risk of people exercising on a treadmill by facial expressions from online video images. From a personal standpoint, the Marie Curie researcher Dr. Akyuz has benefited greatly from this project. With the help of the host university, she has developed her career on theoretical machine learning as well as engineering applications in the Computer Vision field.

REFERENCES
[1] C.L. Blake and C.J. Merz. UCI repository of machine learning databases. University of California, Irvine, Dept. of Information and Computer Sciences, 1998. http://www.ics.uci.edu/_mlearn/MLRepository.html.
[2] Y. Zhang, S. Burer, and W. N. Street. Ensemble pruning via semi-definite programming. Journal of Machine Learning Research, 7:1315-1338, 2006.
[3] M. Varma and B. R. Babu. More generality in efficient multiple kernel learning. In Proceedings of the International Conference on Machine Learning, Montreal, Canada, 2009.
[4] Raymond Smith and Terry Windeatt. Facial action unit recognition using filtered local binary pattern features with bootstrapped and weighted ecoc classifiers. In Oleg Okun, Giorgio Valentini, and Matteo Re, editors, Ensembles in Machine Learning Applications, volume 373 of Studies in Computational Intelligence, pages 1-20. Springer, 2011.

final1-reporttwfinal.pdf

Final Report Summary - OPT-DIVA (Mathematical Modelling of Ensemble Classifier Systems via Optimization of Diversity- Accuracy Trade off)

Verwandte Dokumente

Diese Seite teilen

Herunterladen