The final prediction error criterion for MLP models
By the nature of multilayer perceptron (MLP) models, the number, p, of free adjustable parameters they involve is high when compared to other identification devices, and the number, n, of learning examples needed to ensure good generalisation properties for the trained network is increased correspondingly. Since little is known about the upper bound of the p/n ratio to avoid damaging overfit, an extra test set to validate the model is required, with a consequent increase of the required data sample size. To account for a possible limited availability of examples, this paper addresses the problem of estimating the expected generalisation rate without using an additional sample for cross-validation purposes. The previous approach does not hold when the MLPs are seen as nonlinear least-squares regression models, because the relevant learning and generalisation rates then correspond to mean squared error quantities. By putting the learning in artificial neural networks into a statistical perspective, it is possible to derive the theoretical conditions under which Akaike's Final Prediction Error (FPE) criterion might be used for estimating the expected generalisation rate of a trained MLP. By changing the architectural size of the fitting MLP, it is possible to generate underparametrised, overparametrised or perfectly suited classes of MLP models. The results of this large empirical database suggest that the range of applicability of the FPE criterion goes far beyond the conditions that had been needed for its derivation.
Bibliographic Reference: Paper presented: International Conference on Artificial Neural Networks, Brighton (GB), Sept. 4-7, 1992
Availability: Available from (1) as Paper EN 36775 ORA
Record Number: 199210615 / Last updated on: 1994-12-02
Original language: en
Available languages: en