Correlation structure of training data and the fitting ability of back propagation networks: Some experimental results
White [6-8] has theoretically shown that learning procedures used in network training are inherently statistical in nature. This paper takes a small but pioneering experimental step towards learning about this statistical behaviour by showing that the results obtained are completely in line with White's theory. We show that, given two random vectors X (input) and Y (target), which follow a two-dimensional standard normal distribution, and fixed network complexity, the network's fitting ability definitely improves with increasing correlation coefficient r(XY) (0 less than or equal to r(XY) less than or equal to 1) between X and Y. We also provide numerical examples which support that both increasing the network complexity and training for much longer do improve the network's performance. However, as we clearly demonstrate, these improvements are far from dramatic, except in the case r(XY) = +1. This is mainly due to the existence of a theoretical lower bound to the inherent conditional variance, as we both analytically and numerically show. Finally, the fitting ability of the network for a test set is illustrated with an example.