Please use this identifier to cite or link to this item:
|Scopus||Web of Science®||Altmetric|
|Title:||Selection of input variables for data driven models: An average shifted histogram partial mutual information estimator approach|
|Citation:||Journal of Hydrology, 2009; 367(3-4):165-176|
|Publisher:||Elsevier Science BV|
|Abstract:||The use of artificial neural networks (ANNs) for the modelling of water resources variables has increased rapidly in recent years. This paper addresses one of the important issues associated with artificial neural network model development; input variable selection. In this study, the partial mutual information (PMI) input selection algorithm is modified to increase its computational efficiency, while maintaining its accuracy. As part of the modification, use of average shifted histograms (ASHs) is introduced as an alternative to kernel based methods for the estimation of mutual information (MI). Empirical guidelines are developed to estimate the key ASH parameters as a function of sample size. The stopping criterion used with the original PMI algorithm is replaced with a more computationally efficient outlier detection technique based on the Hampel distance. The performance of the proposed PMI algorithm, in terms of computational efficiency and input selection accuracy, is first investigated by using it to identify significant variables for data series where dependencies of attributes are known a priori. The proposed ASH PMI input variable selection algorithm with the Hampel distance stopping criterion consistently selects the correct inputs, while being computationally efficient. The modified PMI algorithm is then applied to identify suitable inputs to forecast salinity in the River Murray at Murray Bridge, South Australia, with a lead time of 14 days using an ANN approach. The ANN models developed with the inputs selected with the modified PMI algorithm perform very well when compared with results obtained using ANN models with different input sets developed in previous studies. Furthermore, the proposed input variable selection algorithm results in more parsimonious ANN models. © 2008 Elsevier B.V. All rights reserved.|
|Keywords:||Artificial neural networks|
Average shifted histograms
|Description:||© 2008 Elsevier B.V. All rights reserved.|
|Appears in Collections:||Aurora harvest|
Civil and Environmental Engineering publications
Environment Institute publications
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.