This is a short post for all of you out there who use information criteria to inform model decision making. The usual suspects are the Akaike and the Bayes-Schwartz criteria.
Especially, if you’re working with big data, try adding the Hannan Quinn (1972) into the mix. It’s not often used in practice for some reason. Possibly this is a leftover from our small sample days. It has a slower rate of convergence than the other two – it’s log(n) convergent. As a result, it’s often more conservative in the number of parameters or size of the model it suggests- e.g. it can be a good foil against overfitting.
It’s not the whole answer and for your purposes may offer no different insight. But it adds nothing to your run time, is fabulously practical and one of my all time favourites that no one has ever heard of.