The need for having part of the training set used as verification data is straightforward, but I am not really clear on how and at what stage of the training should it be incoperated?
Is it at the end of the training (after reaching a good minimum for the training data)? If so, what should be done if the verification data yeilds a big error?
Is it throughout the training (keep looking for a minimum while errors for both the training and verification data aren't satisfactory)?
No matter what I try it seems that the network is having a trouble to learn both training and verification when the verification set reaches a certain size (I recall reading somewhere that 70% training 30% verification is a common ratio, I get stuck at a much smaller one), while it has no problem to learn the same data when used entirely for training.