views:

27

answers:

1

I am attempting to train a neural network for a system that can be thought of as a macro-level postal network. My inputs are two locations (one of the 50 US states) along with 1 to 3 other variables, and I want a numeric result out.

My first inclination was to represent the states as a numeric value from 0-49 and then then have a network with only 3 or so inputs. What I've found, however, is that my training never converges on a useful value. I am assuming that this is because the values for the states are wholly arbitrary - a value of 39 for MA has no relation to a value of 38 for CA, especially when 37 represents a jump back to CT.

Is there a better way for me to do this? Should I be creating a network with over 100 inputs, representing boolean values for origin and destination states?

+1  A: 

I think that your intuition about the difficulty of representing different states as consecutive integers is correct -- that representation compresses a lot of information into each input. That means that your network might have to learn a lot about how to decode that information into facts that are actually useful in solving your problem.

One state per input, with boolean inputs, could help. It would make it easier for the network to figure out which two states you're talking about. Of course, that approach doesn't necessarily make it easy for the network to learn useful facts like which states are adjacent to eachother.

It might be useful to try to determine if there are any kinds of information out there that are both easy for you to provide and that also might make learning easier. For example, if the physical layout of the states is important to solving your problem (i.e. CT is adjacent to NY, which is adjacent to PA) then perhaps you could break the country into regions (e.g. northwest, southeast, midwest) and provide boolean inputs for each region.

Feeding a few input schemes like that into a single network could allow you to specify a single state using a (potentially) more useful representation: instead of saying "it's state #39", you could say (for example) "it's the northernmost state that touches more than five neighboring states in the eastern region".

If the network finds it useful to determine if two states are near eachother, this kind of representation might make learning go a bit faster -- the network could get a rough idea if two states are close by simply comparing the two "region" inputs for the states. Checking whether two region inputs are equal is a lot easier than memorizing the fact that state #39 is near states #38, #21, #7, and #42.

Nate Kohl