views:

144

answers:

4

All the examples I have seen of neural networks are for a fixed set of inputs which works good for images and fixed length data. How do you deal with variable length data such sentances, queries or source code. Is there a way to encode variable length data into fixed length inputs and still get the generalization properties of neural networks?

Thanks

A: 

i'm not entirely sure, but I'd say, use the maximum number of inputs (e.g. for words, lets say no word will be longer than 45 characters (longest word found in a dictionary according to wikipedia), and if a shorter word is encountered, set the other inputs to a whitespace character.

Or with binary data, set it to 0. the only problem with this approach is if an input filled with whitespace characters/zeros/whatever collides with a valid full length input (not so much a problem with words as it is with numbers).

Zenon
+5  A: 

You would usually extract features from the data and feed those to the network. It is not advisable to take just some data and feed it to net. In practice, pre-processing and choosing the right features will decide over your success and the performance of the neural net. Unfortunately, IMHO it takes experience to develop a sense for that and it's nothing one can learn from a book.

Summing up: "Garbage in, garbage out"

f3lix
What about the case where you want the neural network to extract the features and feed it to another network for classification / interpretation. Or you want the network to learn a grammar from a set of examples. In both these cases the network would need to process variable length data sets.
Jeremy E
There are times when you want a bidirectional associative memory and the size of the items to associate are different. (name of person, picture of person)
Jeremy E
+1  A: 

Some problems could be solved by a recurrent neural network. For example, it is good for calculating parity over a sequence of inputs.

The recurrent neural network for calculating parity would have just one input feature. The bits could be fed into it over time. Its output is also fed back to the hidden layer. That allows to learn the parity with just two hidden units.

A normal feed-forward two-layer neural network would require 2**sequence_length hidden units to represent the parity. This limitation holds for any architecture with just 2 layers (e.g., SVM).

Ivo Danihelka
Is this similar to a hidden markov model only using neural networks?
Jeremy E
It is more similar to a neural network with some output fed to the next input. Unimportant history will be forgotten over time.
Ivo Danihelka
A: 

I guess one way to do it is to add a temporal component to the input (recurrent neural net) and stream the input to the net a chunk at a time (basically creating the neural network equivalent of a lexer and parser) this would allow the input to be quite large but would have the disadvantage that there would not necessarily be a stop symbol to seperate different sequences of input from each other (the equivalent of a period in sentances)

Jeremy E