You answered your own question when you said you need to have your learning rate change as the network learns. There are a lot of different ways you can do it.
The simplest way is to reduce the learning rate linearly with number of iterations. Every 25 (or some other arbitrary number), subtract a portion off of the rate until it gets to a good minimum.
You can also do it nonlinearly with number of iterations. For example, multiply the learning rate by .99 every iteration, again until it reaches a good minimum.
Or you can get more crafty. Use the results of the network to determine the network's next learning rate. The better it's doing by its fitness metric, the smaller you make its learning rate. That way it will converge quickly for as long as it needs to, then slowly. This is probably the best way, but it's more costly than the simple number-of-iteration approaches.