There are a number of reasons for the many dialects of Lisp, some historical, some technical, and some mostly psychological.
Historical: By classical standards, Lisp was fairly slow and used lots of memory. Quite a few people have devised various techniques (or corruptions, if you don't like them) to try to make it more practical. This was especially true when Lisp machines were being built -- the hardware was devised specifically to run Lisp, and at the same time, the Lisp they ran was devised (revised?) specifically to run on that hardware and to take full advantage of its capabilities.
Technical: Some decisions that have been made at times in Lisp were questionable (to put it nicely). For example, all modern Lisps uses lexical scoping, but quite a few early ones used dynamic scoping. Some Scheme users don't think much of the non-hygienic macros in most other Lisp dialects.
Psychological: Lisp is so simple that many people have felt qualified to write their own implementations. Many Lisp programmers are also fond of experimentation and pursuing perfection, so many of those implementations included the implementors idea of improvements of various (usually incompatible) kinds. Nobody was coordinating efforts so many of those extensions/changes were incompatible with each other in various ways, so each became a (more or less) distinct dialect. Some of this was probably avoidable, but some of it wasn't -- just for example, two people might see a particular feature as flawed. One would work at improving it to something he found more acceptable, while another removed it completely, and either considered that an improvement in itself, or possibly devised something completely different to replace it.
Poor communication also often played a role. Somebody at (say) MIT might go somewhere on sabbatical, and take along a tape of some Lisp implementation, which would start to be used wherever they went. That would often (quite unintentionally) fork the implementation, as the two schools did work independently in parallel.