Prefix/suffix tree are generally the standard, best and most cautious solution for this sort of thing in my opinion. You can't go wrong with them.
But here is a different idea, which resorts to Bloom filters. You probably know what these are, but just in case (and for other people reading this answer): Bloom filters are very small, very compact bit-vectors which approximate set inclusion. If you have a set S and a Bloom filter for that set B(S), then
x ∈ S ⇒ x ∈ B(S)
but the reciprocate is false. This is what is probabilistic about the structure: there is a (quantifiable) probability that the Bloom filter will return a false positive. But approximating inclusion with the Bloom filter is wildly faster than testing it exactly on the set.
(A simple case use: in a lot of applications, the Bloom filter is used, well, as a filter. Checking cache is expensive, because you have to do a hard drive access, so programs like Squid will first check a small Bloom filter in memory, and if the Bloom filter returns a positive result, then Squid will go check the cache. If it was false positive, then that's OK, because Squid will find that out when actually visiting the cache---but the advantage is that the Bloom filter will have spared Squid having to check the cache for a lot of requests where it would have been useless.)
Bloom filters were used with some success in string search. Here is a sketch (I may remember some of the details wrong) of this application. A text file is a sequence of N lines. You are looking for a word composed of M letters (and no word can be spread accross two lines). A preprocessing phase will build ONE Bloom filter for each line, by adding every subsequence of the line to the Bloom filter; for instance, for this line
Touching this dreaded sight, twice seene of vs,
And the corresponding Bloom filter will be created with "T", "To", "Tou" ... "o", "ou", ... "vs,", "s", "s,", ",". (I may have this part wrong. Or you might want to optimize.)
Then when searching for the subword of size M, simply do one very fast check on each of the Bloom filters, and when there is a hit, examine the line closely with the KMP algorithm, for instance. In practice, if you tune your Bloom filters well, the trade-off is remarkable. Searching is incredibly fast because you eliminate all useless lines.
I believe from this concept you could derive a useful scheme for your situation. Right now, I see two evident adaptation:
either cut your data set in many blocks of size K (each with its Bloom filter, like the lines in the previous example);
or use a sort-of dichotomy where you split the set into two subset, each with a Bloom filter, then each subset split into two sub-subsets with their own Bloom filter, etc. (if you are going to add all substrings as suggested with the method I described, this second idea would be a bit prohibitive---except you don't have to add all substrings, only substrings ranging size 1 to 10).
Both ideas can be combined in inventive ways to create multi-layered schemes.