I read you question like this
"Given input of n numbers from domain D, what is the fastest way to write down sorted input of those n numbers, provided that you can store only k numbers (k < n) in memory? Provide algorithm for n = 10000, k = 1000."
Note, in your question you say that domain D is a range from 1 to 10000. I believe that is an oversimplification. With n = 10000 and input being a range (no repetition), this becomes trivial as you will know exactly where each number should be written in the sorted file. Furthermore you know exactly what are the contents of that file and you don't have to write it at all and you don't have to read the input. :D
Now if the N(D) is not equal n or if you allow repetition then the problem becomes a bit more interesting.
If the memory is limited, I think the intuitive approach is to do this:
1st approach
Reading the input you will be able to sort at most k1 elements before writing them down, where k1 is the number of element which will require k elements in memory to be sorted.
You will end up with f = (n div k1) + 1 files which are internally sorted.
Then you will need to read from f files and merge the partially sorted data writing them down into a final file.
Different sorts have different memory requirements and will produce a different number of partially sorted files that will have to be merged.
Merging more files will require more memory because you will not know in which file you can find the next number.
2nd approach
Another approach is, as you suggest, to know in which file you can find the next number. It is like putting them in buckets based on their size (distributing the sort by classifying), but the problem there is that unless you know how is your data distributed it will not be easy to determine the range of each bucket.
The size of each bucket should be again k1 for least number of files.
Assuming that you know something about your data distribution this could be done, otherwise you will need another pass over your data to establish the cutting points.
For general data where the size of a bucket is not known and you can not first pass over all of the data you (for example if you have to keep some sort of sorted structure for your data as the input is coming in and you don't know what will come next) you would basically have to keep an index such as B+ tree, but this is not optimal. Indexes are optimized for fast retrieval and (some of them) for insertion of small number of new elements.
3rd approach
Having such a small domain allows to simply count the numbers and write their frequency down. If you can have random access to the output files the file system buffering can take care of the efficiency (buffering is an algorithm that does efficient disk writes limited by memory usage, the only problem is if the size of the buffer is less then k numbers and if the chosen bitmap like structure is the most efficient).
Intuitively I would say that the best bet would be to first calculate distribution and calculate the size and limits for each bucket. Then divide the file into buckets. Then sort each bucket. I guess that some performance could be squeezed out by at least partially sorting the data while writing them into buckets.