tags:

views:

127

answers:

4

I have a text file that has a very long list of items. So I want to sort them alphabetically but I do not want to load all the file into the memory (RAM).

I tried loading all the contents of the file to an array and sort them just like I do normally. But the system complains that there are no much memory!!

Thanks, Mohammad

+7  A: 

You'll need to read up on external sorting. The basic approach is to use some sort of divide-and-conquer routine like merge sort, where you read and sort a portion of the file, then read and sort another portion of the file, etc. and when you get to the end you merge the sorted portions together.

Jason S
You can view here a nice course about external sorting. http://video.google.com/videoplay?docid=-978892635109400080#
Daniel Băluţă
On unix, use the command line: `sort`.
Matthieu M.
+3  A: 

Maybe the STXXL (Standard Template Library for Extra Large Data Sets) helps.

STXXL offers external sorting amongst others.

stephan
interesting....
Jason S
A: 

You don't have to hold the whole file in memory. If this is a task you don't have to do very often, you can write an application that sorts it very slow. Something like this (pseudo):

vector<int> linesProcessed;
for (int i = 0; i < lineCount; i++)
{
   if (linesProcessed contains i) continue;
   string alphabeticalFirstLine;
   int lineIndex;
   foreach line in oldFile
   {
       if (line is before alphabeticalFirstLine)
       {
            alphabeticalFirstLine = line;
            lineIndex = i;
       }
   }
   write alphabeticalFirstLine to newFile;
   vector.add(lineIndex);
}
clear vector;
delete oldFile;
rename newFile to oldFile;
Martijn Courteaux
That's called selection sort (almost), not the best idea.
unbeli
@unbeli: I know it is almost selection sort. Selection Sort searches also for the biggest value. But I wrote, "if this is a task you don't have to do very often, ..."
Martijn Courteaux
Even if one doesn't do it very often, why should she implement weak algorithm when there is a simple stronger one?
unbeli
@unbeli: I didn't know about the External Sorting. So I posted a suggestion. That is the reason why I'm a user on Stack Overflow: help people and learn programming, get insight. I just wanted to help him...
Martijn Courteaux
A: 

If you are using some unix-like OS you can use sort command. It will take care about memory consumption. For an example something like "cat large_file | sort" will do the job.

Or you can write your own / use external sorting from the library. Tell us what language are you using and maybe someone will tell you exact library to use.

Klark