What is the best way to remove duplicate lines from large .txt files like 1 GB and more ?
Because removing one-after-another duplicates is simple, we can turn this problem to just sorting file.
Assume, that we can't load whole data to RAM, because of it's size.
I'm just waiting to retreive all records from SQL table with one unique index field (I loaded file lines to table earlier) and wondering, does exists way to speed it up.