I don't see why the hash table approach is inefficient, at least in algorithm analysis terms - in memory locality terms admittedly, it can be quite bad. Anyway, scan the array twice...
First scan - put all the array elements in the hash table - amortized O(n) total.
Second scan - check for (sum - current) in the hash table - O(n) total.
This beats the O(n log n) sort-and-search methods, at least in theory.
Then, note that you can combine the two scans into one. You can spot a pair as soon as you encounter the second of that pair during the first scan. In pseudocode...
for i in array.range
hashset.insert (array [i])
diff = sum - array [i]
if hashset.includes (diff)
output diff, array [i]
If you need positions of the items, use a hashmap and store item positions in it. If you need to cope with duplicates, you might need to store counts in a hashmap. For positions and duplicates, you might need a hashmap of start pointers for linked lists of positions.
This makes assumptions about the hash table implementation, but fairly safe ones given the usual implementations in most current languages and libraries.
BTW - combining the scans shouldn't be seen as an optimisation. The iteration overhead should be insignificant. Memory locality issues could make a single pass slightly more efficient for very large arrays, but the real memory locality issues will be in the hashtable lookups anyway.
IMO the only real reason to combine the scans is because you only want each pair reported once - handling that in a two-scan approach would be a bit more hassle.