I have a table with contents that look similar to this:
id | title
------------
1  | 5. foo
2  | 5.foo
3  | 5. foo*
4  | bar
5  | bar*
6  | baz
6  | BAZ
…and so on. I would like to group by the titles and ignore the extra bits. I know Postgres can do this:
SELECT * FROM (
  SELECT regexp_replace(title, '[*.]+$', '') AS title
  FROM table
) AS a
GROUP BY title
However, that's quite simple and would get very unwieldy if I tried to anticipate all the possible variations. So, the question is, is there a more general way to do fuzzy grouping than using regexp? Is it even possible, at least without breaking one's back doing it?
Edit: To clarify, there is no preference for any of the variations, and this is what the table should look like after grouping:
title
------
5. foo
bar
baz
I.e., the variations would be items that are different just by a few characters or capitalization, and it doesn't matter which ones are left as long as they're grouped.