Well, there's no 100% guaranteed-correct way, no. But you can probably make some progress by transforming all "messy" columns into a more canonical form, e.g. by capitalising everything, trimming leading and trailing spaces and ensuring at most 1 space appears in a row. Also things like changing names of the form "SMITH, JOHN" to "JOHN SMITH" (or vice versa -- just pick a form and go with it). And of course you should make copies of the records, don't change the originals. You can experiment with discarding further information (e.g. "JOHN SMITH" -> "J SMITH") -- you'll find this changes the balance of false positives to false negatives.
I would probably take the approach of assigning a similarity score to each pair of records. E.g. if the canonicalised names, addresses and email addresses agree exactly, assign a score of 1000; otherwise, subtract (some multiple of) the Levenshtein distance from 1000 and use that. You'll need to come up with your own scoring scheme by playing around and deciding the relative importance of different types of differences (e.g. a different digit in a phone number is probably more important than a 1-character difference in two people's names). You will then experimentally be able to establish a score above which you can confidently assign a status of "duplicate" to a pair of records, and a lower score above which manual checking is required; below that score, we can confidently say that the 2 records are not duplicates.
The realistic goal here is to reduce the amount of manual duplicate-removal work you'll need to do. You are unlikely to be able to eliminate it entirely, unless all the duplicates were generated through some automatic copying process.