Ideally all of my data would look like this:
William Faulkner - 'Light In August'
William Faulkner - 'Sanctuary'
William Faulkner - 'The Sound and the Fury'
In that case, this regex would seem to work fine:
Pattern pattern = Pattern.compile("^\\s*(.*)\\s+-\\s+'(.*)'\\s*$");
Matcher matcher = pattern.matcher("William Faulkner - 'Light In August'");
if (matcher.matches()) {
String author = matcher.group(1).trim();
String bookTitle = matcher.group(2).trim();
System.out.println(author + " / " + bookTitle);
}
else {
System.out.println("No match!");
}
But occasionally my data contains examples like these, which do not seem to be matched by the pattern above:
Saki - 'Esme'
Saki - 'The Unrest Cure' (Second Edition)
Saki (File Under: Hector Hugh Munro) - 'The Interlopers' (Anniversary Multi-pack)
William Faulkner - 'The Sound and the Fury' (Collector's Re-issue)
'The Sound and the Fury'
The Sound and the Fury
The Bible (St James Version)
In all cases where no hyphen exists, these are book titles without authors. I have not found any cases of an author's name appearing without a book title.
How could I change my regex to handle this correctly?