No one gave a satisfactory answer, so we started poking around Lucene documentation and discovered we can accomplish this using custom Analyzers and Tokenizers.
The answer is this: create a WhitespaceAndAtSymbolTokenizer and a WhitespaceAndAtSymbolAnalyzer, then recreate your index using this analyzer. Once you do this, a search for "@gmail.com" will return all gmail addresses, because it's seen as a separate word thanks to the Tokenizer we just created.
Here's the source code, it's actually very simple:
class WhitespaceAndAtSymbolTokenizer : CharTokenizer
{
public WhitespaceAndAtSymbolTokenizer(TextReader input)
: base(input)
{
}
protected override bool IsTokenChar(char c)
{
// Make whitespace characters and the @ symbol be indicators of new words.
return !(char.IsWhiteSpace(c) || c == '@');
}
}
internal class WhitespaceAndAtSymbolAnalyzer : Analyzer
{
public override TokenStream TokenStream(string fieldName, TextReader reader)
{
return new WhitespaceAndAtSymbolTokenizer(reader);
}
}
That's it! Now you just need to rebuild your index and do all searches using this new Analyzer. For example, to write documents to your index:
IndexWriter index = new IndexWriter(indexDirectory, new WhitespaceAndAtSymbolAnalyzer());
index.AddDocument(myDocument);
Performing searches should use the analyzer as well:
IndexSearcher searcher = new IndexSearcher(indexDirectory);
Query query = new QueryParser("TheFieldNameToSearch", new WhitespaceAndAtSymbolAnalyzer()).Parse("@gmail.com");
Hits hits = query.Search(query);