On top of Joe Stefanelli's answer, I'd add:
- How is the table being used?
- Is it merely a dump or log of activity, is it used for OLTP purposes (lookup a few rows at a time), is it used for OLAP-like activity (read many, many rows at a time)?
- Is performanace critical (must retrieve rows in microsecond time) or secondary (say for end-of day reports)?
Since you only get one clustered index, I'd tailor the clustered index based on these answers so that is best supports system requirements. Some ideas:
If it's a daily log that is rarely if ever queried, a clustered index on only RevisionNumber would be adequate.
If you report on all files loaded on a given day, the clustered index on RevisionNumber would be ideal.
If you need to look up individual files with any kind of frequency or expediency, that index would suck since, if I've got it right, there'd be 100,000,000 rows (files) for each RevisionNumber -- but a simple non-clustered index on FileName, or FileName + RevisionNumber, would cover that (but see the next idea).
For fast lookups, FileName, FilePath, or FilePath + FileName could be painfully long strings to index. Adding a column (or a persisted calculated column) for checksum(FileWhatever) and indexing on that could save a lot of time. Queries would have to look something like:
SELECT FullFileName, Plus, Other, Columns
from FileTable
where RevisionNumber = @TargetRevision
and ChecksumColumn = checksum(@TargetFullFileName)
and FullFileName = @TargetFullFileName
Lastly, if you're really adding and dropping about 100,000,000 rows every day, I'd serious look into table partitioning, with the partition based on RevisionNumber.