I'm trying to match SHA1's in generic text with a regular expression.
Ideally I want to avoid matching words.
It's safe to say that full SHA1's have a distinctive pattern (they're long and a consistent length) - so I can match these reliably - but what about abbreviated SHA1's?
Can I rely on the presence of numbers?
Looking at the SHA1's in my commit log - numbers always appear in the first 3 characters. But is this too short? How many characters of SHA1 do I need to consider before I can assume a number would have appeared?
This does not have to be 100% accurate - I just need to match an abbreviated SHA1 99% of the time.