You might have to go for some heuristics here. Maybe you can estimate travel time based on a few factors like geometric distance and some features about the start and end points (urban vs rural areas, country, ...). You could get a few distances, try to fit your parameters on a subset of them and see how well you're able to predict the other ones. My prediction would be, for example, that travel times approach linear dependence from distance as distance grows larger, in many cases.
I know it's messy, but hey you're trying to estimate 12.5mio datapoints (or whatever the amount :)
You might also be able to incrementally add knowledge from already-retrieved "real" travel times by finding close points to the ones you're looking for:
- get closest points StartApprox, EndApprox to starting and end position such that you have a travel time between StartApprox and EndApprox
- compute distances StartError, EndError between start and StartApprox, end and EndApprox
- if StartError+EndError>Distance(StartApprox, EndApprox) * 0.10 (or whatever your threshold) -> compute distance via API (and store it), else use known travel time plus overhead time based on StartError+EndError
(if you have 100 addresses in NY and 100 in SF, all the values are going to be more or less the same (ie the difference between them is probably lower than the uncertainty involved in these predictions) and such an approach would keep you from issuing 10000 queries where 1 would do)