Is there a formula that will tell me the max/average # of concurrent users I would expect to have with a population of 1000 users a day using an app for 10 minutes?
1000 users X 10 minutes = 10,000 user minutes
10,000 user minutes / 1440 minutes in a day = 6.944 average # of concurrent users
If you are looking for better estimates of concurrent users, I would suggest putting google analytics on your site. It would give you an accurate reading of highs, lows, and averages for your site.
That depends quite a bit on their usage pattern and no generic formula can cover it.
As an extreme example, if this is a timecard application, you would have a large peak at the start and stop of each work day, with scattered access between start and stop as people work on different projects, and almost no access outside of working hours.
Can you describe the usage pattern you expect?
In the worst case scenario, all 1000 users use the app at the same time, so max # oc concurrent users is 1000.
1000 users * 10 minutes = 10000 total minutes
One day has 24 hours * 60 minutes = 1440 minutes.
Assume normal distribution, you would expect 10000 / 1440 = 6.9 users on average using your app concurrently. However, normal distribution is not a valid assumption since you probably won't expect a lot of users in the middle of a night.
Well, assuming there was a steady arrival pattern, people visited the site with an equal distribution, you would have:
( Users * Visit Length in Minutes) / (Minutes in a Day) or (1000 * 10 ) / 1440
Which would be about 7 concurrent users
Of course, you would never have an equal distribution like that. You can take a stab at the anticipated pattern, and distribute the load based on that pattern. Best bet would be to monitor the traffic for a bit, with a decent sampling of user traffic.
Not accurately.
Will the usage be spread evenly over the day or are there events that will cause everyone to use their 10 mintes at the same time?
For example lets compare a general purpose web site with a time card entry system.
General purpose web site will have ebbs and flows throughout the day with clusters of time when you have lots of users (during work hours...).
The time card entry system might have all 1000 people hitting the system within a 15 minute span of time twice a day.
Simple math can show you an average, but people don't generally behave "on average"...
As is universally said, the answer is dependent upon the distribution of when people "arrive." If it's equally likely for someone to arrive at 3:23 AM as at 9:01 AM, max_concurrent is low; if everyone checks in between 8:55 AM and 9:30 AM, max_concurrent is high (and if response time slows depending on current load, so that the 'average' time on the site goes up significantly when there are lots of people on the site...).
A model is only as good as its inputs, but having said that, if you have a good sense of usage patterns, a Monte Carlo model might be a good idea. (If you have access to a person with a background in statistics or probability, they can do the math just based on the distribution parameters, but a Monte Carlo model is easier for most people to create and manipulate.)
In comments, you say that it's a "self-serve reference app similar to Wikipedia," but your relatively low usage means that you cannot rely on the power of large numbers to "dampen out" your arrival curves over 24-hours.