I know I am not the only one who does not like progress bars or time estimates which give unrealistic estimates in software. Best examples are installers which jump from 0% to 90% in 10 seconds and then take an hour to complete the final 10%.
Most of the time programmers just estimate the steps to complete a task and then display currentstep/totalsteps as a percentage, ignoring the fact that each step might take a different time to complete. For example, if you insert rows into a database, the insertion time can increase with the number of inserted rows (easy example), or the time to copy files does not only depend on the size of the file but also on the location on the disk and how fragmented it is.
Today, I asked myself if anybody already tried to model this and maybe created a library with a configurable robust estimator. I know that it is difficult to give robust estimates because external factors (network connection, user runs other programs, etc) play their part.
Maybe there is also a solution that uses profiling to set up a better estimator, or one could use machine learning approaches.
Does anybody know of advanced solutions for this problem?
In connection to this, I found the article Rethinking the progress bar very interesing. It shows how progress bars can change the perception of time and how you can use those insights to create progress bars that seem to be faster.
EDIT: I can think of ways how to manually tune the time estimate, and even with a 'estimator library' I will have to fine tune the algorithm. But I think this problem could be tackled with statistical tools. Of course, the estimator would collect data during the process to create better estimates for the next steps.
What I do now is to take the average time something took in the previous step (steps grouped by type and normalized by e.g. file size, size of transaction) and take this average as estimate for the next steps (again: counting in different types and sizes).
Now, I know there are better statistical tools to create estimators and I wonder if anybody applied those to the problem.