What was your most serious production bug? This could be any bug you contributed to the making of or solving in a live system.
[moved my response to the answers]
What was your most serious production bug? This could be any bug you contributed to the making of or solving in a live system.
[moved my response to the answers]
Mine was on my first project out of school, on a large sales compensation system for a software company. We had a bug in the final summation routine which would attempt to subtract any owned money from the next paycheck. In certain situations, where a retroactive computation increased the amount of money owed from a previous month, the debit would be recorded, and then never get reduced from the next paycheck. What might start out as a $3.23 the first month would increase to $6.46 the following month. You can see where this is going. Although we heard of a couple of user complaints early on we dismiss them as "user error" - the sales plans were complex and it was quite easy for anyone to misunderstand what the correct amount was to be paid. But after a few months, the monies that were missing were too large to be ignored - over $2,000,000 in not paid out payroll checks. The code fix was easy, going over months of payroll computations for hundreds of employees, not so much.
I lost some user registration data for about 7 users during a live update to a system I built. That doesn't sounds so bad, except that it was registrations for an $18 billion IPO. We were able to track the information down through the automated emails that got sent out, but there were a few beads of sweat shed over that little hiccup.
I worked on an e-commerce website where the client data was supplied as a CSV dump from a legacy back-end system. We only had a sample data set to work with (despite repeated requests for the full data set) so the first time we saw the full data was on the live site the morning it launched. All the strings were quoted in the CSV file but the numbers weren't. What we didn't realise is that the legacy system inserted a comma for the thousands in larger numbers - so where we expected, say, 1099.99, we got 1,099.99. Of course, the CSV parser saw the comma and took the value as 1. Imagine the client's surprise when orders started to come in for big ticket items which were apparently selling at the bargain price of £1 each. The code was fixed quickly and fortunately their terms allowed them to decline the orders. Lesson learned: never trust a sample data set and don't go live until you've tested with a full data load.
We had an e-commerce system, and when it was moved to the production server (through our super awesome manual copy/paste/edit settings process), the senior developer - the only one with access to the server - forgot to connect the system to the payment gateway. $18,000 worth of sales later, the client notices that their bank account isn't any bigger than when we started.
Process improvements since that day:
Not one.
How we solved the problem:
Told the client to contact all the customers based on their email notifications