views:

552

answers:

12

I have a query that should always be returning a single int. I have now logged it returning a string entirely unrelated to what it should be.

We've been getting some random FormatExceptions that we've tracked down to several database queries. After some additional logging, I found that, this morning, the query below returned the string "gladiator". Website.PkID is an int column and works most of the time, but some times it fails miserably and returns either an int that's waaaay out there (bigger than any valid WebsiteID) or a random string.

This particular query is hit once per session start. It's not using a shared connection, so I'm having trouble understanding how it could get such a mixed-up result. Could there be some kind of corruption in the connection pools?

I don't think the problem is isolated to this query. I've seen similar FormatExceptions (because of an unexpected result) coming from LINQ queries as well. We've also spotted some of these errors around the same times:

A transport-level error has occurred when sending the request to the server. (provider: TCP Provider, error: 0 - An existing connection was forcibly closed by the remote host.

Could it be a connection issue? Or maybe we're getting result sets mixed up between the db server and the web server? This has really got me scratching my head.

Offending query:

public static int GetActiveWebSiteID(string storeID, string statusID)
{
    int retval;

    string sql = @"SELECT isnull(MAX(PkID),0) FROM WebSite 
                   WHERE StoreID = @StoreID 
                   AND WebSiteStatusID = @WebSiteStatusID";

    SqlConnection conn = new SqlConnection(Settings.ConnString);
    SqlCommand cmd = new SqlCommand(sql, conn);
    cmd.CommandType = CommandType.Text;
    cmd.Parameters.AddWithValue("@StoreID", (object)storeID ?? DBNull.Value);
    cmd.Parameters.AddWithValue("@WebSiteStatusID", (object)statusID ?? DBNull.Value);

    conn.Open();
    using(conn)
    {
        var scalar = cmd.ExecuteScalar(); // <-- This value returned here should only ever be an int, but randomly is a string

        retval = Convert.ToInt32(scalar);
    }
    return retval;
}

The above query has worked fine for years until recently. We now have a bunch of additional LINQ queries in the app (not sure if that makes a difference). We're running .Net 3.5.

+1  A: 

The ExecuteScalar() function's return type is object, and you declare your result variable with the var keyword. That's not really a good combination, because you're putting a lot of pressure on the system to get the type inference right.

Joel Coehoorn
The column is an int and the result was a string of "gladiator". I don't see how type inference has anything to do with this. An int column, or any aggregation thereof, should never return a string.
Chad
Type inference should never spawn a "gladiator" :)
VVS
Nor should the code he posted, unless PkID isn't really an int column.
Joel Coehoorn
The compiler either resolves the type exactly or gives an ambiguity error.
VVS
But how do you know in this case that it resolved to the type you expect? Convert.ToInt32() will accept almost anything, and that's the only other place it's used.
Joel Coehoorn
A: 

hi there, I think that you were thinking about sqlCommand.ExecuteNonQuery that returns the number of rows affected within an int value...

this is the definition of the ExecuteScalar method:

public override object ExecuteScalar() Member of System.Data.SqlClient.SqlCommand

Summary: Executes the query, and returns the first column of the first row in the result set returned by the query. Additional columns or rows are ignored.

Returns: The first column of the first row in the result set, or a null reference (Nothing in Visual Basic) if the result set is empty.

so I think that the common way of returning that column is as a string representation of the column value.

Hope this helps.

Alex Pacurar
ExecuteScalar() is exactly what I want. I don't care about the number of rows.
Chad
A: 

the Field "PkID" is a varchar/char in the "WebSite" Table.

If the "ISNULL" part of the query is true then it will return an integer (0), else it will return an string with the value of "PkID"

Burnsys
As mentioned in the question, Website.PkID is an int column.
Chad
A: 

There could be more than one WebSite table. Can you qualify the table with schema name:

SELECT isnull(MAX(PkID),0) FROM YourSchema.WebSite WHERE StoreID = @StoreID AND WebSiteStatusID = @WebSiteStatusID

AlexKuznetsov
There is only one Website table.
Chad
A: 

Is there any commonality to when it fails to return an int?

Since your query only ever returns a single column in a single row, what do you get if you use a more typesafe ExecuteReader and take the first column's value?

Is it always returning a row? If the WHERE clause causes it to return no rows (say your parameters are not what you think they are), your ISNULL doesn't come into effect - there are no rows at all, and ExecuteScalar is supposed to return a NULL.

Cade Roux
I stripped out some of the code for the sake of brevity in the above example. The real code has more logging and also checks the value of "scalar" for null and DbNull before trying to convert to an int. These checks handle the no-result case.
Chad
But as you recommend, I'll try reworking the query to see if that helps
Chad
A: 

I've recently seen a case where code was switching connection strings unexpectedly. For diagnostic purposes, please hard code the connection string and see if the problem goes away.

Also, for sanity's sake, please use nested using blocks like:

using(SqlConnection conn = new SqlConnection("hard-coded connection string"))
{
    using (SqlCommand cmd = new SqlCommand(sql, conn))
    {
        // more init
        object scalar = cmd.ExecuteScalar();

        // process result
    }
 }

It wouldn't surprise me to find there are two database instances, and in one, PkID is an int, in another it's varchar.


Take a look with SQL Profiler to see if you can catch the return of "gladiator". In the other case I was working with, SQL Profiler showed nothing at all, indicating that the actual query was going to a different database.

John Saunders
There's only one valid database with a Website table out there. Good idea on cleaning up the usings a little bit - that much couldn't hurt anything.
Chad
I presume you know this because you used the Profiler and saw the call that returned "gladiator"?
John Saunders
A: 

I think neither your posted query nor LINQ are the problem.

Are you really sure you're looking at the right source? How is the method called? How is logging done?

Select isn't broken.

VVS
Yes, it's the right source. I've removed the log4net logging in the example code above for brevity, and had injected very specific logging that told me the inputs and output.I have a hunch that there might be some kind of network thing going on between the sql server and web server. We've had weird issues before, though never of this severity.
Chad
You are not convincing me. When you've used the SQL Profiler and seen the code actually run on the server you expect it to run on, then we'll know what server it's running on. Until then, you're guessing.
John Saunders
A: 

Why can't you cast isnull(MAX(PkID),0) to int in your dynamic sql?

Gustavo Cavalcanti
A: 

Maybe you can try to track down the issue in reverse.

Is there any column or row that contains the string "gladiator" in the database? Can you determine where that data is? Maybe that would shed some light on the issue.

Do you have any enums with a member called "gladiator"? Could type inference somehow be hitting a glitch where it interprets the integer value as an enum?

Just some random thoughts.

Chris Dunaway
A: 

Hi,

I'm assuming that Settings.ConnString is read from Web.Config or the registry and is re-used other static routines. Could it be possible that there is a timing issue where a second method is executed prior to the call to cmd.ExecuteScalar() in your routine which modifies cmd.CommandText on the connection?

Hope this helps,

Bill

Bill Mueller
+2  A: 

After a few months of ignoring this issue, it started to reach a critical mass as traffic gradually increased. We increased logging and found numerous definitive instances where, under heavy load, completely different result sets would get returned to unrelated queries.

We watched the queries in Profiler and were able to see that the bad results were always associated with the same spid, and that each bad result was always one query behind the actual sql statement being queried. It was like a result set got missed and whatever result set was next in the spid (from another connection in the same pool) was returned. Crazy.

Through trial and error, we eventually tracked down a handful of SqlCommand or LINQ queries whose SqlConnection wasn't closed immediately after use. Instead, through some sloppy programming originating from a misunderstanding of LINQ connections, the DataContext objects were disposed (and connections closed) only at the end of a request rather than immediately.

Once we refactored these methods to immediately close the connection with a C# "using" block (freeing up that pool for the next request), we received no more errors. While we still don't know the underlying reason that a connection pool would get so mixed up, we were able to cease all errors of this type. This problem was resolved in conjunction with another similar error I posted, found here: What Causes "Internal Connection Fatal Errors"?

Chad
A: 

Hi

Facing exactly similar behavior where I am getting the return value of a different query at peak loads. We also have done heavy logging and has data which shows this mix up. As in your case the only common thing is the connection pool. Its again .Net 3.5 and I do not find any similar occurrences other than yours online.

I can definitely start implementing "using". But other than that if you have any further data on the actual cause it would be helpful.

Hemchand

Hemchand
Yea, that sounds exactly like what we had, and boy was it unnerving. I'd first try to find the heaviest hit queries and make sure they're closing the LINQ connection immediately after use (hence the using block).
Chad
Woops, hit enter too quick. It really helped us to be able to reproduce it, which wasn't easy. We had to really hammer the site to make it happen. In our case, we used several simultaneous instances of the crawler Xenu on several client machines with the max amount of threads all crawling the site. Watch your log and at some point, you'll get a trickle of those errors. Then just keep updating your heaviest used LINQ queries with the using block. SQL Profiler also helped. After we tweaked our app like this, we have never seen a recurrence, despite a significant growth in traffic. Best of luck!
Chad
Hi Chad, I should say going the "using" approach helped. And we took this as an opportunity in identifying the most frequent database calls and optimizing them.
Hemchand