tags:

views:

154

answers:

3

Let's say I have a table that has a varchar field. If I do an insert like this:

INSERT MyTable 
 SELECT N'the string goes here'

Is there any fundamental difference between that and:

INSERT MyTable 
 SELECT 'the string goes here'

My understanding was that you'd only have a problem if the string contained a Unicode character and the target column wasn't unicode. Other than that, SQL deals with it just fine and converts the string with the N'' into a varchar field (basically ignores the N).

I was under the impression that N in front of strings was a good practice, but I'm unable to find any discussion of it that I'd consider definitive.

Title may need improvement, feel free.

+5  A: 

You should prefix strings with N when they are destined for an nvarchar(...) column or parameter. If they are destined for a varchar(...) column or parameter, then omit it, otherwise you end up with an unnecessary conversion.

It's definitely not a "best practice" to stick N in front of every string regardless of what it's for.

Aaronaught
But if you put the N in front, you don't really have to know what the target column's datatype is. Worrying about the extra conversion feels like micro-optimization.
jcollum
@jcollum: The fact remains that the best practice is to use what's actually correct. Someone may look at your query and incorrectly conclude that the column supports Unicode characters when it actually doesn't.
Aaronaught
What I wonder then is why so many tools put N in front of strings, regardless of the destination type.
jcollum
@jcollum: That is a very good question. Might just be laziness - easier to use the same code for all character types than to write special-case code based on Unicode settings.
Aaronaught
+1  A: 

From INSERT (Transact-SQL)

When referencing the Unicode character data types nchar, nvarchar, and ntext, 'expression' should be prefixed with the capital letter 'N'.

Also have a read at Why do some SQL strings have an 'N' prefix?

And

Server-Side Programming with Unicode

Unicode string constants that appear in code executed on the server, such as in stored procedures and triggers, must be preceded by the capital letter N. This is true even if the column being referenced is already defined as Unicode. Without the N prefix, the string is converted to the default code page of the database. This may not recognize certain characters.

astander
Doesn't answer the question. I wasn't asking why it's there. I was wondering if there's a downside or if it's a best practice to have it there regardless.
jcollum
No, it would not be, if it is not required...
astander
+1  A: 

Short answer: fine for scripts, bad for production code.

It is not considered a best practice. There is a downside, it creates a minuscule performance hit as 2 byte characters are converted to 1 byte characters.

If one doesn't know where the insert is going, or doesn't know where the source text is coming from (say this is a general purpose data insertion utility that generates insert statements for an unknown target, say when exporting data), N'foo' might be the more defensive coding style.

So the downside is small and the upside is that your script code is much more adaptable to changes in database structure. Which is probably why you see it in bulk data-insert scripts.

However, if the code in question is something meant for re-use in an environment where you care about the quality of the code, you should not use N'the string' because you are adding a conversion where none is necessary.

MatthewMartin
Ah, defensive, I like that term here. +1
jcollum
I think this, combined with the one above is the answer. I think I'll edit this and call it the answer.
jcollum
@jcollum: It's not just about the performance hit of a conversion. You're indicating the wrong semantics to future maintainers. It's a bit like writing predicates as `WHERE SomeCol = '1'`, as if it's a `varchar` column, when it's really an `int` column. It may work, but that doesn't make it less wrong, and it may lead to subtle collation-related errors in the future as the script gets changed. Generated code is obviously not held to as high a quality standard as written code; the same is generally true of any code gen, try looking at the designer code-behind of a Windows Form.
Aaronaught
@aaron: there are any number of places in scripts where there are datetimes that are passed in as strings '2010-04-28 12:31AM'. I'm not seeing what you mean about the collation errors. I.e. I don't understand how asking sql to convert and nvarchar to a varchar (before inserting data for a column) could result in collation errors.
jcollum
@aaron: my edit of this answer also addressed not doing this in production code, which I think is appropriate. I don't see any downside to doing this in scripts. Please illustrate if you see an issue that I don't.
jcollum
@jcollum: My issue was with your last sentence, "because you are adding a conversion where none is necessary." The implication is that's the only reason, but a more important reason is that it's simply misleading. Code generators don't care about that because machine-generated code is not intended for human consumption.
Aaronaught
@aaron: I think you're reading an "only reason" in there when it doesn't say that. It's just the first reason that I picked and the most quantifiable one.
jcollum