views:

1119

answers:

6

Here's the code I want to speed up. It's getting a value from an ADO recordset and converting it to a char*. But this is slow. Can I skip the creation of the _bstr_t?

                _variant_t var = pRs->Fields->GetItem(i)->GetValue();

                if (V_VT(&var) == VT_BSTR)
                {
                    char* p = (const char*) (_bstr_t) var;
+2  A: 

This creates a temporary on the stack:

USES_CONVERSION;
char *p=W2A(var.bstrVal);

This uses a slightly newer syntax and is probably more robust. It has a configurable size, beyond which it will use the heap so it avoids putting massive strings onto the stack:

char *p=CW2AEX<>(var.bstrVal);
1800 INFORMATION
Well, I liked this approach in theory, except it's not really faster according to my timings.
Corey Trager
I'm surprised it isn't. Your original method requires the creation of an extra BSTR which has to go through the COM allocator. Both of mine just copy into a buffer onto the stack with conversion. If your original string is long enough, the overhead from the conversion is probably most significant
1800 INFORMATION
My strings are short, but I have, oh, 150,000 thousand of them or so in a loop. That is I'm looping thru the rows and columns of an ADO recordset, converting the BSTRs to char*s, then pushing the char*s into an std::vector of std::strings. Takes about 3.2 seconds both ways.
Corey Trager
Well you still have to perform the conversion from unicode. Have you considered trying to avoid the conversion? Change your whole app to use unicode? Are you sure you have the right bottleneck? maybe it is the Getvalue() methods?
1800 INFORMATION
1800 INFORMATION, thanks for the suggestions. There are other parts of my loop that contribute to the 3.2 seconds, but in this Stackoverflow question, I was focusing on just this narrow point.I'm not free to do things over, but I think going with unicode from end-to-end would have been better.
Corey Trager
Allocating memory on the heap with alloca(n) is not slow. In fact it can be optimized down to three instructions: SUB ESP, n; MOV EAX, ESP; RET 4;
jmucchiello
alloca does not allocate memory on the heap
1800 INFORMATION
+2  A: 

Your problem (other than the possibility of a memory copy inside _bstr_t) is that you're converting the UNICODE BSTR into an ANSI char*.

You can use the USES_CONVERSION macros which perform the conversion on the stack, so they might be faster. Alternatively, keep the BSTR value as unicode if possible.

to convert:

USES_CONVERSION;
char* p = strdup(OLE2A(var.bstrVal));

// ...

free(p);

remember - the string returned from OLE2A (and its sister macros) return a string that is allocated on the stack - return from the enclosing scope and you have garbage string unless you copy it (and free it eventually, obviously)

gbjbaanb
A: 

Ok, my C++ is getting a little rusty... but I don't think the conversion is your problem. That conversion doesn't really do anything except tell the compiler to consider _bstr_t a char*. Then you're just assigning the address of that pointer to p. Nothing's actually being "done."

Are you sure it's not just slow getting stuff from GetValue?

Or is my C++ rustier than I think...

Telos
Actually there is a bunch of magic going on behind the scenes by the _bstr_t class to convert the string from unicode to ANSI
1800 INFORMATION
That depends. I don't know the details of the _variant_t class, but if it defines an operator _bstr_t() then that's the function that will be called if a _variant_t to _bstr_t conversion is needed.
Ferruccio
Those aren't mere casts for the compiler. I know that from stepping through the code. Objects are being constructed.
Corey Trager
+2  A: 

The first 4 bytes of the BSTR contain the length. You can loop through and get every other character if unicode or every character if multibyte. Some sort of memcpy or other method would work too. IIRC, this can be faster than W2A or casting (LPCSTR)(_bstr_t)

PiNoYBoY82
Worth a try. THanks.
Corey Trager
This is a crazy suggestion. What happens if the BSTR contains wide characters? You'll just corrupt the data.
1800 INFORMATION
If you know for sure what your dataset is, then no issue. You sacrifice flexibility for speed.
PiNoYBoY82
Agree with PiNoYBoY82, it's a hack, but it works when planets align in a certain way.
Constantin
The characters are ansi. For me, this hack works, is safe, and it's very fast.
Corey Trager
You are storing ANSI chars in a BSTR? Urgh. On Win32, BSTRs are always wide char -- everything else is a dirty hack.
Johannes Passing
A: 

It seems to be working ok.

A: 
_variant_t var = pRs->Fields->GetItem(i)->GetValue(); 

You can also make this assignment quicker by avoiding the fields collection all together. You should only use the Fields collection when you need to retrieve the item by name. If you know the fields by index you can instead use this.

_variant_t vara = pRs->Collect[i]->Value;

Note i cannot be an integer as ADO does not support VT_INTEGER, so you might as well use a long variable.

Stone Free