views:

175

answers:

3

I have a client who is incapable of formatting spreadsheets properly. I've asked them 10 times over to follow a sensible guideline (As in FORMAT THE COLUMNS WITH THE CORRECT DATATYPE) but in their lack of intelligence or professionalism or both, they just can't /won't do it.

I have an ODBC connection set up for the spreadsheets and pulls the data except for the columns that contain product counts or product prices. Unless I manually set the type for each of those columns to "Number" in Excel, the values appear null in the ASP page that displays the results.

Is it possible to write a SQL statement that will alter the datatype as it comes in? I don't care it if converts every column to a string as long as the ADODB.Recordset will display the values of all of the columns.

Code:

<%
Dim MM_SQLSource

MM_SQLSource = "SELECT * FROM [Inventory$]"
Set rsGetExcelInfo = Server.CreateObject("ADODB.Recordset")
rsGetExcelInfo.ActiveConnection = MM_Excel_Connect
rsGetExcelInfo.Source = MM_SQLSource
rsGetExcelInfo.CursorType = 0
rsGetExcelInfo.CursorLocation = 2
rsGetExcelInfo.LockType = 1

rsGetExcelInfo.Open()
%>
+1  A: 

It may not be entirely your client's fault. Excel sometimes has its own ideas about how a column should be formatted. For example, if you have a column containing zip codes, some with the Plus 4, others without, it is pretty much a crap shoot as to how that column will be formatted.

As for your original question, according to this site, CONVERT is a valid SQL scalar function, so maybe something like

SELECT CONVERT(BadField, SQL_CHAR) AS FixedField FROM [Inventory$]

might work?

My first inclination was to suggest using COM to read the data from the spreadsheet. I'm pretty sure you would be able to read each cell's format and deal with it accordingly, but I always found Excel via COM to be difficult and not terribly fast (I've only done it from C++).

Marc Bernier
+1  A: 

Unfortunately AFAIK all the ODBC/ADO/OLEDB/whatever gadgets guess the data type for each column after inspecting a few rows (default is 8, I believe).

There are packages/modules/libraries available for Python[1]/Perl/PHP/.Net etc that will let you read an Excel file cell by cell. The better ones will give you both a value and a finely-grained type (text/numeric/date/boolean/error/empty) for each cell -- armed with that info plus a knowledge of what should be in each column, you can do a repair job.

[1] E.g. http://pypi.python.org/pypi/xlrd of which I'm the author ... after being bitten by (a) the ODBC problem (b) the 'save-as-CSV' problem (c) the COM problem

John Machin
A: 

How about stop using Excel and put it into a real database with some forms for user data entry?

It shouldn't be too overly difficult to put the spreadsheet into a database. Access would do fine. Or use a SQL Server back end and an Access front-end. Or if you can handle more effort, make a web front end.

What you'll find is that the more difficult it is to do this, the worse your data was anyway, and the more errors, inconsistencies, and omissions you had.

Do everyone a favor and formalize this thing.

Emtucifor