tags:

views:

139

answers:

1

Hi - got a problem I'm sure someone somewhere has encountered before. We'd been FTPing a customers .csv files down to our laptops, then SQLLoading them to our Oracle DB's, but network made it a slow process.. I set up a shell script to LFTP those files down to the Solaris DB box, and sqlload them - much faster. There were some character issues, so I was able to alter the NLS_LANG, and now see the same characters in the DB as when we go the windows route. 2 of these 7 files have issues..Of 500,000 records, a few thousand are written to a .bad file because lines are split. Curious that in Windows environment this doesnt happen. Not sure if this is a FTP vs. LFTP thing, or a charset transcription which occurs when coming into UNIX (MSWIN -> WE8ISO) Thought maybe there is a set variable which might be used to make LFTP behave more like FTP in this regard....Any Ideas there?

My band-aid alternative If I can't figure out the real problem above, is to reload the 2 .bad files after manipulating the split line back up onto the end of the previous line. Here's an example of a split record in the .bad file. They always seem to split at this address field, often times where there should have been a dot or a comma - see there at '215 St' the line breaks:

"","","1-1000035","","","1-1000035","SIS STRATEGIC INFORMATION SYSTEMS","SIS STRATEGIC INFORMATION SYSTEMS","","RESELLER","Active","N","Y","","","","","","$"
,"","","","","","","","80","","","","","","","","","","","","","(403) 281-4252","(780) 701-4050","North America","","","11432 215 St
Summerbarn Rd","","","Edmonton","AB","T2S3Y5","Canada","","","","","","1-1000035","","","","","","","","","","","","",
"","","","","",,,,"",,0,"UPSERT",10,"Y","Inserted By Widget",2009-10-23 15:08:03.387000000,2009-10-23 15:08:03.387000000,"",,"",,"","","1-1000035"^M
A: 

Could it be the difference between Unix and Windows line endings (\n vs. \r\n)?

Beaner
trying to nip it in the bud so to speak and makethe LFTP xfer do like the FTP xfer, if it is between those protocols and not between charactersets.
Ryan