bulk-load

Bulk load data into sqlite?

Does anybody have any tips on utilities that can be used to bulk load data that is stored in delimited text files into an SQLite database? Ideally something that can be called as a stand-alone program from a script etc. A group I work with has an Oracle Database that's going to dump a bunch of data out to file and then load that data ...

What mysql settings affect the speed of LOAD DATA INFILE?

Let me set up the situation. We are trying to insert a modestly high number of rows (roughly 10-20M a day) into a MyISAM table that is modestly wide: +--------------+--------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +--------------+--------------+------+-----+---------+-------...

Need help in designing a phone book application on python running on google app engine

Hi I want some help in building a Phone book application on python and put it on google app engine. I am running a huge db of 2 million user lists and their contacts in phonebook. I want to upload all that data from my servers directly onto the google servers and then use a UI to retrieve the phone book contacts of each user based on his...

DB load CSV into multiple tables

UPDATE: added an example to clarify the format of the data. Considering a CSV with each line formatted like this: tbl1.col1,tbl1.col2,tbl1.col3,tbl1.col4,tbl1.col5,[tbl2.col1:tbl2.col2]+ where [tbl2.col1:tbl2.col2]+ means that there could be any number of these pairs repeated ex: tbl1.col1,tbl1.col2,tbl1.col3,tbl1.col4,tbl1.col5,tb...

AS3 Loading Workflow: XML First, then Multiple Assets

I am working on my first big Actionscript 3 website and trying to decide on the best loading sequence. I am currently using BulkLoader, since filesize wasn't much of an issue for a larger website, but I am definitely open to other approaches. I am trying to figure out which external assets to measure progress {1 swf, 1 css file, multip...

Date fields and Django's loaddata

Can a date be loaded into a DateField using Django's loaddata admin feature? I have a JSON file that I'm using to bulk load data into my app. When you dumpdata, date fields are outputted in the format yyyy-mm-dd. However, if you try loading data back in with the same format, the field is treated as a string and the load fails. For ex...

How to set parent for datastrore entity during bulkloading data by appcfg.py on Google App Engine?

I'm trying to bulkload data using appcfg.py as described here. I got it working except setting parent entity, I can't seem to find info on how to set a parent entity for entity being created by the import. Can you point me to the right direction or provide a code snippet for my bulkloader.Loader implementation? ...

STA threads with SQLXMLBULKLOAD

If I have N STA .NET Threads each performing an independent bulk load operation on a different database using the SQLXMLBulkLoad dll (which requires calling threads to be STA), is it possible for all bulk loads to be happening at the same time, or are they implicitly serialized due to the STA COM configuration? Thanks! ...

MySQL to AppEngine

Hi Nick! How are you? I'm from Brazil and study at FATEC (college located in Brazil). I'm trying to learn about AppEngine. Now, I'm trying to load a large database from MySQL to AppEngine to perform some queries, but I don't know how i can do it. I did some testing with CSV files,but is there any way to perform the direct import from My...

Bulkloading schema less entities on Google App Engine

The new bulkloader added into SDK 1.3.4 works great for models that have a schema. For models inheriting db.Expando (or loosely defined schemas) i would like to understand what i would have to do to bulk upload them. I defined a custom connector, that implemented the ConnectorInterface and converted my data to the python dict required. ...

Bulk Upload XML data to Google App Engine Using the YAML config method

Hi, I would like to bulk load the wurfl database (http://wurfl.sourceforge.net/) and I am not quite sure how to structure my data entities as well as how to write the transforms for the bulkloader config file. Here is a sample of a node from the wurfl file: <device id="generic" user_agent="" fall_back="root"> <group id="product...

Problem during SQL Bulk Load

Hi there, we've got a real confusing problem. We're trying to test an SQL Bulk Load using a little app we've written that passes in the datafile XML, the schema, and the SQL database connection string. It's very simple (the program is five lines, more or less) but we're getting the following error from the library we're passing this stuf...

Unindexed property using bulk loader for App Engine

How do I specify that a property should not be indexed using the bulk loader yaml definition? transformers: - kind: SomeEntity connector: csv property_map: - property: prop external_name: prop export_transform: int - property: prop_unindexed external_name: prop_unindexed export_transform: int # ... what goe...

Bulk loading into PostgreSQL from a remote client

I need to bulk load a large file into PostgreSQL. I would normally use the COPY command, but this file needs to be loaded from a remote client machine. With MSSQL, I can install the local tools and use bcp.exe on the client to connect to the server. Is there an equivalent way for PostgreSQL? If not, what is the recommended way of loadin...

AppEngine Bulk Upload List Property

Hi! I have a model with a list property. I have a csv that has each list data that looks like this. [u'1234567'] The list has only one item each. My bulkloader.yaml has configured import_transform: transform.none_if_empty(list). It uploads the above list property as [u'[', u'u', u"'", u'1', u'2', u'3', u'4', u'5', u'6', u'7', u"'", ...

App Engine Bulk Loader Performance

I am using the App Engine Bulk loader (Python Runtime) to bulk upload entities to the data store. The data that i am uploading is stored in a proprietary format, so i have implemented by own connector (registerd it in bulkload_config.py) to convert it to the intermediate python dictionary. import google.appengine.ext.bulkload import con...

Most efficient way of bulk loading unnormalized dataset into PostgreSQL?

I have loaded a huge CSV dataset -- Eclipse's Filtered Usage Data using PostgreSQL's COPY, and it's taking a huge amount of space because it's not normalized: three of the TEXT columns is much more efficiently refactored into separate tables, to be referenced from the main table with foreign key columns. My question is: is it faster to ...