views:

124

answers:

4

I have written this program for connecting and fetching the data into file, but this program is so slow in fetching . is there is any way to improve the performance and faster way to load the data into the file . iam targeting around 100,000 to million of records so thats why iam worried about performance and also can i use array fetch size and batch size as we can do in java.

import java.sql as sql
import java.lang as lang
def main():
    driver, url, user, passwd = ('oracle.jdbc.driver.OracleDriver','jdbc:oracle:thin:@localhost:1521:xe','odi_temp','odi_temp')
    ##### Register Driver
    lang.Class.forName(driver)
    ##### Create a Connection Object
    myCon = sql.DriverManager.getConnection(url, user, passwd)
    f = open('c:/test_porgram.txt', 'w')
    try:
     ##### Create a Statement
     myStmt = myCon.createStatement()
     ##### Run a Select Query and get a Result Set
     myRs = myStmt.executeQuery("select emp_id ,first_name,last_name,date_of_join from src_sales_12")
     ##### Loop over the Result Set and print the result in a file
     while (myRs.next()):
      print >> f , "%s,%s,%s,%s" %(myRs.getString("EMP_ID"),myRs.getString("FIRST_NAME"),myRs.getString("LAST_NAME"),myRs.getString("DATE_OF_JOIN") )
    finally:
     myCon.close()
     f.close()

### Entry Point of the program
if __name__ == '__main__':
    main()
A: 

Can't you just use the Oracle command-line SQL client to directly export the results of that query into a CSV file?

Jonathan Feinberg
well iam trying to make a universal program so that ican change the database connection and driver and still it work for any database.
kdev
In other words, "write once, run everywhere, just very slowly".
APC
@APC: Well Jython *is* Python on top of Java.
David
A: 

You might use getString with hardcoded indices instead of the column name (in your print statement) so the program doesn't have to look up the names over and over. Also, I don't know enough about Jython/Python file output to say whether this is enabled by default or not, but you should try to make sure your output is buffered.

EDIT:

Code requested (I make no claims about the correctness of this code):

print >> f , "%s,%s,%s,%s" %(myRs.getString(0),myRs.getString(1),myRs.getString(2),myRs.getString(3) )

or

myRs = myStmt.executeQuery("select emp_id ,first_name,last_name,date_of_join from src_sales_12")
hasFirst = myRs.next()
if (hasFirst):
    empIdIdx = myRs.findColumn("EMP_ID")
    fNameIdx = myRs.findColumn("FIRST_NAME")
    lNameIdx = myRs.findColumn("LAST_NAME")
    dojIdx = myRs.findColumn("DATE_OF_JOIN")
    print >> f , "%s,%s,%s,%s" %(myRs.getString(empIdIdx),myRs.getString(fNameIdx),myRs.getString(lNameIdx),myRs.getString(dojIdx) )
    ##### Loop over the Result Set and print the result in a file
    while (myRs.next()):
        print >> f , "%s,%s,%s,%s" %(myRs.getString(empIdIdx),myRs.getString(fNameIdx),myRs.getString(lNameIdx),myRs.getString(dojIdx) )
Phil
can you please show me an example about how to getstring with hardcoded indices.
kdev
Thanks to all of you for your help
kdev
A: 

if you just want to fetch data into files ,you can try database tools(for example , "load","export").

xjn
+1  A: 

Unless you're on the finest, finest gear for the DB and file server, or the worst gear running the script, this application is I/O bound. After the select has returned from the DB, the actual movement of the data will dominate more than any inefficiencies in Jython, Java, or this code.

You CPU is basically unconscious during this process, you're simply not doing enough data transformation. You could write a process that is slower than the I/O, but this isn't one of them.

You could write this in C and I doubt you'd see a substantial difference.

Will Hartung