views:

243

answers:

4

I had a nice and hacky Perl script to automatically scrape and download sales report files from iTunes Connect. As of today, Apple overhauled the sales report site. It looks a lot nicer, but it uses a lot of JavaScript and simple scraping isn't going to work any more.

So, does anybody know of a way to scrape this new site effectively?

Some previous questions point to various scripts and online services. I presume they are all broken now as well. If you know of one that is still functional, please let me know.

+1  A: 

Try the free iMacros Firefox addon. It has extensive web scraping support and since it works in the browser, it can handle Javascript. You can start it also via the command line.

Edit: This does indeed work. Here's a macro for downloading the past 3 days of sales reports. I haven't yet tried integrating with command line tools, but it should work.

VERSION BUILD=6650406 RECORDER=FX
TAB T=1
URL GOTO=https://itunesconnect.apple.com/
TAG POS=1 TYPE=INPUT:IMAGE FORM=NAME:appleConnectForm ATTR=NAME:1.Continue&&SRC:https://itunesconnect.apple.com/AppleConnect/US-EN/labelconnect/btn_signin.png
TAG POS=1 TYPE=B ATTR=TXT:Sales<SP>and<SP>Trends
TAG POS=1 TYPE=A ATTR=ID:theForm:saletestid
TAG POS=1 TYPE=SELECT FORM=NAME:theForm ATTR=ID:theForm:datePickerSourceSelectElementSales CONTENT=1
TAG POS=1 TYPE=A ATTR=ID:theForm:downloadLabel2
ONDOWNLOAD FOLDER=~/Downloads/iTCSales/ FILE=Daily-{{!NOW:yyyymmdd}}-1.txt.gz WAIT=YES
TAG POS=1 TYPE=SELECT FORM=NAME:theForm ATTR=ID:theForm:datePickerSourceSelectElementSales CONTENT=2
TAG POS=1 TYPE=A ATTR=ID:theForm:downloadLabel2
ONDOWNLOAD FOLDER=~/Downloads/iTCSales FILE=Daily-{{!NOW:yyyymmdd}}-2.txt.gz WAIT=YES
TAG POS=1 TYPE=SELECT FORM=NAME:theForm ATTR=ID:theForm:datePickerSourceSelectElementSales CONTENT=3
TAG POS=1 TYPE=A ATTR=ID:theForm:downloadLabel2
ONDOWNLOAD FOLDER=~/Downloads/iTCSales FILE=Daily-{{!NOW:yyyymmdd}}-3.txt.gz WAIT=YES
TAG POS=1 TYPE=DIV ATTR=TXT:Done
TAG POS=1 TYPE=DIV ATTR=TXT:Done
TAG POS=1 TYPE=INPUT:SUBMIT FORM=NAME:signOutForm ATTR=VALUE:Sign<SP>Out
SamMeiers
I don't normally use Windows so I'd hate to keep a virtual machine with Windows and Firefox running just for this. I see there's a Chrome version so I'll see if that is cross-platform.
Daniel Dickison
Never mind my previous comment -- iMacro does work on Mac OS X. I think I must've been looking at the IE plugin's system requirements page.
Daniel Dickison
A: 

let us know if it actually works in the new format apple has brought up

Noam
A: 

http://twitter.com/viva/status/24133713255

just need to sift through the ajax cruft. if it's even possible. very annoyed with apple at the moment.

fleas
+2  A: 

http://code.google.com/p/appdailysales/ was just updated to support the Sept 2010 iTunes connect changes.

Hafthor
This works perfectly. Very nice work. Thanks!
Daniel Dickison
To be clear, this is not my work. It is http://stackoverflow.com/users/245020/kirby-t
Hafthor
Is it possible that Apple has changed things again in the last 24 hours?! I just tried the script v2.0.1 and it's not working (error on line 245: list index out of range)
Brian
Figured it out - there's an error on line 244 (change "value1" to "value")
Brian