tags:

views:

32

answers:

1

Hi, all.

Has anyone used a REST-based approach for ETL / Datawarehousing operations? In other words, invoking ETL and OLAP / Database refresh jobs through REST webservices calls:

e.g. PUT http://company.com/cube/123523 (to refresh a specific OLAP cube with new data) or POST http://company.com/view/patients/123123 (to create a new database view for patients)

Seems to me like REST is a very suitable and clean architectural style for modeling such monthly tasks....

+1  A: 

ETL is all about inserting rows into a database very, very fast (or sometimes, very, very flexibly when the data is a bit dicey and requires automated cleanup).

REST means using all of HTTP, so using all the verbs and generally the a unicode-way of life.

HTTP as a protocol isn't very fast. It isn't binary (all though I suppose you can have binary payload)

ETL problems are really looking for solutions that depend on the data source. Does your datasource have a native, binary protocol? Use that, it usually is the fastest.

All that said, there are data sources that are locked behind port 80. Things like Microsoft's ADO.NET Data Services (Astoria) already are working out the details of a REST based data access API. I'd be surprised if it is high performance, but it certainly seems like it would be highly flexible.

MatthewMartin
Thanks for the response, I'm actually looking at REST in terms of invoking those ETL tasks, not implementing them. The ETL processes are PL/SQL scripts and packages, my approach is to implement the workflow of executing the sequence of ETL scripts via REST calls...are there products that already do this?
wsb3383
In that case, the fact that you are kicking off an ETL task is irrelevant. If you are buying a 3rd party UI to ETL tools, then I'm not sure why it would matter how it was implemented. REST is usually interesting because it is fairly easy API as compared to COM, COM+, CORBA, or even Web Services, so programming against a REST API is a much smaller project.
MatthewMartin