Thursday, June 6, 2013

Run ETL with Pentaho Data Integration using package files

The Kettle have a small functionality to run ETL, it uses Apache VFS that let you access a set of files from inside an archive and use them directly in your processes. However, you can use this for execute ETL in somewhere on the web.


Run in file system

I create this little sample (a job that executes a transformation). And I compress this two in the zip file.
So, I have this zip file on that path in my own computer: C:\Users\latinojoel\Desktop\sample.zip
The command line I need to do for execute ETL, is looks like that:
~\data-integration>Kitchen.bat -file=zip:\\"C:\Users\latinojoel\Desktop\sample.zip!/job.kjb" -level=Detail -param:MSG="Wow, It's works. Very funny! :-)"


Run from web resource

With Apache VFS, you can run a zip file from web too.
For example, you can access of my zip file using this URI.
The command line you need to do is that:
~\data-integration>Kitchen.bat -file="zip:https://dl.dropboxusercontent.com/u/54031846/sample.zip!/job.kjb" -level=Detail -param:MSG="Wow, It's works. Very funny! :-)"


Is available for Pan too. A sample:
~\data-integration>Pan.bat -file="zip:https://dl.dropboxusercontent.com/u/54031846/sample.zip!/transf.ktr" -level=Detail -param:MSG="Wow, It's works. Very funny! :-)"

2 comments:

  1. What do you use for Data Integration? I have been working with a few different tools.

    ReplyDelete
    Replies
    1. Hi. I'm using Pentaho Data Integration akka Kettle. The link is here: http://sourceforge.net/projects/pentaho/files/Data%20Integration/

      Delete