Thursday, March 20, 2014

Read Microsoft Excel Document As Relational Table Using Teiid

If you are thinking, who in this age of big data with Hadoop, MongoDB et.al one still bothers to fiddle around with Excel? In reality there are still lot of corporate users out there, do their analytics and reporting using Excel. So, this is still is very important source of data.

From very early versions of Teiid, Teiid supported consuming the data from Microsoft Excel documents, however it always relied upon the "Excel" ODBC driver that came built in Windows platforms. Teiid then used "jdbc-odbc" bridge driver provided by Oracle's JDK to read the data in. Although this solution worked most of the time, it was very tedious and sprinkled with issues and limited the access to Windows platform.

In latest version of Teiid 8.7, a new translator is introduced which is based Apache POI framework, which gives platform independent way to read Excel documents. This is worked out to be much more simple and also gave a way define the metadata for the data in Excel Documents.

If you are interested, I wrote a step-by-step example for this here https://community.jboss.org/wiki/MicrosoftExcelDocumentIntoRelationalTable

Ramesh..

1 comment:

  1. great! this will help a lot. great also for replacing odbc method... sometimes it's necessary (sometimes temporarily... for months) use xls spreadsheets to speed up things, while a proper app is developed. At least for us. Thanks for caring :)

    ReplyDelete