Accessing data from HDP using the ODBC driver on Microsoft Excel

9:55 PM


I am a newbie to Hadoop programming and in order to play around with it, I set up Hortonworks, and what better way than trying out the examples given in the Hortonworks website? Easy as it seems…

But,

To be honest, the (few) tutorials on the examples are a huge pain in the ***. Somehow trying high and low, I was able to get data into HCatalog (Don’t ask me how! I may write another article on it sometime soon though, before I forget!). But getting the data to visualize in Excel, as the tutorial stated seemed to be a problematic task.

So here it is guys, how I fixed problems with getting the Hortonworks ODBC driver to work on Windows. This is based on top of this tutorial titled “How To Refine and Visualize Sentiment Data”.

My specs are :


  • Hortonworks 2.1 
  • Windows 8 64 bit

Steps:
 
1. Select the ODBC driver from your HDP platform version addons page. Click this link for HDP 2.1.
Download the 64 bit driver as shows below




2. Run the installer once it is downloaded and as asked for, restart your machine (without a restart functions may not work properly).

3. Go to control panel > Administrative tools > ODBC Data Sources (64 Bit) 





5. On System DN, click on the “Sample Hortonworks Hive DSN” and click on Configure and provide the values as I have.   




Note : The host may change depending on your settings. But the easiest way to find the host is by ssh-ing or logging into your vm and running an ifgconfig as shown below. 




5.Test it. You should be able to get a success message if you have correctly specified your host and port. 

6. Now open Excel and click on Data sources Microsoft Query


7. Choose 'Sample Hortonworks DSN' and click OK



8. Choose your table and the necessary columns. Click next for the next few screens and then click on Finish



9. At this point, chances are that you will end up with an error like this:

ERROR [HY000] [Microsoft][HiveODBC] (35) Error from Hive: error code: '40000' error message: 'Error while compiling statement: FAILED: HiveAccessControlException Permission denied. Principal [name=hue, type=USER] does not have following privileges on Object [type=TABLE_OR_VIEW, name=default.tweetsbi] : [SELECT]'. (Microsoft Hive ODBC Driver)



This is because user "hue" does not have the necessary grants on the tables. We will now give permission to the user.

10. On the sandbox (or you can ssh into the vm), execute the following

hive
grant SELECT on table tweetsbi to user hue;



11. Now repeat step 9 and successfully import your data to excel. 


I could go further and show you how to populate this data using the Power View, but its straightforward and you can find it in the existing tutorials online. 

Hope, this helps at least some of you to get your way through this. 

Until later, cheers!

You Might Also Like

2 comments