Let’s say somehow you already built out your big data platform–it could be Hortonwroks sandbox, Hortonworks on windows , claudera sandbox or HDinsights. Now, what is next?
Usually, you can challenge yourself to push data into this big data platform(HDFS) and fetch them out. This should be good start for you to understand this ECO system.
Say data in first. The easiest way is to use the Hadoop Shell: “CopyFromLocal”, this command can help you to copy the raw files from your local to your HDFS (Big Data File System). Or the other option you have is to use “Sqoop”. Sqoop can help Put the data from Relational Databases into HDFS.
Sqoop tutorial: https://sqoop.apache.org/docs/1.4.1-incubating/SqoopUserGuide.html
Now, you got data in your Big Data system esp. in HDFS. How to present these data out? well, in this market, the only tool I know is Excel Power Query, which is a free add-on for excel. Once you install this add-on, you have the option to connect HDFS to fetch the data. More details, please check out this post. http://joeydantoni.com/2013/08/15/power-query-and-hadoop-2/
Keep tune, I am back for more big data discussions.