Tuesday, December 2, 2014

Pre-requisites for getting started with Big Data Technologies

Lots of my friends who have heard about Big Data World or may be interested in getting into the same, have this query, what are the prerequisite for learning or may be start digging into Big Data Technologies. And what are the technologies that comes under Big Data.

Well this is quite a difficult question to answer, because there is no distinct draw between what comes under the hood. But one thing is for sure that Big Data is not only about Hadoop as lots of us out there have this misconception.

Hadoop is just a framework that is being used in Big Data. And yes it is used quite a lot or if i can say it is one of the integral part of Big Data. But beside Hadoop there are tons of tools and technologies that comes under the same. To name a few we have:

  • Cassandra 
  • HBase             
  • MongoDB              
  • CouchDB              
  • Accumulo        
  • HCatlog     
  •  Hive                      
  •  Sqoop       
  •  Flume          and many more! 

 OK, now if we look at the NoSql Databases (that's what we call databases handling unstructured data in Big Data) and different tools, mentioned above, most of them (few being exception) is written in JAVA including Hadoop. So as a programmer if you want to know and go in depth of the architectural APIs, Core Java is the recommended programming language that will help you to grasp the technology in a better and more efficient way.

Now if i am saying that core java is recommended that doesn't imply that people who don't know Java, have no scope in the same. Because Big data is all about managing the data more efficiently, more intelligently.

So people who have the knowledge of data warehousing gets a plus point here. Managing large amount of data and playing around the same with its volume, velocity variety and complexity is the work of a Big Data Scientist.

Apart from Data warehousing background, People having experience with Machine learning, Theory of Computation, and Sentiment Analysis are contributing a lot in this World.

So it will be unfair to say, that who can and who cannot work in Big Data technology. Its an emerging field where most of us can lay our hand and can contribute in its growth and development, And Yes the most important thing that's what I like about being in Big Data is that, most of the tools are open source. So I can play around with the Source Code :)

