Tuesday, February 17, 2015

Things to know before Big Data & Machine Learning

When I started with Big data , I started with Hadoop .
When I started with Machine learning , I started with Linear Regression .

But time by time I realized I didn't make the best choice to do so.
why ?????????????????

Because I missed the core of the technologies but even I managed to finish the job but that doesn't mean I did it right. I missed the gap between learning and knowing the technology , I missed the fundamental behind the specific .

So I personally preferred following before Big Data and Machine Learning-

Don't confuse.. well Big Data & Machine Learning are 2 different things but they need each other. And You will know it once you do it :-)

Things to Do before Map-Reduce -

  • First understand Map-Reduce DATASTRUCTURE.
  • Write your own implementation of Map Reduce which is ultra easy without using any framework like hadoop or Spark. 
  • Refresh Graph Technologies like DFS & BFS
  • Explore some basic Dynamic Programming and Greedy Algorithm like Knapsack , LCS, Floyd Warshell , KMP etc.
  • SQL or RDBMS or precisely Data Model , Relational Algebra 

Things to do before Machine Learning -

  • Mathematics 
  • Vector , Scalar
  • Matrix Multiplication , Addition and other basic operations
  • Linear Formulation
  • Probability , conditional and independent
  • Probability Distribution
  • Basics of Permutation and Combination
  • Hypothesis
  • Very basic Statistics like Mean, Median , Standard Deviation variation 
  • Regression

Ok seems like lot to do before you even start of.... Well practically its not.
Everything is either from High School or College, so ideally you just need to refresh your memory and it will actually bring excitement to start with .

Warning : This list is going to grow further :-)