A common mistake that data science trainees have is they focus very much on the tools rather than the concepts that are evergreen. They usually ask, "Which tool shall I be learning first for data science, R or Python?".
You will have seen in a few posts of mine, that I strongly advocate that a trainee build a project portfolio. Indeed, you will need to be familiar with at least a tool to bring your project to fruition.
But what I noticed is that tools come and go. For instance, during the period of 2014 to 2018, two of the biggest buzzwords when it comes to Data Science tools (or shall I call it Big Data tools) was Hadoop and Spark. Prominent companies that were offering tools in this space are the likes of HortonWorks, MapR, and Cloudera. Fast forward from 2014 to now, both Cloudera and HortonWorks have merged and MapR is now acquired by HP Enterprises. Not many people in the community are talking about them anymore. The buzzword now is cloud computing with Google Cloud Platform, Microsoft Azure and Amazon Web Services dominating the market currently.
Think about it, when a field is being developed, the community of professionals will increase. This increase will lead to an increasing demand for better tools. Increased demand will lead to better innovation in tools. Economics 101 in action. So tools can come and go and especially in technology, the lifecycle of tools can be very short, less than a decade as seen in Hadoop and Spark. If you look at it, we are in a flux, where tools come and go. If our learnings are too focused on the tools, we might become obsolete.
I am of the opinion, at the start of your learning journey focused on the concepts. What are they? Mathematics, statistics, and business applications. Below are a few posts you can read to build up the necessary concepts.
Concepts seldom change, they are very stable unless proven otherwise. Like for instance, having a good understanding of gradient descent and how it works. For me, I aim to strengthen my mathematics background so that I can digest the formulas and equations, understand the corollary and lemmas, what they represent in business applications.
Implementation follows after acquiring the concept and knowledge required. Implementation is important because it brings another level of understanding of the concepts learned. Moreover with the concepts, it helps you to ask questions about the tools more specifically, rather than learning many features of the tool but you cannot see their application. In this way, it can make the learning of the tool more concise. Below is a simple diagram to illustrate your learning flow.
Another suggestion is not to focus on a single tool, if you can try to implement the concepts in different tools, this will make you more versatile and also gained knowledge on the pros and cons of different tools.
At the end of the day, remember tools can become obsolete but concepts are evergreen. Data scientists have limited time for work and learning, thus we need to be more efficient in using our time to create the biggest impact on our career.
Thanks for your kind support in reading till here. Do consider subscribing to my newsletter. I wish you all the best in your Data Science journey! If the article is useful, do share with your friends and consider giving me a shoutout at LinkedIn or Twitter. :)