In my previous blog posts, I have stated down the knowledge and skills that a Data Scientist or AI Scientist should have. Here is the post if you have missed it.
I was asked a related question recently from members of the interest group, Data Science Bangkok,"What is the learning path for a high school student if he or she wants to become a Data Scientist/AI Scientist?" Interesting question as I usually get asked the "How to Start" question from university undergraduates or fresh graduates. I am going to make it more generic here so that mid-careers can have a reference as well. Compared to my previous posts, I am now putting it in the format of a journey.
Starting out! First Step!
I found that my mathematics background was extremely helpful in understanding the workings of many machine learning models. I loved mathematics and the reason why I did not choose it as my career was because during my time, my only career choice was mathematics teacher (no offence) but I did not see myself teaching but rather analysing. :)
Coming back, by this time, as a high school student, you probably will have learnt many different branches of mathematics. To kick things off, as a first step, is to start learning more about the following branches of Mathematics:
1) Linear Algebra
I have written a blog post on what to study specifically, so do refer to it to find out more. If you are looking for free resources to learn these Mathematics, go to Khan Academy.
There is another field that you can start studying and that is Computer Architecture. The reason is that once you start on your "practising" journey a lot of IT jargons will be thrown at you, caches, buses etc. Strong foundation in Computer Architecture can help you to put all of these terms together to paint a better picture for the engineering work that you are likely to undertake later on. Khan Academy has resource on it and if not, Wikipedia to the rescue!
Step 1 Done! What is Step 2?
Time to move up a notch and start learning about Machine Learning, generally Supervised and Unsupervised Learning models. Given the strong foundation you have in Mathematics, you can attempt to understand Machine Learning, but the likely case is you will start to feel inadequate and want to refer back to what you have learnt. Keep your study notes handy! :)
While learning about Machine Learning, you can move on and learn a suitable programming language. As of 2020, I will recommend you start off with Python (prepare for a journey of satisfying frustration). A lot of companies these days are using Python because it is a scripting language, playing a huge part in automation. Having said that, if you want to make yourself versatile, I suggest that you pick up R as well. While learning these languages, your knowledge in computer architecture can provide you some assistance in deciphering error messages. Happy de-bugging! (Pssst...you hear of StackOverflow?)
Resources? There are tons of tutorials out there, although I learnt mine from Coursera. Here is the link. You can audit the course, so you do not have to pay. If you want to install Python onto your notebook/laptop, I will suggest you go for Anaconda. And if installation are a pain in the ass, you can go to Azure Notebooks but the packages are not updated, based on what I observed (Big Sigh!).
2 Steps in! What's Next?
Great! You made it this far! Time to put what you have learnt into a project, a project to learn from. Seek a dataset you will be interested to explore. There are many open datasets out there, Kaggle, UCI Machine Learning Repository and open data from major US Cities like Boston, San Francisco and New York.
The aim is to start a project portfolio and build it up with multiple projects. At this stage, it can be tough to scope a project that you like to do. Suggestion? Go to your local data science community and seek a mentor. Have the mentor to guide you on scoping a suitable project. Be proactive to search for answers, understand that the mentor has a busy schedule and seek out a mentor only when your searches does not give you any answers.
How to build it into a project portfolio? I have a blog post you can refer to. Just remember, the portfolio is to showcase to your future employer how good you are.
Once you start on your first project, the subsequent ones should be easier. If not, seek help from the local data science community. Just don't be a bugger ok? :)
From here on, your journey has just started and I sincerely hope by the end of the first project you have found the passion in Data Science and Artificial Intelligence! Good luck, have fun and all the best! :)
If you are interested to find out more what are the other knowledge you need to equip yourself with, do have a read on the following post:
- Mathematics & Statistics
- Data & IT Management
- Domain Knowledge
- Soft Skills - Part 1 & Part 2
Do keep in touch on LinkedIn or Twitter, else subscribe to my newsletter to find out what I am thinking, doing or learning. :)