If you are looking to break into Data Science and Artificial Intelligence, and considering training options such as bootcamps, this is a blog post for you.

Based on my observation, most courses that I have came across are very much focused on Machine Learning. They do not teach much Data Cleaning and they do not teach Strategic Management at all which I feel are essential for a Data Scientist to know, otherwise the value added to the organization is limited.

Funny thing is most of them claimed that the curriculum are designed by "experienced professionals". I hope these are not the same professionals that tells you data cleaning takes up 80% of your project time, because it does not make sense, that data cleaning makes up a substantial part of the project but is not taught much.

If you have worked on a Data Science project before, you will realize that Machine Learning forms a small part (a VERY SMALL part) of the entire project. It is a crucial part nonetheless but the success of the project do not solely depends on it. The obsession with ONLY Machine Learning is very unhealthy, granted that I do understand why most people have the notion that Data Scientist are maths geek that are obsessed with mathematics ONLY.

Below is a diagram taken from a paper, that talks about technical debt. (You can read the paper here). The diagram layout what are the different components of a data science project.

Machine Learning is Only a Small Component!

You can see that Machine Learning forms a very small part of the project. There are tons of other things the Data Scientist needs to do, especially on Data Management, Data Cleaning and Feature Engineering/Extraction.

And I will be brutally honest here, you can (and should) learn ALL the Machine Learning models out there but at the end of the day, you will only use a few of them. What about the rest, you might ask? Well...not to burst your bubble here but if the management or other stakeholders do not understand them (which most of them are given they are not machine learning expertise and very new to its usage in business), chances are slim the models will be implemented, unless you are a skilled communicator and have "mesmerized" them. There are many other reasons machine models can be rejected outright, for instance model explain-ability or regulations.

Till this point you may be wondering whether you should still go for the bootcamp or courses. My answer to that is not an immediate "No". We all have different learning modes. If a bootcamp is a good learning mode, providing structure and curation of materials, then please sign up for one. For more details on how to select the 'best' one, check out my other blog post here.

You need to understand that joining the bootcamp is not adequate for you to transition into Data Science. You can learn a lot about Machine Learning and apply it to a "designed" project but you must know two things:

1) The environment of the bootcamp project is controlled for better learning. In reality, the project environment is more chaotic.

2) Machine Learning is just a small component, going through the bootcamp says you only "know" how to apply Machine Learning model into "solving" business challenges. (Notice the quotation marks?)

To increase your chance of getting into the profession, you will now need to use what you have learned, to create a project portfolio. For more details, please refer to my other blog post.

If you have made it this far into the article, congratulations, you have taken the first step and now is the time to plan how to build a project portfolio. My suggestion for next step? Find a suitable mentor, someone you want to learn from and do not mind the hardship that follows. And one more thing, plan for the projects you like to do, perhaps in a industry you'll like to join. Good luck!

If the blog post has been useful to you, do consider sharing it. Do check out my LinkedIn profile and Twitter (@PSkoo) if you want to stay connected. Any feedback or comment is welcome too!