Recently, I was asked the question “How can I go about to pick up domain knowledge?”
To be fair to a lot of boot camps and degree programs, it is very difficult to teach domain knowledge especially when the trainers do not have much hands-on experience, or many project experience.
Domain knowledge is very important. It helps the data scientist to structure a business problem, that can be solved by data, and matters tremendously to stakeholders. Having good domain knowledge allows the data scientist to provide good value to stakeholders, and perhaps in turn make them more indispensable to the employer.
Secondly, data collection and training of machine learning models can be affected by the domain it is applied. Having a good and comprehensive understanding of the domain can reduce the effort wasted due to missing out certain nuggets of domain knowledge.
Types of Domain Knowledge
I split domain knowledge into three types.
- Business Generic
Business Generic are business standards that are the same for anyone working in any business environment. This usually includes accounting standards like (GAAP and IFRS), or tax codes. It can include knowledge like change management, economics, team management, etc. It is business knowledge that applies to any industry.
Industry-Specific is domain knowledge specific to the industry you are in. For instance, if you are in the banking industry, you will be subjected to different regulations standards such as Basel III or if you are in the healthcare industry, there are regulation standards on medical devices and pharmaceutical products.
Company-Specific is domain knowledge specific to the business the data scientist is in. This is something that the boot camps or degree programs are not able to teach and have to be figured out by the data scientist once he/she joins the business organization. Company-specific domain knowledge includes revenue model, target market, business processes, etc.
From the above section, you can see that for Business Generics and Industry-Specific domain knowledge can be gained easily, whereas company-specific may involve trade secrets so they are not readily available.
I am going to share some resources you can use to brush up your domain knowledge, especially the Business Generics and Industry-Specific.
Business Journals & Newspapers
I am a fan of The Economist and Bloomberg BusinessWeek. They contain much analysis of the industry or company they are reporting. They state down what the company is concerned with and how external events, such as Financial Crisis, political elections, industry regulations affect businesses and their outlook. Reading them helps in data scientist's analysis work and also understand what and how external events might affect data collection and machine learning model training. There are other good business journals and newspapers. Browse a few different publications, have a read and decide which one(s) you want to go with.
In recent months, I revisited podcasts as another learning tool, I am overwhelmed by the amount of content. There are many many many genres of podcast channels. If you tend to learn via auditory channels, podcasting will be the right place for you. Podcasts are free and most of the major business newspapers have podcast channels. This serves as another learning resource for you to pick up domain knowledge.
YouTube now contains tons of videos you can learn from, domain knowledge, is one. If you are looking at short snippets, around 10 mins, you can look at the CNBC channel, they have topics like why certain business made it in the certain countries and do have some in-depth analysis in them even though it is short. If you watch it at 2X, you can cut down the time through the video, assuming you can grasp the narration easily.
How about Company-Specific?
I find that to understand more about company-specific domain knowledge, you should work with internal data as much as possible, be it through analysis or data cleaning. I find that the more time you spent with internal data, the more you can understand that company-specific domain knowledge. In short, get intimate and spend time with internal data. :)
Data Science is never limited to Machine Learning alone. There are many areas involved to create value from data. The obsession with machine learning is never healthy if knowledge in other areas is not picked up as well. Different projects will involve different extent of domain knowledge but at the end of the day, it is something data scientists cannot do without, and if the data scientist wants to start contributing value, he/she cannot avoid domain knowledge so it is best to start early.
I hope the post is useful to you! If you have any feedback, please reach out to me on my LinkedIn or Twitter. Consider signing up for my newsletter to be updated with what I am working on. Have fun in your learning journey!