Becoming a Data Scientist - Part 4: Domain Expertise
Following the few posts that discuss what are the knowledge that someone who is interested to pursue a data science career(overall, mathematics, data & IT management), I shall now discuss more on what are the business knowledge (domain knowledge) that the data scientist should have some knowledge of.
1 — Business Process
These days, every business or organization have processes. It is critical that these processes are always kept to be efficient and effective, especially those that are customer-facing.
It is important that the data scientist understand these processes in general and how they work. Why is this important? Reason is data collection and model implementation are usually added into the business process as a company matures in data science. The data scientist needs to have a good understanding of how these business process works so as to be able to recommend when a certain data element can be captured for better quality and secondly be able to recommend where in the business process, the model should be placed so that the data elements for the model is provided and the model is generating “decisions” at the right stage.
For instance, most of the credit scorecards requires credit bureau data. Thus in the credit applications process, the models have to be placed at a stage where applicant’s data that are needed by the model, especially the applicant’s credit bureau data, are available for the model. The model should also be generating the credit score before the decision stage since it is a critical decision factor (i.e. in deciding whether to grant credit).
Since this involves the implementation of the models, it is critical that the data scientist have a good understanding of business process so as to make credible recommendations for data collection and model implementation.
2 — Strategic Management
Often times, the insights from the data scientist needs to be turned into business strategies. For instance, insights from a marketing campaign response model can be used to determine which customer characteristics are likely to respond to a marketing campaign and from there devise a reasonable campaign that can reach out to these groups of customers.
Thus being able to provide “actionable” insights is critical and essential for a data scientist that wants to provide value to their organization. Having a good understanding of strategic management, helps the data scientist to understand what kinds of insights will be highly valued by the company, what kind of insights is actionable (perhaps given the resources available in the organization), be able to think of the next steps after presenting the actionable insights. Being able to think strategically helps the data scientist to continuously provide value through the provision of insights that can be acted upon. Being able to continuously provide actionable and valuable insight help to build up credibility since one would have a higher tendency to listen to a data scientist that gives useful insights rather than insights that cannot be acted upon (i.e. hot air and feasible actions).
Business and Revenue Model
Business model is how a organization is serving a chosen market, where the competitive advantage is over other similar competitors and the revenue model would state down how the organization continue to extract value/profits from the business model.
For the data scientist to add value to their organization, it is essential that they know what is the business and revenue model of their organization, both present and future. Having some understanding, allows the data scientist to prioritize which business objectives is important and be able to provide insights that support important business objectives. This ties back with being able to provide relevant insights for strategy formulation and execution so that the organization can continue to operate, serve and extract profit from the chosen market.
With an understanding of strategic management, business model and revenue model, the data scientist can understand the amount of value each project provides thus be able to provide insights that can be acted upon and because the insights are adapted from business and revenue model, it allows the company to continue extracting value from their data, creating a sustainable momentum in pushing for more data science or analytics in organizations.
3 — Change Management
A lot of people who is starting out on data science do not realize that data scientist are change agents as well, because of the insights that we provide, changes are necessary and let’s face it, humans do not like change but change is necessary if the business is to survive in a dynamic environment, more dynamic as we go along.
Data scientist being change agents, need to understand how to create sustainable change (i.e. does not revert back to old habits) through the process of providing insights. Data scientist cannot just create tremendous amount of information/insights and then just dump it on the organization. Sometimes there needs to be a measured approach to releasing the insights and information so that changes can be made and be effective.
For those that are interested in change management, I find the process designed by John Kotter as one of the best out there. You can read the Wikipedia’s article on Change Management here.
4 — Domain specific
Subsequently, the business knowledge that the data scientist would need to have would be related to the domain that the project/analysis is in. For instance, if the data scientist is working in a risk management department, it will need to understand the specific business definitions, regulations (especially banking, healthcare, pharmaceutical, aviation), accounting policies & international standards (GAAP or IFRS), process etc. This is the part that is more specific to the organization the data scientist is deployed in.
One thing that I noticed in the hiring practices is the huge preference for employees with domain-relevant knowledge. This may severely limit the supply of data science talents the organization have accessed to. Looking at the landscape and labor force, employers would have a better chance of getting more value form data science by looking for those that are mathematically strong, being able to convert business objectives to mathematical models. Based on my observation, this is a much more difficult skill to find or train, as compared to programming and domain knowledge.
Conclusion
With this I conclude, what in my opinion and observations, are the key skills and knowledge that newcomers to data science should know, learn and understand.
As technology changes, the data scientist job will evolve and the knowledge and skills might have to be updated accordingly, so keep learning!!
(Note: This post was written previously for Medium and this is an edited version of it. Updated as well. Original post can be found here.)