Becoming a Data Scientist - Part 4: Domain Expertise

Following the few posts that discuss what are the knowledge that someone who is interested to pursue a data science career(overall, mathematics, data & IT management),  I shall now discuss more on what are the business knowledge (domain knowledge) that the data scientist should have some knowledge of.

1 — Business Process

These days, every business or organization have processes. It is critical  that these processes are always kept to be efficient and effective, especially those that are customer-facing.

It  is important that the data scientist understand these processes in  general and how they work. Why is this important? Reason is data collection and model implementation are usually added into the business process as a company matures in data science. The data scientist needs to have a good understanding of how these business process works so as to be able to recommend when a certain data element can be captured for better quality and secondly be able to recommend where in the business process, the model should be placed so that the data elements for the model is provided and the model is generating  “decisions” at the right stage.

For instance, most of the credit scorecards requires credit bureau data.  Thus in the credit applications process, the models have to be placed at a stage where applicant’s data that are needed by the model, especially the applicant’s credit bureau data, are available for the model. The model should also be generating the credit score before the decision  stage since it is a critical decision factor (i.e. in deciding whether to grant credit).

Since this involves the implementation of the models, it is critical that the  data scientist have a good understanding of business process so as to make credible recommendations for data collection and model implementation.

2 — Strategic Management

Often times, the insights from the data scientist needs to be turned into business strategies. For instance, insights from a marketing campaign response model can be used to determine which customer characteristics  are likely to respond to a marketing campaign and from there devise a reasonable campaign that can reach out to these groups of customers.

Thus being able to provide “actionable” insights is critical and essential  for a data scientist that wants to provide value to their organization.  Having a good understanding of strategic management, helps the data  scientist to understand what kinds of insights will be highly valued by  the company, what kind of insights is actionable (perhaps given the  resources available in the organization), be able to think of the next  steps after presenting the actionable insights. Being able to think  strategically helps the data scientist to continuously provide value  through the provision of insights that can be acted upon. Being able to  continuously provide actionable and valuable insight help to build up  credibility since one would have a higher tendency to listen to a data  scientist that gives useful insights rather than insights that cannot be acted upon (i.e. hot air and feasible actions).

Business and Revenue Model

Business model is how a organization is serving a chosen market, where the competitive advantage is over other similar competitors and the revenue model would state down how the organization continue to extract value/profits from the business model.

For the data scientist to add value to their organization, it is essential  that they know what is the business and revenue model of their  organization, both present and future. Having some understanding, allows the data scientist to prioritize which business objectives is important and be able to provide insights that support important business  objectives. This ties back with being able to provide relevant insights  for strategy formulation and execution so that the organization can continue to operate, serve and extract profit from the chosen market.

With  an understanding of strategic management, business model and revenue model, the data scientist can understand the amount of value each project provides thus be able to provide insights that can be acted upon and because the insights are adapted from business and revenue model, it allows the company to continue extracting value from their data, creating a sustainable momentum in pushing for more data  science or analytics in organizations.

3 — Change Management

A lot of people who is starting out on data science do not realize that data scientist are change agents as well, because of the insights that we provide, changes are necessary and let’s face it, humans do not like change but change is necessary if the business is to survive in a dynamic environment, more dynamic as we go along.

Data  scientist being change agents, need to understand how to create  sustainable change (i.e. does not revert back to old habits) through the  process of providing insights. Data scientist cannot just create tremendous amount of information/insights and then just dump it on the organization. Sometimes there needs to be a measured approach to releasing the insights and information so that changes can be made and  be effective.

For those that are interested in change management, I find the process designed by John Kotter as one of the best out there. You can read the Wikipedia’s article on Change Management here.

4 — Domain specific

Subsequently, the business knowledge that the data scientist would need to have would  be related to the domain that the project/analysis is in. For instance,  if the data scientist is working in a risk management department, it  will need to understand the specific business definitions, regulations  (especially banking, healthcare, pharmaceutical, aviation), accounting  policies & international standards (GAAP or IFRS), process etc. This is the part that is more specific to the organization the data scientist is deployed in.

One  thing that I noticed in the hiring practices is the huge preference for employees with domain-relevant knowledge. This may severely limit the  supply of data science talents the organization have accessed to. Looking at the landscape and labor force, employers would have a better chance of getting more value form data science by looking for those that  are mathematically strong, being able to convert business objectives to  mathematical models. Based on my observation, this is a much more  difficult skill to find or train, as compared to programming and domain  knowledge.

Conclusion

With this I conclude, what in my opinion and observations, are the key skills and knowledge that newcomers to data science should know, learn and understand.

As technology changes, the data scientist job will evolve and the  knowledge and skills might have to be updated accordingly, so keep learning!!

(Note: This post was written previously for Medium and this is an edited version of it. Updated as well. Original post can be found here.)