Continuing from my previous post, today I am going to discuss two more reasons of failure in data science from the article "Data Science: Reality Doesn't Meet Expectations".
(3) Data science can’t always be built to specs.
Business stakeholders have many questions they want to get answered. One of the biggest challenges is to convert these questions into something that data can answer. For instance, a simple question like "Want to understand the characteristics of clients who default on credit card loans?" will generate numerous related questions such as, "What is the definition of default?","Is a customer that does not pay in full but pay down partially considered default?","If paid down partially is considered a default, then how much is considered? 50%, 70%, 80%?".
Even if we have the questions ready, setting the definitions business stakeholders are interested in. We still need to validate the assumptions that the data collected can answers the questions stakeholders are interested in. What I call the data-question fit.
Data science is all about experimentation and iterations too. The reason is that data scientist is always finding ways to "fit" the data into the interested questions. This takes some to and fro between feature engineering and data munging to the training of machine learning models before the fit is good.
Businesses have to realize that it cannot be built to specs but we can get it close to specs. There are two ways to do achieve the "close to specs" circumstances. Either getting the data to fit the questions of interest, or getting the questions of interest to fit the data available. The third possibility will be a mixture of both approaches.
Getting Data to Fit Questions
Plan your data collection, where possible. Data needs to be collected over time thus getting it right at the start is important. What is "right" you might ask? It is getting the data at a quality level that is sufficient to answer your business questions. In this case, how well the questions get answered is constrained by the availability and quality of data. This approach will take time to collect data, at least but the interest/buy-in from stakeholders is unlikely to be diminished.
Getting Questions to Fit Data
In this case, the questions may need to be modified to accommodate the available data. Personally, I am not that keen because for sure there is degradation on the value of data science initiatives. The questions may be changed to a point that is not of interest to the stakeholders. The advantage of this approach I can think of is data is readily available. Having said that, there is no guarantee of data quality.
How to get Data and Questions to fit together, so that value is provided to the business stakeholders, again requires someone who is experienced, who has the knowledge and experience to ensure there is a good fit. This part is not something that is taught in most programs, bootcamps, or degree. The solution is to work with someone experienced, someone who has experience scoping data science projects.
(4) You’re likely the only “data person.”
The data science journey is lonely as it is unlikely one has the privilege to work in a team of data scientists. And given that you are the only proficient data professional in your organization, you will be inundated with a lot of requests for data. You will feel frustrated. In your mind, you will keep saying to yourself, "Do you guys know that one request from you means hours of work for me?" (Sounds familiar?)
In such cases, there are a few things you can do, and I break them down into steps for you here.
Step 1 - Be indispensable to a few "influential" business teams. Provide a lot of value to them through data. Help them hit their KPIs if possible. This will provide you a good basis for the next steps.
Step 2 - Show the business teams what are the effort and considerations you have to take to satisfy their requests. Get them to prioritize the requests before they send it to you. Educate them at the end of the day, so they can help themselves by prioritizing their requests.
Step 3 - Time to reject requests that take too much time with little value provided. Start to say "NO".
Step 4 - If possible and budget allow, request for interns from related programs. You can see it as training your future team members so you can move on to projects of interest.
The start will be a tough journey, having to do the work you are paid for, analyzing the data, and educating the stakeholders but they will pay off eventually allowing you to have much political capital with other colleagues. :)
Data science by itself is quite tough, but it cannot exist by itself in business. So we have to learn how to work and communicate with other business stakeholders. By showing that you can provide value through data, helping your business stakeholders to achieve their goals, you will put yourself on a solid career foundation. Building from scratch may seem a lot of work as well, but if you look from another perspective, you have the clean slate to build something you will be proud of later.
This is part 2 of the series. The article I mentioned has seven reasons altogether. I have covered 2 reasons in the previous post. I will cover the rest in another post later.
If what I have discussed here or any of my articles is of value to you, consider subscribing to my newsletter. Have fun in your Data Science and Artificial Intelligence journey!