Through my discussion with people on data science and artificial intelligence, I often hear people saying, “Start-ups do not need data science. Let’s focus on capturing users by building the features that our users want.”, or something to that effect. Data science seldom made it to the list of priorities for most founders when they are working on their start-ups.

Most of the discussion revolves around the following reasons for not adopting data science; Data science is portrayed as expensive (mega-infrastructure!), takes up too much time, it is very challenging (need an expert to work on and expertise are very rare) and data science can only work when there are HUGE amounts of data.

I hold a different opinion on that. My opinion is "YES", start-ups do not need the “sophisticated” machine learning initially but it is at the right time to consider and prepare for data science capabilities in the organization.

Data Collection and Management Processes

Start-ups work with a blank slate and do not have legacy issues compared to large enterprises thus has a VERY GOOD opportunity to discuss what are the data to collect, the quality of data to collect, which stage of the business process should the data be collected, etc. By having such discussions early on, data collection can be worked into the business processes easily before different business processes extend and get more complex, becoming a “big bowl of spaghetti”. It is easier to fix the car while it is at a slow speed compared to fixing a giant car (big enterprises) moving at a fast speed.

A/B Testing!

For example, most start-ups are interested to quickly ramp up popular features and they need a more data-driven approach to determine popularity (i.e. conducting A/B testing). Resources such as data (which part in the business process should the data collection be done, the granularity of the data to be collected) and infrastructure (which database should we use) can be discussed upfront to allow start-ups to do a quick analysis of the data captured, either decide if a feature is popular or decisively move on to other worthy pursuits when the analysis showed otherwise.

Data Collection Needs TIME!

Secondly, it takes time to collect data. Good quality data do not magically appear. It requires planning, from data collection, data quality to data storage and retrieval. Collecting data at the right quality level can cut down a lot of data preparation work that is required before any analysis. Time is an essential ingredient to collect enough data.

With data collected very early on, start-ups can learn about the impact of their strategy and conserve resources (resources are precious in start-ups right?) if the impact is not going to be positive or great.

Thirdly, by starting your data collection early on, the start-up would be storing one of the critical resources that are needed to build artificial intelligence capabilities, if the start-up moves through several rounds of funding. Though this might change if we see further development in AlphaGo Zero.

Data science need a HUGE amount of data?

This misconception is likely to be brought about by the term “Big Data” that was used extensively to create urgency among companies to adopt data science.

If start-ups are to tap onto their data for value immediately, the first thing to do is to set up the reporting process or perhaps establishing an operation and strategy dashboard. Decide on the concerned metrics, based on current business strategy, for each of the dashboards.

For the operation dashboard, the start-up can have the metrics refreshed on a more frequent basis as compared to the strategic dashboard. The key here is to have everyone in the start-up understand currently, how are the operations doing; are we at the stipulated service level for our customers, is there a drop in user experience in critical areas, etc. As such, the start-up can move the limited resources to the right area to sustain operations at the right service level.

For the strategic dashboard, it is more for the start-up to understand if their current business model is working, if the business strategy (like capturing users, extending the usage of existing users, etc) is working or not.

These two dashboards do not need huge amounts of data since the data captured can be processed immediately for insights. It can help start-ups to manage their operations and strategy quickly and effectively, ensuring limited resources are used in areas that have the largest positive impact.

Data Science is expensive?

Data science need not be expensive. A start-up should not commit a lot of funds into tools without having a good long-term usage plan. My suggestion is to plan out the data science use cases that the start-up wants to work on and research the tools that are available, then see if it makes sense to go for open source or enterprise tools. Only commit to purchase tools when it makes absolute business sense when the value produced by these tools exceeds the costs of tools. I strongly believe that infrastructure should grow together with the value produced by the usage of data science in the start-up. Immediate purchase of enterprise tools without a good plan for it is likely to result in a huge waste of resources that are scarce in the start-up environment.

As previously mentioned, the types of analysis or machine learning done at  the initial stages of start-ups need not be complicated, so start-ups could perhaps offer the opportunities to carry out the analysis or machine learning to interns, giving them the relevant experience that can greatly benefit their career in the long run. This creates a win-win situation in that the start-up gets workable use cases and understands the value of data science at a low cost, which may include a nice surprise discovery of data science talents along the way. The interns get to practice what they have learned in their undergrad studies and see the strengths and weaknesses of their current set of skills. Perhaps to ensure, that the win-win situation creates the largest impact, having a mentor to guide the intern(s) will be beneficial. More importantly, is the mentor needs to have the practical experience and have worked on data science projects before.

Access to Data Science Resources

Most of the data scientists I have met, are always on the lookout for interesting challenges to work on, assuming they are adequately paid. In other words, what attracts data scientists is never salary alone but also the kind of challenges provided. So if the start-up can provide good challenges and an environment that is supportive of it, they can attract their fair share of data scientists.

VCs and angel investors may want to hire a data scientist (permanent role or consultative basis) to work on the data science projects provided or identified by the VCs’ and investor’s portfolio of start-ups.


Start-up should start thinking about building data science capabilities as early as possible. The greatest benefit to start early is the amount of data collected since they do not appear with a snap of the finger. Planning early allows the start-up to collect good quality data, iterate quickly, and move up the data science learning curve much earlier than their competition.

Infrastructure should grow together with the amount of value derived from the use cases. Or more importantly, the costs to implement use cases moves in tandem with business value. This would create a sustainable momentum of adopting data science in start-ups.

Start-ups do not need huge amounts of data at the start. They can start gaining insights from whatever data that they have captured and use these insights to conserve resources and focus on more critical areas.

I hope the blog has been useful! Have fun in your data science learning journey and do visit my other blog posts and LinkedIn profile. Consider signing up for my newsletter too. :)

(Note: This post was written previously for Medium and this is an edited version of it. Updated as well. Original post can be found here.)