“What is commonly neglected in data science?”

Well there are many, but something “uncommon” came to my mind at the moment of discussion. What is it?

Data Collection!

Ok so what about Data Collection?

Think  about it, for any data projects, we know that collecting as much data  as possible is great! But we have to ask ourselves the question, why are  we collecting data in the first place?

Because we want to build as clear a picture of what has happened and study it through the lens of mathematics and statistics!

So the focus/guidelines will be collecting data that paints a CLEAR  picture! The more dimensions we collect, the better of course, but also  these dimensions of data that we are connecting should be  well-integrated as well.

Its like each dimension of data you  collected is a jigsaw puzzle piece. The more pieces you have the better  it is but if they are well-integrated, they can form a clearer picture. Think about four separate jigsaw puzzle piece, vs four integrated puzzle  piece.

Which is a reason why it is always good to strategise your data collection strategy and tools such as business process  mapping, user journey helps a lot in thinking about what dimensions of  data to collect, to crunch to get a clear picture of what is happening.

That  is why I was never a big fan of the term “Big Data” because its pretty  difficult to integrate these siloed data. Rather I am a big fan of “Relevant Data”. :)

I wish you all the best in your data capabilities journey! :)

Please feel free to link up on LinkedIn or Twitter (@PSkoo). Do consider signing up for my newsletter too. I just started my YouTube channel, do consider subscribing to it to give support! :)

Consider supporting my work by buying me a "coffee" here. :)