“What is commonly neglected in data science?”
Well there are many, but something “uncommon” came to my mind at the moment of discussion. What is it?
Ok so what about Data Collection?
Think about it, for any data projects, we know that collecting as much data as possible is great! But we have to ask ourselves the question, why are we collecting data in the first place?
Because we want to build as clear a picture of what has happened and study it through the lens of mathematics and statistics!
So the focus/guidelines will be collecting data that paints a CLEAR picture! The more dimensions we collect, the better of course, but also these dimensions of data that we are connecting should be well-integrated as well.
Its like each dimension of data you collected is a jigsaw puzzle piece. The more pieces you have the better it is but if they are well-integrated, they can form a clearer picture. Think about four separate jigsaw puzzle piece, vs four integrated puzzle piece.
Which is a reason why it is always good to strategise your data collection strategy and tools such as business process mapping, user journey helps a lot in thinking about what dimensions of data to collect, to crunch to get a clear picture of what is happening.
That is why I was never a big fan of the term “Big Data” because its pretty difficult to integrate these siloed data. Rather I am a big fan of “Relevant Data”. :)
I wish you all the best in your data capabilities journey! :)
Consider supporting my work by buying me a "coffee" here. :)