After I wrote the blog post on how to be a GREAT data scientist, I continued to ask the question, “How can one be a great data scientist that continuously improve and add value to the organization?”. After much research and thoughts, here are the skills, in addition to what I have covered in the previous blog post.

Research Skill

As  a data scientist, we are a solution provider of sorts, finding nuggets of information/insights that can help the organization to overcome their  current challenges or to move along the path of continuous improvement.

There  is no way a data scientist can know everything there is to know for the project they are working on so there is a strong need to research for and on the solutions. It could be researching for the techniques to tackle class imbalance, integrating different technologies or new  machine learning techniques, or the very simplest and common task of looking for suitable functions in R or Python.

So what about research skill? Having a good research skill involves two dimensions in my opinion.

(1)  Suitability — finding the right information. Being able to search for  the information that will answer the questions you have adequately.

(2) Speed — finding the information quickly.

Having  good research skills (i.e. finding the right information quickly) helps  cuts down time which the data scientist can use for other important tasks such as data exploration, data management, feature engineering, improving code readability etc.

Good news is in order to find the right information quickly, one has to work on the keywords that they use in search engines. Whenever I research for a particular topic, I will think about the keywords that I need for the search engine to “throw out” a list of results that will be suitable  for my current needs.

Through “sharpening” of keywords, it helps the search engine to “understand”  better what you are researching for and generate the required results.  Besides the keywords used, I also take note of the order of keywords so  as to improve the search results. Search engines provide auto-complete  these days so take advantage of it, to get the results you want quickly.

I also tend to use a single search engine for my research. This is to “train” the search engine to understand what I usually look for so that  it can provide the necessary information I need.

Efficient Learning

Given  the many fields and areas that the data scientist is involved, being able to learn efficiently is important. The more a data scientist can learn, the more value he/she can provide to the organization.

Being  able to pick up the knowledge quickly is perhaps one dimension of efficient learning. I believe a second dimension in efficient learning is being able to relate different parts of each field. In other words, being able to make many relevant connections between different fields. Being able to make the relevant connections, helps with retaining the knowledge and also strengthens the understanding of the recently learned concept.

Each of us have our efficient learning mode so it is important for us to  quickly discover it and apply it, especially if we are moving up the  data science learning curve.

Another point to add, the environment that a data scientist works in is very dynamic, with large changes abound. Sometimes,  we need to “un-learn” in order to quickly learn about new things,  meaning we have to give up on certain concepts that runs deep in many  things that we have learnt before. There is a need for us to continuously question our deep-seated assumptions and concepts and questioned their relevance.

Mentoring & Teaching

How does mentoring & teaching helps in moving a data scientist from good to great. There are two ways I can think of.

Firstly, mentoring and teaching helps strengthen the current concepts that a data scientist has. Having a group of trainees/team members to mentor or teach helps in  validating the current concepts that one holds, whether it is useful or  under what circumstances it is not. I strongly  believe that data scientist thrive in a collaborative environment and  thus even a seasoned data scientist can still learn a thing or two from  others since they can bring in fresh/newer perspectives.

Secondly, mentoring & teaching helps in brushing up the communication skills of a data scientist. During the mentoring session that I hold, it helps  me in practicing my communication skills as I have to find various ways  to explain the same concept so that my mentee can understand the  concept well. Making the concept easy to  understand by finding different methods to communicate it is very  similar to communicating insights to business users, helping the users  to understand well the insights provided.


The  skills that are shared here is to help data scientist to increase their  value to the organization that is hiring them, these are soft skills  (similar to the previous post) that is not easy to pick up from books/online but can be built up through experience, so start working on it NOW. :)

I hope the blog has been useful to you. If it has, do share it. Have fun in your data science learning journey and do visit my other blog posts.

Do keep in touch on LinkedIn or Twitter, else subscribe to my newsletter to find out what I am thinking, doing or learning. :)