After I wrote the blog post on how to be a GREAT data scientist, I continued to ask the question, “How can one be a great data scientist that continuously improve and add value to the organization?”. After much research and thoughts, here are the skills, in addition to what I have covered in the previous blog post.
As a data scientist, we are a solution provider of sorts, finding nuggets of information/insights that can help the organization to overcome their current challenges or to move along the path of continuous improvement.
There is no way a data scientist can know everything there is to know for the project they are working on so there is a strong need to research for and on the solutions. It could be researching for the techniques to tackle class imbalance, integrating different technologies or new machine learning techniques, or the very simplest and common task of looking for suitable functions in R or Python.
So what about research skill? Having a good research skill involves two dimensions in my opinion.
(1) Suitability — finding the right information. Being able to search for the information that will answer the questions you have adequately.
(2) Speed — finding the information quickly.
Having good research skills (i.e. finding the right information quickly) helps cuts down time which the data scientist can use for other important tasks such as data exploration, data management, feature engineering, improving code readability etc.
Good news is in order to find the right information quickly, one has to work on the keywords that they use in search engines. Whenever I research for a particular topic, I will think about the keywords that I need for the search engine to “throw out” a list of results that will be suitable for my current needs.
Through “sharpening” of keywords, it helps the search engine to “understand” better what you are researching for and generate the required results. Besides the keywords used, I also take note of the order of keywords so as to improve the search results. Search engines provide auto-complete these days so take advantage of it, to get the results you want quickly.
I also tend to use a single search engine for my research. This is to “train” the search engine to understand what I usually look for so that it can provide the necessary information I need.
Given the many fields and areas that the data scientist is involved, being able to learn efficiently is important. The more a data scientist can learn, the more value he/she can provide to the organization.
Being able to pick up the knowledge quickly is perhaps one dimension of efficient learning. I believe a second dimension in efficient learning is being able to relate different parts of each field. In other words, being able to make many relevant connections between different fields. Being able to make the relevant connections, helps with retaining the knowledge and also strengthens the understanding of the recently learned concept.
Each of us have our efficient learning mode so it is important for us to quickly discover it and apply it, especially if we are moving up the data science learning curve.
Another point to add, the environment that a data scientist works in is very dynamic, with large changes abound. Sometimes, we need to “un-learn” in order to quickly learn about new things, meaning we have to give up on certain concepts that runs deep in many things that we have learnt before. There is a need for us to continuously question our deep-seated assumptions and concepts and questioned their relevance.
Mentoring & Teaching
How does mentoring & teaching helps in moving a data scientist from good to great. There are two ways I can think of.
Firstly, mentoring and teaching helps strengthen the current concepts that a data scientist has. Having a group of trainees/team members to mentor or teach helps in validating the current concepts that one holds, whether it is useful or under what circumstances it is not. I strongly believe that data scientist thrive in a collaborative environment and thus even a seasoned data scientist can still learn a thing or two from others since they can bring in fresh/newer perspectives.
Secondly, mentoring & teaching helps in brushing up the communication skills of a data scientist. During the mentoring session that I hold, it helps me in practicing my communication skills as I have to find various ways to explain the same concept so that my mentee can understand the concept well. Making the concept easy to understand by finding different methods to communicate it is very similar to communicating insights to business users, helping the users to understand well the insights provided.
The skills that are shared here is to help data scientist to increase their value to the organization that is hiring them, these are soft skills (similar to the previous post) that is not easy to pick up from books/online but can be built up through experience, so start working on it NOW. :)
I hope the blog has been useful to you. If it has, do share it. Have fun in your data science learning journey and do visit my other blog posts.