At the start of last year, I wrote an opinion piece on what I saw and thought could be the trends for Artificial Intelligence moving forward. That is until the Covid-19 Pandemic hit the whole world and consumed its attention. I then wrote a piece on how the Covid-19 Pandemic may impact the Data Industry. When those posts were shared, I received a lot of feedback and opinions from others that sharpen and validated what I understand about the industry as well and thus I am writing down what I see are the trends moving forward.
In previous years, many data scientists will have been talking about how to integrate DevOps into machine learning projects. There was no clear distinction between laying data pipelines and implementing machine learning into projects, these two areas were just part of the bigger "DevOps".
In recent months, I commonly hear the terms "DataOps" and "MLOps" mentioned among the community and like-minded friends. According to Wikipedia, here are the definitions between the two.
"DataOps is an automated, process-oriented methodology, used by analytic and data teams, to improve the quality and reduce the cycle time of data analytics. While DataOps began as a set of best practices, it has now matured to become a new and independent approach to data analytics. DataOps applies to the entire data lifecycle from data preparation to reporting and recognizes the interconnected nature of the data analytics team and information technology operations." ~ Wikipedia
"MLOps, a subset of ModelOps is a practice for collaboration and communication between data scientists and operations professionals to help manage production ML (or deep learning) lifecycle" ~Wikipedia
If you check out the edit history of these two terms, no surprises that DataOps was "coined" first, in early 2017 whereas MLOps was "coined" in late 2018. This shows the emphasis on Data followed by Machine Learning.
However, what I thought was more food for thought was the fact that even though they have more than two years of history each, the words gained prominence and mindshare in recent months. Their frequent appearance showed that the industry has matured further and in terms of tools or projects, matured companies are looking at getting or creating better tools to engineer and implement these two "Ops" else there would not be a need for distinction and mention. This will show further that the knowledge and skills gap between the matured versus the beginners, companies that just started on their journey, is wider now. It also goes to show that the data industry is still developing.
If you are a company that just started on the data journey, just so you know there are more to catch up on, depending on your use cases. :)
Use Cases: Simulation & Computation
Last year, we saw something very interesting and that is Reinforcement Learning has been used to simulate and learn to set the "optimal" tax policies. Check out the video below from "Two Minute Papers".
This would not have happened without the rise in computation power, through Cloud Computing and also further development in Reinforcement Learning, to simulate the rationality of human beings in an economy. It also reminded me of a manufacturing conference that I gave a talk at two years ago. In that conference, there is this mention of the term "digital twin" where a factory has a physical floor and another in a simulation. It was still a new term back then and the simulation was used to determine the optimal arrangement for the machines to manufacture products at the lowest cost, in terms of time and money.
I believed that simulation will be the next frontier of usage for data and machine learning and we will see more use cases of it.
Talking about higher computation power, we have to talk about AlphaFold, which by the name you will know is a product of Google's Deepmind. Using Machine Learning and high computation power, we can proceed to solve or at least understand protein folding better. Why is it important? Check out the video Prof Sabine. Her YouTube channel is my new favorite to understand the World of Science.
As we continue on this journey in Machine Learning combined with higher computation power, what seems impossible previously because of the large solution space, may become feasible. Together with Simulation, we should see more similar use cases as we go along in the years.
Business: More Automation & Movement to Cloud
Currently, we have a lot of success in creating AI applications that fit the definition of "Weak AI" or "Artificial Narrow Intelligence". They can only replace tasks but not jobs at the moment. So I will say the current AI technology fits the label "Smarter Automation" rather. We should be seeing more application of automation as we go along, especially with the Pandemic around for a while more, the Pandemic should accelerate the development of more automation in the workplaces, both to increase productivity and reduce human-to-human physical interaction.
The movement towards having cloud infrastructure or "X"as-a-Service should continue further and will have gained a lot more momentum with the Pandemic. Most businesses will understand that it is a form of outsourcing and their IT infrastructure will be heavily reliant on the service level of the Cloud provider but the benefits outweigh the costs and risk by a lot. So no surprises there, just a matter of the adoption rate rather.
With the larger push for Cloud services, I believed talents that know how to architect in Cloud Platforms will be in higher demand. The more Cloud Platforms the talents can architect, the more in demand he/she will be. Yes, there will still be a need for a team to maintain the Cloud services, but it will be for the advanced features because the common features will mature to a point that one does not need to have the in-depth technical knowledge to maintain it.
Data Literacy & AI Explain-ability
I shared this in my previous two opinion posts and I believe again the trend is further accelerated by the Pandemic, that is the general population will have a better understanding of what data can or cannot do, its potential, and its limitations.
As the population's data literacy increases, the onus is on businesses to be able to explain how decisions are made to gain trust from their existing and potential customers which is essential given the rise in fake news/information. The need for transparency and an action plan if something goes wrong is more important than ever. Look no further than how the Singapore government reacted with regards to contact tracing data.
In case you are wondering, I think the government recovered well, although it did expose that there needs to be better coordination but as long as we learn from the lessons, as part of the electorate, I am fine with it. :)
My opinion is that we will not be out of the Pandemic so soon even if we have the vaccines developed. Because there are still many factors in consideration, like the effectiveness of the vaccines (do we need booster shots), the vaccination rate and proportion of the world population being vaccinated, etc. Thus we can count on the Pandemic to push the above-mentioned trends further for a while more.
These are based on my current observations and readings, and of course, I did an extrapolation on it to see what may happen in 2021 and beyond. Readers might have a different opinion and feedback and I will love to hear your thoughts on it. :)
Thanks for taking the time to read till here. Please feel free to link up on LinkedIn or Twitter (@PSkoo). Do consider signing up for my newsletter too. Let us all emerge stronger from the Covid-19 Pandemic!