Data and Trending Technologies: Artificial Intelligence Is All About Data

COL03xx - column image pls use itArtificial Intelligence Is All About Data! It’s Real (And Not Fake)

NewVantage Partners’ annual Big Data Executive Survey reveals that 88.5% of top executives surveyed believe that Artificial Intelligence (AI) is first among all new capabilities that will have a disruptive impact on their companies, and believe that AI Is “Most Disruptive” new capability over the next decade.

Additionally, Google’s CEO Sundar Pichai recently said AI is “one of the most important things that humanity is working on. It’s more profound than electricity or fire.”  It is fairly well documented that large tech companies like Alphabet (Google), Apple, Facebook, Microsoft, Amazon, Alibaba, etc. are investing heavily in Artificial Intelligence and many analysts argue that they have a distinct advantage in AI over other companies.

Why do they have this advantage? AI needs data to recognize patterns; these companies deal with large sets of data and they have learned to organize and manage data to feed AI. In essence, AI is about algorithms and data.



For this column of my ongoing series on ‘Data and Trending Technologies’, my focus will be on Artificial Intelligence and how data plays an out-sized role in AI’s impact.

In one of the recent articles on, the chain of thought on AI was described succinctly as follows:

  1. Graphics and tensor processors are eating linear algebra.
  2. Linear algebra is eating deep learning.
  3. Deep learning is eating machine learning.
  4. Machine learning is eating artificial intelligence.
  5. Artificial intelligence is eating software.
  6. Software is eating the world.

Even though this is somewhat tongue in cheek, the chain of thought explains the AI stack very clearly. In essence, AI is about consuming and learning from data in a systematic way.

First let us look at some of the early successes of AI to understand the role of data. Apple (with its intelligent Siri), Amazon (with its ever-improving Alexa), and Facebook and Google (with their image recognition algorithms), all deal with voluminous amounts of data in speech, voice, and image fields. It is not accidental that applications like speech/image/voice recognition and the companies like Google, Amazon, and Apple are leading in AI. These companies have access to large sets of data and these applications rely on that data to make quicker informed decisions.

It is said that data is the ‘new oil’ or ‘new coal’ to drive our digital economies going forward in the same way that coal and oil drove industrial economies. Whether that is not true or not, we know that AI needs data to do the following things: (1) Discover from data and automate repetitive learning (2) Dig deep into multiple layers of data using neural network algorithms (3) Make algorithms into progressively-learning algorithms by discovering hidden structures and patterns/irregularities in data (4) Add intelligence to existing applications (like Siri or Alexa) with the learned patterns in data.

I read many articles about how AI can help with data management. Here are a few if you are interested. But let’s flip that and ask the question “How can data management help AI?” As I mentioned above, data (and algorithms) are the keys to success of AI in applications. If we agree with that premise, the next logical question should be “How can managing the data help AI even further?” And here are my thoughts based on my experience so far in related consulting engagements.

Data management as a discipline involves accessing data from various sources, ensuring adequate levels of data quality, integrating with various applications/systems as necessary, and storing/processing to serve business needs. Using a few examples from our consulting examples, I’d like to show how various components of data management influenced AI applications.

In one of our engagements involving AI, the team initially discarded one of the archived systems (and data stored with it) and failed to recognize it as a source for AI application. When we re-ran the application after including the data from the archived systems, the results were fairly dramatic as the AI application now identified more seasonal patterns from the historical data that was lacking in the original instance. The ‘data access/data sources’ component of data management made a meaningful difference to the AI application.

In this second example related to data quality, I might be stepping on some sacred tenets in data management. In this instance, we initially fed another AI application with ‘clean’ data as a true data practitioner would do. Out of curiosity, we made another run at the AI application feeding it with the original data (unclean or dirty data). Lo and behold to our surprise, the ‘unclean data’ had lot of insights that the AI application could learn from to predict more meaningful and realistic outcomes than the one that was fed ‘clean’ data. What we learned is that the business unit was storing some meaningful information in many free form fields which we ‘cleaned’ as per data quality rules. I am sure I am going to get flak for suggesting that ‘unclean’ data is somehow superior to clean data in this case. I am not sure if the ‘data quality’ exercise itself was improperly done, but my point is that the ‘data quality’ component of data management has a material impact on AI applications.

In an excellent article on, Paramita Ghosh identified some use cases for AI in healthcare, construction, human resources, and industrial operations areas. In all these industries, data management is crucial and should be vital to their success (if it’s not already), and I believe that success of AI in these industries is also reliant on data management. Now and then, I come across data practitioners concerned about the impact of AI on their practice and I disagree with their premise that AI might somehow negatively impact their area of expertise. Far from it, I am excited about the rise of AI and how I can contribute to its success with my focus on data management and data analytics. Why? Whichever way you spin it, data is at the center of many of the latest technologies including AI. All I can say is “Bring it on!” I hope you will too.



submit to reddit

About Ramesh Dontha

Ramesh Dontha is Managing Partner at Digital Transformation Pro (www.DigitalTransformationPro.Com), a management consulting company focusing on Data Strategy, Data Governance, Data Quality and related Data management practices. For more than 15 years, Ramesh has put together successful strategies and implementation plans to meet/exceed business objectives and deliver business value . His personal passion is to demystify the intricacies of data governance and data management and make them applicable to business strategies and objectives. Ramesh can either be reached on LinkedIn or via email: rkdontha AT