This post was originally published by Viv Penninti at Medium [AI]
Have you ever wondered how some people are just plain brilliant with numbers? How they can spot data issues in a jiffy — or glean an insight that’s not apparent to you? In this new era of Artificial Intelligence, every Business Intelligence (BI) vendor, and for that matter most software vendors, claim to be using AI. In most cases this is great window dressing — but puns apart, given the complexity of BI insight discovery, can we automate this process with current AI technology?
Business Intelligence is a data analytics function that has more recently become synonymous with dashboarding. However, the more complex part of BI — the “why” and exploratory/ discovery part — has its own language and associated “grammar” (semantic rules), that can vary across companies and industries which complicates the development of a truly intelligent cross-industry BI system.
Limitations of Current Artificial Intelligence Technologies
The success and value created by AI over the last decade is undeniable. McKinsey estimates that AI will add over $13 trillion to the global economy by 2030 and that AI will eliminate 30+% of repetitive jobs over the next few decades. However, the fact that only 23% of companies are using AI in a “production” setting with much of that related to repetitive task automation using Robotic Process Automation (RPA), is an early indicator that we’re hitting limits of what current AI technology can accomplish in the corporate world. The two main issues impacting AI progress include: (a) Machines learning models, deep learning in particular, require a significant amount of labeled data (the ability of the human mind to infer from a limited set of observations would appear antithetical to this need), and (b) the deluge of weights that result from deep learning models, result in a black box with no easy way to ascribe causality. The need for both “pattern” based AI and symbolic AI that can handle knowledge artifacts, rules, and causal reasoning — may be a necessary requirement to emulate true human intelligence. At present we have no such published approach, so rest assured that we’re quite far away from the destruction of humanity concerns (ala “Muskism”) with the current “pattern” based AI paradigm.
BI Automation Complexity Examples
A smart BI tool, for example, would need to know who my competition is right? Does that need to be trained too or is that an inferred knowledge artifact? How does one train a system to learn about additive vs. non-additive measures? For example, we know that Revenues can be added across any attribute; time, geographies, etc. (fully additive), but the same can’t be said for Price/Unit that is not additive across any dimension (non-additive). Is this something that needs to be trained? When is an index calculation useful — where an index is defined as a value divided by some base (e.g. Price Index 2021 using year 2000 Price as the base year)? Analysts understand that indexes are useful for measurements across non-additive dimensions — except the time dimension! Where do we get the data to train for this type of implicit knowledge? Could these be represented symbolically, as rules, instead?
Continuing this thought, when we see two measures as shown in the table, is there an insight to be gleaned from this simple table? In this case, yes there is a critical insight! A good analyst will note that the average monthly revenues for the current three months is 10MM (30/3)
versus an average of 20MM (60/3) for the previous three months (because the previous three month sales = 90–30 = 60MM). This would indicate a significant reduction (50%) in revenues over the last three months, and assuming no seasonality and no other explainable factors, a good analyst will “flag” this as a significant change. But how would a system learn to do this without understanding the relationship between the measures? How do we represent such relationships? Training again?
Today there is massive proliferation of KPI ETL pipes in companies to support dashboarding. Lost in this plumbing of SQL code used to compute the KPIs is the semantic context (knowledge) and other critical artifacts needed for intelligent BI automation — unless we develop machines that learn and interpret SQL language too! The thought that hundreds of these types of rules and artifacts can be “learned” by an AI system seems like a fool’s paradise — at least based on today’s pattern driven approach to AI learning.
A Neuro-Symbolic Approach to BI Automation
Our journey to automate BI started eight years ago, and based on our 30+ years of data analytics experience, we realized early on that modern pattern based machine learning (ML) alone will not be able to address the challenge of BI automation. A hybrid Symbolic AI and ML approach is shown in the diagram below:
The main components include: (a) Semantic Knowledge Repository, (b) Data/SQL Generation Engine, © Inference/Assertion Engine, and (d) a Text Realization Engine. The most complex engineering accomplishment, oddly, was the development of a robust and extensible Semantic Knowledge Repository (or “analytical semantic layer”) which serves as the knowledge and rule hub for all subsequent analytics and inferencing.
The need for Symbolic AI approach is supported by various new research efforts — which recognize the limitations of modern AI/Deep Learning in terms of; large labeled data needs, the inability to ascribe causality, lack of inferencing, and reasoning. As stated by Lamb and Garcez — “many have identified the need for well-founded knowledge representation and reasoning to be integrated with deep learning and for sound explainability.” Most current vendors provide BI analytics with predefined report level templates and associated custom analytics code that is not scalable and severely limiting. To build a truly “no-code/low code” intelligent BI platform requires an analytical semantic layer and a ML/assertion engine coupled with NLG.
Closing Thoughts & Conclusion
Current AI solutions are for the most part “Weak” (i.e. address a single task) and even the “Strong” or more generalized AI solutions do not utilize institutional knowledge artifacts or defined rules based inferences (aka expert systems) or a method to learn such rules/causality. Researchers continue to investigate causal neural networks, neuro-symbolic AI, bayesian networks, and other approaches in search of developing technologies that better represent human intelligence. With the best and brightest working on these problems, we’re sure to overcome these limitations in few decades or less. For now, though, it is more likely that current AI technology could eliminate (literally and figuratively) the more structured data science function and the Data Scientists who build them in the first place — but not the business intelligence function!
 Reshaping Business with Artificial Intelligence — BCG and MIT Sloan Management Review
 Neuro-Symbolic AI: The Third Wave, December 2020, Artur d’Avila Garcez and Lu´ıs C. Lamb
This post was originally published by Viv Penninti at Medium [AI]