This post was originally published by Daniel Gutierrez at Inside Big Data
Welcome to insideBIGDATA’s “Heard on the Street” round-up column! In this brand new regular feature, we highlight thought-leadership commentaries from members of the big data ecosystem. Each edition covers the trends of the day with compelling perspectives that can provide important insights to give you a competitive advantage in the marketplace. We invite submissions with a focus on our favored technology topics areas: big data, data science, machine learning, AI and deep learning. Enjoy!
Fair Decisions Require Data Literacy. Commentary by: Scott Zoldi, Chief Analytics Officer, FICO.
“As the number of businesses that incorporate machine learning into their decision making continues to grow, so too does ensuring that all involved know the importance of data literacy – the language behind digital decision making. Too often, this does not appear to be the case – when pursuing higher-quality insights, many businesses devote resources solely to improving artificial intelligence (AI) algorithms or analytic models rather than the data they’re feeding the algorithm. As the data literate among their leaders realize, no matter how innovative the algorithm or model, its results will only be as accurate as the data it receives. Instead of focusing on the model, the majority of a company’s time should be spent ensuring the data it feeds an algorithm is accurate, admissible and free of bias. Bias, after all, is rooted in data – until proven otherwise, all data should be treated as suspect. Business leaders need to have a strong AI and data governance policy in place, one that guarantees they know whether customers have, for example, invoked a right to be forgotten, or withdrawn their data from future models. Assuming the data on hand has been collected fairly, the company’s leaders must then be able to prove why their use of certain data fields and the algorithms processing them is acceptable if they want the insights generated to be accurate and bias-free. While analytic models often render binary decisions, the issues governing their proper use are anything but. They are nuanced, complex and must not be rushed. As an increasing number of companies recognize that digital decisioning is the gateway to digital transformation, their business leaders and data scientists also need to ensure they’re speaking the same language – data literacy.”
On TESLA AI Humanoid Robot. Commentary by Dr. Ben Goertzel, 25 years working on AI, former Chief Scientist at Hanson Robotics and CEO of SingularityNET.
“Tesla’s announcement of an ambitious humanoid robot project — with aggressive timelines but apparently absolutely no actual technology behind it — has been greeted by actual robotics researchers with a combination of amusement, bafflement and scorn. One reason for this: Humanoid robotics — much like full level 5 self-driving but even more so — is an application that turns out to be much harder than it seems at first. So many aspects of everyday human movement and perception and interaction that seem obvious and intuitive to us as humans, are actually quite difficult problems from a computer science and robotics perspective. Another reason: It well known in the robotics world that making amazing research robots that give cool demos only brings you a small portion of the way toward making robots that can robustly do useful things in the real world. But Musk didn’t even bother to make a smoke-and-mirrors robot demo like an actual robotics company, he just showed off a dancer in a robot suit! Those of us who have been working on robotics for years and decades understand that the humanoid-robotic functionalities Musk has casually promised are actually “holy grail” problems that are the subject of numerous global R&D projects, and whose achievement probably requires both serious hardware advances and significant progress toward AGI. No doubt Tesla could make serious progress in the humanoid robotics space, but the progress isn’t going to be nearly as fast nor the near-term deliverables as pretty as Musk has led his audience to believe. There is real work being done in the direction of humanoid service robots for home, office and hospital — for instance, in my own world, the Grace robot created by Awakening Health, a JV between Hanson Robotics and SingularityNET. It would be better to direct the world’s attention to these real humanoid robots and the challenges associated with them than to human robot impersonators.”
What Data Scientists Don’t Know About Mathematical Optimization. Commentary by Dr. Greg Glockner, VP and technical fellow at prescriptive analytics leader Gurobi.
“There’s a folklore out there that some business decisions are too complex for mathematical optimization. Instead, data scientists often leverage machine learning technologies alone. While these technologies can produce repeatable and reliable decisions based on previous data, they only offer a partial solution as data changes. While a great tool, machine learning should not be a substitute for mathematical optimization. As data changes relevant to its industry, scientists must make quick and efficient changes to ensure it is accounted for. With mathematical optimization, data scientists don’t have to re-write the entire code, but instead can update data values in the equation to refactor for reliability and performance. For planning or any kind of recommended decisions, optimization technology is the best tool available to a data scientist: it adapts to changing business needs and is proven for even the hardest business decisions.”
Child Trafficking, AI and the Financial Sector. Commentary by: Mark Gazit, CEO, ThetaRay.
“27% of human trafficking victims are children. Forced labor generates $150 billion in illegal profits annually. If it weren’t profitable, it wouldn’t exist. But how can we make it unprofitable? Believe it or not, banks can help put a dent in the child trafficking trade. There now exists advanced AI technology that is able to detect sophisticated money laundering schemes hidden within millions of innocent-looking transactions. Recently, a bank used such a technology to detect a scheme of many cross-border banking transfers related to ‘medical tourism’ without clear business activity. An investigation revealed that it was tied to child trafficking. If banks can proactively and continually stop illicit profits from getting through and alert authorities to suspicious accounts, some human traffickers will be apprehended, and others will decide that this activity is not worth the risk. It is crucial that banks put the pressure on these criminals.”
Object Storage. Commentary by Giorgio Regni, CTO, Scality.
“The massive growth of the streaming services market has changed the game. The uptick in demand has had an impact on how streaming services approach storage and backup. It’s time to rethink what the right storage solution is and why it’s key to your backup and disaster recovery strategy. It makes sense, in light of a variety of business factors and the effect of the pandemic, that the hybrid cloud storage approach would become popular. It works well for media companies that want to maintain their primary data storage on-premises for reasons including performance, security and compliance. At the same time, the public cloud offers compelling services that can enhance this on-premises data. The public cloud can also be used for collaboration; for example, new assets can easily be uploaded by content producers to a public cloud. By using hybrid cloud services, companies retain greater control over their private data. A company can store sensitive data on a private cloud or local data center and at the same time make use of the robust computational resources of a managed public cloud. Object storage can help organizations have a more fine-grained authentication and security, more so than a typical file storage solution. At the same time, this provides your employees with the kind of self-service backup and recovery they need in a remote world where speed is of the essence and efficiency is front and center. Companies have more opportunities than before when it comes to managing backup and recovery. More consumers are streaming, and more employees are working from home; companies need the ability to help employees to protect their own data as well as the corporate data. With both speed and efficiency in its favor, object storage makes this possible.”
Explainable Machine Learning Models. Commentary by: Alex Keller, Data Science Lead at Tillful.
“Explainable machine learning is designed for model designers, model consumers, and regulators as a way to get an approximation to the underlying model’s internal behaviors and how and why it arrived at their predictions. Preferably, these explanations are accessible, not overly technical, and integrated as a priority throughout the entire lifecycle of model creation — from inception, through deployment, and into the final reports and assessments that ultimately empower decision making. Such models are essential for creating more inclusive and robust methodologies that minimize biases in decisions with significant social or economic impact. In finance, for instance, they have the power to eliminate the biases that keep financial institutions from extending credit to minority business owners, who have historically been denied equal opportunity in the space. In addition, financial institutions can harness deeper patterns and higher capacity models by establishing a principled approach to data collection, validation, model construction, and Explainability. Historically, data scientists and statisticians designed low-capacity statistical models that fit linear relations; this suite of techniques such as Generalized additive modes, Decision Trees, or Linear regressions are glass-box methods. By construction, Glass-box methods are fully interpretable and thus explainable for human comprehension. However, with the rise of larger, more high dimensional data sets, higher capacity non-linear models capture complex patterns and return state-of-the-art performance. As ML/AI becomes more common in everyday decision-making processes, it becomes paramount that companies deploy an ensemble of techniques to understand the underlying models’ behaviors. To achieve Explainability, we require techniques of interpretability as the foundational pillar. Thus, continued innovation and research in interpretability will be an integral component, along with research and innovation in robustness, fairness, and causal learning. Moreover, Explainability is a collaborative effort that requires the intersection of many areas of study and policy development.”
“MacGyvering Data” and How Businesses Can Make the Most of the Data they have, Regardless of Budget or Constraints. Commentary by Arun Kumar, EVP Data & Insights at Hero Digital.
“Throughout my career, I have seen data and Insights inform the right actions that drive business growth. However, I have also worked with companies at many different stages in their data journeys. Not all companies have a robust set of data to use, so I have learned to get creative by ‘MacGyvering Data.’ In an ideal world, we would all like to have unfettered access to customer data that is accurate, timely and available in the desired formats ready for analysis and consumption. But the real world is far from that. Data is messy, incomplete and oftentimes missing. Which is where we need to get creative. Our approach is to work with data our clients already have, and to build a test and learn approach around it. Rapidly deploy insights from sparse data, understand early signals, and deploy the next set of actions. All this requires a nimble approach which utilizes the ability to listen to signals from the data quickly and the courage to act on those early signals. While we would all like to wait for perfect data, this approach leads to iterative improvements that over time, lead to significant business gains. Work with the data you have. Build on it. Learn from it. Action it. And in that process, gather more of it.”
AI/ML in Streaming Analytics: How Businesses Can Make Better Decisions, Faster. Commentary by: Conor Twomey, Managing Director, North America, KX & FD Technologies.
“Modern companies are handling more data points than ever before. This newfound access to fast, big and varied data has transformed business; but to be truly data driven, organizations (and their leaders) must learn how to harness the power of that information to make meaningful in-the-moment decisions. This begins with a mindset and culture change. While the pandemic has certainly been an accelerator for data-driven enlightenment, recent research showed that 66% of companies say culture is a blocker to reducing time to value from their data and almost half (49%) struggle with a lack of people and skills. Investment in the right talent and tools is key, especially as 90% of respondents indicate they need to increase investment in real-time data analytics solutions in the near term. With real-time streaming analytics, businesses can bridge the gap between historic data – sitting either in legacy, on-premise systems or more commonly, in the cloud – and real-time data. Recently the introduction of artificial intelligence (AI), machine learning (ML) and data science has generated new momentum and opportunities in streaming analytics. A report by Gartner explained that ‘by the end of 2024, 75% of enterprises will shift from piloting to operationalizing AI, driving a 5X increase in streaming data and analytics infrastructures.’ Technologies that enhance continuous intelligence with AI/ML are already ahead of the curve – creating new possibilities to improve resiliency, reduce data silos and provide more meaningful insights that automatically react to new market conditions. And agile, data-first companies that are leveraging those tools have the much-needed ability to sort through the data, gain intelligent insights and solve more problems – allowing them to make faster, event-driven decisions.”
The Complexity of Moving Legacy Data Pipelines to Cloud. Commentary by: Mark Cusack, CTO of Yellowbrick.
“As cloud adoption matures we’re starting to see new patterns of deployment. Enterprises are looking beyond single cloud targets and are planning and implementing multi-cloud strategies. Locking into one particular cloud vendor presents risk from an availability and cost perspective, and also means the enterprise can’t take advantage of best-of-breed cloud services that each CSP offers. The true innovators are going one step further, and are starting to look at deploying applications and services at the network edge. We’ll see an explosion of growth in data at the edge over the next few years with Gartner predicting that 50% of all data will be generated outside of public cloud data centers by 2023. The driver of this growth will be IoT connected devices, the rollout of 5g networks, and the rise of location-specific services. To help address this growth at the edge, multi-clouds will expand in scope to become distributed clouds, in which the same public cloud infrastructure, services and APIs are deployed everywhere. Preserving a common infrastructure will simplify integrations from the edge to the center and will enable a uniform security and data governance approach. The analytical ecosystem will adapt to fit the distributed cloud model, deploying analytics at the point of need, based on data gravity, latency and sovereignty requirements. Companies need to start thinking about the challenges and opportunities that widely distributed data will bring in the future, and vendors will need to develop products that integrate and manage disparate data, analytics and services across distributed clouds.”
Life at the Edge: The Key to Successful AI. Commentary by: Bill Scudder, SVP and General Manager of AIoT Solutions, AspenTech.
“One of the biggest challenges when it comes to real-world applications of AI is the inflexibility. Edge AI has unlimited potential use cases – solutions and applications vary from smartwatches to production lines, and from logistics to smart buildings and cities. In the case of manufacturing, bringing AI to the edge to monitor manufacturing processes enables operators to preemptively mitigate serious production issues and optimize performance. In situations where organizations are deploying hundreds and thousands of sensors or could be faced with millions of dollars in production losses within minutes – edge AI is critical.”
Business Builds Data Literate Leaders. Commentary by: Merav Yuravlivker – CEO and Co-founder of Data Society.
“The first step to building data-literate leaders is to empower them to think about data strategically. Many executives and managers shy away from using data because they don’t feel comfortable with it and might not be familiar with the vocabulary and potential. By giving them the opportunity to learn how to leverage data tools, techniques, and strategies, they will apply them to develop a data-informed culture across their organization. We’ve seen our training solutions educate and equip leaders with the skills they need to succeed. And once leaders become data-informed, they create a shift in thinking that leads to more streamlined operations and agile teams.”
This post was originally published by Daniel Gutierrez at Inside Big Data