If you think AI and analytics are better apart, you may be missing out on some valuable opportunities. I was shocked recently while gathering background information for a short e-book I co-authored with Ellen Friedman, AI and Analytics at Scale: Lessons from Real-World Production Systems, to find that some people still think large-scale analytics projects and AI projects should be siloed and segregated. In particular, these people think of AI systems as very expensive, very specialized, separate systems, that must be completely isolated from analytics systems. My take is just the opposite.
For years we’ve observed real-world enterprises across many sectors that benefit by co-locating analytics – even legacy analytics – together with modern AI and machine learning projects. Furthermore, if you can’t put analytics and AI together, it’s a signal that you don’t have a scale-efficient system, and that you have a substantial amount of avoidable technical debt.
Why AI and analytics should be together
Putting AI and analytics together on a shared system brings many advantages, which include:
Shared resource optimization: shared systems are typically more cost effective because they allow higher utilization. Sharing also makes system administration more efficient and provides a uniform security framework, which in turn reduces the burden on IT teams and improves compliance with data security standards.
Data sharing: Shared systems also minimize siloed data. This is important because AI is much more effective if you have training data that shows all sides of an issue.
Second-project advantage: AI and machine learning projects have high potential value, but they can be highly speculative. Leveraging existing data sets and resources means that these projects can be tested more quickly and at lower cost for failed ideas. This approach makes you more likely to find the big winners, resulting in a substantial second-project advantage.
Improved collaboration: To be successful in practical business settings, AI requires expert domain knowledge, access to the right data, and a way to put results returned by models into action to address valuable business goals. A shared system built on a unifying data infrastructure together with an existing framework for taking action can give you an end-to-end pipeline for data engineers, analysts, AI experts, and business leaders that encourages valuable collaboration and makes it easier to bring AI projects into production.
What you need to run AI and analytics together
The advantages listed above are appealing, but what’s necessary to have a system that can support AI and analytics together? You need to develop a comprehensive data strategy, and you need scalable data infrastructure designed specifically to support scale-efficient systems.
To do this, the data infrastructure should have these characteristics:
Lessons from real-world use cases: AI and analytics together
Following are three use cases where combining AI and analytics paid off. Each of these customers used HPE Ezmeral Data Fabric as a software-defined and hardware-agnostic unifying data layer.
• Second-project advantage pays off for a major media company
We have a customer who, as a large media company, had two problems. First, they needed to refresh their system for producing business analytics. Second, they needed better viewing audience predictions to improve their advertising business. They felt that machine learning might be the key to improving these audience predictions.
The easy start was to augment and ultimately replace their data warehouse with the ability to work on larger and more granular data. To do this, they made use of the data fabric, thus improving scalability and substantially reducing the cost of producing business reports.
The next step was to use the data from these analytics systems to build initial versions of AI-based audience prediction, again on the data fabric. These initial systems were very basic, but still had better accuracy relative to older systems, largely because they could use more data and thus could account for more of the factors that affect viewership, such as weather, short- and long-term seasonality, and competing events.
The success of these systems and the rapid ROI convinced management that it was worth investing in a data science team to build more sophisticated AI prediction systems. At this point, a second round of coattail effects of AI and analytics together began. This allowed the data science team to field many candidate predictors, ranging from incremental updates to radical new approaches. The result has been substantial further improvements over the first viewership models, and the team is looking into other opportunities to improve the business through AI.
• Lunchroom collaboration made millions for large retailer
A data engineer and a web product manager walked into a lunchroom one day. This isn’t the beginning of a Silicon Valley joke, but was, instead, how a new product feature got started.
The product manager was lamenting that he could get their team to build a price-matching feature if only he could get web crawl data. But he couldn’t possibly get the budget to scrape the web for that data without a solid business case. The data engineer spoke up and described the web crawl that was already done – and pointed out how the resulting data was already on their shared data infrastructure. As a result, they prototyped the feature that afternoon and deployed it to production not long after.
The moral is that sharing a single data infrastructure makes collaboration as easy as eating lunch.
• Containerization sped up development for an AI services company
Another customer used containers and shared data infrastructure to simplify delivery of AI systems together with analytics applications.
There were several interesting consequences. First, because data could be shared between legacy applications and their containerized replacements, containerization could proceed one application at a time and could be substantially less invasive to the code. Second, because containers can be rebuilt in a repeatable way with explicit dependencies, the security team could automatically scan containers to find risky dependencies, and the QA team could rebuild all containers in a safe environment, thus controlling exactly which bits got into production code.
The net effect was higher levels of DevOps (or even what is now called MLOps) automation. That automation means that new applications can be deployed much more quickly, which is a big win in dynamic situations.
Unifying data infrastructure: HPE Ezmeral Data Fabric
The key advantages of having AI and analytics projects together, as demonstrated by these three real-world use cases, depend on using data infrastructure specifically engineered for building a scale-efficient system.
Download the free e-book in pdf to read over a dozen additional real-world use cases that show the competitive advantage of scale efficiency: AI and Analytics at Scale: Lessons from Real World Production Systems.
About Ted Dunning
Ted Dunning is chief technologist officer for Data Fabric at Hewlett Packard Enterprise. He has a Ph.D. in computer science and is an author of over 10 books focused on data sciences. He has over 25 patents in advanced computing and plays the mandolin and guitar, both poorly.
Copyright © 2021 IDG Communications, Inc.