Sergey - stock.adobe.com
The creation and management of AI-related services introduce a whole new set of headaches for CIOs. The biggest change reflects a shift in the kinds of infrastructure required to build AI applications, different development workflows and new data requirements. Today, CIOs should consider extending IT service management (ITSM) tools to support these new capabilities.
"Delivering AI-related infrastructure and processes is similar in many regards to delivering enterprise-wide services or IT for the IT function," said Craig Wright, a managing director with business transformation advisory firm Pace Harmon. Both AI and IT infrastructure rely on correlating data to either prevent or minimize the risk of widespread adverse impact to the broader, non-IT user community. Traditional ITSM functions include service definition, introduction, delivery and continuous improvement. AI service management also needs to consider new workflows, different kinds of infrastructure and better processes for versioning data and AI models.
Same goal, new tech
"The end goal of good AI infrastructure is similar to traditional IT," said Ken Zamkow, general manager for North America at Run:AI, an AI virtualization tools provider. This includes enabling users in the organization, such as data scientists and engineers, to work faster and more efficiently while keeping costs in check and maintaining control, visibility and security.
However, the underlying technologies are different because AI is based on long and repetitive experiments rather than pure coding; the data sets involved are significantly larger; and the compute needs are much greater and involve much more expensive hardware. "Getting all of this right is the key and the main challenge," Zamkow said.
Part of the problem lies in navigating the huge variety of workflows, tools, frameworks and platforms that are available for AI training and deployment. While there are some IT choices that are quickly gaining popularity, there aren't any well-developed, end-to-end industry standards to rely on. This means that the IT and infrastructure teams in each enterprise need to essentially stitch together a collection of offerings that meet their particular use cases.
This often also leads to growing complexity, costs and lack of visibility into how resources are utilized. Zamkow has found that one good strategy is to use AI tools that are easy to integrate with existing ITSM tools to help provide a smoother and more flexible architecture.
What's your AI quotient?
Wright believes that IT executives should consider new metrics, which he calls an AI quotient, to help evaluate the maturity of AI-related processes. This could be similar to how the capability maturity model integration is used to improve traditional software development processes.
Aspects of this could include measuring analytics capabilities, the ability to improve feedback of AI results and the ability to make use of emotional data of users of AI tools. For example, new metrics like emotional awareness could be quantified by interpreting customer and management satisfaction surveys and analyzing the emotional the tone used in voice and chat messages. Ranges could be established, and an emotional index could be calculated to correlate with different roles impacted by the AI. Analytics capabilities could be quantified in terms of the time to release and the ability to provide value to different business groups. A feedback metric could reflect the ability to measure degradations in AI model accuracy and retrain them with fresh data.
Mike Leone, senior analyst at Enterprise Strategy Group (ESG), an IT market research firm, sees three main categories of challenges for managing AI services: infrastructure, people and data.
Enterprises struggle with new types of AI application development environments, data science tools and integrated data or analytics platforms, in addition to the operational expenses that go into managing and maintaining everything. On the people side, ESG research found that over 20% of organizations cite a lack of experienced or trained staff as their top challenge related to AI.
Every enterprise needs to come up with the appropriate infrastructure to meet the specific AI use cases within the organization, Leone said. For example, hardware matters less if you're utilizing public cloud services. If you're a data scientist being tasked with creating a model, you'll likely focus purely on the data science tools.
On the data side, enterprises struggle with insufficient data quality for AI. "Data quality and data workflow management is interesting in that it is often overlooked but ends up being the most essential component in all of this," Leone said. Without high-quality data, all the work, cost and time could be wasted to some extent. And without efficient workflows across the different stages of the AI lifecycle and the different tools and component, the time to value will explode.
Gap in AI stacks
Leone recommends that IT executives find ways to simply extend the ITSM tools used today to manage new types of workloads. For example, infrastructure metrics are the same in terms of latency, IOPS, throughput and utilization. With AI, it's about marrying those metrics across infrastructure and software components to ensure they're in lockstep.
"You're unlikely to find a complete AI solution with comprehensive management today because it's composed of multiple components from different vendors," Leone said. As a result, enterprises are almost forced to pick the interface that can manage as much of the AI stack as possible.
Leone sees a gap right now in the market to properly manage and orchestrate the AI stack in a comprehensive way. This is in part because AI developers are rapidly discovering new workflows that has led to an explosion in new components and vendors across the AI stack and data pipeline. ESG research found that the average enterprise was working with 37 different hardware and software vendors across the entire AI and machine learning data pipeline.
"The need to consolidate management silos across all those vendors is a massive opportunity for vendors and a massive challenge for businesses with AI initiatives underway," Leone said.
Extending existing ITSM tools
ITSM self-service and fulfilment processes are important techniques where data scientist teams can get the needed AI infrastructure for their projects, said Vesna Soraic, global product marketing team manager of service management and fulfillment at Micro Focus, an IT operations management platform.
Other important ITSM principles for AI are change management processes. For example, newly trained AI models should be deployed into production with the needed speed and agility but also with proper governance processes.
Asset management is a traditional ITSM tool, where hardware and software assets like licenses and subscriptions are managed. Ramprakash Ramamoorthy, product manager at Zoho Labs, a division of Zoho Corporation, parent company of ITSM platform ManageEngine, said that similar techniques could also be applied to data for AI. In this case, the ITSM tools would need capabilities to version data sets and manage access controls for them.
Faster pace of change
Both the hardware and software are evolving quickly, which could lead to more complex ITSM change processes. Arvind Ganga, AI lead at TOPdesk, an ITSM software and services provider, recommends CIOs reevaluate their existing change process or create a different change process specifically for AI infrastructure and processes.
In general, managing AI-related infrastructure and processes is like traditional ITSM, in the sense that traditional ITSM has already dealt with the same issues. "However, everything will become more complex because more tools, infrastructure components, parties, data sets, integrations and regulations are in place, and because of the frequent updates," Ganga said. As a result, agile processes will become more important in ITSM as that offers the possibility to deal with quick changes.
Traditional ITSM was, and is, good for traditional infrastructure, but AI processes aim to be agile like modern software development techniques using continuous integrations and continuous deployment, said Micro Focus' Soraic. She expects to see more enterprises adopting container-based AI infrastructure and processes.