Several research surveys report that there is great interest in using AI for mainframe application modernization, but we are far from AI taking autonomous actions and executing tasks associated with modernizing decades-old code and associated dependencies.
For all the excitement around AI’s role in mainframe modernization, most Fortune 500 organizations remain solidly in the experimental stage. Surveys across large corporations show an eagerness to leverage AI to improve productivity and extend the life of existing systems—but real, autonomous application modernization on IBM® z/OS® is far from reality. The constraints come in both human and technical form, and the trust gap between developers and AI-assisted code generation in mainframe environments remains wide.
Much of today’s AI use on the mainframe is taking place in exploratory tasks. The idea of an Artificial Intelligence code assistant, or AICA, autonomously navigating workflows in complex mainframe environments is still more theoretical than applied. In his paper “Examining the Use and Impact of an AI Code Assistant on Developer Productivity and Experience in the Enterprise[1],” IBM’s Justin D. Weisz found that developers are using AICA not to write more code, but to mostly understand the code. This is because a good percentage of the code being modernized is decades old. Younger mainframe programmers aren’t that familiar with COBOL, assembler, PL/1 and JCL, and are leveraging the power of AICA to better understand what the code means. Lack of complete understanding of AI queries (a.k.a. prompt engineering) on the mainframe is another area that is prohibiting younger programmers from using AICA beyond surface understanding of the code and the dependencies across decades-old mainframe environments. Developers value help explaining legacy logic and diagnosing issues over speed or automation. Despite large‑language model (LLM) advances to date, output remains inconsistent.
Human Programmer Intervention Required
Developers often modify nearly all of the code that AI generates—only about two to four percent of the code generated is used unaltered. Even when the recoding results are acceptable, code review and validation are non‑negotiable for seasoned programmers. Many participants in Weisz’s study noted that AI either worked too slowly or produced inaccurate results for complex Db2® routines and testing beyond simple unit testing. For seasoned mainframers, the quality of AICA mainframe code is generally equivalent to that of a competent junior developer. It still demands a lot of human intervention and manual refinement.
Weisz’s findings align with broader sentiments across the industry. According to the 20th Annual BMC Mainframe Survey[2], 72% of respondents have adopted AIOps in some capacity, and DevOps adoption now stands at 67%. AI is clearly being woven into mainframe workflows—but mainly as a supporting tool, not a replacement for human workflows that support high-quality application modernization. BMC’s report also highlights a growing “spectrum of trust” for agentic AI, which can recommend or initiate actions but rarely acts autonomously due to persistent reliability and compliance concerns.
The 2025 Broadcom Mainframe Developer Survey[3] reinforces this picture of cautious adoption. Nearly all respondents (93%) view the mainframe as strategically essential, and most are dealing with applications exceeding half a million lines of code. In that context, AI’s limitations become clear. Users reported that when asked to analyze a specified single code segment, some tools attempt to re‑engineer entire programs—an untenable approach for systems that may have tens of thousands of lines of code. Even among highly productive teams, mainframe programmers, on average, spend only about 16% of their time coding, with the remainder devoted to maintenance, validation, and integration—all areas where current AICAs struggle to contribute meaningfully.
The Intellectual Property Issue with AI-Assisted Code Generation
Beyond the technical limitations, trust and authorship loom large. Nearly half of Weisz’s survey respondents expressed concerns over potential copyright infringement by LLMs trained on global code repositories, and more than 80% of IBM’s Watson Code Assistant users questioned whether the tool might generate code that is proprietary IP. Another legal and compliance layer must be built into the use of AICA, yet most large organizations are still building these operating procedures. Without clear internal legal and organizational frameworks for reviewing AI‑generated outputs, companies are reluctant to let AI operate beyond controlled assistance.
One organization helping educate legal and executive teams in large enterprises is the Linux Foundation. The LF AI & Data project has released a “Model Openness Framework” for evaluating and classifying machine learning models, aiming to establish completeness and openness as core tenets of responsible AI research and development. More info on the LF AI & Data project can be found at https://lfaidata.foundation.
Mainframe developers are also aware of the cultural tension AI introduces. Some worry that using AICAs could raise management’s expectations for code throughput or accelerate “de‑skilling,” while others bristle at the perception that AI‑assisted code is merely “boilerplate.” Prompt engineering—the ability to ask the right question in the right way—is an emerging skill, but few programmers feel confident wielding it effectively yet. Older mainframe programmers aren’t yet comfortable with how to ask an AI query to get the right information they seek while younger mainframe programmers may not understand legacy code enough to know what to ask for in a prompt.
Taken together, these findings describe a landscape of measured optimism but still without a lot of hesitation. AI is proving useful for documentation, explanation, and routine code analysis—essentially the heavy lifting that frees developers to focus on higher‑order design tasks. But it isn’t close to autonomously modernizing decades of mainframe applications with hundreds of thousands of lines of code and deep legacy dependencies.
The trajectory is promising, yet the current state of AI on the mainframe is best summarized as augmentation, not automation. For now, AI is doing it some, but we are years away from AI doing it all.
Infotel Solutions Augmenting the Human Intervention
Infotel has multiple tools for speeding up mainframe programmers’ manual reviews of AICA-delivered code. DB/IQ QA+ is a code quality assurance product that analyzes SQL code in Db2 and provides a line-by-line overview of the code’s quality along with a bad-code cost factor. Additionally, Infotel’s Scribe – a utility from the Infotel Augmented Solutions or IAS product line – provides an AI Agent that can review code and provide a code structure diagram with documentation navigation for quick reference. Scribe supports multiple languages including Java, PHP, C#, JavaScript, COBOL, C, and C++. For more information on DB/IQ QA+ and Scribe, please visit https://infotel-software.com.
Research Links
IBM Research: AI Code Assistant (WCA) Developer Productivity Study (2025) – https://arxiv.org/pdf/2412.06603
BMC 20th Annual Mainframe Survey (2025) – https://www.bmc.com/newsroom/releases/20th-annual-bmc-mainframe-survey.html
Kyndryl State of Mainframe Modernization Surveys (2024 & 2025) – https://www.cioandleader.com/kyndryl-survey-reveals-86-of-enterprises-are-moving-fast-to-adopt-ai-to-accelerate-mainframe-modernization/
Broadcom Mainframe Developer Perspectives Survey (Published Feb 1, 2026) – https://engage.broadcom.com/mainframe-developer-survey
[1] https://arxiv.org/pdf/2412.06603
[2] https://www.bmc.com/newsroom/releases/20th-annual-bmc-mainframe-survey.html
For detailed information, download our free technical documentation.
Do you have a project in mind? Our experts are here to help. Click below to contact us.