- AI CIO
- Posts
- Agentic AI and Security: A Wake-Up Call for CIOs
Agentic AI and Security: A Wake-Up Call for CIOs
Bridging the Gap Between AI Potential and Enterprise Reality While Tackling Security
Dear CIO,
What if AI could autonomously fix bugs in your enterprise code? It’s an idea that’s driving innovation across the tech world, but as the hype builds, a critical question remains: Are today’s AI models equipped to handle the proprietary, complex codebases that underpin enterprise systems? For CIOs navigating this rapidly evolving landscape, this question is central to addressing security risks, technical debt, and the very structure of enterprise IT. Let’s explore where agentic AI stands today, the challenges it faces, and why CIOs must take a proactive role in shaping its future.
Best Regards,
John, Your Enterprise AI Advisor
Brought to You By
The AIE Network is a network of over 250,000 business professionals who are learning and thriving with Generative AI, our network extends beyond the AI CIO to Artificially Intelligence Enterprise for AI and business strategy, AI Tangle, for a twice a week update on AI news, The AI Marketing Advantage, and The AIOS for busy professionals who are looking to learn
Dear CIO
Agentic AI and Security: A Wake-Up Call for CIOs
Bridging the Gap Between AI Potential and Enterprise Reality While Tackling Security
Recently, I’ve been keeping a close eye on the hype around agentic AI—specifically models that can independently process, diagnose, and correct bugs in code. We’re seeing many solutions from academia and the private sector pop up, yet they’re all chasing one underlying idea: Can AI fix known bugs in software autonomously?
However, even with some breakthroughs, there remains a gap between potential and reliable implementation. While many companies are competing to be the first to master code-fixing AI, the reality is that most big players like IBM, Amazon, Google, and Microsoft seem to be focused on AI solutions heavily grounded in a limited list of programming languages. The current benchmark for fixing bugs is based on a paper called “SWE-BENCH: CAN LANGUAGE MODELS RESOLVE REAL-WORLD GITHUB ISSUES?” published in October 2023. The paper turns into an official benchmark, with the original benchmark code being based on 12 popular Python repositories. It has been expanded to 17 libraries and includes Javascript.
This emphasis on Python is a double-edged sword—it’s popular in AI and widely available on open platforms like GitHub, so models trained on public GitHub and similar repositories might have decent accuracy there. But here’s the kicker: enterprise-level code—primarily commercial and critical infrastructure code in sectors like finance, retail, and government—is often built in languages like Java. And most of that code sits on private servers, not public GitHub repositories. In other words, the models can’t learn what they’ve never seen.
Can these agentic processes work with enterprises' unique, proprietary code standard? It’s an intriguing area, and if we’re serious about expanding AI’s reach into enterprise-level code solutions, we need to find ways for models to work without the advantage of public datasets.
Where Do CIOs Fit Into This Picture?
For CIOs, this represents a much larger security and technical debt issue. Many CEOs are bringing in Chief AI Officers who, in turn, hire recent graduates from Stanford and CMU, folks with minimal enterprise experience but lots of enthusiasm for tech. I’m all for fresh ideas in an organization, but I’ve heard too many stories of these bright-eyed AI hires declaring, “I deployed Kubernetes in 15 minutes!” Sure, deploying Kubernetes in a lab environment is one thing, but maintaining it within a corporate network with complex security, compliance, and performance requirements is quite another. The disconnect is that these folks don’t know what they don’t know. And worse, many Chief AI Officers are being advised to bypass the CIO and CISO—creating isolated AI silos with minimal oversight from the company’s core tech leadership.
This is where shadow AI becomes a real threat. Imagine AI applications in critical business areas the CIO isn’t even aware of, much less monitoring. AI solutions often require extensive infrastructure, whether for chatbots, predictive analytics, or bug fixing. By my estimate, around 85% of these applications rely heavily on cloud-native components like Kubernetes, Kafka, and Redis. And if the people responsible for security and infrastructure are left out, we’re only setting ourselves up for disaster. CIOs must be involved to prevent mounting technical debt and secure the scaffolding we’re building for future services.
A Call for Action on Security Standards
In my view, there’s another pressing need here: standards and guidelines crafted explicitly for the AI era. I think NIST’s approach to AI security is well-intentioned, but it’s not hitting the mark. Most NIST AI security guidelines could practically apply to something as simple as a Google search engine. They’re missing critical, nuanced aspects like model-level security, observability, error rates, and bias evaluations.
OWASP, on the other hand, has been proactive. They’re rolling up their sleeves and addressing foundational AI security issues with their OWASP LLM top 10, which provides a great starting point for understanding the risks of large language models. They recently published a COE guide outlining best practices, roles, and responsibilities for teams working with AI.
However, even OWASP’s guide lacks attention to observability, which is pivotal in the AI context. AI observability should go beyond standard logging and monitoring—it needs to address hallucination detection, model bias, and correctness verification. For CIOs, observability tools should be as essential as firewalls in the current IT landscape. Understanding the accuracy and reliability of model outputs isn’t optional; it’s critical to securing the integrity of business operations.
Final Thoughts
As CIOs look ahead, the stakes are high. AI is rapidly becoming a core business function, and with it, the onus is on leadership to ensure it’s secure, efficient, and sustainable. It’s not just about deploying the latest models but about building the proper foundation that anticipates and mitigates risks rather than adding to the mountain of technical debt. The lesson here is clear: we’re at an inflection point in AI’s evolution. Let’s make sure we’re building a future that we can sustain.
How did we do with this edition of the AI CIO? |
Deep Learning
Garima Bajpai explores how Site Reliability Engineering (SRE) practices are evolving to meet the demands of modern software environments.
Elizabeth Montalbano writes about how Google has fixed two critical vulnerabilities in its Vertex AI platform, that could have allowed attackers to escalate privileges and exfiltrate proprietary enterprise models.
Kevin Porreault covers Google Cloud's Cybersecurity Forecasts 2025 predictions on an increase in AI-driven cyber threats.
Alexandra Kelley writes on the House AI Task Force emphasizing a human-centered, light-touch regulatory approach for AI legislation.
Mustafa Kapadia writes on how project managers must master foundational skills like choosing the right LLM, writing effective prompts, and measuring results to ensure reliable outputs to leverage AI.
Guido Appenzeller explores how the rapid decline in LLM inference costs, decreasing by 10x annually since GPT-3's launch, is opening new commercial possibilities for AI applications.
Gartner analysts highlighted four key challenges CIOs face in delivering AI value—cost management, decentralized data governance, balanced AI-driven productivity, and employee well-being.
In this podcast, Nick Lippis speaks with Petek Ergul, Global Head of CTO Service Management at HSBC, on AI’s transformative impact on the financial industry.
Armand Ruiz announces that IBM has been recognized as a leader in the 2024 IDC MarketScape for Worldwide Machine Learning Operations.
James Coker covers the significant investment in AI by the cybersecurity industry that is expected to give defenders an advantage.
I write on the "AI productivity paradox," drawing parallels with the early computer age to highlight the delayed impact of AI on productivity.
I join a Techstrong Gang episode covering agentic AI and AI security.
Regards, John Willis Your Enterprise IT Whisperer Follow me on X Follow me on Linkedin |