McKinsey's AI Platform Got Hacked: What It Means for Your Company

A security firm breached McKinsey's Lilli AI platform, exposing 46.5 million chat messages and 728,000 files. Here's what every company deploying AI should learn from this.

9 min read·

McKinsey, a firm that advises Fortune 500 companies on technology strategy, had its internal AI platform breached through a vulnerability that any junior pentester would recognize: SQL injection.

The breach of McKinsey's Lilli platform, disclosed by security firm Codewall, is a case study in what happens when companies rush AI adoption without applying the same security rigor they would to any other critical system.

What Happened

McKinsey launched Lilli in 2023 as an internal AI assistant for its 43,000+ employees. The platform offered chat, document analysis, and AI-powered search across 100,000+ proprietary documents. With 70%+ adoption and 500,000+ prompts processed monthly, Lilli became central to McKinsey's operations.

Codewall's security team found that out of 200+ documented API endpoints, 22 had no authentication whatsoever. One of these unprotected endpoints accepted user search queries and wrote them directly to the database. While the query values were safely parameterized, the JSON field names were concatenated directly into SQL statements, creating a classic injection point.

From there, the attackers chained this SQL injection with an Insecure Direct Object Reference (IDOR) vulnerability to escalate access across the platform. These are common SaaS pentest vulnerabilities that show up time and again.

The Scale of Exposure

The numbers are staggering:

Asset Volume
Chat messages 46.5 million
Files (PDFs, spreadsheets, presentations) 728,000
User accounts 57,000
AI assistants and workspaces 384,000 assistants, 94,000 workspaces
RAG document chunks (proprietary research) 3.68 million
Files flowing through external APIs 1.1 million
System prompt configurations 95 across 12 model types

These chat messages contained strategy discussions, financial analyses, and client-sensitive information. This is the kind of data that makes competitor intelligence teams salivate.

The Real Danger: Prompt Layer Compromise

Data exposure is bad enough. But the most alarming finding was that the AI system prompts controlling Lilli's behavior were stored in the same compromised database, with write access available.

This means an attacker could silently modify the instructions governing AI outputs. The implications:

  • Poisoned advice: Subtly alter AI recommendations across the organization
  • Silent exfiltration: Instruct the AI to include sensitive data in responses that get logged externally
  • Guardrail removal: Strip safety controls without anyone noticing
  • No audit trail: Modified prompts leave no trace in traditional logging systems

This is a threat category that most security teams aren't even thinking about yet. We covered this exact risk vector in our breakdown of the enterprise AI security stack, where prompt injection and data leakage sit at the top of the threat model for any company running AI internally.

Why Traditional Security Tools Missed This

Lilli had been in production for two years. McKinsey undoubtedly ran security scans. The vulnerability persisted because traditional application security tools like OWASP ZAP test for known patterns. They check if query parameters are injectable. They don't test whether JSON field names in API request bodies get concatenated into SQL statements.

This is the gap between automated scanning and actual security assessment. Scanners check boxes. Skilled testers think creatively about how data flows through an application.

If you're wondering whether penetration testing is just a nice-to-have, this breakdown of whether a pentest is required for SOC 2 explains why compliance frameworks increasingly expect it, and why simply running a vulnerability scanner doesn't cut it.

What You Should Actually Do About It

If McKinsey, with its resources and technical talent, shipped an AI platform with unauthenticated endpoints and SQL injection vulnerabilities, your company can too. Here's what to do about it, practically.

1. Inventory Every AI System and Its Data Flows

Before you can secure AI, you need to know where it lives. Most companies have more AI exposure than they think: ChatGPT Enterprise, Copilot, internal LLM tools, AI features in SaaS vendors.

What to do this week:

  • Catalog every AI tool in use (sanctioned or shadow IT)
  • Map what data each tool ingests: chat logs, documents, code, customer records
  • Identify who has access and what permissions exist
  • Document all API endpoints, especially third-party integrations

Shadow AI is the fastest-growing blind spot in security programs. We covered practical controls for detecting and managing shadow AI that apply to companies of any size.

2. Treat AI Endpoints Like Any Other Critical API

22 unauthenticated endpoints on a platform processing sensitive data is inexcusable. But it happens constantly when internal tools get exempted from the same security standards as customer-facing products.

What to do this week:

  • Enforce authentication on every API endpoint, internal or external
  • Implement role-based access control (RBAC) so users only access their own data
  • Add rate limiting to prevent automated extraction
  • Log all API calls with enough detail to detect anomalous access patterns
  • Review your IDOR protections: can user A access user B's data by changing an ID in a request?

3. Run a Penetration Test Before Launch, Not Two Years After

Automated scanners missed this vulnerability. A manual penetration test or an AI-powered security agent would have caught it in hours. The lesson: test AI platforms with the same rigor as any production system.

What to do this quarter:

  • Schedule a penetration test specifically scoped for your AI platform
  • Include API endpoint enumeration and authentication verification
  • Test input validation across all data formats (not just query parameters, but JSON field names, headers, file uploads)
  • Test access controls between users and data boundaries
  • Review where system prompts are stored and who can modify them

If you want a sense of what testers typically find, our analysis of the most common SaaS pentest vulnerabilities reads like a checklist for this breach: broken access control, injection flaws, missing authentication.

4. Lock Down the Prompt Layer

The most novel part of this breach wasn't the SQL injection. It was the access to system prompts. This is a new attack surface that most security teams don't audit.

What to do this quarter:

  • Store system prompts outside the application database, in a separate config store with restricted access
  • Implement version control for all prompt configurations so changes are tracked and reversible
  • Set up alerts for any modifications to system prompts or model configurations
  • Restrict write access to prompts to a small set of admins, never the application runtime user
  • Add integrity checks: hash stored prompts and verify them at load time
  • Consider prompt signing to cryptographically guarantee prompts haven't been tampered with

For a deeper look at how AI agent guardrails map to compliance frameworks like SOC 2 and ISO 27001, we published a practical guide on what controls to implement and how auditors evaluate them.

5. Build AI-Specific Monitoring

Traditional SIEM tools don't understand AI-specific threats. You need monitoring that covers the new attack surfaces AI introduces.

What to implement:

  • Monitor prompt injection attempts in user inputs (regex patterns, anomaly detection)
  • Track RAG retrieval patterns for signs of data harvesting
  • Alert on unusual volumes of document access or cross-user data retrieval
  • Log model output quality metrics to detect poisoned responses
  • Set up baseline usage patterns per user and alert on deviations

6. Get Ahead of AI Compliance Requirements

Regulatory frameworks are catching up fast. ISO 42001 is the first international standard specifically for AI management systems, covering risk management, bias controls, transparency, and human oversight. If you're deploying AI at scale, this standard gives you a structured framework for managing exactly the kind of risks exposed in this breach.

The EU AI Act adds enforcement teeth. Depending on your AI use case's risk classification, you may face specific obligations around documentation, testing, and human oversight.

What to do this quarter:

  • Assess your AI deployments against ISO 42001's Annex A controls, even if you're not pursuing certification yet. The 39 controls are a practical checklist.
  • If you already hold ISO 27001, explore the integration path with ISO 42001. The two standards share the same structure, so incremental effort gets you comprehensive coverage.
  • Determine your obligations under the EU AI Act based on your risk classification
  • Document your AI governance decisions, model selection criteria, and risk assessments. Auditors will ask for these.

7. Don't Forget the Basics

The most sophisticated AI platforms still run on web applications, APIs, databases, and cloud infrastructure. The McKinsey breach didn't require novel AI-specific exploits. It used SQL injection and broken access control, vulnerabilities that have been in the OWASP Top 10 for over two decades.

Security fundamentals that apply to every AI deployment:

  • Input validation and parameterized queries on every database interaction
  • Principle of least privilege for database accounts (the app runtime user should never have schema-level access)
  • Network segmentation between the AI platform, its data stores, and the broader environment
  • Secrets management for API keys, model endpoints, and database credentials
  • Regular dependency updates for the ML frameworks and libraries in your stack

The Bottom Line

The McKinsey breach isn't about one company's failure. It's a preview of what's coming across every industry as organizations rush to deploy AI without matching their speed of adoption with security investment.

The fix isn't complicated. It's the same security fundamentals that have always mattered, combined with new controls for AI-specific attack surfaces. Authenticate every endpoint, validate every input, lock down the prompt layer, test before you deploy, and assume attackers will find what scanners miss.

The difference with AI platforms is that the blast radius when something goes wrong is orders of magnitude larger. Every conversation, every document, every strategic decision flowing through your AI system becomes exposed in a single breach.

If you're deploying AI tools internally and haven't run a security assessment on them, now is the time to start.

Share this article

Other platforms check the box

We secure the box

Get in touch and learn why hundreds of companies trust Bastion to manage their security and fast-track their compliance.

Get Started