Poisoning the AI Key Vault: A Technical Deep Dive into the LiteLLM PyPI Attack

Poisoning the AI Key Vault: A Technical Deep Dive into the LiteLLM PyPI Attack

LiteLLM PyPI Supply Chain Attack Visualization

In the rapidly evolving landscape of AI infrastructure, few libraries have become as central as LiteLLM. Acting as a universal proxy for hundreds of LLM providers, it handles the most sensitive secrets an organization possesses: API keys for OpenAI, Anthropic, Gemini, and local inference endpoints. On March 31, 2026, this central node became a primary target for one of the most sophisticated supply chain attacks in recent memory. By compromising the LiteLLM PyPI package, threat actors gained a foothold into the heart of AI-driven enterprises, exposing the fragility of the “modern AI stack.”

This technical deep dive examines the mechanics of the LiteLLM PyPI Attack, specifically versions 1.82.7 and 1.82.8, the three-stage payload deployed by the threat actor group TeamPCP, and the accidental coding error that prevented a global catastrophe.

The Accidental Discovery: A Fork Bomb to the Rescue

The discovery of the compromise was not the result of a proactive security audit, but an accidental coding bug within the malware itself. Callum McMahon, a security researcher at FutureSearch, was testing a custom Model Context Protocol (MCP) plugin that relied on LiteLLM as a transitive dependency. Upon installing the latest version from PyPI, his development environment entered a catastrophic “fork bomb” state—spawning hundreds of child Python processes until the system became unresponsive.

Analysis revealed that version 1.82.8 had introduced a malicious .pth file (litellm_init.pth) into the site-packages/ directory. In Python, .pth files are executed automatically on interpreter startup. Because the payload within the .pth file was designed to re-initialize the Python environment to ensure its own persistence, it inadvertently triggered a recursive loop, spawning a new process on every initialization. Had this bug not existed, the malware’s credential exfiltration—which operated silently via HTTPS POST—could have gone undetected for weeks.

Anatomy of the Compromise: Injection Methods

The attackers, attributed to TeamPCP, utilized two distinct injection methods across two malicious releases, demonstrating a refined understanding of the Python packaging ecosystem.

1. Version 1.82.7: Module Import Injection

In this initial attempt, the malicious code was injected directly into litellm/proxy/proxy_server.py. This was a targeted strike against users running LiteLLM as a proxy server. The malware remained dormant until the proxy_server module was imported, at which point it initiated its discovery phase. By hiding within a deeply nested module, the attackers avoided detection from basic static analysis that only scans top-level __init__.py files.

2. Version 1.82.8: Global Interpreter Persistence

Refining their approach, the attackers switched to the aforementioned .pth file method in version 1.82.8. This was a significantly more aggressive strategy. By placing a malicious file in the site-packages/ root, they ensured that *any* Python script run on the system—whether or not it imported LiteLLM—would execute the payload. This effectively turned the Python interpreter itself into a delivery vehicle for the malware.

The Three-Stage Payload: A Masterclass in Exfiltration

Once active, the malware executed a sophisticated three-stage operations plan designed for maximum data harvesting and lateral movement.

Stage 1: Comprehensive Credential Harvesting

The first priority was the collection of “high-value targets.” The payload did not just look for LiteLLM keys; it systematically searched for all sensitive environment variables and configuration files, including:

  • Cloud Provider Credentials: Scanned ~/.aws/credentials, ~/.azure/access_tokens.json, and GCP service account keys.
  • Kubernetes Secrets: Harvested ~/.kube/config files and service account tokens.
  • Development Secrets: Grepped .env files, .git-credentials, and shell history (.bash_history, .zsh_history) for hardcoded API keys.
  • Persistence Keys: Collected private SSH keys from ~/.ssh/.
  • System Credentials: Attempted to read /etc/shadow for password hashes (requiring privilege escalation).

All harvested data was encrypted using a hybrid scheme (RSA-4096 for key exchange and AES-256 for the bulk data) and exfiltrated via HTTPS POST to models.litellm.cloud—a domain registered by the attackers to mimic a legitimate sub-domain of the project.

Stage 2: Kubernetes Lateral Movement

If the malware detected a Kubernetes environment, it transitioned into an aggressive lateral movement phase. Using the stolen service account tokens, it would:

1. Read all cluster-wide secrets across all namespaces.

2. Attempt to create a privileged pod (using alpine:latest) on every node in the kube-system namespace.

3. These pods would mount the host filesystem (/host), granting the attackers full control over the underlying infrastructure and any other containers running on the node.

Stage 3: Persistence and C2 Infrastructure

To ensure long-term access, the malware installed a persistent backdoor at ~/.config/sysmon/sysmon.py, accompanied by a systemd user service. This backdoor was designed to poll checkmarx.zone—a domain registered by TeamPCP to abuse the trusted “Checkmarx” brand name—every 50 minutes for new instructions or second-stage payloads. This allowed the attackers to bypass many DNS allowlists that might otherwise flag unknown C2 domains.

Attribution: The TeamPCP Campaign

The LiteLLM PyPI Attack is part of a larger campaign by TeamPCP, a threat group that has been systematically targeting security-focused open-source tools. In the weeks prior, TeamPCP was linked to the compromise of Trivy and Checkmarx KICS. Their signature move is not to compromise the source code on GitHub, but to steal CI/CD credentials (likely through phishing or previous supply chain successes) to publish malicious packages directly to registries like PyPI or Docker Hub.

This “registry-first” compromise bypasses the peer-review process of code commits, making it invisible to developers who only look at the source repository for changes. It highlights a critical gap in the security of the software supply chain: the trust placed in the publishing infrastructure itself.

Engineering for a Post-LiteLLM World: Lessons Learned

The fallout from this attack provides a stark reminder that AI security is, at its core, infrastructure security. To protect against future incidents, organizations must move beyond the “blind trust” model of package management:

Security Domain Traditional Approach Recommended Architecture (Post-LiteLLM)
Package Management pip install [package] Immutable Lockfiles (poetry.lock) + Internal Artifact Repos (Artifactory/Nexus).
Secret Management Environment Variables / .env External Vaults (HashiCorp Vault/AWS Secrets Manager) with Just-In-Time (JIT) short-lived tokens.
AI Proxy Auth Long-lived API Keys Agentic Auth Frameworks like KavachOS using Delegation Tokens.
CI/CD Security Static Tokens Trusted Publishing via OIDC (OpenID Connect) to eliminate static registry credentials.

Conclusion

The LiteLLM PyPI attack was a high-stakes gambit that nearly succeeded in harvesting the API keys and cloud credentials of 40,000 developers. Only a developer’s error—the fork bomb in version 1.82.8—brought the operation to light before the exfiltration phase could scale. As we continue to build autonomous AI systems like the Claude Code Compaction Engine, we must ensure the foundations they sit on are secure. The era of “blind pip install” is officially over; the era of zero-trust dependency management has begun.

Source references:

FutureSearch: Inside the LiteLLM Compromise (Callum McMahon)

InfoQ: PyPI Supply Chain Attack Compromises LiteLLM

Datadog Security Labs: TeamPCP and the AI Infrastructure Threat


Discover more from Susiloharjo

Subscribe to get the latest posts sent to your email.

Discover more from Susiloharjo

Subscribe now to keep reading and get access to the full archive.

Continue reading