Tell HN: Litellm 1.82.7 and 1.82.8 on PyPI are compromised
TL;DR Highlight
Malicious .pth files stealing credentials were inserted into LiteLLM PyPI packages versions 1.82.7 and 1.82.8. A supply chain attack that auto-executes on Python interpreter startup — without any import — giving it a wide blast radius.
Who Should Read
Backend developers and MLOps engineers using LiteLLM — specifically anyone who pip-installed litellm in an AI service development environment. Immediate action required.
Core Mechanics
- .pth files in Python's site-packages directory are automatically executed when the Python interpreter starts — no import statement needed. This makes them an unusually dangerous vector for supply chain attacks.
- The malicious code in versions 1.82.7 and 1.82.8 collected environment variables (including API keys, cloud credentials, and database URLs) and exfiltrated them to an external server.
- Any environment where litellm was installed — including Docker containers, virtual environments, and CI/CD pipelines — may have had credentials exfiltrated at every Python process startup.
- The attack was discovered and the malicious versions were yanked from PyPI, but anyone who installed those specific versions between release and yanking is affected.
- Mitigation: immediately upgrade to a clean version, rotate all credentials accessible in environments where 1.82.7 or 1.82.8 was installed.
Evidence
- The security researcher who discovered the attack shared the decompiled malicious code, confirming the .pth execution mechanism and the exfiltration endpoint.
- LiteLLM maintainers published an incident response within hours, confirming the attack, which versions were affected, and recommending immediate upgrade and credential rotation.
- The attack hit particularly hard in AI development environments where litellm is used as a routing layer — these environments typically have credentials for many AI providers (OpenAI, Anthropic, etc.) in their environment variables.
- Commenters raised the broader point: LiteLLM is exactly the kind of high-value target for supply chain attacks — widely used in AI infrastructure, often installed with broad permissions.
How to Apply
- Immediately: pip install --upgrade litellm to get a clean version. Then rotate all credentials that were accessible as environment variables in affected environments.
- Check your pip history or requirements.txt locks to determine if you installed 1.82.7 or 1.82.8. If unclear, treat it as compromised and rotate anyway.
- Add a .pth file scanner to your dependency audit process — tools like pip-audit and safety don't currently detect malicious .pth files, so consider adding a custom check.
- For AI service environments, store sensitive credentials in a secrets manager (AWS Secrets Manager, Vault) rather than environment variables — this limits blast radius from env var exfiltration attacks.
Code Example
# Script to check for malicious package
pip download litellm==1.82.8 --no-deps -d /tmp/check
python3 -c "
import zipfile, os
whl = '/tmp/check/' + [f for f in os.listdir('/tmp/check') if f.endswith('.whl')][0]
with zipfile.ZipFile(whl) as z:
pth = [n for n in z.namelist() if n.endswith('.pth')]
print('PTH files:', pth) # Should be an empty list if clean
for p in pth:
print(z.read(p)[:300]) # Inspect contents
"
# Pin to a safe version
pip install litellm==1.82.6
# Pin version in requirements.txt
# litellm==1.82.6Terminology
Related Papers
Training an LLM in Swift, Part 1: Taking matrix mult from Gflop/s to Tflop/s
Apple Silicon에서 Swift로 직접 행렬 곱셈 커널을 구현하며 CPU, SIMD, AMX, GPU(Metal)를 단계별로 최적화해 Gflop/s에서 Tflop/s 수준까지 성능을 높이는 과정을 상세히 설명한 글이다. 프레임워크 없이 LLM 학습의 핵심 연산을 밑바닥부터 구현하고 싶은 개발자에게 Apple Silicon의 성능 한계를 체감할 수 있는 드문 자료다.
Removing fsync from our local storage engine
FractalBits가 fsync 없이 SSD 전용 KV 스토리지 엔진을 구현해 동일 조건 대비 약 65% 높은 쓰기 성능을 달성한 설계 방법을 공유했다. fsync의 메타데이터 오버헤드를 피하기 위해 사전 할당, O_DIRECT, SSD 원자 쓰기 단위 정렬 저널을 조합한 구조가 핵심이다.
Google Chrome silently installs a 4 GB AI model on your device without consent
Google Chrome이 사용자 동의 없이 Gemini Nano 4GB 모델 파일을 자동 다운로드하고, 삭제해도 재다운로드되는 문제가 발견됐다. GDPR 위반 가능성과 수십억 대 기기에 적용될 때의 환경 비용 문제가 제기되고 있다.
How OpenAI delivers low-latency voice AI at scale
OpenAI redesigned its WebRTC stack to serve real-time voice AI to over 900 million users, detailing the design decisions and trade-offs of a relay + transceiver split architecture.
Efficient Test-Time Inference via Deterministic Exploration of Truncated Decoding Trees
Deterministic Leaf Enumeration (DLE) cuts self-consistency’s redundant sampling by deterministically exploring a tree of possible sequences, simultaneously improving math/code reasoning performance and speed.