€54k spike in 13h from unrestricted Firebase browser key accessing Gemini APIs
TL;DR Highlight
This is a real-world case where an unlimited API key activated for Firebase AI Logic (Gemini API) was exploited in automated attacks, resulting in €54,000 in charges within 13 hours, and Google refused a refund. It serves as a warning about the dangers of exposing API keys on the client side.
Who Should Read
Frontend/full-stack developers using Firebase or Google Cloud to develop or launch features based on the Gemini API. Specifically, small teams or individual developers who include API keys in client code or fail to properly configure budget management.
Core Mechanics
- After activating Firebase AI Logic to add AI features based on the Gemini API (generating web snippets from text prompts) to a project originally created a year ago for Firebase Authentication, automated traffic surged immediately upon activation.
- The browser API key for the Firebase project was originally created with 'No API restrictions,' and with the addition of the Gemini API, Gemini requests began to flood in without limit using that key. These were automated bot requests unrelated to actual user traffic.
- A budget alert of €80 and a cost anomaly notification were set up, but both were delayed by several hours. By the time a response was possible, approximately €28,000 had already been charged, and the delayed cost reporting resulted in a final amount exceeding €54,000.
- The Google Cloud support team rejected the request for a charge adjustment, stating that the usage 'occurred from that project' and was therefore considered valid.
- Gemini team lead Logan Kilpatrick commented directly, explaining the current countermeasures: Tier 1 users have a default monthly spending limit of $250, and project spending caps can also be set. However, both features have a maximum delay of 10 minutes.
- Google stated that it is moving towards preventing the use of Gemini API with unrestricted API keys. It has also changed the default for new users to generate more secure Auth keys.
- Prepaid billing is also being introduced, starting with new users in the US and expanding globally, allowing developers to pre-determine and control their spending.
- Google had maintained a principle for over 10 years that 'Google API keys are not secrets (they can be public),' but this principle was broken after the release of the Gemini API. However, many developers are unaware of this change and continue to use keys as before.
Evidence
- "The structural issues with GCP budget notifications were heavily criticized in the comments. Because billing events pass through a queue/log for aggregation, the notifications themselves can be delayed by hours. Even setting a hard limit operates based on the last known aggregated value, leaving it helpless against sudden spikes. Many agreed with the criticism that 'this structure is designed to protect the company, not the customer.'\n\nAnother developer who experienced similar damage appeared in the comments. After incurring $26,000 in damages, they requested a refund from the Google Cloud support team, which was initially denied but is currently under review, with advice to 'escalate as much as possible.' Another comment shared a case where a Gemini API key quickly generated during a Google public live training session was leaked, resulting in approximately $6,909 in charges.\n\nIt was pointed out that many Gemini API keys are already hardcoded on GitHub. A search for 'gemini \"AIza\"' on GitHub yields numerous results, and developers are analyzing that they are treating Gemini keys in the same way as existing Google Maps/Firebase keys, which have long been considered safe to make public.\n\nInformation was shared about the 'emergency brake' feature in GCP that immediately stops billing. Separating the billing account from the project stops API usage immediately, but there is a risk of deleting associated resources, making it unsuitable for production apps.\n\nA developer using Backblaze B2 mentioned it as an example of a 'working spending limit.' B2 shares their experience that API requests are immediately blocked when a $0 limit is set, exceeding the free tier, emphasizing the need for a fundamental change in cloud service billing design.\n\nQuestions were raised about why scammers target AI API keys. In response to the question of what clear financial gain there is, like mining Bitcoin with EC2 credentials, speculation arose that it may be to use or resell LLM services under someone else's account."
How to Apply
- If you plan to integrate the Gemini API with your Firebase project, you must simultaneously set both 'HTTP referrer restrictions' and 'API restrictions (allow only generativelanguage.googleapis.com)' in the API key settings. Without these two restrictions, the key can be used to call all Google services if exposed.
- Design your architecture so that Gemini API calls are made only on the server side (not on the client side - browser, app). If client-side calls are unavoidable, apply Firebase App Check to restrict API calls to authenticated apps only.
- Set a project spending cap (project spend caps) in Google AI Studio or the GCP Console, and build a pipeline using Pub/Sub + Cloud Function to automatically disconnect the billing account when the limit is exceeded. Simple email notifications are not a practical defense due to delays of several hours.
- If you add the Gemini API to a project originally created for other purposes, such as Firebase Authentication, the existing API keys in that project will all have Gemini access. You must thoroughly review the existing key list and remove unnecessary API permissions.
Code Example
# GCP billing account disconnection method via programming (based on official documentation)
# Caution: Do not use for production apps. Associated resources may be deleted.
# Disconnect billing account using gcloud CLI
gcloud billing projects unlink PROJECT_ID
# Or use Pub/Sub + Cloud Function combination to automatically disconnect upon budget exceedance
# 1. GCP Console > Billing > Budgets & alerts > Create budget
# 2. Set Alert threshold (e.g., 80%)
# 3. Select 'Connect a Pub/Sub topic'
# 4. Deploy the following Cloud Function
import base64
import json
from googleapiclient import discovery
def stop_billing(data, context):
pubsub_data = base64.b64decode(data['data']).decode('utf-8')
pubsub_json = json.loads(pubsub_data)
cost_amount = pubsub_json['costAmount']
budget_amount = pubsub_json['budgetAmount']
project_name = pubsub_json['budgetDisplayName'] # Use budget name as project name
if cost_amount >= budget_amount:
billing = discovery.build('cloudbilling', 'v1')
# Disconnect billing account
billing.projects().updateBillingInfo(
name=f'projects/{PROJECT_ID}',
body={'billingAccountName': ''}
).execute()
print(f'Billing disabled for project {PROJECT_ID}')
# API key restriction settings (Google Cloud Console > APIs & Services > Credentials)
# - Application restrictions: HTTP referrers (web sites) -> Allow only your domain
# - API restrictions: Restrict key -> Select only generativelanguage.googleapis.comTerminology
Related Papers
1-Bit Bonsai Image 4B Image Generation for Local Devices
4B 파라미터 이미지 생성 모델의 가중치를 1비트/3값으로 극단적으로 압축해서 iPhone에서도 돌아가게 만든 모델. 7.75GB짜리 diffusion transformer를 0.93GB까지 줄였다.
Show HN: Tiny-vLLM – high performance LLM inference engine in C++ and CUDA
vLLM의 핵심 기능을 C++와 CUDA로 직접 구현하며 배울 수 있는 교육용 LLM 추론 엔진 프로젝트로, 소스코드와 단계별 강의가 함께 제공된다.
Real-time LLM Inference on Standard GPUs: 3k tokens/s per request
Kog AI가 8× AMD MI300X에서 요청당 3,000 tokens/s를 달성하는 LLM 추론 엔진을 공개했고, 기존 소프트웨어 스택의 병목을 GPU 메모리 대역폭 최대화로 풀어냈다는 내용이다.
A sleep-like consolidation mechanism for LLMs
LLM이 긴 컨텍스트를 처리할 때 발생하는 Attention 비용 문제를 해결하기 위해, 사람의 수면처럼 주기적으로 컨텍스트를 fast weight에 압축·저장하는 새로운 메커니즘을 제안한 논문이다.
CODA: Rewriting Transformer Blocks as GEMM-Epilogue Programs
GPU에서 Transformer 학습 시 발생하는 메모리 병목을 해결하기 위해, 정규화·활성화 등 소규모 연산들을 GEMM 출력이 칩 위에 있는 동안 함께 실행하는 커널 추상화 CODA를 소개한다. LLM이 이 추상화를 활용해 고성능 커널을 자동 생성할 수 있다는 점이 특히 주목받고 있다.
KV-Fold: One-Step KV-Cache Recurrence for Long-Context Inference
모델 수정 없이 KV 캐시를 청크 간 누산기로 쓰면 128K 토큰까지 100% 정확도로 정보를 검색할 수 있다.
Related Resources
- https://discuss.ai.google.dev/t/unexpected-54k-billing-spike-in-13-hours-firebase-browser-key-without-api-restrictions-used-for-gemini-requests/140262
- https://trufflesecurity.com/blog/google-api-keys-werent-secrets-but-then-gemini-changed-the-rules
- https://ai.google.dev/gemini-api/docs/billing#tier-spend-caps
- https://ai.google.dev/gemini-api/docs/billing#project-spend-caps
- https://cloud.google.com/billing/docs/how-to/budgets-programmatic-notifications
- https://news.ycombinator.com/item?id=47156925