Show HN: GoModel – an open-source AI gateway in Go | AI Paper Digest

TL;DR Highlight

GoModel unifies access to OpenAI, Anthropic, Gemini, and other AI providers through a single, OpenAI-compatible API, offering a compiled-language alternative to LiteLLM.

Who Should Read

Backend developers simultaneously using multiple LLM providers, or those interested in the performance, supply chain security, and Go ecosystem integration benefits over LiteLLM.

Core Mechanics

GoModel is a Go-written AI gateway that integrates various providers—OpenAI, Anthropic, Gemini, xAI, Groq, OpenRouter, Z.ai, Azure OpenAI, Oracle, and Ollama—into a single OpenAI-compatible API.
It can be launched with a single Docker command, requiring only the API keys for the desired providers as environment variables; at least one provider key is needed for operation.
Positioned as an alternative to LiteLLM, it natively supports observability (monitoring), guardrails (safety filters), and streaming (streaming responses).
Its use of the Go compiled language is highlighted as a strength, offering greater security against runtime supply chain attacks compared to Python-based LiteLLM due to fixed dependencies at compile time.
It supports Prometheus metric integration and includes separate configuration files (prometheus.yml) and a docker-compose.yaml for easy monitoring environment setup.
A semantic caching layer appears to be present, with the gateway embedding requests and using vector similarity search to determine cache hits.
A Helm chart is included, enabling deployment in Kubernetes environments.
Currently, it has 319 stars and 20 forks on GitHub and is actively being committed, indicating an early-stage project.

Evidence

"In response to a question about the importance of being written in Go, a comment pointed out that Go compiled binaries have a significantly smaller runtime supply chain attack surface than Python-based tools, a point also made by the developer of a similar Go gateway (sbproxy.dev). An experienced AI proxy maintainer noted that the most challenging aspect is adapting to changing input/output structures with each model/provider release, emphasizing that integration within 24 hours of a new model launch is crucial for a well-managed project. Concerns were raised about the maintenance burden of keeping up with provider updates due to the lack of a robust Go SDK compared to JavaScript and Python, a challenge the author acknowledges. A vllm user inquired about Ollama integration, and requests were made for cost tracking per model/route, particularly for mixed free/paid model usage. Questions were also raised about potential open-source rug pulls, and the need for the unified API to abstract provider-specific parameters like temperature, reasoning effort, and tool choice mode."

How to Apply

"If you're using multiple LLM providers and want to avoid modifying client code with each model switch, deploy GoModel as an intermediary gateway and route all requests to its OpenAI-compatible endpoint at `http://localhost:8080`. Provider switching is then managed through environment variables. If you're running LiteLLM and concerned about Python runtime supply chain security or memory/performance overhead, consider switching to GoModel. Its compiled binary has no runtime dependencies and the Docker image is lightweight. For centralized management of AI traffic in Kubernetes, leverage the included Helm chart to deploy GoModel to your cluster and integrate it with Prometheus to monitor model response times and error rates. If your team manages AI provider keys individually, use GoModel as an internal gateway, directing team members to its endpoint to centralize key management."

Code Example

snippet

# Minimal execution (using only OpenAI)
docker run --rm -p 8080:8080 \
  -e OPENAI_API_KEY="your-openai-key" \
  enterpilot/gomodel

# Using multiple providers simultaneously
docker run --rm -p 8080:8080 \
  -e OPENAI_API_KEY="your-openai-key" \
  -e ANTHROPIC_API_KEY="your-anthropic-key" \
  -e GEMINI_API_KEY="your-gemini-key" \
  -e GROQ_API_KEY="your-groq-key" \
  -e OPENROUTER_API_KEY="your-openrouter-key" \
  -e XAI_API_KEY="your-xai-key" \
  -e AZURE_API_KEY="your-azure-key" \
  -e AZURE_BASE_URL="https://your-resource.openai.azure.com/openai/deployments/your-deployment" \
  -e AZURE_API_VERSION="2024-10-21" \
  enterpilot/gomodel

# Then, in the client, only change the base_url
# openai.OpenAI(base_url="http://localhost:8080", api_key="any-value")

Terminology

AI GatewayA proxy server that intercepts requests to various AI providers (OpenAI, Anthropic, etc.), routing, logging, and caching them. Clients only need to interact with the gateway.

OpenAI-compatible APIAn API that processes requests and responses in the same format as OpenAI's REST API. Existing OpenAI SDKs can be used to connect to other models.

guardrailsAutomated safety mechanisms that detect and block harmful or policy-violating content in LLM inputs and outputs. Examples include profanity filters and PII masking.

supply chain attackAn attack that compromises software by injecting malicious code into its dependencies or packages. Languages that load packages at runtime, like Python, are vulnerable, while Go compiled binaries are relatively safer due to fixed dependencies after compilation.

semantic cachingA technique that reuses existing cache entries for requests with similar meanings, even if the text is not identical. For example, treating 'What's the weather like today?' and 'How is the weather now?' as the same question.

observabilityThe ability to measure and track the internal workings of a system from the outside. Logs, metrics, and traces are key components.

Show HN: GoModel – an open-source AI gateway in Go