EuroLLM: LLM made in Europe built to support all 24 official EU languages
TL;DR Highlight
An open-source LLM jointly developed by 8 European universities and institutions supporting all 24 EU official languages with 1.7B parameters.
Who Should Read
European developers, researchers, and public sector organizations needing multilingual LLM capabilities with EU data sovereignty requirements.
Core Mechanics
- Covers all 24 official EU languages at a useful quality level — first model to do so comprehensively
- 1.7B parameter model: deployable on consumer hardware and edge devices
- Open-source with permissive license, trained entirely on EU-sourced data for compliance
- Developed by a consortium of 8 European academic and research institutions
- Benchmarked across multilingual NLP tasks; competitive with larger models on low-resource EU languages
Evidence
- Benchmark results across 24 languages on standard multilingual NLP tasks
- Open weights released on HuggingFace
- Model card documents training data sources (all EU-origin)
How to Apply
- Use this model for multilingual text classification, summarization, or QA across EU languages where data privacy or sovereignty is a requirement.
- Fine-tune on your specific domain data to improve performance for specialized use cases.
- Deploy on-premise for GDPR compliance without sending data to US-based API providers.
Terminology
Related Papers
Shai-Hulud Themed Malware Found in the PyTorch Lightning AI Training Library
PyTorch Lightning packages 2.6.2 and 2.6.3 delivered credential-stealing malware via a supply chain attack.
Alignment whack-a-mole: Finetuning activates recall of copyrighted books in LLMs
Fine-tuning even safety-aligned LLMs can bypass safeguards and reproduce copyrighted text verbatim, revealing prompt filtering alone isn't enough to prevent copyright infringement.
Show HN: MacMind – A transformer neural network in HyperCard on a 1989 Macintosh
This is an educational project implementing a single-layer Transformer with 1,216 parameters in the scripting language HyperTalk (1987) and training it on a real Macintosh SE/30. It demonstrates that the core mathematics of modern LLMs works the same on hardware from 30 years ago.
MegaTrain: Full Precision Training of 100B+ Parameter LLMs on a Single GPU
Introducing MegaTrain, a system that leverages CPU memory as the primary storage and utilizes the GPU solely as a compute engine, enabling full-precision training of 120B parameter models with just a single H200 GPU.
Show HN: I built a tiny LLM to demystify how language models work
This educational project allows you to build a mini LLM with 8.7 million parameters, trained on a Guppy fish character, from scratch in just 5 minutes using a single Colab notebook, focusing on demystifying the black box nature of LLMs.