EuroLLM: LLM made in Europe built to support all 24 official EU languages
TL;DR Highlight
An open-source LLM jointly developed by 8 European universities and institutions supporting all 24 EU official languages with 1.7B parameters.
Who Should Read
European developers, researchers, and public sector organizations needing multilingual LLM capabilities with EU data sovereignty requirements.
Core Mechanics
- Covers all 24 official EU languages at a useful quality level — first model to do so comprehensively
- 1.7B parameter model: deployable on consumer hardware and edge devices
- Open-source with permissive license, trained entirely on EU-sourced data for compliance
- Developed by a consortium of 8 European academic and research institutions
- Benchmarked across multilingual NLP tasks; competitive with larger models on low-resource EU languages
Evidence
- Benchmark results across 24 languages on standard multilingual NLP tasks
- Open weights released on HuggingFace
- Model card documents training data sources (all EU-origin)
How to Apply
- Use this model for multilingual text classification, summarization, or QA across EU languages where data privacy or sovereignty is a requirement.
- Fine-tune on your specific domain data to improve performance for specialized use cases.
- Deploy on-premise for GDPR compliance without sending data to US-based API providers.
Terminology
Low-resource languageA language with limited training data available, making it harder to build high-quality NLP models for.
Data SovereigntyThe principle that data is subject to the laws and governance of the country where it is collected and stored.
GDPRGeneral Data Protection Regulation. The EU's primary data privacy law, governing how personal data of EU residents is processed.
Multilingual LLMAn LLM trained on data from multiple languages, capable of understanding and generating text in all of them.