📊 Full opportunity report: AMÁLIA · The Three Hard Questions. on ThorstenMeyerAI.com — validation score, market gap, and execution plan.
TL;DR
Portugal’s state-funded AMÁLIA large language model is now operational, outperforming many benchmarks, but experts are raising three fundamental questions about its openness, native data, and goals. These questions impact national AI strategies across Europe.
Portugal’s €5.5 million government-funded project, AMÁLIA, is now operational, delivering a multilingual Portuguese language model that outperforms several benchmarks, but it faces significant scrutiny over three fundamental structural questions.
The AMÁLIA project involves approximately 60 researchers from Portugal’s top institutions, including NOVA, IST, and IT, and was announced in December 2024. Its base model, completed on September 30, 2025, is accessible to 450,000 academic users via the FCT’s IAedu platform, with knowledge cut off at the end of 2023. The model is built as a continuation of the EuroLLM multilingual foundation, not trained from scratch, contrasting with Italy’s Minerva, which trained from zero on Italian and English data.
According to the technical report by Vieira et al. (2026), AMÁLIA outperforms previous open models on European Portuguese benchmarks and beats Qwen 3-8B on most Portuguese tasks, though it still trails Qwen on certain benchmarks like ALBA. The project aims for a final version by June 2026, with ongoing developments and potential improvements in native-language capabilities and multimodal features.
AMÁLIA
The three hard
questions.
Portugal spent €5.5M to build a European Portuguese LLM. The base version is operational, the benchmarks beat Qwen 3-8B on most pt-PT tasks. So why are the most important questions still unanswered?
Last month, Duarte O.Carmo published the sharpest public analysis of AMÁLIA — Portugal’s state-funded European Portuguese large language model. He prefaces his critique with the necessary diplomatic apparatus before doing what almost nobody else in the European-sovereign-LLM discourse has been willing to do publicly: asking hard questions about whether the work, as released, actually does what it set out to do. This piece is a structural extension of his analysis. The AMÁLIA case study exposes three hard questions every national LLM effort needs to answer publicly — and the broader European sovereign-LLM movement has been operating without explicit answers to any of them.
Three questions every national LLM effort needs to answer publicly.
Duarte O.Carmo’s framing maps cleanly onto the structural argument. Each question lands specifically in AMÁLIA — and the broader European sovereign-LLM movement has been operating without explicit answers to any of them.
The three questions form a structural feedback loop. Q3 (optimization target) determines Q2 (data volume needed) which conditions Q1 (openness sufficient for community contribution). The European sovereign-LLM movement collectively benefits from these questions becoming standard methodology disclosure, not exceptional critique.

Designing Large Language Model Applications: A Holistic Approach to LLMs
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
107 billion tokens. 5.8 billion clearly pt-PT.
The structurally tractable question with a structurally surprising answer. For a model whose entire stated purpose is European Portuguese prioritization, the native-language share of extended pre-training is 5.5%. The implications cascade into every other question.

APRENDA GitLab CI/CD: Implemente DevOps com Deploys Automatizados e Feedback Contínuo (Infraestrutura & Automação Brasil) (Portuguese Edition)
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
The Olmo standard. AMÁLIA’s current state.
Allen Institute for AI’s Olmo project defines what “fully open” operationally requires. Olmo doesn’t lead frontier benchmarks. That’s not the point. The point is to be the structural reference for openness. AMÁLIA’s “fully open source” claim should track to the operational standard.

The AI Infused Classroom
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Four strategic positions. AMÁLIA between two and three.
Approximately €100M+ in publicly disclosed European sovereign-LLM funding across the major initiatives. The structural question every project faces: what is the actual competitive position you’re staking? Four options — none mutually exclusive — but each requiring different commitments.

TEEPOCH AI Language Translator Earbuds,OWS Real-Time 4-in-1 Translation Earbuds 144 Languages & Accents, Bluetooth 5.4 Translation Device for Travel/Business/Learning (SP39 Gray Black)
Real-Time Two-Way Translation in 144 Languages Earbuds– Break language barriers effortlessly with seamless live translation across 144 languages,…
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Three standards. For AMÁLIA and the movement.
The structural critique generalizes beyond AMÁLIA. Italy, France, Germany, Switzerland, the OpenEuroLLM consortium, and every subsequent national project benefit from public discourse holding national LLM efforts to operational standards on openness, data accounting, and strategic positioning.
The European sovereign-AI agenda is a serious strategic project that deserves serious public discourse. O.Carmo’s analysis is what serious public discourse looks like. Appropriately diplomatic. Structurally rigorous. Willing to ask the hard questions in public when the public investment justifies it. More of this is needed — across every European sovereign-LLM project, not just AMÁLIA.
Implications for European Sovereign LLM Strategies
The development and deployment of AMÁLIA exemplify a broader European effort to create national language models, highlighting the importance of transparency, native data utilization, and goal-setting. The questions raised about openness, native-language data sufficiency, and primary objectives are critical for shaping future policy, funding, and research directions across Europe. How these issues are addressed will influence the continent’s AI sovereignty and competitiveness.
European Sovereign LLM Efforts and Structural Challenges
Across Europe, several countries and initiatives—such as Italy’s Minerva, Germany’s Aleph Alpha, France’s Mistral, and the OpenEuroLLM consortium—are pursuing national language models with varying approaches. These efforts emerged amid concerns over dependency on US-based models and the desire for AI sovereignty. A recurring challenge is defining what ‘fully open’ means, determining how much native-language data is enough, and establishing clear objectives for these models. Portugal’s AMÁLIA is a key case study because of its significant public investment and national scope, making its structural questions particularly salient.
“The three questions—openness, native data, and goals—are fundamental for understanding the true potential and limitations of national LLMs.”
— Duarte O.Carmo
Unanswered Questions About AMÁLIA’s Openness and Objectives
It remains unclear how open AMÁLIA truly is in practice, especially regarding access to underlying data and model weights. The extent to which native Portuguese data suffices for future improvements is also uncertain. Additionally, the primary goals—whether to maximize openness, native-language performance, or strategic sovereignty—are still under discussion within the community. The final version’s capabilities and limitations are yet to be fully revealed, and ongoing developments may address some of these gaps.
Next Steps for AMÁLIA Development and European LLM Policy
The final version of AMÁLIA is scheduled for release in June 2026, which will clarify some of these structural questions. Researchers and policymakers will closely monitor its performance, openness, and native-language capabilities. Concurrently, broader European initiatives are likely to refine their strategies around transparency, native data use, and objectives, driven by the insights gained from AMÁLIA and similar projects. Public discussions and evaluations will shape the continent’s AI sovereignty roadmap in the coming year.
Key Questions
What makes AMÁLIA different from other European language models?
AMÁLIA is built as a continuation of a multilingual foundation, not trained from scratch, and involves significant Portuguese native data. It is publicly funded and aimed at serving Portugal’s academic and public sectors, making its structural questions particularly relevant.
Why are the questions about openness and native data important?
These questions determine how accessible and effective the model can be for native speakers, influence transparency, and impact strategic sovereignty. Addressing them is crucial for trustworthy and competitive AI development.
What are the potential risks of not answering these questions?
Without clarity, there is a risk of overestimating the model’s capabilities, limiting transparency, and missing opportunities for strategic control over AI resources, which could hinder Europe’s AI sovereignty efforts.
Will AMÁLIA’s final version resolve these questions?
It is uncertain. The final version will likely clarify some capabilities and limitations, but broader structural questions about openness and strategic goals may require ongoing policy and community debate.
Source: ThorstenMeyerAI.com