GPT-NL

Generative AI Large Language Models (LLMs) have taken the world by storm. And while the LLMs created by Big Tech are getting the most attention, they’re also facing controversy over data sources and privacy concerns. They also require Dutch companies to rely on technologies built elsewhere. TNO, SURF, and the Netherlands Forensics Institute (NFI) created a digitally sovereign Dutch LLM that is responsible, compliant, and secure. It offers Dutch organisations a trustworthy, viable alternative.

Page with AI chat

Beyond Big Tech

LLMs like ChatGPT and Google Gemini are currently leading the market. These language models are trained on very large datasets, the origins of which are not fully transparent. Copyright holders have filed legal actions against some major tech companies in various jurisdictions over the use of copyrighted material. Lastly, interacting with these LLMs in Dutch is little more than a translation of English data, and therefore is less applicable to Dutch contexts, culture, and values.

In 2024, TNO experts joined forces with IT cooperative SURF and the Netherlands Forensic Institute to develop a state-of-the-art AI research facility and the first lawful, sovereign Dutch LLM built from scratch. GPT-NL aims to be a trustworthy and transparent language model specifically for professional and enterprise use. It is optimised for real-world business applications where trust, security, and compliance are non-negotiable. Rather than mere novelty, GPT-NL focuses on delivering proven business value, and adheres to GDPR requirements and the principles of the EU AI Act.

Setting the standard

Since the project began, its partners have strived to remain transparent and open about data sources and their decision-making processes. The data used to train the model – mostly Dutch, English, and code sources – consists solely of opt-in data, data legally accepted for training of AI models, and synthetic data that does not infringe upon intellectual property (IP) restrictions. Any copyrighted materials used in the training were provided with the copyright holder’s permission or permissive licences for AI training. All personally identifiable data has been filtered out of the sources.

This approach aims to set the standard for responsible, compliant LLM development throughout the Netherlands and Europe. GPT-NL and its creators intend to create a strong ecosystem that strengthens the broader European AI innovation landscape and enables a viable and trusted alternative to less transparent models. To date, more than a dozen organisations have joined the GPT-NL partnership, and trusted sources like NDP Nieuwsmedia and Algemeen Nederlands Persbureau (ANP), the largest independent national news agency in the Netherlands, have contributed their archives to train the model.

A clean and inclusive data chain

GPT-NL also goes several steps further. It is investigating ways to address sustainability challenges: the tremendous energy supercomputers need to process complex information, plus the water and energy required to cool the data centres. The GPT-NL team is also involving experts and opening dialogues to investigate and mitigate model bias on the basis of gender, age, ethnicity, sexual orientation, disabilities, or socio-economic status. Using extensive documentation and clear communication, GPT-NL remains transparent and understandable. The team has established a Content Board to safeguard the interests and rights of the copyright holders, and enable them to have a voice in the future of the LLM.

Invest in the future of AI

GPT-NL has been trained on more than 1 trillion tokens, and is ready to be deployed in on-premise or private cloud environments. It is therefore also appropriate for classified and confidential documents. The GPT-NL team is now ready to welcome the first group of launching customers to test and refine the system on tasks like summarisation, simplification, and Retrieval-Augmented Generation (RAG) Q&A tasks. The launching clients will play a critical role in further developing the model, while also working in a secure and legally compliant model for their own advancement.

Want to be part of strengthening the Dutch and European AI innovation ecosystem? Eager to contribute to compliant, responsible, and safe LLM development and European innovation? Get in touch today or visit www.gpt-nl.nl to learn how you can contribute to a sovereign AI future.