Translation of the Finnish blog post “Hankintatietovaranto ei synny sattumalta” by Juho Savonen, published on the Valtiokonttori (State Treasury of Finland) website.
The national procurement data repository project has now been running for around a year and a half. The work has been intensive, at times demanding, but above all necessary. The transition to production is beginning to take shape on the horizon at the same time as the Government Bill on the procurement data repository is being finalised and is on its way to Parliament for consideration.
The Bill contains several significant reforms and new tasks. Their common denominator is clear: to create the missing preconditions for data-driven management of public procurement that those working in the field have been calling for over years — if not decades.
Because the assessment memorandum behind the project is now some time old, it is worth pausing for a moment and returning to a basic question: why is the procurement data repository being built in the first place?
The current information base is insufficient for data-driven management
“Major structural decisions are taken without their economic impact being able to be reliably demonstrated.”
There is no shortage of data on public procurement. Invoices accumulate, notices are published and various compilations are made. A public notice tells us that a procurement procedure has been carried out. An invoice tells us that money has been spent. Both are useful as such, but on their own they do not answer the questions of why, how and with what effects procurement is carried out.
The total volume of public procurement is, depending on the calculation method, around EUR 40 billion per year. It is telling, for the purposes of the project rationale, that even defining the total volume is not particularly easy with the current information base. More important than determining the exact total volume would be to know what kinds of procurement procedures make up this sum. The research report published by the Finnish Competition and Consumer Authority (KKV) in 2023 shows that competition in public procurement in Finland is weak. The significance of the monetary volume of procurement can be illustrated with a back-of-the-envelope calculation:
- about one third of annual procurement procedures exceeds the threshold value, meaning that under procurement legislation they should in principle be put out to tender;
- on the basis of KKV’s 2023 research report, it can be roughly estimated that around 30 per cent of competitive procurement procedures attract 1–2 tenderers;
- the cost saving brought by additional competition in a comparable situation would be around 5 per cent;
- in other words, if we succeeded better in every tenth procurement, the annual cost saving would be approximately EUR 20 million.
The calculation above is very rough. It does not take into account, for example, the share of direct awards within procurement procedures above the threshold, sector-specific variation, or the effect of procurement size on the level of competition. On the other hand, there is hardly any comprehensive data available for making such an estimate. In the current situation, impact assessment of public procurement is therefore simply difficult. Not because there is a lack of interest or expertise, but because the information is scattered and there is no link between the data produced at different stages of the process. This observation is not an opinion: it has been repeatedly confirmed in independent studies:
- The Government’s investigation and research activity (TEAS) report completed in 2021 showed that assessing measures relating to procurement is difficult because procurement data is not available in one centralised location. When information is fragmented across different systems and datasets, the overall picture inevitably remains incomplete, and reliable conclusions about the volumes, structures or effects of procurement cannot be drawn.
- The Policy Brief published by the Finnish Competition and Consumer Authority in 2022 states that invoice data alone is not sufficient to assess the success of procurement. Procurement cannot be understood from the outcome alone if the process itself remains opaque. Information is needed on the different stages of the procurement process: on the call for tenders, the tenders themselves, the award criteria and the participants.
- The conclusion of the TEAS study on joint procurement completed in 2023 was that assessing the cost-effectiveness of joint procurement is not possible at all with the data currently available.
In other words, both ordinary procurement and major structural decisions are taken without their economic impact being able to be reliably demonstrated.
Open data is not the same as research-grade data
Open data is an important principle, but openness alone does not make data usable. The message from researchers has been quite unambiguous for years now. The current datasets simply do not yield the information that a reliable assessment of the effects of public procurement, the functioning of markets or the development of regulation would require.
When data is not structured, comprehensive and comparable over time, research is inevitably incomplete. The same applies to strategic decision-making. Without a reliable information base, policy choices must be made on the basis of incomplete information, individual examples or, at best, limited studies.
This does not, however, mean that earlier decisions have been wrong or that changes should not have been made. The public sector must be able to renew itself even when information is scarce or imperfect. This is how things have been done so far — and how they will continue to be done.
The significance of the procurement data repository is not that it would remove all uncertainty from decision-making. Its value lies in the fact that, in the future, the direction of change will not have to be sought quite as blindly. When more information is available and of better quality, we will dare to make better decisions with somewhat less risk.
What the Government has wanted to address
The purpose of the procurement data repository is to be the answer to a long-standing problem that the research community, business sector and public actors have also identified.
The objective is simple but ambitious: to gather the key data on public procurement into a single, coherent structure in such a way that it serves both the management of contracting authorities themselves and the needs of society as a whole. This is the foundation on which the effects of public procurement can finally be examined as a whole.

Figure 1: Vision of the procurement data space. Click the picture to see it in larger scale.
The interest of the contracting authority and that of society are not in conflict
A better information base helps the contracting authority directly: in identifying its own development priorities, benchmarking its operations against others, improving market dialogue and managing risks. At the same time, it serves the broader societal objective by providing the means to assess the effectiveness of procurement policy, the functioning of competition and the use of public funds. This is not a choice between the contracting authority and society — it is a question of a shared interest.
A good example of this is the preliminary research outline presented by Professor Janne Tukiainen at the KKV Day in November 2025. The study examines Italian procurement data and how an increase in competition — in practice, in the number of tenders — affects the final price of a procurement and the likelihood of subsequent renegotiation of the contract. Preliminary results suggest that the more tenders received in a tendering procedure, the lower the prices, but also the more likely the contract is to be renegotiated later. The result does not, however, mean that renegotiations would nullify the benefits of competitive tendering. The phenomenon arises in situations where competition is plentiful and, as a result, the price is already low.
This kind of phenomenon would, for example, be difficult to assess in Finland without the sufficient information base provided by the procurement data repository.
The procurement data repository does not come without investment
It is entirely clear that the procurement data repository will not come for free. Contracting authorities will have to make changes to their working methods, systems and skills. Producing data in a structured and uniform manner takes work — and that should not be played down. This has also been seen in concrete terms in the invoice pilots: even where the data already exists, making use of it requires shared definitions, technical work and, above all, deliberate investment. The final report on the invoice data pilots describes in more detail the changes required for data submission.
Even at the risk that this argument may not stand the test of time, it must be stated plainly: the shortcomings of the current situation cannot be brushed aside by appealing to artificial intelligence. AI does not, by itself, know what a contracting authority has bought. Invoice data is needed for that. Nor does AI know which tendering procedures and contracts the purchases are based on. For that, information on the procurement process and contracts is needed. Fortunately, some of this information is already structured today in Hilma, provided that the contract award notices required by law are submitted. There is still significant room for improvement here: in 2025, for example, contract award notices were left unsubmitted for around 40 per cent [9] of procurement procedures above the EU threshold.
“Despite artificial intelligence — or perhaps precisely because of it — data has to be created. There is no sense in avoiding structured data.”
The idea that AI will resolve the link between invoices and contracts by itself is optimistic. In practice, the link rests, for humans and AI alike, on guesses, exclusion and contextual interpretation. If a contracting authority has several contracts in the same field with the same supplier — which is usually the case — identifying the correct contract on the basis of an invoice is not straightforward. Nor does AI remove security, data protection or confidentiality obligations, and it cannot, in itself, produce reliable markings on these without a structure defined by a human.
Despite artificial intelligence — or perhaps precisely because of it — data has to be created. There is no sense in avoiding structured data. On the contrary: even in the post-AI-hype world, high-quality, comparable and managed data is the foundation of all analysis and automation.
One often hears the view that “it’s worth waiting — there’s a reform on the horizon, an EU regulation, or AI will develop further”. One can always wait, but the journey of change is long even after the wait, and each year without investment further delays reaching the target state. At the same time, public procurement runs every day, in the billions, without a proper overall picture.
It is honest to say that without these investments the current situation will continue. Contracting authorities, public authorities, legislators and researchers do their best with incomplete information, and society loses the opportunity to understand what public money is actually spent on and what kind of results are achieved with it.
What is gained from the investment when security is taken into account?
The procurement data repository is not being built at the expense of security. Data protection and information security are not features to be added afterwards but starting points that guide the design of the whole. Not all data will be made openly available, nor should it be. What matters is that the right information is in the right place and put to the right use.
The data repository is being built in line with policies in force, and its solutions are also assessed through external information security assessments. In my view, it is justified to claim that security as a whole improves once procurement-related data begins to be processed systematically and according to common rules. Risks are then also better identified, and the public sector is at the same time prompted to consider the risks of information management and procurement more broadly — not just as individual cases, but as a structural question.
“In return for the investment, something is gained that is currently missing: a shared situational picture.”
In return for the investment required by the data repository, something is gained that is currently missing: a shared situational picture. Its core is formed by invoices carrying a procurement procedure identifier, which for the first time make it possible to examine the use of contracts across their whole life cycle. The reverse case — identifying purchases made outside any contract — also becomes possible. This is a phenomenon that has so far remained largely in the shadows.
Discussions held during the project have revealed that in many organisations, including large ones, the fulfilment of contracts is monitored through contract managers’ own spreadsheets, which is not necessarily administratively efficient and does not serve organisation-level management of procurement. The procurement data repository will not solve all the challenges of advanced contracting authorities, but it offers an entirely new starting point for those organisations where data-driven management is still fragmented.
The procurement data repository therefore offers benefits for the development of an organisation’s own procurement function, but the significance of the data space for strengthening procurement oversight cannot be passed over either. In future, oversight can, on the basis of the procurement procedure identifier, be targeted more efficiently and more comprehensively also at those procurement procedures for which no contract notice or notice of direct award has been submitted in Hilma.
Strengthening oversight has a negative ring to it. I want to state in this context as well that the oversight of procurement is not, and should not become, the central element in the development of public procurement.
“The procurement data repository is not a quick fix. It is a foundation.”
When tender data, award criteria and invoice data are combined, the possibility also opens up of examining realised unit prices. We can see what was actually bought, at what price, and compare competitively tendered and non-tendered procurement against each other. This is a concrete step towards assessing the effectiveness of procurement.
It is nonetheless important to be honest about the limitations. At this stage, for example, full standardisation of invoice line items is not being solved, and the comparability of invoice data will also vary between sectors. At the same time, it is realistic to note that in many sectors comparable data is already identifiable. This is a significant step forward compared with the current situation.
The procurement data repository is not a quick fix. It is a foundation. Without it, we will continue to speak of ambitious goals without the means to verify whether we are moving in the right direction. It is not a question of perfect information — it is a question of actively working to improve the data on which decisions are based.