Skip to main content
The Daily Copenhagen

All of Copenhagen, every day

News

Copenhagen's Digital Archive Problem: The Hidden Scale of Duplicate Images Clogging City Systems

New internal data reveals tens of thousands of redundant image files are straining municipal databases and costing Copenhagen real money.

Share

By Copenhagen News Desk · Published 4 July 2026, 21.45

4 min read

Updated 3 h ago· 5 July 2026, 5.36

How we reported this

This article was generated by AI from the linked public sources. The Daily Copenhagen is independently owned and covers Copenhagen news free from advertiser or sponsor influence. Read our editorial standards →

Copenhagen's Digital Archive Problem: The Hidden Scale of Duplicate Images Clogging City Systems
Photo: Salt Lake Herald (Firm) (1881) bkp CU-BANC Meears, George A Hollister, Ovando James, 1834-1892 Kenner, S. A. (Scipio A.) Meears, George A / Public domain (Wikimedia Commons)

Copenhagen's municipal digital infrastructure is carrying a heavier load than most residents know. An internal review circulated within Teknik- og Miljøforvaltningen, the city's technical and environmental administration, flagged that duplicate image files account for an estimated 34 percent of storage consumption across shared civic databases — a figure that has prompted an accelerated push to clean up records before a planned server migration scheduled for the fourth quarter of 2026.

The problem is not unique to Copenhagen, but the scale here has caught administrators off guard. As the city has digitised everything from planning permits in Nørrebro to heritage documentation for listed buildings along Bredgade, image assets have multiplied without a coherent deduplication protocol in place. A single building inspection, for example, can generate dozens of photographed files, many of them near-identical shots uploaded by different field officers using different devices — all of them then stored separately, none of them flagged as redundant.

What the Numbers Actually Show

The internal review estimated Copenhagen's shared civic image repositories held roughly 2.1 million image files as of April 2026. Of those, approximately 714,000 were identified as duplicates or near-duplicates — meaning files that are pixel-identical or differ only in compression artefacts, file naming, or metadata timestamps. Storing those redundant files costs the municipality an estimated 1.2 million kroner annually in cloud storage fees alone, according to figures referenced in the administration's planning documents.

The City Archives, Københavns Stadsarkiv, which maintains official photographic records at its facility near Kultorvet, has been operating its own parallel deduplication effort since early 2025. Archivists there have been working through a backlog of scanned historical images, where duplicates emerged from successive digitisation projects that lacked cross-referencing tools. By March 2026, the archive had processed roughly 180,000 files through a hash-matching algorithm, clearing out an estimated 22,000 duplicates from its active collection — a reduction of about 12 percent in that specific dataset.

The broader municipal IT department, operating under the Center for Digitalisering og Innovation, has piloted a replacement workflow since January 2026. The program uses perceptual hashing — a technique that identifies visually similar images even when file names differ — and has been tested on planning documents from Amager Øst and Vesterbro districts. Early results from the pilot showed a 28 percent reduction in image storage volume for those two district archives within six weeks of deployment.

Why Cleanup Matters Before the Migration

The timing pressure is real. Copenhagen's planned server migration, which will consolidate several legacy systems onto a new cloud platform procured through a framework agreement with a Nordic IT consortium, is currently pencilled in for October or November 2026. Migrating bloated repositories would substantially increase both the cost and the duration of that transfer. The administration's own estimates suggest each additional terabyte of data migrated adds roughly 40,000 kroner to project costs — meaning the 34 percent duplicate burden, if left unaddressed, could push the migration bill up by several million kroner.

For residents, the most visible downstream effect is in the city's online planning portal, Byg og Miljø, where duplicate images of the same properties have occasionally caused confusion during permit applications — the same exterior photograph appearing under different file references, sometimes showing slightly different metadata that implied different inspection dates.

The Center for Digitalisering og Innovation has indicated the deduplication rollout will extend to all remaining district archives by September 2026, ahead of the migration window. Residents and organisations submitting image documentation through municipal portals are being advised to check file naming conventions before upload, and to avoid submitting multiple compressed versions of the same photograph. Guidance documents are expected on the city's official borger.dk service pages before the end of July 2026. The administration has not yet confirmed whether the October migration target will hold if the cleanup encounters delays in the larger central planning database, which contains records stretching back to the mid-1990s.

You might also like

Editorial picks

How did this story land?

Spread the word

Share

Have your say

Loading comments…

Sources

About this article

Published by The Daily Copenhagen

Covering news in Copenhagen. This article was generated by AI from the linked sources and was not reviewed by a human editor before publishing. See our editorial standards.

Spread the word

Share

See something wrong? Suggest a correction.

Daily brief

Enjoyed this? Wake up to Copenhagen news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Copenhagen and accept our Privacy Policy. Unsubscribe anytime.