Copenhagen's public digital archives contain an estimated 40,000 duplicate image files spread across municipal databases, a figure that city data managers have been quietly working to reduce since a cross-departmental audit began in January 2026. The problem is not unique to the Danish capital, but the scale here — and the administrative cost attached to it — has pushed the issue onto the agenda at Rådhuset, the city's central town hall on Rådhuspladsen.
The timing matters. Copenhagen City Council approved a broader digital infrastructure review in late 2025, allocating 12 million kroner toward modernising records management across all borough offices by the end of 2027. Duplicate imagery — spanning planning documents, heritage photography, public event records and urban development surveys — has emerged as one of the most concrete, measurable inefficiencies that administrators can actually fix without large-scale procurement.
Where the Redundancy Lives
The problem clusters in a handful of institutions. The Copenhagen City Archives, Stadsarkivet, which operates out of its Ørestads Boulevard facility in Ørestad, holds digitised collections stretching back to the late 19th century. Scanning campaigns conducted between 2018 and 2023 created multiple versions of the same physical document, often at different resolutions, without a systematic deduplication step. Internal estimates suggest that roughly 18 percent of the archive's digitised photograph holdings — numbering over 200,000 images in total — are near-identical or exact duplicates.
The Copenhagen Museum, Københavns Museum, on Vesterbrogade in Vesterbro, faces a parallel issue. Collections management staff there have identified duplicate entries created when image batches were migrated between two different collections management systems over a four-year transition period ending in 2024. Storage costs for unneeded files may seem trivial per image, but at institutional scale — with server contracts running at roughly 3,500 kroner per terabyte annually for managed municipal cloud storage — redundancy adds up fast.
Beyond heritage institutions, the problem bites hardest in the planning and urban development departments. Teknik- og Miljøforvaltningen, the city's technical and environmental administration, processes thousands of site survey photographs each year for projects across districts from Nørrebro to Amager. Without automated deduplication workflows, the same drone survey images from a single site often appear under multiple project reference numbers, complicating searches and occasionally causing junior staff to work from outdated versions of the same file.
The Numbers Behind the Clean-Up
Deduplication software trials conducted at Stadsarkivet during the first quarter of 2026 returned measurable results. A pilot run on a subset of 25,000 images identified 4,300 exact or near-exact duplicates — a hit rate of just over 17 percent. Extrapolating that figure across the full municipal holdings suggests the city could free up meaningful storage capacity and reduce metadata maintenance work by a comparable margin.
The labour cost is the more significant figure. Archive professionals estimate that manual review of a duplicate image pair — verifying which version to retain, updating metadata, and logging the deletion — takes between four and eight minutes per file. At 40,000 estimated duplicates across the city system, that represents somewhere between 2,600 and 5,300 person-hours of work if done without automated support. At average municipal technical staff hourly rates, the manual route carries a realistic price tag exceeding 2 million kroner.
Automated deduplication tools tested in the Stadsarkivet pilot completed the same 25,000-image scan in under three hours and flagged duplicates for human sign-off rather than replacing judgment entirely. The hybrid approach — automated detection, human confirmation for heritage materials — is now being written into a proposed city-wide image governance protocol expected to go before the relevant committee in September 2026.
For residents and researchers who use Copenhagen's public digital collections — accessible through the portal Copenhagen City Archives online — the practical benefit will eventually be faster search results and fewer cases of encountering the same image under several different catalogue numbers. Institutions involved have been advised to freeze new large-scale digitisation intake until the deduplication framework is formally adopted. The September committee date is firm, administrators say, though the full rollout across all municipal departments is not expected to conclude before spring 2027.