Chicago's public record system is sitting on a quiet mess. Across multiple city departments — including the Department of Planning and Development and the Chicago History Museum's digital collections on North Clark Street — archivists and records managers have identified thousands of duplicate image files that have accumulated over years of departmental mergers, digitization drives, and inconsistent data standards. The problem is not new. The urgency is.
A combination of factors has pushed the issue to the front of the queue in mid-2026. The city's five-year digital infrastructure plan, launched in 2023, set a July 2026 checkpoint for departments to certify their digital asset inventories. Several departments have missed that milestone, in part because duplicate images create false counts in inventory systems, making it nearly impossible to establish a clean baseline. Without that baseline, grant applications for preservation funding — including requests to the National Endowment for the Humanities, which offers competitive grants in the $75,000 to $350,000 range for archival work — cannot be properly documented.
Where the Problem Is Most Acute
The Chicago Public Library's Vivian G. Harsh Research Collection in Bronzeville holds one of the most significant concentrations of Black Chicago history in the country. Staff there have flagged that a 2021 batch digitization project, intended to process roughly 14,000 photographic prints, inadvertently created multiple file versions of approximately 2,800 images due to a scanner software misconfiguration. Those duplicates now sit alongside originals in a shared network drive, and distinguishing between archival masters and low-resolution working copies requires manual review. The Harsh Collection does not have the staffing to handle that volume alone.
Over on the North Side, the Newberry Library on West Walton Street faces a different variant of the same issue. Its map and photograph collections were partially merged with a third-party metadata contractor's deliverables in 2022. Duplicate entries in the library's discovery catalog mean researchers sometimes retrieve the same image under two different accession numbers, with conflicting date attributions. That kind of error compounds over time — every downstream researcher who cites the wrong date embeds the mistake further into the record.
City Hall's own Bureau of Asset Information Management, which sits inside the Department of Assets, Information and Services, has been tasked since February 2026 with developing a deduplication protocol for municipal photography. The bureau is reportedly evaluating at least three software platforms capable of perceptual hashing — a technique that identifies visually identical or near-identical images even when file names differ. Licensing costs for enterprise-level tools of this kind typically run between $18,000 and $60,000 annually for a mid-sized government deployment, according to published vendor pricing sheets from companies including Cognex and ImageMatcher Pro.
The Decisions That Will Define the Outcome
The most consequential choice ahead is not technical. It is about governance. Someone has to decide which version of a duplicate image becomes the canonical record, and that decision has real stakes. Choosing the wrong file — a compressed copy over an archival TIFF, for example — permanently degrades the quality of Chicago's official visual archive. The city needs a named, empowered authority to make those calls, and right now that authority is diffuse across at least four departments with overlapping jurisdictions.
Community organizations are watching closely. The Pilsen-based Resurrection Project, which maintains neighborhood documentation archives stretching back to the 1980s, has already begun its own internal audit in anticipation of a potential city-wide data-sharing agreement. Meanwhile, the Chicago Metropolitan Agency for Planning has quietly indicated it may condition future open-data partnerships on archive quality standards — a lever that could accelerate action at the departmental level.
The July 2026 checkpoint has now passed. Departments that missed their inventory certifications face a September 30 secondary deadline, according to the city's published digital infrastructure timeline. If that deadline slips as well, the bureau's ability to apply for federal matching funds in the next fiscal cycle — which opens in October — becomes severely constrained. The window for getting this right, without losing money and without losing history, is measured in weeks, not years.