Back to blog

Storage Limitation Is the 2026 Enforcement Frontier for Analytics

Regulators stopped asking whether you collected data lawfully and started asking when you deleted it. After the CNIL's €42M Free ruling and the EDPB's erasure sweep, analytics retention is the audit target.

The expensive GDPR question used to be do you have a lawful basis to collect this. In 2026 it became why is this still in your database. Two enforcement events moved retention from a documentation footnote to the front line, and analytics systems are squarely in the blast radius.

Two rulings that changed the target

On 13 January 2026 the CNIL fined Free Mobile €27 million and Free €15 million. Most coverage focused on the 2024 breach that exposed 24 million subscriber contracts, but the decision also cited Article 5(1)(e) — storage limitation — directly. Free Mobile had kept data on 2.8 million contracts cancelled for more than ten years with no justification. The data was not unlawfully collected. It was unlawfully kept.

A month later the EDPB published the results of its 2025 coordinated enforcement action. Thirty-two data protection authorities audited 764 controllers on erasure practices and found two systemic failures: no internal data classification, and no automated deletion in the IT systems themselves. A documented retention policy that nothing technically enforces was treated as an aggravating factor, not a mitigating one.

The signal is unambiguous. Regulators now assume collection happened; they audit deletion.

Why analytics is the obvious place to look

Storage limitation under Article 5(1)(e) says personal data must not be kept in identifiable form longer than the purpose requires. Most analytics stacks fail this quietly.

A typical event table stores a row per interaction tied to a stable identifier — a cookie ID, a device ID, a hashed email — with a timestamp and no expiry. The schema is designed to keep everything forever so that "we might want to query it later" stays possible. That is precisely the just-in-case retention the CNIL penalized.

The problem compounds because the identifier is what makes a row personal data. As long as user_id = a91f... can be linked back to a person across months of events, every one of those rows is in scope, every one is subject to access and erasure requests, and every one is a liability in a breach. Retention is not a storage cost. It is exposure measured in time.

Minimization at write time beats deletion at read time

The durable fix is not a better cron job that prunes old rows. It is a data model where the identifiable form never persists in the first place.

Monoid's identity model is built around this. A visitor is represented only by a daily hash:

SHA-256(IP | UA | SALT_SECRET | YYYY-MM-DD)

The raw IP and User-Agent exist only in Worker memory long enough to compute that hash; D1 stores the hash, never the raw values. Because the date is an input, the hash for the same visitor changes every midnight. There is no stable cross-day identifier to retain, so the cross-session profile that storage limitation targets cannot accumulate — not by policy, by construction.

This is data minimization at write time. You do not delete the linkage later because you never wrote it. An erasure request becomes trivial when there is no durable key to erase against.

Retention you can actually prove

Storage limitation still needs an outer bound, and a defensible one is short and enforced, not long and aspirational. Monoid keeps pageview data for at most 730 days. Two years is a deliberate ceiling for trend analysis, and the purge is a real operation against the table, not a line in a policy PDF. The EDPB's finding was that policies without enforcement are the gap investigations exploit; retention that actually executes closes it.

The combination matters more than either half. Daily-rotating hashes mean the data is already aggregate-shaped before retention even applies. The 730-day bound means even that aggregate signal has a hard, demonstrable end.

How to judge any analytics you rely on

Whatever analytics sits behind your site, three checks map directly to what regulators audited:

  • Find every field that links a record to a person across time. A stable user ID, a persistent cookie value, a raw IP, a fingerprint. Each one turns an event log into a retained profile — so ask whether the analytics you use stores any of them at all.
  • Ask when retention is applied — at ingestion or in a quarterly review. Minimization that happens as data is written beats a deletion job someone has to remember to run. The strongest answer is a model where the identifiable form was never stored in the first place.
  • Ask whether deletion is demonstrable. If a vendor cannot point to a hard retention bound that actually executes, it has a policy, not a control — and the EDPB sweep just told you which one gets fined.

The cheapest data to defend in an audit is the data you never kept in identifiable form. Storage limitation is no longer a clause to paraphrase in a privacy policy; it is the thing the next investigation will measure, and the architecture that answers it is the one that never had a profile to delete.

Sources

Comments

Loading comments…