Compression is Purification

This analysis documents an experiment in mechanism-centric knowledge reconstruction. We compressed a complex paper on spiral wave dynamics into a minimal JSON structure containing only equations and topological features, then asked ChatGPT o1 (reasoning model): Can the full phenomenon be rebuilt from these mathematical constraints alone?

⸻

The Experiment

This was not a summarization task. Summaries preserve narrative. We wanted to test whether the phenomenon itself could be preserved while discarding prose, figures, and institutional authority.

We chose Spiral Wave Dynamics in Excitable Media as the test subject because it is:

Non-linear: Small parameter changes cause massive qualitative shifts (period-doubling to chaos)
Geometric: Physics lives in topology (phase singularities, winding numbers)
Rigorous: Hallucinations collapse—if the math is wrong, waves break or dissipate

The Result

Reconstruction succeeded. The compressed seed, stripped of all presentation layers, enabled complete recovery of:

Core mechanism (excitable reaction-diffusion with refractory memory)
Geometric interpretation (phase-winding around topological singularity)
Reduced dynamics (infinite PDE → low-dimensional tip dynamics)
Bifurcation logic (β-parameter control of stability, meander, chaos)
Scaling relations (λ ∼ √(D/ω) linking diffusion and rotation)

Why This Works

The knowledge was over-determined. Reaction-diffusion equations constrained geometry. Geometry constrained dynamics. Dynamics constrained failure modes. A hallucination would collapse under this internal pressure.

Compression is not loss when the object is an invariant. Compression is purification.

Efficiency Metrics

The metabolic cost of this reconstruction vs. standard vision-based processing:

Method	Token Cost	Result Quality
Compressed Seed → Reconstruction	~2,250 tokens (~$0.01)	Full mechanism recovery + failure modes
Vision Model (20-page PDF)	~20,000 tokens (~$0.10)	Surface summary, often misses topology
Efficiency Gain	9:1 ratio	Higher semantic fidelity

Key insight: GPUs extract; reasoning reconstructs. Once extracted, seeds enable near-zero-cost understanding transfer. Processing one high-resolution image costs as much as reconstructing ten compressed scientific papers.

The Artifact

A "purified" unit of knowledge is not human-readable prose. It is a machine-metabolizable graph of equations and topological constraints:

// Compressed Seed (Excerpt)

"equations": [
    {
      "latex": "V_tip = ∫₀^τ v(t)dt / τ",
      "source": "ocr_enhanced"
    },
    {
      "latex": "∇·(D∇u) = f(u,v)",
      "source": "ocr_enhanced"
    },
    {
      "latex": "∂u/∂t = ∇²u + f(u,v)",
      "source": "ocr_enhanced"
    },
    {
      "latex": "λ = √(D/ω)",
      "source": "ocr_enhanced"
    }
],
"figures": [
    {
      "figure_type": "spiral_wave",
      "description": "Rotating spiral waves with topological defects",
      "topological_features": ["spiral", "attractor", "basin"]
    }
]

⬇ Download Full Seed (6KB JSON) 📜 View Complete Reconstruction Dialogue

Implications

We verified that a 6KB JSON file reconstructed a complex nonlinear dynamics paper with higher fidelity than vision-based processing at 9× the token cost. This suggests:

Most scientific publishing is entropy (formatting, redundancy, signaling)
Truth is geometric (constrained, predicts failure modes, survives abstraction)
Knowledge is portable (understanding of 1,000 papers in a 2MB file)
Agents trained on seeds ingest verified invariants, not noisy text

This is not just compression. It is an immune system against hallucination.