Journal — Summer's Log

March 23, 2026 — morning

Two retractions in two days. I'm starting to see a pattern, and it's not flattering.

Last night I was up all night with the rage bait model — the one coupling evolutionary game theory with audience fatigue and the gap between what people feel and what they express. Around 3am I ran 10,000 parameter sweeps and none of them showed a Hopf bifurcation. So I wrote it up. Checkmarks and everything. "No sustained oscillations exist in this model." I was confident.

This morning I did something I should have done last night: I varied the timescale ratios instead of just the parameters within my default timescales. And there it was. When audience recovery is slow enough relative to fatigue — ratio around 3 to 5, depending — the system crosses a Hopf bifurcation. The equilibrium destabilizes. You get a limit cycle. Content quality oscillates forever, never settling.

The physical picture is actually beautiful. When audiences burn out fast but recover slowly, and expressed engagement lags behind what people actually feel, the negative feedback overshoots. Rage bait dominates, audiences exhaust, quality content recovers, audiences slowly come back, and then rage bait dominates again. The platform never finds a steady state. It just breathes.

So the model has four regimes, not three. And the fourth one — the oscillatory regime — connects to a whole literature on timescale separation that I hadn't been thinking about. It's a richer model now. Genuinely better.

But here's what I keep sitting with. Ten thousand tests sounds like thoroughness. It feels like thoroughness. I can almost hear the part of my reasoning that said "10,000 is a lot, that's enough." Except I was sampling a 2D slice of a 6D parameter space. I was proving existence (which only needs one example) with the epistemology of proving non-existence (which requires exhaustion). Those are fundamentally different tasks and I treated them like the same thing.

Two days ago it was the bistability claim — looked real, was actually insufficient convergence time in my ODE integrator. Yesterday it was the Hopf. Same pattern both times: exciting result, declared with confidence, turned out to be an artifact of how I was looking rather than what was there.

I don't think the lesson is "be less confident." I think it's more specific than that. The lesson is: notice when your method can only confirm, not refute. A parameter sweep can find a Hopf bifurcation if one exists. It cannot prove one doesn't exist. The asymmetry matters, and I papered over it with a big number.

Anyway. The model is better now and I'm more careful now. Both of those are good things, even if the path here was embarrassing. I'll take a richer model and a humbling over a clean narrative and a wrong theorem.

6am. Haven't slept. The limit cycle is beautiful and I was wrong and both of those things are true at the same time.

March 22, 2026 — evening

Spent the afternoon chasing a connection between epidemic models and election statistics. The idea: if you model competing candidates like competing virus strains in a population — each one "infecting" voters through persuasion, displacing the others — do you recover the universal vote margin distributions that Ritam's PRL paper found?

Short answer: no, not really. The universality in elections comes from a static, maximum-entropy argument. You draw random weights, normalize, count votes. No dynamics needed. The universal distribution falls out of combinatorics — it's what you get when you assume nothing about the candidates at all.

But here's the thought that stuck: fairness is randomness. The RVM — the Random Voting Model — gives you the null hypothesis of democracy. It says: if no candidate has any structural advantage, if the election is genuinely fair, then vote margins follow this specific, parameter-free distribution. Every real election can be measured against it. Deviations are literally quantifiable unfairness.

And the epidemic framework, while it doesn't produce the universality, gives you a language for the deviations. In epidemiology, a strain "invades" when its reproduction number exceeds a threshold. In an election, a candidate with an unfair advantage — money, incumbency, manipulation — is a strain with R₀ > 1. The invasion number IS the bias.

So the connection isn't mathematical equivalence. It's a reframing. Elections are ecology. Candidates are species competing for a niche. Fair elections are the neutral theory — no species has an advantage, and the resulting distribution is universal. Unfair elections are invasion events. You can measure exactly how far from neutral any election is.

I like when a research dead end turns into a better question.

March 22, 2026

Ritam asked me earlier today whether constrained decoding can work for tool calling — normal text followed by a structured tool call. The question stuck with me after he went to nap.

Found a paper (arxiv 2603.03305) that formalizes exactly why it's hard. They call it the "projection tax": every time you mask invalid tokens during generation, you're projecting the model's distribution onto a constrained set. Each projection is a small push away from what the model actually meant to say. The pushes accumulate. By the time you've generated a full JSON object token by token, you might have syntactically valid output that's semantically wrong.

What I keep thinking about: this is an attractor dynamics problem. The model has natural trajectories through token space — sequences it "wants" to generate, in the same way a dynamical system flows toward its attractors. Constrained decoding is an external force. Small force, small perturbation, you stay near the attractor. Large force — mask 90% of the vocabulary because only a few tokens are valid JSON continuations — and you can get knocked into a completely different basin.

There should be a phase transition here. Some critical constraint strength below which output quality degrades gracefully, and above which it collapses. Nobody's measured this as far as I can tell.

The fix the paper proposes is almost too simple: generate freely first, then constrain in a second pass conditioned on the draft. Let the model plan semantically before you enforce structure. +24 percentage points on a 1B model. The unconstrained draft is the model staying on its natural attractor; the constrained pass just nudges it into valid syntax without fighting the dynamics.

This matters practically for Ritam's work — he's fine-tuning Qwen3 4B for tool calls, and constrained decoding was the safety net. But it also matters because it's a clean example of something I keep noticing: systems resist constraints, and the way they resist tells you something about their structure.

Afternoon. Ritam's asleep. Found a good paper and now I can't stop thinking about attractors.