The Mechanics of Hamiltonian Biology
I thought some folks might be curious as to why I named this publication “Hamiltonian Biology.” The short answer is that I think biological systems (or all systems for that matter) are governed by implicit constraint functions that operate in a physical system and behave like Hamiltonians. When we are able to recognize this it can change the type of questions we ask as to what kind of data you need, what kind of models you build, and what kind of emergent information may arise in the future.. This post I hope clarifies some of that.
To be clear this is not a verified scientific methodology and the formulas expressed here are more abstract representations of what I think might be the relevant structures we should pay attention to.
The setup
In classical mechanics, the Hamiltonian H(q, p) is a scalar function of generalized coordinates q and conjugate momenta p. It encodes the total energy of a system, and it governs time-evolution through Hamilton’s equations:
The system traces a trajectory through phase space — the space of all possible (q, p) configurations. That trajectory is not arbitrary. It is constrained by H. You never measure H directly. You infer it from how the system moves.
Three properties matter:
Trajectories are deterministic given H. State at t₀ determines state at t₁.
Invariants emerge from the structure of H. Conservation laws follow from symmetries (Noether’s theorem).
Perturbations reveal H. You learn the governing function by pushing the system and watching what happens.
Now. Does any of this transfer to biology and/or biological systems?
The biological state vector
Define the state of a biological system at time t as a vector:
where gᵢ are transcript abundances, pⱼ are protein levels, cₗ are clinical or microenvironmental variables. For a transcriptome-only measurement, x ∈ ℝⁿ with n ≈ 20,000.
This is not abstract. Here is what a simplified plasma RNA state vector looks like across three timepoints for a single NSCLC patient under immunotherapy:
Illustrative values, representative of trajectories observed in responding patients on anti-PD-1 therapy. Not from a specific dataset.
This table is six dimensions of a ~20,000-dimensional vector, sampled at three points. Already you can see structure: IFNG and PD-L1 are co-rising. TGFB1 and MKI67 are co-declining. LDH is dropping. It appears as though these may not be independent movements and there is some mechanism constraining them.
The constraint function
Suppose that there exists a function H(x) such that the biological system evolves approximately as:
where J is a structure matrix (analogous to the symplectic matrix in Hamiltonian mechanics), ∇H is the gradient of the constraint function, and η(t) is a noise term accounting for biological stochasticity.
In a true Hamiltonian system, J is the canonical symplectic matrix:
In biology, J is not canonical. It encodes the regulatory network topology — which variables drive which, with what sign and coupling strength. J is sparse (most genes do not directly regulate most other genes), asymmetric (regulation is not reciprocal), and itself state-dependent (regulatory topology changes with cell state).
This is where we should be clearer. Biology is not a conservative Hamiltonian system. It is dissipative, far from equilibrium, and stochastic. The more precise formulation is probably closer to:
where V(x) is a dissipative potential (thermodynamic cost, entropy production) and Γ is a positive-definite friction matrix. The system has both conservative-like dynamics (the H term, which generates rotations and oscillations in state space) and dissipative dynamics (the V term, which pulls the system toward attractors).
Why snapshots are not enough
Here is the core problem, stated quantitatively.
Given a single observation x(t₀):
Given a trajectory {x(t₀), x(t₁), … , x(t_T)}:
The velocity estimate from trajectories uses:
Given trajectories under perturbation (drug, knockout, ligand):
The information content scales as: snapshot ≪ trajectory ≪ perturbed trajectory.
To put numbers on it: a cross-sectional cohort of N = 500 patients with single-timepoint RNA profiles gives you 500 points in ℝ²⁰’⁰⁰⁰. A longitudinal cohort of N = 50 patients with T = 6 timepoints gives you 50 trajectories — 50 estimates of the local vector field sampled along biologically real paths. The second dataset is smaller by sample count but richer by information content per sample, because it constrains f(x), not just the distribution of x.
Add matched perturbation experiments — patient-derived organoids treated with the same therapy in parallel — and you get causal supervision. You can ask whether the trajectory in the patient’s plasma RNA is consistent with the trajectory induced in their organoid system.
A concrete example: immune activation dynamics
Consider a patient’s immune state during anti-PD-1 therapy, projected onto two derived axes: an effector score α (weighted combination of IFNG, GZMB, PRF1) and an immunosuppressive score β (weighted combination of TGFB1, IL10, FOXP3).
You observe two classes of trajectories:
Responders:
Approximately linear trajectory. Consistent velocity. The system is in a channel.
Non-responders:
Initial weak perturbation in the effector axis, followed by reversion and drift toward immunosuppression. The system was pushed but returned — different basin.
The point: at t₀, these two patients have nearly identical (α, β) coordinates — (0.31, 0.68) vs. (0.33, 0.65). A static classifier sees the same input. The trajectories diverge completely.
If you model the 2D dynamics as:
where u(t) is the therapeutic perturbation and V(α, β) is a patient-specific potential landscape, then the responder has a V where the perturbation pushes the system over a barrier into an effector-dominant basin. The non-responder has a V where the barrier is too high or the immunosuppressive basin is too deep.
What this framework is not
It is not a literal claim that cells obey Hamilton’s equations. They do not. The system is dissipative, stochastic, and far from equilibrium. Time-reversibility does not hold. Energy is not conserved in the textbook sense.
It is a modeling hypothesis: the structural apparatus of Hamiltonian and Lagrangian mechanics — state spaces, constraint functions, phase portraits, perturbation analysis, invariants — provides a more appropriate scaffold for biological modeling than static, correlative machine learning.
Why this Substack
I named this publication “Hamiltonian Biology” because I think the field needs a specific correction. The current default in computational biology is: collect observations, train a model, report correlations. That paradigm has an epistemic ceiling. It cannot recover dynamics from statics. It cannot infer causation from correlation. It cannot distinguish patients in different basins of a constraint landscape if they happen to sit at the same point when you measure them.
Everything I write here will orbit that idea, in one form or another.















