The physical meaning of the holographic principle

We show in this pedagogical review that far from being"an apparent law of physics that stands by itself"(R. Bousso, Rev. Mod. Phys. 74 (2002), 825-874), the holographic principle (HP) is a straightforward consequence of the quantum information theory of separable systems. It provides a basis for the theories of measurement, time, and scattering. Principles equivalent to the HP appear in both computer science and the life sciences, suggesting that the HP is not just a fundamental principle of physics, but of all of science.


Introduction
The Holographic Principle (HP) was originally stated as a conjecture by 't Hooft [1]: Given any closed surface, we can represent all that happens inside it by degrees of freedom on this surface itself.
The HP generalizes Bekenstein's area law [2] for the entropy of a black hole (BH): where S denotes the thermodynamic entropy of a BH and A its horizon area in Planck units.The number of "degrees of freedom on this surface itself" cannot, in particular, exceed A  4 , with A the area of the surface in question [1].Susskind [3] provided the first physical implementation of 't Hooft's conjecture by defining an explicit mapping from volume to surface degrees of freedom for a general closed system.This mapping assumes that all light rays that are normal to any element of surface within the bulk are also normal to the boundary.Bousso [4] then showed that requiring covariance induces a holographic limit on information transfer by light, reformulating Eq. (1) as a covariant entropy bound: where A(Σ) denotes the area in Planck units of a (typically but not necessarily) closed surface Σ, and L(Σ) any light-sheet of Σ, defined as any collection of converging light rays that propagate from Σ toward some focal point away from Σ. Bekenstein's area law emerges as the special case in which the equality in (2) holds.Bousso also provided several counterexamples showing the failure of a straightforward interpretation of the HP as a spacelike entropy bound.
When formalized by (1) or (2), the HP is semiclassical; indeed it is "quantum" only in its reliance on Planck units and hence a finite value of .The entropy S is a classical thermodynamic entropy.In the context of general relativity (GR), Σ is a continuous classical manifold enclosing a continuous classical volume characterized by a real-valued metric.As 't Hooft [1] pointed out, the HP renders S(L(Σ)) independent of the metric inside Σ: The inside metric could be so much curved that an entire universe could be squeezed inside our closed surface, regardless how small it is.Now we see that this possibility will not add to the number of allowed states at all.
Here the "allowed states" are the thermodynamic states of Σ, states that an external observer can count by measuring energy transfer between the system and its external environment.In the case of BH, Rovelli [5,6] has shown explicitly that states that are effectively isolated from the external environment and hence do not contribute the systemenvironment interaction over relevant time scales do not contribute to S(L(Σ)).A BH can, in principle, have arbitrarily many such isolated states, as in Wheeler's "bag of gold" model of a BH with a small horizon and large interior [7].
The HP was given broader theoretical relevance within quantum gravity (QG) research by Maldacena [8], who showed that a string QG on a "bulk" d-dimensional anti de Sitter (AdS) spacetime and a conformal quantum field theory (CFT) on its d − 1-dimensional boundary encode the same information.A more limited dS/CFT holographic duality has also been explored [9].While such dualities have seen wide theoretical application, their physical motivation remains that of 't Hooft's conjecture and Bekenstein's area law.The existence of these holographic dualities suggests that the HP is both deep and fully general, but they do not explain why this should be the case.Bousso [4] summarized the situation by remarking that the HP remains: . . .an apparent law of physics that stands by itself, both uncontradicted and unexplained by existing theories, that may still prove incorrect or merely accidental, signifying no deeper origin.
Our purpose here is to respond to this remark of Bousso's by developing a clear, consistent, and at bottom, a very simple picture of the physical meaning of the HP.Building on previous results [10,11,12,13,14,16], we show that the HP can be generalized to describe the maximum classical information flow implemented by any physical interaction between mutually separable, i.e. unentangled, finite physical systems.Specifically, we can state: Generalized Holographic Principle (GHP): If U = AB is a finite closed system, then if |AB is separable, the classical information exchange between A and B is limited to N bits, where N is the dimension of the interaction Hamiltonian H AB .
This GHP is a fully quantum information-theoretic principle that is entirely independent of geometric considerations.The N exchanged bits can, however, without loss of generality be viewed as encoded at a density of no more than 1 bit per 4 l 2 P on an ancillary, spacelike boundary B separating A from B. Bousso's covariant formulation follows as a special case whenever B is considered a physical boundary traversed by a light sheet, i.e. whenever photons (or indeed any gauge bosons) are considered to be "carriers" of the exchanged information.
After setting up the formalism in §2, we show in subsequent sections how: 1.The GHP enables a provably general theory of quantum measurement [17] that is fully consistent with the free-energy principle (FEP) introduced by Friston and colleagues [18,19,20,21,22] as a description of active inference by Bayesian agents ( §3).
2. The GHP provides a natural definition of system-relative entropic time applicable to any physical system A that is separable from its environment B ( §4).
3. The GHP allows us to view B as a scattering center and its areal elements as encoding S-matrix elements ( §5).
When generalized to the GHP, therefore, the HP does not "stand by itself" but is rather a fundamental principle from which much familiar physics follows.It has, moreover, a simple and intuitively obvious physical meaning: GHP (informal): Any classical information exchanged between finite physical systems is encoded on the boundary between them.
Intersystem boundaries are, in other words, classical information channels.We review in §6 various statements of this same principle that have been derived in statistical physics, computer science, and the life sciences.We conclude that the HP is a foundational principle not just of physics, but of all of science.
2 Holographic screens are information encoding boundaries

Physical interaction is information exchange
Let U be an isolated, finite-dimensional quantum system and consider an arbitrarily-chosen bipartite decomposition U = AB corresponding to a Hilbert-space tensor product where K B denotes Boltzmann's constant, T is the absolute temperature of the environment, k = A or B, the M k i are N mutually-orthogonal Hermitian operators with eigenvalues in {−1, 1}, the α k i ∈ [0, 1] are such that N i α k i = 1, and β k ≥ ln 2 is an inverse measure of k's thermodynamic efficiency that depends on the internal dynamics H k ; see [10,11,12,13,14] for further motivation and details of this construction.For fixed k, the operators 0 for all i, j; hence when expressed as Eq. ( 3), H AB is swap-symmetric under the permutation group S N for each k.We can, therefore, write N = dim(H AB ), i.e. the eigenvalues H AB can be encoded by 2 N distinct N -bit strings.The weak-interaction limit requires N dim(H A ), dim(H B ), although as discussed below, this condition is not sufficient to guarantee separability.
When expressed as Eq. ( 3), H AB can be realized physically as illustrated in Fig. 1.The operators M k i are interpreted, as the notation suggests, as measurement operators, and dually as state-preparation operators [23].As each of the M k i has eigenvalues in {−1, 1}, they can be regarded as z-spin operators s k z(i) acting on individual qubits q i .The orthogonality of the M k i requires the q i to be mutually independent, i.e. non-interacting.Each "cycle" of interaction between A and B then comprises four sequential steps: preparation of the q i by B, measurement of the q i by A, preparation of the q i by A, and measurement of the q i by B. The systems A and B thus exchange N bits of classical information on each cycle.Note that, in this picture, the operators M A i and M B i do not act directly on the Hilbert spaces H B and H A of B and A respectively, but on the N -dimensional effective Hilbert spaces H A <q i > and H B <q i > that specify, from the perspectives of A and B respectively, the states of the q i .As the eigenvalues of the M k i when considered together encode an eigenvalue of H AB , the classical information exchanged is the current eigenvalue of H AB , i.e. the energy transferred by the interaction.The symmetry of the interaction cycle then assures conservation of energy.Figure 1: A holographic screen B separating finite systems A and B with an interaction H AB given by Eq. ( 3) can be realized by an ancillary array of noninteracting qubits that are alternately prepared by A (B) and then measured by B (A). Qubits are depicted as Bloch spheres [24].There is no requirement that A and B share preparation and measurement bases, i.e.QRFs.Adapted from [13] Fig. 1, CC-BY license.
The array q i of noninteracting qubits via which A and B exchange classical information clearly performs the functions of a holographic screen: • The q i separate A from B. The interpretation of the M k i as preparation and measurement operators depends critically on the assumption of separability; if A and B are entangled, i.e. if |AB fails to factor as |AB = |A |B , the idea that A "prepares" or "measures" |B is physically meaningless.
• The q i encode all of the classical information about B accessible to A and vice versa.
Indeed if H U remains unspecified, H A and H B can vary arbitrarily without affecting H AB , as required by 't Hooft's idea of squeezing "an entire universe" into B without affecting A.
We can, therefore, think of the q i as "points" or more accurately "sites" on a boundary B separating A from B and hence separating their respective Hilbert spaces H A and H B , where clearly AB = U requires H A ⊗ H B = H U .This boundary B is, however, entirely ancillary; its states |q i are not elements of H U .The independence of the q i gives B a discrete, indeed, a Grothendieck topology (see e.g.[15]).At this stage of the construction, B is just a discrete topological space that is neither characterized as a space-like nor as a time-like surface.The embedding of B in a d + 1 dimensional spacetime manifold can be achieved through a tessellation in voxels of the latter.Voxels that neighbour each other and cubulate spacetime introduce a concept of distance between qubits.In turn qubits define the nodes of a one-complex that represent the discretization of B. We can provide an illustrative example by embedding B in a 2 + 1 dimensional spacetime manifold.In this case, we can "geometrize" B as shown in Fig. 2 by embedding each of the q i in a (conventionally 3d) voxel of size (2∆x) 2 • 2c∆t where to preserve covariance and hence Eq. ( 2), ∆x ≥ l P and ∆t ≥ t P , with l P and t P denoting the Planck length and time respectively, and c is the maximal speed of classical information transfer, i.e. the speed of light.As B itself is ancillary to H U , this geometry is ancillary to H U .The geometry on B has, therefore, no effect on the physics implemented by the joint system self-interaction H U or the A − B interaction H AB .
Figure 2: One qubit degree of freedom (represented as a Bloch sphere), e.g. a spin, embedded in a 3d voxel at some minimal scale ∆x, ∆t.Here c is the minimum speed of (classical) information transfer.
The GHP as stated above clearly follows immediately, by construction, from the mutual separability of A and B via Eq.(3); Eq. ( 2) and hence the covariant HP follow when the boundary B is geometrized as above.The GHP can, therefore, be viewed as an alternative way of stating the fundamental idea that physical systems must have their own, mutually conditionally independent states if they are to be regarded as interacting.This idea of conditional independence is so deeply embedded in our language -the language of "things" that interact with each other -that it is seldom made explicit.The GHP formalizes an obvious logical consequence of this idea: finite things can only exchange finite information by interacting, and this information has to fit through the finite channel implemented by the boundary that separates them.We will see in what follows that this seemingly-simple fact has significant implications both in physics and in other disciplines.Indeed, it generates, with minimal further assumptions, much of what is considered foundational in physics, computer science, and the life sciences.

Information is strictly conserved
The representation of H AB as a Hermitian operator in Eq. ( 3) depends on the Axiom of Unitarity [25]; in particular, it requires the time evolution operator: to be unitary.The Axiom of Unitarity guarantees that time evolution is reversible, and hence that information is conserved, in any closed system.
Because the GHP follows by construction from Eq. ( 3), it is clearly consistent with the Axiom of Unitarity, and hence with strict conservation of information.Indeed, the informational symmetry of any holographic screen B enforces the conservation of information by preventing any "build up" of information on one side or the other.
If we regard the conservation of information (and hence unitarity) as a fundamental principle analogous to the conservation of energy, then we can formulate it as the Principle that the net information in any closed system remains constant.The joint system U is closed by definition; hence unitarity requires the net information content of U to be constant.We can, therefore, rescale the net information in U to zero.This applies, in particular, to net classical information: Conservation of Classical Information (CCI): The net classical information in any closed system is zero.
If we consider irreversibly encoded classical information, to which Landauer's Principle [26,27,28] applies, CCI clearly follows from the conservation of energy: if the net energy of a system is zero, the net irreversibly encoded classical information in that system can only be zero.Compliance with CCI is guaranteed if we require: Exclusive Holographic Encoding (EHE): Classical information is encoded only on holographic screens.
Indeed the informational symmetry of holographic encoding renders EHE and CCI equivalent.Both can be viewed as "no collapse" principles that render classical information strictly ancillary to the closed-system dynamics H U .Classical information encoded on B is not, however, ancillary to either of the separated systems A or B; this encoded information is input to, or dually output from, A or B by Eq. (3).From the perspective of either A or B, B encodes N = 0 bits of classical information whenever H AB = 0.This is true, for some positive value of N , for any tensor-product decomposition of any closed system U that meets the separability criteria that allow writing Eq. ( 3).Hence we can restate EHE as: Relativity of Classical Information (RCI): All classical information is decompositionrelative.
We will see below that RCI renders both all observable "systems" and all classical memory observer-relative.It thus generalizes the idea -that appears in relative-state [29,30], relational [31], and QBist [32,33] approaches to quantum theory -that quantum states are observer-relative to quantum systems and, critically, to all classically-encoded records of previous observations.It thus renders all such systems and records observer-dependent and hence "non-objective."There is, however, from a theoretical perspective nothing controversial about RCI; it merely restates the Axiom of Unitarity.We suggest, as Bohr [34] and Mermin [35] have before us, that "what quantum theory is trying to tell us" is precisely RCI.

Classical encoding requires free choice of basis
Writing Eq. ( 3) requires choosing the basis vectors |i k , where again k = A or B. In the physical realization of B as a qubit array shown in Fig. 1, choosing the |i k is choosing, for each of A and B, the z axis used to measure s z for each of the q i .This choice determines, for each of A and B, which of the 2 N eigenvalues of H AB is encoded on B. Choosing the |i k is, therefore, effectively choosing the zero-point of energy for each of A and B. These zero points can clearly be different.
Free choice of the |i k by us as theorists is equivalent, operationally, to free choice of |i A and |i B by A and B, respectively."Free choice" is standardly interpreted as freedom from local determinism, e.g. by events in the past light cone [42]; while unitarity of closed-system (e.g.U ) evolution can be read as a form of superdeterminism [43] 3) no longer holds.Free choice of basis is, therefore, required for separability and hence for the GHP.We will see below that this has significant consequences for the theory of measurement.
3 The GHP enables a fully-general quantum theory of measurement

Measurement produces finite-resolution, classical outcomes
Quantum theory is traditionally considered to have a "measurement problem"; indeed since Schrödinger first introduced his cat [36], an enormous literature has devoted itself to the question of how observers can obtain classical information from a quantum world (see [37] for a thorough review and [38] for a recent compendium of philosophical positions).We will, in this section, employ quantum theory and the GHP to construct a fully-general theory of measurement; this construction was developed in [10,11,12,13,14,16], to which readers are referred for further details.We show in [17] that this theory reduces, in its classical limit, to the well-established theory of active inference derived from the classical FEP [18,19,20,21,22].We suggest that this holographic quantum theory of measurement obviates the traditional measurement problem, though aside from the brief comments made earlier, we defer discussion of its relations to any of the plethora of philosophical interpretations of quantum measurement to future work.
The goal of measurement is to obtain recordable, reportable observational outcomes that can be compared to the outcomes of measurements carried out at other times or by other observers.Such measurement outcomes must be encodable in a thermodynamically irreversible way as classical information on classical memory devices, e.g.pieces of paper, transistor arrays, or weight values on connections in a neural network.As thermodynamically irreversible classical encoding has a finite energy cost of at least ln 2 k B T per bit [26,27,28], any such observational outcome must be encodable as a finite bit string.This is in fact obvious -no physical apparatus has infinite resolution -but is often neglected when observational outcomes are represented by unrestricted real numbers.
A theory of measurement must, therefore, address three questions: • It must provide a formal mechanism for mapping physical interactions to finite classical encodings of observational outcomes.
• It must provide a formal mechanism -operationally, a semantics -that enables encoded observational outcomes to be meaningfully compared.
• It must provide a mechanism that supplies the free energy required to support irreversible classical encoding.
A theory of measurement must, in other words, enable saying what it means, both operationally and thermodynamically, to claim to have measured a length of 0.300 ± 0.001 m by interacting with a wooden board via a meter stick, and then to have written down the result on a piece of paper.
The GHP enables addressing these questions precisely by localizing available classical data, available recordable memory, and available free energy to the single boundary B. This co-localization of informational and thermodynamic resources to B has three immediate consequences: • A proper subset or sector F of the bits encoded on B are accessible to an "observer" A only as free energy that can be employed to fund processing other inputs or writing data to memory.
• Any observational outcomes recorded by A must be written on B, and are therefore exposed to the "world" B.
• Obtaining new observational outcomes and recording previous ones compete for free energy resources, with the measurement resolution and hence allocated numbers of bits as the tradeoff parameter.
The GHP requires, in other words, that "observers" be treated as physical systems subject to resource constraints.The symmetry between state preparation and measurement required by Eq. ( 3), moreover, renders B informationally symmetric: A and B both have access to exactly the same N qubits.The labels "observer" and "world" are, therefore, for convenience only; the two parties to the interaction have exactly the same roles as physical systems.This has a further important consequence: the classical idea of "passive observation" is ruled out in principle.Obtaining information from B requires acting irreversibly on B, expending free energy in the process.As Wheeler [39] put it, "No question?No answer!"

Measurements are given meaning by quantum reference frames
Observational outcomes are rendered comparable, and hence physically meaningful, through the use of reference frames (RFs).Outcomes of length measurements, for example, are rendered comparable and hence meaningful by being assigned units, meters, that refer to the standardized definition of a meter, and via this to the intrinsically-spatial concept of the speed of light.Operationally, any RF attaches units of measurement, and hence a semantics, to an observational outcome.Outside of metrology or the laboratory, the physical implementations of RFs are often neglected in classical physics.When considered in the context of quantum theory, any RF must be physically implemented by a quantum system and therefore must be considered a quantum RF (QRF) [40,41].Meter sticks, clocks, even the Earth's gravitational and magnetic fields are physically-implemented RFs and hence are QRFs, as are all items of laboratory apparatus.
Let Q be a QRF with an internal dynamics H Q .As emphasized by [41], by virtue of being a quantum system Q encodes "nonfungible" information, i.e. information that cannot be written as a finite bit string.This nonfungible information can be thought of as the quantum phase information encoded by an instantaneous pure state |Q .The existence of this nonfungible information indeed follows from the GHP: the information about Q obtainable by any external observer A is limited to the eigenvalue of H QA finitely encoded on their mutual boundary.This finite encoding does not, in principle, fully specify H Q .
Operationally, any QRF Q is a physical implementation of a quantum computation: that reversibly maps between "raw" data representable as n-bit strings and meaningful observational outcomes representable as m-bit strings, with the forward mapping implementing measurement and the reverse mapping implementing preparation.As developed in previous work [12,13] and proven in the general case in [16], any such computation can be given a category-theoretic representation as a Cone-Cocone Diagram (CCCD): where the nodes A i are Barwise-Seligman [44] classifiers.The node C is a Barwise-Seligman classifier that is both the colimit of the incoming arrows f j and the limit of the outgoing arrows h j , and all arrows are morphisms ("infomorphisms") between such classifiers [45].A Barwise-Seligman classifier implements a satisfaction relation A between "tokens" and "types" in some language.Letting the tokens be bit strings in {0, 1} n and the types be bit strings in {0, 1} m , we can consider A to be given by an n × m real matrix P ij , where each element p ij represents the probability that the i th token belongs to the j th type [13,16]; when all p ij ∈ {0, 1}, binary classifiers as originally defined in [44] are recovered.Letting A and B be classifiers with tokens in Tok(A) and Tok(B) and types in Typ(A) and Typ(B), respectively, an infomorphism is a pair of maps − → f and ← − f such that the following diagram commutes: Infomorphisms thus provide informational 'semantic coherence' between classifiers, and are further amenable as such when the local logics of a (regular) theory are taken into account to create logic infomorphisms [44, §12] (reviewed in [45]). 1ommutativity of CCCDs, i.e. diagrams of the form (6), is guaranteed by the definition of C as both a limit and a colimit of infomorphisms to and from the A i .Such diagrams can be arbitrarily elaborated by the addition of intermediate "layers" of classifiers with appropriate incoming and outgoing infomorphisms provided this commutativity condition is respected [17,51].CCCDs are naturally interpreted as automorphisms {0, 1} k → {0, 1} k implemented by passage through a constraint network having the classifier C as its apex; they can be interpreted as implementing variational auto-encoders (VAEs) or arbitrary Bayesian networks as discussed in [17,51].More generally, they can be taken to represent most types of functional (directed) graph networks along with their underlying quiver representations [50] as applied in [16].
Non-commutativity of CCCDs, typically when C is undefinable for any hierarchical Bayesian network, for instance, is a compelling separate issue affording criteria for intrinsic or quantum contextuality as formulated by the results of [51, §7] and [16, §7.2] (cf.[52,53,54]).Essentially, such criteria involve the non-existence of any "globally" definable (conditional) probability distribution across all possible observations.Regarding this separate issue, there is much to be said (and to be amplified elsewhere) in light of the non-commutativity results which are closely tied in with the development of the GHP and QRF formalism as presented here.For now, let us briefly comment upon the relevance: non-locality in QFT is a special case of quantum contextuality (see e.g.[55,33,56] for summaries of the Bell-Kochen-Specker theorems).In the quest for designing robust, fault-tolerant (FT), massive-scale performing quantum computers, quantum contextuality turns out to be an essential resource for quantum-speed up, encompassing such powerful techniques as magic state distillation (MSD) [56], and related quantum computation by state injection (QCSI), as established for qubits relative to measurement-based quantum computation [57].
The "raw data" available to any QRF Q implemented by an observer A are the eigenvalues +1 or −1 returned by some subset of the operators M A i .We can, therefore, represent any such Q as a CCCD "attached" to the boundary B as shown in Fig. 3.The measured bits are prepared by the action of the "world" B's corresponding operators M B i ; the CCCD acts back on the measured bits to prepare them for subsequent measurement by B, preserving the symmetric, cyclic interaction required by Eq. (3).The classifier A i accepting an input of +1 or −1 from the measurement operator M A i can be defined to execute any function ϕ : {1, −1} → [0, 1], i.e. to assign any probability value.Operationally, therefore, the classifier A i acts as the local z axis with respect to which the qubit q i on which M A i acts is measured or prepared.This is in fact obvious: the local z axis must itself be a physically-implemented one-bit QRF.Choosing the local z axis is, as discussed earlier, equivalent to choosing the basis in which the M A i are expressed.Free choice of basis for the M A i implies, therefore, free choice of QRFs; a QRF acting on the outputs of a subset of operators M A i . . .M A k effectively sets the local basis for these operators.Free choice of QRFs enables observers to treat the data encoded on different components of their boundaries differently, and hence to distinguish "systems of interest" from their surrounding environments.Formally, implementing a QRF Q breaks the S N swap symmetry of B by assigning the qubits measured by the M A i . . .M A k the functional role of "inputs" to Q. Free choice moreover entails, consistent with the discussion in §2.3 above, that A and B can choose different QRFs, and hence both process and prepare states of B in different ways.Sharing QRFs across B induces entanglement [13,17], a point to which we will return in §4.2 below.

Observable systems and their pointer states are defined by QRFs
Consider a simple and canonical case of measurement: at some time t i , a human observer A measures the time-dependent pointer state |P , or at lower resolution a pointer-state density ρ P , of some system S of interest, e.g. an item of laboratory apparatus.For S to be measureable, it must be part of A's observable "environment" E. Prior to measuring |P , the observer must identify S, distinguishing it from the rest of E, including other items of apparatus that are not S as well as the rest of the laboratory and beyond.This identification process cannot depend on the state |P , which is of interest as measurement target precisely because it is both unknown and time-dependent.Identification instead depends on some "reference" degrees of freedom of S, e.g. its size, shape, color, labeling, or location, that are time invariant, i.e. that maintain some constant state |R or state density ρ R .Figure 4 illustrates this commonplace scenario.The data specifying the state |E of the undifferentiated environment, and the state |S = |R |P of the identified system of interest to A, can only be proper subsets or sectors of the data encoded on B and measured by the M A i .These sectors must, moreover, be disjoint from the thermodynamic sector F from which A extracts sufficient free energy to support any classical information processing.We can, therefore, identify proper subsets of operators M E i , M R i , and M P i , dropping the superscript A to simplify the notation.The outputs of these subsets of operators are processed by QRFs that can be labeled E, R, and P without ambiguity; the sectors specifying the states |E , |R , and |P are simply the domains of the operators M E i , M R i , and M P i , respectively, and hence of these QRFs.The fact that S is part of E requires that {M R i }, {M P i } ⊂ {M E i } and, therefore, that R and P are proper components of E. Co-measurability of R and P requires that R and P are decoherent both from each other and from the remainder of E [11,12,13].
Developing a model of the behavior of P that enables predictions of future states requires, at minimum, recording measurements of |P taken at multiple times to some classical memory.Typically at least some of the ambient background conditions encoded in |E as a whole are also recorded.Like any physical system with which an observer interacts, a memory Y must be identified before being written on or read from.The considerations adduced above for any system S thus apply equally to any memory Y .The basic elements of a quantum theory of measurement can, therefore, be depicted as in Fig. 5.They include not only the QRFs discussed here and the thermodynamic flows that power them, but also a time QRF that provides a measure of duration between writes to memory and hence an effective time stamp.This timekeeping system will be discussed further in §4 below.The "picture" of measurement illustrated in Fig. 5 differs in significant ways from that introduced by von Neumann [25] and reproduced in most textbooks.Most obviously, it treats the observer as a physical system with a particular functional architecture, not as an abstraction.It enforces, via the GHP, the requirement of separability between the observer and the world being observed; without this, the observer lacks a conditionally-independent state and the idea of "measurement" becomes meaningless.The GHP also restricts the observer's access to the "bulk" degrees of freedom of the world: the operators H A i act not on the Hilbert space H B of B, but on the much smaller effective Hilbert space H A B of the boundary B. This renders all observed "systems" observer-relative and hence "personal" in the sense emphasized by QBists [32,33].Observed "systems" here include memory devices and, significantly, other observers; hence the theory of measurement requires a physical theory of classical communication that has yet to be fully developed [13,17].Perhaps most subtly, Fig. 5 involves no assumption of a background spatial embedding.It treats 3d space as a QRF that A may or may not be capable of deploying.Hence Fig. 5 is consistent with approaches to QG in which spacetime is fully emergent from underlying informational or other processes.The field theory that naturally follows from Fig. 5 is, therefore, a topological quantum field theory (TQFT), not a QFT on a background spacetime.

Sequential measurements induce TQFTs
We have shown in [16] that given the GHP, sequential measurements of any sector S of a holographic screen B induce a TQFT on S. We also show how this TQFT can be realized as a quantum topological neural network (TQNN), a generalized representation of a standard deep-learning system [58].Here we briefly summarize the main result and mention some of its consequences, referring readers to [16] for details.
A TQFT can be represented as a functor from the category of cobordisms to the category of Hilbert spaces [59,60].We prove in [16] that any QRF can be represented as a CCCD, and then construct a category with CCCDs as objects and morphisms of CCCDs, which must by definition respect the commutativity of CCCDs as diagrams, as morphisms.The category is, effectively, a category of QRFs, in which the morphisms represent sequential choices of QRF to be applied to the data encoded on some sector S. We show that all such choices can be represented by one of two diagrams.Using the compact notation S (8) to represent a QRF S, we can represent measurements of a physical situation in which one system divides into two, possibly entangled, systems with a diagram of the form Parametric down-conversion of a photon exemplifies this kind of process.The reverse process can be added to yield: Diagram (10) represents a relabelling of subsets of the base-level classifiers that act on the sector S: In the second type of sequential measurement process, the pointer-state QRF P is replaced with an alternative QRF Q with which it does not commute.Sequences in which position and momentum, s z and s x are measured alternately are examples.These can be represented by the diagram Again this can be written as a relabeling of classifiers, leaving the pointer-state classifiers that are traced over when measuring only the reference component R for system identification implicit, as: where the notation Ãl indicates that A l has been rewritten in a rotated measurement basis, e.g.s z → s x or x → p = m (∂x/∂t).As both P and Q must commute with R, the commutativity requirements for S are satisfied.
Measurement sequences of the form of Diagram (10) can be mapped to cobordisms of the form while sequences of the form of Diagram ( 12) can be mapped to cobordisms of the form: In either case, F : CCCD → Cob is the required functor from the category CCCD of CCCDs to the category of Cob finite cobordisms.In general, we can state: Theorem 1 ([16] Thm. 1).For any morphism F of CCCDs in CCCD, there is a cobordism S such that a diagram of the form of Diagram ( 14) or (15) commutes.
referring to [16] for the proof.
Theorem 1 has a number of immediate consequences, chief among which is that any effective field theory (EFT) defined on S must be gauge invariant [14,16].The GHP, therefore, not only generates a default physical theory -a TQFT -of any observable system, but strongly restricts any geometrization of that theory.Indeed the results obtained in [16] strongly suggest that observable spatial geometry, including the Minkowski metric and the Einstein equations of GR, is induced by symmetries of the QRFs employed to identify observable systems as such over time.If this proves to be the case, it will reconceptualize "space" as a quantum informational structure even at macroscopic scales. 4 The GHP provides local definitions of entropy and time

Operations on B implement Wick rotations
The GHP generalizes the Bekenstein area law [2] to a statement applicable to any boundary B implementing an interaction H AB between finite, separable systems, i.e. an interaction that can be written as Eq. ( 3): A boundary B implementing an interaction H AB between finite, separable systems has thermodynamic entropy S(B) = N , where N is the dimension of H AB .
Geometrization of B then requires N ≤ A(B)/4 as discussed in §2.1 above.The S(B) is thus conceptually an entropy as defined by Shannon [61]: the width of a classical communication channel.
Unlike S(B), the thermodynamic entropy S(B) of the physical system B is neither specified nor rerstricted by H AB .From A's perspective, however, B is a source of both usable free energy and classical information as illustrated in Fig. 5. Relative to A, therefore, S(B) cannot decrease, i.e.B cannot become a free-energy or classical-information sink.The 2nd Law thus holds for A, independently of either H B or of any details of A's predictive models, if any, of B's behavior.The informational symmetry of B guarantees the same is true for B: relative to B, S(A) cannot decrease.This observer-relativity of thermodynamic entropy has previously been emphasized by Tegmark [62].
As discussed in §3.3 above, the idea of sequential measurement, and hence the idea of recordable time, is only physically meaningful for observers able to write data irreversibly to a classical memory.The action of writing to a memory sector Y defines an A-specific, local time QRF t A as illustrated in Fig. 5.The most natural unit of t A is the minimal time to write one bit, to which time-energy complementary gives a minimum value of h/[ln2(K B T A )], with h being the Planck's constant.The bit-counting process can be implemented by an operator G ij that advances t A by one unit i → j; formally, G ij is a groupoid element [12,13].The rate at which A's "clock" G ij "ticks" is determined by A's thermodynamic efficiency.
The local time QRF t A is clearly entropic: it counts recordable information received from B and hence increments as (A-relative) S(B) increments (see [63] for a similar account of entropic time).Thus it is natural to interpret the measured time t A as "flowing" with the passage of information from B to A. The informational symmetry of B allows us to represent t B in the same way, as illustrated in Fig. 6.We can, therefore, see the GHP as giving a physical meaning to the Wick rotation [64], namely to the prescription that "inverse temperature is imaginary time": a measurement operation performed by A on a qubit received from B induces the "collapse" of the qubit into a certain eigenstate ε, namely |q t → e −ıεt |ε ∼ |ε , where we can write |ε as a pure state because its time-phase dependence is not observable.The pure state hence recovered by the QRF of A represents the element of classical information that is processed thermodynamically on B, hence subjected to a thermodynamic distribution e −ε/(K B T ) .Encoding of information can be seen therefore as a Wick rotation ıεt → ε/(K B T ).A further such process of "reading" performed by B can be understood as a backward evolution in time of the qubit before irreversible encoding happens, or as an evolution of the missing (virtual, because of irreversible encoding has happened) qubit of energy −ε.This virtual evolution of a "hole"-like qubit would in turn correspond to a second Wick rotation, with same axis of rotation, reverting the axis of time.In other words, each operation on a qubit of B rotates the local time vector by − → ı , so a combined cross-B write-read operation in either the B-to-A or A-to-B direction implements − → ı twice and reverses the local time direction.Energy-momenta and angular momenta are not conserved during the encoding process: the energetic cost of the rotation is rather understood in terms of an irreversible bit encoding, i.e. at least ln2(K B T A ). Figure 6 shows that any system A satisfying the separability conditions required by the GHP can be viewed as interacting with its own future.Hence any decomposition U = AB that respects separability can be viewed as decomposing H U along a temporal boundary.
When the boundary B is given the geometry shown in Fig. 2 and hence required to respect covariance, this temporal decomposition has an obvious interpretation: any separable system interacts exclusively with its own future light cone.The implementation of Wick rotation by actions on B is thus intimately tied to the role of gauge bosons as information carriers and hence to the Minkowski metric as a representation of the time dependence of information flow.Let ρ A = tr B |ψ ψ| be the reduced density matrix of A obtained by taking a partial trace over B of the total density matrix ρ = |ψ ψ| of the joint system AB.Recall that the (von Neumann) entanglement entropy S(A) of A in the bipartite decomposition AB is given by (see e.g.[24,65]):

QRF sharing induces entanglement
If the joint state |AB is no longer separable, the entanglement entropy S(AB) is nonzero.
In the situation described above, we can localize this entanglement entropy to the particular sector dom(Y A ) = dom(Y B ) of the decompositional boundary B; in this case its maximum value is the dimension of this sector, i.e.S(AB Requiring that Y A and Y B compute the same function ψ, that dom(Y A ) = dom(Y B ), and that t A and t B have equal periods is, effectively, requiring that Y A and Y B are the same QRF.We have previously shown [13] that QRF replication across a boundary B is forbidden by the no-cloning theorem [74].Briefly, the information on B does not determine |A , so it is insufficient for B to replicate |A , and hence insufficient for B to replicate any QRF state |Q that is a component of |A [10,13].No cloning of unknown states is, in this and indeed in any case, a straightforward consequence of the nonfungibility of quantum information.
The above discussion exemplifies this: the assumptions that dom(Y A ) = dom(Y B ) and that t A and t B have equal periods can be made only as a priori stipulations, as neither of these conditions can be inferred from the data encoded on B. The GHP, therefore, provides a mechanism that enforces no-cloning by restricting the transfer of information between A and B to the information encoded on B.
These results, together with those of the previous sections, provide a novel characterization of some standard concepts, including AdS/CFT as developed by [8] and others, while working throughout with both the bulk distribution and the boundary degrees of freedom specified in terms of binary qubits.There are a number of related results.For instance, Ryu and Takayanagi [65] start from the AdS/CFT correspondence to develop an HP-motivated derivation of entanglement entropy in d + 1 dimensional CFT as obtainable from the area of a d-dimensional minimal surface Σ in AdS d+2 , a result analogous to the Bekenstein-Hawking formula for BH entropy.Related is the proposal of Lee [68] that entanglement arises from the HP, in so far that all the information specifying the physics bulk bits can be described in terms of qubits on the holographic boundary; a result which appears consonant with our development of ideas as presented above.An interesting direction of research hinges on substituting the AdS bulk with either a TQFT or an extended version of it, for instance an extended BF theory.An attempt for an exact holographic mapping, with emergent space-time geometry recovered along the lines of [69], has been investigated in [70], where a relation between loop quantum gravity and tensor networks has been explored accounting for bulk-boundary duality and holographic entanglement entropy.From a genuine TQFT perspective, it would be tempting to analyze the connection between AdS bulk and BF theories, with holographic boundaries provided by Chern-Simons theories with punctures.
In lower dimensions, interesting results have been reported in 2d dimensional spaces with holographic boundaries involving the SYK model [71,72].More generally, Wootters [73] reviews entanglement of pure states (informationally reversible), mixed states (irreversible, since in creating a mixed state from a pure state some information is forsaken), and entanglement of formation which is intimately tied to the notion of computational concurrency where explicit formulas are available.

Areal elements are scattering centers
Replacing curved arrows with angled arrows and an explicit qubit array with B, we can re-draw Fig. 1 as Fig. 7. Viewing the arrows as depicting information flow as before, the areal elements of B function as scattering centers.Inspecting this scattering diagram, two things are immediately apparent: • From either A's or B's perspective, transmitting information across B is indistinguishable from scattering information off B.
• Comparing Fig. 6 and Fig. 7, scattering information off B reverses its temporal direction with respect to the local time QRF t A or t B .
The first of these points enforces the informational symmetry of B and hence enforces unitarity.The second requires any "carrier" of information to be its own antiparticle.Following this idea of information as subject to scattering and representing the total initial and final informational states of k = A or B as |out k and |in k respectively, we can write an S-matrix: where the reversal of the usual order reflects the fact that information always flows from a preparation device to a measurement device.This S k is N × N ; Figs. 1 and 7 depict it in bases in which it is diagonal.As S k is simply a representation of Eq. (3) for system k, we have the expected conclusion: Every interaction between separable systems can be represented as scattering.
This conclusion rests solely on the GHP, requiring no assumptions about an embedding geometry.
Adding the geometry shown in Fig. 2 to B allows B to be viewed as a "physical" horizon, e.g. the (stretched) horizon of a BH, the motivating system of interest for both the Bekenstein area law and the HP. Figure 7 then represents traversal of or scattering from the horizon as observed by either the BH interior B or the external "rest of the universe" A. Coupled pair production events near the horizon yield symmetric diagrams of this kind [13]; hence Fig. 7 is consistent with the observation of Hawking radiation by an asymptotic observer "embedded" in A. That the formation and evaporation of a BH could be considered a scattering process was originally proposed by 't Hooft [75] and has since been given an explicit formulation [76].
It is important to emphasize that the GHP requires and Fig. 7 represents a fixed decomposition U = AB for which |AB = |A |B and hence fixed Hilbert-space dimensions for A and B. It does not, therefore, represent net transfers of degrees of freedom across B. "Collapse" or "infall" processes and, dually, "evaporation" processes alter the interior geometry of B but not its Hilbert-space dimension.The information encoded on B can be altered by these processes, but the dimension N and the topology of B remain invariant.Such topologically-invariant models of BH evolution have been developed previously based on the proposed implementation of entanglement by Einstein-Rosen bridges [77,78].

HP-like principles appear in multiple disciplines
We have thus far considered the physical meaning of the HP, as generalized to the GHP, from the perspective of quantum information theory.We now broaden this perspective to consider principles analogous to the HP that have been formulated independently in other disciplines.Such principles have received wide-spread application, suggesting that the HP is in fact a general principle of not just of quantum theory, but of all of science.

Markov blankets and the FEP
The idea of a Markov blanket (MB) was formulated by Pearl [79] to capture the emergent conditional independence of disjoint components of finite causal networks with Markovian dynamics by judiciously defining the boundaries of the systems in question.The MB of any node or subnetwork X of such a causal network comprises all nodes that are parents of X (i.e.nodes with arrows to X), children of X (i.e.nodes with arrows from X), or other parent's of X's children, as shown in Fig. 8.All information exchanged between the node or subnetwork X and the nodes exterior to its MB must traverse the MB; the MB thus functions as a finite classical information channel between X and its external "environment" E. The separation between X and E imposed by the MB renders them mutually conditionally independent3 .
Figure 8: a) The MB of a node X in a causal network comprises the parents and children of X together with any other parents of X's children.b) The MB is effectively an information channel separating X from its environment E. From [81], Fig. 2; used with permission.
Friston [21] has shown that any random dynamical system that has a non-equilibrium steady state (NESS) solution to its density dynamics i) has an internal dynamics that is conditionally independent of the dynamics of its environment, and hence has a MB, and ii) will continuously "self-evidence" by returning its state to (the vicinity of) its NESS.Satisfying these conditions is required of any system that is observable as such over time, i.e. any system for which sequential measurements as considered in §3.4 above.Such systems can be described as minimizing a variational free energy (VFE) functional that effectively measures their uncertainty about their environment's future behavior.The free energy principle (FEP) is the statement that any random dynamical system meeting the above two conditions, i.e. any system for which sequential measurements are possible, will behave in a way that asymptotically minimizes its detected VFE.This way of characterizing random dynamical systems gives rise to a "Bayesian mechanics" [22] that reformulates classical physics in decision-theoretic language within a scale-free computational architecture that is applicable, in principle, from the molecular and cellular levels up to the cosmological.In such a model, aptly described as Bayesian selection, natural selection itself can be viewed as structure learning based upon the model evidence encoded by some phenotype [80].
Using the tools reviewed in §3 above, we have shown [17] that the FEP can be reformulated for generic, finite quantum systems meeting the separability required by Eq. ( 3).The MB is, in this case, implemented by a holographic screen B compliant with the GHP.Strict minimization of VFE drives systems to share QRFs across B, i.e. drives them to entanglement as discussed in §4.2 above.The FEP is, therefore, asymptotically a restatement of the Axiom of Unitarity.

Multiple realizability and virtual machines
The foundational principle of computer science is the Church-Turing (CT) thesis, which states that any computable function can be computed by the λ-calculus [82] or a Turing machine [83].While the CT is often regarded simply as establishing two universal models of computation, at a deeper level it states the multiple realizability of computation: any process that emulates, or can be emulated by, the λ-calculus or a Turing machine can be considered a computational process.The CT thesis thus underlies a definition of computation in terms of emulation: any process that can be (usefully) interpreted as a computation is a computation [84].
As emphasized in [84], interpreting a physical process as a computation relies on finiteresolution observations of some finite number of sequential states.A process useful as a computer must, moreover, allow manipulations that return it to some (quasi-) stable state from which it can be perturbed into a set of distinct "input" states.Whether an arbitrary such system will reach a (quasi-) stable state and hence "halt" after a finite number of computational steps from some finite input cannot be determined algorithmically [83]; this is the Halting problem [85].Whether a finite, step-by-step description of the observed behavior of an arbitrary system following any one or more of some circumscribed set of input perturbations specifies a computation of some nontrivial function is similarly algorithmically undecidable (Rice's theorem [86]).
The concept of a "black box" (BB) formulated in classical cybernetics provides an alternative statement of multiple realizability that does not depend explicitly on the theory of computation [87].A BB is similarly a physical system that permits finite numbers of finite-resolution perturbations and observations.The interior of the BB is considered to contain a "machine table" that determines the next output given the history of inputs.Finite sequences of perturbations and observations are insufficient to determine the machine table of an arbitrary BB (Moore's theorem [88]).
In practice, multiple realizability allows multiple distinct hardware architectures to compute the same functions, and multiple programming languages with widely differing syntax and semantics to have the same computational power.It enables layered computing architectures in which each layer treats the layers both above and below as virtual machines (VMs) that have specified, finite application programming interfaces (APIs) but are otherwise unconstrained in implementation [89].The top-level VM is the user interface, which allows the user finite manipulations and observations of the behavior of the underlying architecture while "hiding" implementation details.While the implementation details of practical computers are accessible in principle, reverse-engineering them from behavioral observations becomes increasingly difficult as the depth of architectural layering increases, and rapidly becomes intractable if components are distributed across a network that supports asynchronous communication.
As quantum systems, quantum computers encode nonfungible information, rendering the reverse engineering problem unsolvable in principle.Quantum artificial neural networks (QNNs) generalize classical artificial neural networks, which are Turing equivalent.Conven-tional QNNs can be further generalized to topological QNNs (TQNNs), which as structured as spin networks, are tensor network representations of TQFTs, and hence fully compliant with the GHP as discussed in §3.4 above [16,58].It is in view of such tensor networks, and the development of several sections here in relation to the boundary B (as in e.g §4. 2) that further open up connections with the AdS/CFT correspondence to be pursued in future work.Enticing is the question: "is spacetime a quantum error-correcting code (QECC)?" (reviewed in [90]).Different slants and interpretations of this question are discussed in [90].For instance, the hypotheses of [91,92], involve space in the bulk as emerging from boundary systems that can realize the structure of a QECC; [93] suggests that the connectivity of spacetime in the bulk is related to the entanglement structure of the codespace of a boundary CFT subject to its QECC.

Active inference and interface theories
The primary application of the FEP has been to biological systems, where it underpins the idea of Bayesian "active inference" in which living systems increase their predictive power not just through learning, but also through active manipulations of their environments [18,19,20].This principle of active inference, or curiosity-driven learning, is both scaleindependent and applicable not just to motions or other actions in 3d space but also to actions in more abstract state spaces, e.g.those of the genome or the metabolic system [94,95].Indeed we have recently shown that the FEP drives the high fan-in, high fanout "neuromorphic" organization of sensory and effector systems that is ubiquitous across biology at all scales [96].
The goal of any system employing active inference is to reduce VFE over the long term by learning to predict how its environment will act on its MB.Crucially, enacting selforganization necessitates the emergence of boundaries defining the separation of internal from external states [80] (and §6.1 here).Prediction is accomplished by a computational system that is a generative model, in the sense of the Good Regulator Theorem [97], of its environment as represented on its MB.Such models may incorporate "metacognitive" components that represent the system itself, again via the system's actions on its MB, which include memory writes as discussed in §3.3 above [98].Active inference systems have, by definition, no direct access to their environments beyond their MBs.The MB acts, in this case, as a system-environment interface, in the sense of an API.
Interface theories of perception and action have also been developed independently of the FEP, particularly by Hoffman and colleagues [99,100,101], who also show explicitly that natural selection processes do not favor "veridical" perception beyond the interface [102,103,104].Both spacetime and perceived "objects" are explicitly emergent from computational processes implemented by the perceiving agent -effectively, its generative model -in this theoretical setting [105].

Conclusion
We have shown here that the HP, particularly when generalized to the GHP, is not merely "an apparent law of physics that stands by itself" but rather a deep, foundational principle.It is a principle of restricted access.It lays a non-negotiable price on separability: if two systems are separated by a boundary B, their access to each other is limited to the information B itself can encode.When stated in this way, the HP seems shockingly obvious.When its implications for our ordinary concepts of "objects" and "spacetime" are pointed out, however, it can seem deeply mysterious.The idea that classical information is decomposition-relative -and hence is observer-relative -strongly challenges our pretheoretical sense of an "objective reality" shared by all physical systems.As Wheeler [106] points out, this challenge lies at the very heart of quantum theory.
What is perhaps most significant about the HP, however, is its emergence over the past century as a foundational principle not just of physics, but of all disciplines that directly address information transfer between separated systems.Its ubiquity speaks simultaneously to the fundamental unity of science and to its fundamental limitations as an empirical enterprise.

Figure 3 :
Figure 3: Attaching a CCCD to a subset of measurement operators M A k , . . .M A n by identifying the binary eigenvalues of the M A i with binary inputs to the A i .Only the incoming arrows are shown for simplicity; adding equivalent but reversed outgoing arrows completes the CCCD.The CCCD specifies a function computed by the internal dynamics H A , i.e. a QRF deployed by A. Adapted from [13] Fig. 3; CC-BY license.

Figure 4 :
Figure 4: Identifying a system S requires identifying some proper component R that maintains a constant state |R (or density of time-averaged samples ρ R ) as the "pointer" state |P (or density of time-averaged samples ρ P ) of interest varies.Adapted from [12] Fig. 2, CC-BY license.

Figure 5 :
Figure 5: Cartoon illustration of QRFs required to observe and write a readable memory of an environmental state |E .The QRFs E and Y read the state from E and write it to the memory Y respectively.Any identified system S must be part of E. The clock G ij is a time QRF that defines the time coordinate t A .The dashed arrow indicates the observer's thermodynamic process that converts free energy obtained from the unobserved sector F of B to waste heat exhausted through F .

Figure 6 :
Figure 6: Local times t A and t B flow in opposite directions across B. Each "write" or "prepare" operation on B thus implements a Wick rotation − → ı of the local time, with a total energetic cost of a combined write-read of at least /ln2(K B T A ).

Figure 7 :
Figure 7: Fig. 1 re-drawn to represent areal elements as scattering centers.
Figures 5 and 6 enable a simple and intuitive understanding of the relation between free choice and separability, and of the approach to entanglement as these are violated.Suppose A and B implement QRFs E A , E B and Y A , Y B , respectively, such that E A and E B compute the same function ϕ and Y A and Y B compute the same function ψ.As arbitrarily many distinct physical systems can compute any given function, this is merely an assumption of shared classical information processing.Now assume that dom(E A ) = dom(E B ) and that dom(Y A ) = dom(Y B ), with dom denoting a function's domain, i.e. that each pair of operators acts on a shared subset of encoded bits.This is a quantum assumption, as it is an assumption about how E A , E B and Y A , Y B are implemented by the internal Hamiltonians H A and H B , respectively.It does not, however, determine the time dependence of the states |A or |B ; in particular, it does not force A's data writes to dom(Y A ) = dom(Y B ) to synchronize with B's data writes.Adding the assumption that t A and t B have equal periods, however, does force such synchrony.With this synchronization assumption, A and B update each other's memory sectors on each cycle.Components of |A and |B that are memory dependent are, in this case, no longer conditionally independent.Hence the joint state |AB is no longer separable.