Quantum Harmonic Analysis of the Density Matrix: Basics

In this Review we will study rigorously the notion of mixed states and their density matrices. We mostly give complete proofs. We will also discuss the quantum-mechanical consequences of possible variations of Planck's constant h. This Review has been written having in mind two readerships: mathematical physicists and quantum physicists. The mathematical rigor is maximal, but the language and notation we use throughout should be familiar to physicists.


Introduction
Quantum harmonic analysis When writing a Review on the density matrix you have several options. You can use the standard approach found in physics graduate textbooks, but then you can't go very far, not only because the important mathematical tools are lacking, but also because the arguments are often flawed and incorrect: most of them strictly speaking only apply when the underlying Hilbert space is finite-dimensional (I am thinking here, among other things, about "trace taking" procedures for obtaining the averages of observables which are most of the time mathematically undefendable in infinite dimension in the absence of extra conditions on these observables). Then, at the other extreme, you can use the C *algebraic approach. This is certainly the most beautiful and elegant (and correct) way to introduce the mixed states of quantum mechanics and their density matrices; after having listed some definitions and properties from the theory of operator algebras the GNS construction then allows the passage from this abstract theory to the usual Hilbert-space picture of quantum mechanics. Unfortunately, this approach requires lots of preparatory work to become accessible, and it is often difficult for the uninitiated to grasp the physical meaning of the tools that are used. I have chosen here a third way, which has the advantage of being both mathematically rigorous and intuitive. It consists in using functional analysis together with tools from harmonic analysis on phase space, especially Weyl-Wigner-Moyal theory (which is one way to "quantize" classical observables). We will call this way of doing things the "quantum harmonic analysis" approach. One of its advantages is that it fully justifies from a mathematical viewpoint the introduction of the Wigner function of a density matrix, which otherwise appears as an ad hoc object pulled out of thin air. There is another major advantage to this approach: it highlights the sensitivity of the theory of density matrices on the choice of the value of Planck's constant, and this is precisely one of the main themes we want to address. We will mostly deal with continuous-variable systems with an infinite-dimensional Hilbert space described by observables with continuous eigenspectra.
Variability of Planck's constant One topic we address in depth in this Review is the sensitivity of quantum states to possible variations of Planck's constant. Physically this is a very controversial question. Setting aside for a moment the debate on whether Planck's constant can vary or not, consider the following situation: we have an unknown quantum state, on which we perform a quorum of measurements in order to determine its density matrix ρ. Now, one should be aware of the fact that this density matrix will not be determined directly by these experiments; what one does is to measure by, say, a homodyne quantum tomography, certain properties of that system. Quantum homodyne tomography originates from the observation by Vogel and Risken [92] that the probability distributions determined by homodyne detection are just the Radon transforms of the Wigner function of the density matrix, and that the latter allows us to infer the density matrix using the Weyl correspondence. The method works as follows: suppose now that we have been able to determine a statistical function ρ(x, p) of the position and momentum variables (called "quadratures" in this context), yielding the properties of the quantum system under investigation. If we now identify this function ρ(x, p) with the Wigner function ρ(x, p) = W (x, p) = 1 2π n e − i py x + 1 2 y| ρ|x − 1 2 y d n y of the corresponding density matrix, this relation can be inverted and yields ρ; mathematically speaking ρ is just, up to a factor, the Weyl operator corresponding to the "classical" observable ρ(x, p), which can be written for instance as Now, two essential observations. The first is that when using this procedure we assume quite explicitly that we are using the Weyl-Wigner-Moyal (WWM) formalism: we identify the function ρ(x, p) with the Wigner function, and "quantize" it thereafter using the Weyl transform. This is very good, of course, but one should keep in mind that there are other possible representations in quantum mechanics; a physically very interesting one is for instance the Born-Jordan quantization scheme. We will not investigate the implications of such a choice in this Review, but we want here to emphasize another problem. Even if we place ourselves in the WWM framework we are tacitly assuming that = h/2π has a fixed value in time and space. If it happens that Planck's constant h has another value, h ′ , at another location or at another time, formulas (1) and (2) would have to be replaced with the different expressions and Notation Let σ be the standard symplectic form on phase space R 2n ≡ R n x × R n p : by definition it is the mapping which to the pairs of vectors z = (x, p) and z ′ = (x ′ , p ′ ) associates the number The symplectic form can be conveniently written in matrix form as σ(z, z ′ ) = (z ′ ) T Jz , J = 0 n×n I n×n −I n×n 0 n×n (z and z ′ being here viewed as column vectors; J is the "standard symplectic matrix"). The inner product of two vectors ψ, φ ∈ L 2 (R n ) is given by with d n x = dx 1 · · · dx n . The associated norm is ||ψ|| = ψ|ψ . In addition to the usual -dependent (unitary) Fourier transform F ψ(p) = 1 2π n e − i px ψ(x)d n x on R n we will use the symplectic Fourier transform on R 2n : it is the transformation F σ which takes a square integrable function (or more generally a tempered distribution) a on phase space R 2n to the function (or tempered distribution) That F σ is just an elementary modification of the -Fourier transform on R 2n given by is easy to see: since σ(z, z ′ ) = (z ′ ) T Jz = Jz · z ′ (J the standard symplectic matrix) we have F σ a(z) = F a(Jz). The symplectic Fourier transform is that it is its own inverse: F 2 σ = F σ F σ is the identity (i.e. the symplectic Fourier transform is involutive) and it satisfies the modified Plancherel identity

Quantum States and the Density Matrix
We begin by recalling informally the basic definitions from the theory of quantum states; since the content of this section is common knowledge we do not provide any particular references (we are following here the terminology and the exposition in Peres [78] with some modifications). The formalism of density operators and matrices was introduced by John von Neumann [73] in 1927 and independently, by Lev Landau and Felix Bloch in 1927 and 1946 respectively. Ugo Fano was one of the first to put the theory of the density matrix in a rigorous form in his review paper [26].

Pure and mixed states; maximal tests
A quantum system is said to be in a pure state if we have complete knowledge about that system, meaning we know exactly which state it is in. Pure states can be prepared using maximal tests ( [78], §2-3): suppose we are dealing with a quantum system and let N be the maximum number of different outcomes that can be obtained in a test of that system. If such a test has exactly N different outcomes, it is called a maximal test. The quantum system under consideration is in a pure state if it is prepared in such a way that it certainly yields a predictable outcome in that maximal test, the outcomes in any other test having well-defined probabilities which do not depend on the procedure used for the preparation. A pure state can thus be identified by specifying the complete experiment that characterizes it uniquely (Fano [26]). One usually writes a pure state using Dirac's ket notation |ψ ; for all practical purposes it is convenient to use the wavefunction ψ defined by ψ(x) = x|ψ ; it is a normalized element of a certain Hilbert space H, which is usually identified in the case of continuous variables with L 2 (R n ) (the square integrable functions). When doing this the state is identified with the linear span of the function ψ, that is the ray Cψ = {λψ : λ ∈ C}. It is very important to note that the pure state |ψ can be identified with the orthogonal projection ρ ψ of H on the subspace Cψ. This projection, which is of rank one, is denoted by |ψ ψ| in quantum mechanics; it is analytically given by the formula ρ ψ φ = |ψ ψ|φ (7) where ψ|φ is identified with the inner product in H. Most tests are however not maximal, and most preparations do not produce pure states, so we only have partial knowledge of the quantum system under consideration.
The information on such a system is less than a maximum, with reference to the lack of a complete experiment with a uniquely predetermined outcome. The state of the system is nevertheless fully identified by any data adequate to predict the statistical results of all conceivable observations on the system [26]. When this is the case we say that the system is in a mixed state. Mixed states are classical probabilistic mixtures of pure states; however, different distributions of pure states can generate physically indistinguishable mixed states (this possibility will be discussed later). A quantum mixed state can be viewed as the datum of a set of pairs {(ψ j , α j )} where ψ j is a (normalized square integrable) pure state and α j a classical probability; these probabilities sum up to one: j α j = 1. A caveat: one should not confuse the mixed state {(ψ j , α j )} with the superposition ψ = j α j ψ j which is a pure state!

The density matrix
From a mathematical point of view the quantum state can be advantageously described by a self-adjoint operator on a Hilbert space; this operator is called the density matrix of the quantum system; in the pure state case this operator is just the orthogonal projection (7) on a subspace of the Hilbert space, while the density matrix of a mixed state is where we have set ρ j = ρ ψ j . It should be kept in mind that the density matrix describes a preparation procedure for an ensemble of quantum systems whose statistical properties correspond to the given preparation procedure (this important aspect will be discussed in more detail below). Proof. (i) To say that the set of density matrices is convex means that if ρ 1 and ρ 2 are in Dens(H) then so is λ ρ 1 + (1 − λ) ρ 2 for all real numbers λ such that 0 ≤ λ ≤ 1. Let {(ψ j , α j ) : j ∈ K} and {(φ j , β j ) : j ∈ L} be two mixed states. Relabeling if necessary the indices we may assume that the sets K and L are disjoint: K ∩ L = ∅. The corresponding density matrices are and hence we have by linearity That ρ = λ ρ 1 + (1 − λ) ρ 2 is a mixed state now follows from the equality (ii) That a pure state really is "pure", i.e. that it can never be represented as a mixed state is easily seen using the following algebraic argument: assume that ρ ψ = |ψ ψ| can be rewritten as a sum j α j |ψ j ψ j | with α j ≥ 0 and j α j = 1. In fact, discarding the terms with α j = 0 we may assume that α j > 0 for all indices j. Let now (Cψ) ⊥ be the subspace of H orthogonal the ray Cψ: it consists of all vectors φ in H such that ψ|φ = 0. For every φ ∈ (Cψ) ⊥ we have φ| ρ ψ |φ = | ψ|φ | 2 = 0 and hence also φ| ρ ψ |φ = j α j | φ|ψ j | 2 = 0; since we are assuming that α j > 0 this equality implies that we must have φ|ψ j = 0 for all φ ∈ S ψ and every ψ j , hence ψ j ∈ ((Cψ) ⊥ ) ⊥ . The orthogonality relation for subspaces being reflexive we have ((Cψ) ⊥ ) ⊥ = Cψ, which means that each ψ j belongs to the ray Cψ, that is ψ j = λψ for some complex number λ with |λ| = 1; the vectors ψ j thus define the state ψ|. It follows from this argument that the pure state density matrices are the extreme points of Dens(H). This means that if ρ ψ is a pure state density matrix, then the relation ρ ψ = λ ρ 1 + (1 − λ) ρ 2 with λ = 0 and λ = 1 implies ρ ψ = ρ 1 = ρ 2 . That this is the case immediately follows from the argument above.

The C * -algebraic approach
The notion of quantum state and density matrices can be very concisely described using the language of C * -algebras; this approach is useful when one wants to give a rigorous axiomatic description of the theory of quantum systems. Its inception goes back to early work by John von Neumann on operator algebras around 1930; the modern theory of C * -algebras was developed one decade later by Gel'fand and Naimark. We recommend Rieffel's paper [82] for a survey of quantization using this approach; Landsman's book [56] is another excellent source, perhaps somewhat closer to physicists' background. Let A be a non-commutative unital C * -algebra (i.e. a unital complex Banach algebra equipped with an isometric involution A −→ A * compatible with complex conjugation, and such that ||A * A|| = ||A|| 2 ). If A = A * we call A an observable; the set of observables is denoted by O. If in addition the spectrum of A ∈ O consists of numbers λ ≥ 0 we say that A is positive (A ≥ 0). A linear map ρ : O −→ C is positive (ρ ≥ 0) if A ≥ 0 implies ρ(A) ≥ 0; if in addition ρ(1) = 1 then S is called a quantum state; quantum states form the state space S(A) of A; it is a convex cone whose extreme points are the pure states while the other elements of S(A) are mixed states. Now, the Gel'fand-Naimark theorem [16,31,56] implies that for every unital C * -algebra A there exists an isomorphism π : A −→ A of A onto a separable Hilbert space H such that two of the Dirac-von Neumann axioms ( [89], pp. 66-67) are satisfied: ρ ∈ S(A) if and only if there exists a positive ρ ∈ B(H) with Tr( ρ) = 1 and ρ(A) = Tr( ρ A) for every A ∈ A; moreover the space of observables O is identified with the self-adjoint elements of B(H) and the number Tr( ρ A) is then the expected value of A ∈ O. The C * -algebraic approach is very appealing because it helps in unifying classical and quantum states: when the system is classical, the algebra of observables becomes an abelian C * -algebra and the states become ordinary probability measures.

"
This relation is usually rewritten using the notion of trace Tr of an operator, defined as the sum of the eigenvalues of that operator. In fact, one of the most used formulas in quantum mechanics is, no doubt, To see this, we observe that since ρ ψ A = |ψ ψ| A and ρ 2 ψ = ρ ψ we have, by cyclicity, Tr( ρ ψ A) = Tr( ρ 2 ψ A) = Tr( ρ ψ A ρ ψ ) that is, in "bra-c-ket notation", and using the homogeneity of the trace, Tr( ρ ψ A) = Tr(|ψ ψ| A|ψ ψ|) = ψ| A|ψ Tr(|ψ ψ|) = ψ| A|ψ since Tr(|ψ ψ|) = Tr( ρ ψ ) = 1 (the only eigenvalues of ρ ψ are 1 and 0). Formula (10) is immediately generalized by linearity of the trace to arbitrary mixed states, and one finds, that the average value of the observable A in a mixed state with density operator (8) is It turns out that several mixtures can lead to the same density matrix. Since formula (11) only involves the density operator itself, such equivalent mixtures cannot be distinguished by measurement of observables alone because all observable results can be predicted from the density matrix, without needing to know the ensemble that was used to construct it.

Precautions
So far, so good. However, the manipulations made above can -at bestbe justified when the Hilbert space H is finite dimensional because in this case the notion of trace class operators (and their properties) involves elementary calculations of finitely-dimensional matrices. This is almost always implicitly assumed in textbooks, where the reader is invited to study ad nauseam the same elementary examples in low dimensions. Unfortunately the definitions above, and the proof of the formulas (10)- (11) are incorrect in the general case of an infinite-dimensional H, and formulas like (11) are not justifiable unless one makes very restrictive hypotheses on the observable A because of convergence problems for the involved integrals. These issues are seldom questioned and treated sloppily in most of the physical literature (in this context it is very instructive to read the excellent paper [13] by Daubechies which warns of such misconceptions). The situation is very much similar to what happens when physicists manipulate the Feynman path integral 1 . It is a heuristic object which has, in the form used most of the time in physics, no mathematical justification, but which easily allows one to get new intuitions and insights (hence its usefulness). Of course, such formal manipulations are okay when one is only interested in qualitative statements, but they are certainly not okay when one wants to perform exact calculations (numerical, or theoretical).
Here is another popular (and questionable!) way of calculating traces. Let ρ be a density matrix on L 2 (R n ); assume that the kernel of ρ is a function K(x, y): It is customary (especially in the physical literature) to calculate the trace of ρ by integrating the kernel along the diagonal, that is by using the formula which is obviously an extension to the infinite dimensional case of the usual definition of the trace of a matrix as the sum of its diagonal elements. Needless to say, this formula does not follow directly from the definition of a trace class operator! In fact, even when the integral in (12) is absolutely convergent, this formula has no reason to be true in general (the kernel of an operator is defined uniquely only up to a set of measure zero, and in (12) we are integrating along the diagonal, which precisely has measure zero!). Brislawn [6] discusses formulas of the type (12) in the context of traceable Hilbert-Schmidt operators and gives sufficient conditions for such formulas to hold using Mercer's theorem and its variants. As shown in Barry Simon's little monograph [86] on trace ideals, the right condition for formula (12) to hold in the case n = 1 is that K must be of positive type, which means that we have for all integers N , all x j ∈ R and all λ j ∈ C (in particular we must thus have K ≥ 0). On the positive side, Simon (ibid.) notes that if a density matrix has kernel K satisfying |K(x, x)|d n x < ∞ then we are "almost sure" that formula (12) holds. Of course this vague statement is not a charter allowing carefree calculations! Needless to say, the "derivations" above are formal and one should be extremely cautious when using the "formulas" thus obtained. Shubin [85], §27, discusses a (not so easy to use explicitly) step-by-step procedure for checking such identities. We will come back to these essential points in Section 4.4.

Quantum Tomography
Reconstructing a quantum state (or even a classical state) is an extremely important problem, and it is, generally speaking, a difficult one. We will discuss here very briefly this topic; we will mainly be giving references to past works; such a list is of course incomplete due to the large number of contributions, which is steadily growing. We will mainly focus on quantum tomography, which is a technique for characterizing a state of a quantum system by subjecting it to a large number of quantum measurements, each time preparing the system anew.

Estimating the density matrix: generalities
Experiments performed in a laboratory on quantum systems do not lead directly and precisely to a determination of the density matrix. The problem can be formulated as a statistical one: can we estimate the density matrix using repeated measurements on quantum systems that are identically prepared? The following strategy is used: after having obtained measurements on these identical quantum systems we can make a statistical inference about the probability distribution of the measurements, and thus indirectly about the density matrix of the quantum system [93]. This procedure is called quantum state tomography, and is practically implemented using a set of measurements of a quorum of observables, i.e., of a minimal "complete" set of noncommuting observables. See D'Ariano and his collaborators [11,12] for various strategies, and also Vogel and Risken [92] or Leonhardt and Paul [57].
Bužek and his collaborators [7] show how to reconstruct the Wigner function using the Jaynes principle of Maximum Entropy. In that context, also see the recent paper Thekkadath et al. [90] where a scheme that can be used to directly measure individual density matrix elements is given; Lvovsky and Raymer [61] give a very interesting review in the continuousvariable case. The Lecture Notes volume [74] edited by Paris and Reháček contains a selection of texts by leading experts presenting various aspects of quantum state estimation. Mancini et al. [62] apply symplectic geometry to describe the dynamics of a quantum system in terms of equations for a purely classical probability distribution; also see Man'ko and Man'ko [64]. Ibort et al. [51] introduce what they call the tomographic picture of quantum mechanics. Although the Wigner function cannot be measured directly as a probability density, all its marginal distributions can. Once we know all the marginal distributions associated with different quadratures, we can reconstruct the Wigner function. Historically, the problem can be traced back to Wolfgang Pauli's question in his 1958 book [79] "The mathematical problem as to whether, for given probability densities W (p) and W (x), the wavefunction ψ (...) is always uniquely determined, has still not been investigated in all its generality". It turns out that it is not possible to determine a state by knowing only the configuration space and the momentum space marginal distributions. The answer is negative; here is an elementary example (Esposito et al. [25]): consider the two normalized wavefunctions where α is a complex number whose real part is positive: Re α > 0 and imaginary part nonzero: Im α = 0. These functions become in the momentum representation, so that the associated probability distri-butions are the same However the states |ψ 1 and |ψ 2 are different, because so that we cannot have ψ 1 = cψ 2 for some complex constant c such that |c| = 1.

The Radon transform
Let us begin with a short discussion of the classical case followed by a motivation for the quantum case; we are following the very clear exposition in §12.5 of the recent book [25] by Esposito et al. Let ρ = ρ(x, p) be some function defined on the phase space R 2 (it could be a classical probability, or a quantum quasiprobability). The Radon transform of this function is defined formally in physics texts by The function Rρ(X, µ, ν) is called a tomogram. The meaning of this integral is that one integrates the function ρ along the line ℓ with equation X − µx − νp = 0 where the variable X has an arbitrary value. Now, the crucial point is that if we know fully the function Rρ -which depends on the three real variables X, µ, ν -then one can reconstruct the function ρ by using the so-called "inverse Radon transform": Let us "verify" this formula, just for the fun of making formal calculations. Denoting by A the triple integral above we have and hence A = (2π) 2 ρ(x, p), so we are done.
In the quantum case one proceeds as follows: noting that δ(X − µx − νp) can be rewritten as the Fourier integral and hence (14) as one defines the quantum Radon transform by analogy with (14) by the formula Doing this one has of course to give a precise meaning to the exponential e i k( X−µ x−ν p) , and one then "inverts" the formula above, which yields in analogy with (15) ρ = 1 2π Rρ(X, µ, ν)e i ( X−µ x−ν p) dXdµdν.
Of course these manipulations are heuristic and completely formal and do not have any mathematical sense unless one defines rigorously the involved operators.

Mathematical Theory of the Density Matrix
To give a precise definition of the trace formulas above, we need to work a little bit and use some operator theory (the theory of Hilbert-Schmidt and trace class operators suffices at this point). We will use several times the generalized Bessel equality j ψ|ψ j φ|ψ j * = ψ|φ (16) valid for every orthonormal basis of a Hilbert space H. It is an immediate consequence of the equality j |ψ j ψ j | = I.

General algebraic definition
Let us now define the notion of density operator in terms of the so important notion of trace class operators. The starting point is to notice that formula (8) implies that a density operator has, to begin with, three basic properties: (1) It is a bounded operator on H; in particular ρ is well-defined for all ψ ∈ H; (2) It is a self-adjoint operator : ρ † = ρ; (3) It is a positive operator : ψ| ρ|ψ ≥ 0 for all ψ ∈ H.
The boundedness of ρ follows from the observation that where ||ψ|| = ψ|ψ ; that || ρ j ψ|| ≤ ||ψ|| follows from the Cauchy-Schwarz inequality since ρ j ψ = |ψ j ψ j |ψ . The self-adjointness of ρ is clear, since it is a linear combination of the self-adjoint operators ρ j = |ψ j ψ j |. Finally, the positivity of ρ is clear since we have (2) when H is a complex Hilbert space). There remains the trace issue. As we mentioned above, physicists define the trace of an operator ρ as the sum of its diagonal elements, and this definition only makes sense without further restrictions when dim H < ∞. The way to do things correctly consists in using the mathematical definition of trace class operators; the latter is very general (and hence very useful) because it involves arbitrary bases of H. The definition goes as follows: a bounded operator ρ on a Hilbert space H (we do not assume self-adjointness or positivity at this point) is of trace class if there exist two orthonormal bases (φ j ) and (χ j ) of H (indexed by the same set) such that Notice that this definition immediately implies that ρ is of trace class if and only if its adjoint ρ † is. Now, a crucial property is that if condition (17) holds for one pair of orthonormal bases, then it also holds for all pairs of orthonormal bases, and that if (ψ j ) and (φ j ) are two such pairs then each series being absolutely convergent. The proof of this result is not difficult; it consists in expanding each base using the vectors of the other; we refer to Chapter 12 in de Gosson [34] and to the Appendix 3 in Shubin's book [85] for complete proofs. This being done, we define the trace of ρ by the formula where (ψ j ) is any orthonormal basis of H; that the result does not depend on the choice of such a basis follows from the identity (18). We leave to the reader to verify that as a consequence of this definition the sum of two trace class operators is again a trace class operator, and that the trace of their sum is the sum of the traces. Also, Tr(λ ρ) = λ Tr( ρ) for every complex number λ, hence density matrices form a convex cone. Also notice that it immediately follows from definition (19) that hence the trace is real if ρ is self-adjoint.

Invariance under unitary conjugation
The following invariance of the trace under unitary conjugation is well known: Theorem 2 Let ρ be a density matrix on the Hilbert space H and U a unitary operator on H. Then U † ρ U is also a positive trace class operator and we have Proof. It is clear that U † ρ U is a positive, bounded, and self-adjoint operator. The operator ρ is of trace class if and only if j ψ j | ρ|ψ j < ∞ for one (and hence every) orthonormal basis (ψ j ) j of H. Since ψ j | U † ρ U |ψ j = U ψ j | ρ U ψ j and the basis ( U ψ j ) j also is orthonormal, the operator U † ρ U is of trace class. The trace formula (21) follows from the orthonormal basis independence of the trace formula (19).
We have not yet related our definition of trace to that used in the previous section. Let us show that these definitions do coincide. Let be a density matrix in the sense of (8); the vectors ψ j being normalized we have and hence, using definition (19) of the trace,

The spectral theorem
An especially useful expansion of a density operator is obtained using elementary functional analysis (the spectral resolution theorem). Recall that each eigenvalue of a selfadjoint compact operator (except possibly zero) has finite multiplicity (see any book on elementary functional analysis, e.g. Blanchard and Brüning [5] or Landsman [55]).

Theorem 3 A bounded linear self-adjoint operator ρ on a complex
Hilbert space H is of trace class if and only if there exists a sequence of real numbers λ j ≥ 0 and an orthonormal basis that is ρ = j λ j ρ j where ρ j is the orthogonal projection on the ray Cψ j . In particular, ρ is a density matrix if and only if λ j ≥ 0 for every j and Proof. This is a classical result from the theory of compact self-adjoint operators on a Hilbert space; see any introductory book on functional analysis, for instance Blanchard and Brüning [5]. That ψ j is an eigenvector corresponding to λ j is clear: since ψ k |ψ j = δ kj we have An immediate consequence of Theorem 3 is that if ρ is a density matrix, then Tr( ρ 2 ) ≤ 1 with equality if and only if ρ represents a pure state: we have The equality j λ 2 j = 1 can only occur if all the coefficients λ j are equal to zero, except one which is equal to one. Thus Tr( ρ 2 ) = 1 if and only if ρ represents a pure state. The number Tr( ρ 2 ) is therefore called the purity of the quantum state represented by the density matrix ρ. Another way of measuring the purity of a state is to use the von Neumann entropy S( ρ). By definition: (with the convention 0 ln 0 = 0). One often uses the suggestive notation Notice that the von Neumann entropy S( ρ) is zero if and only if ρ is a pure state.

Functional properties
Perhaps the most directly useful property of trace class operators (and hence of density matrices) is Theorem 4 below; it says that if we compose a trace class operator with any bounded operator we obtain again a trace class operator.
We denote by L 1 (H Formula (26) applies in particular when ρ is a density matrix and A a bounded quantum observable on H.
Proof. That L 1 (H) is a vector space is clear using formula (17): A is a bounded operator on H. Recall that the boundedness of A is equivalent to the existence of a number C ≥ 0 such that || Aψ|| ≤ C||ψ|| for all ψ ∈ H. Let now (ψ j ) and (φ j ) be two orthonormal bases of H; writing A ρψ j |φ j = ρψ j | A † φ j and applying Bessel's equality (16) using the Cauchy-Schwarz inequality we have, since || A † φ j || ≤ C, Summing this inequality with respect to the index j yields, since ρ is of trace hence A ρ is of trace class as claimed. That ρ A also is of trace class is immediate noting that we can write ρ A = ( A † ρ † ) † . There remains to prove the trace equality (26). Choosing (ψ j ) = (φ j ) the Bessel equality (27) becomes and, similarly, Summing this equality over j we get hence Tr( ρ A) = Tr( A ρ) since the sums of both right-hand sides are identical.

The trace norm
An operator A on a Hilbert space H is called a Hilbert-Schmidt operator if there exists an orthonormal basis (ψ j ) of H such that In particular such an operator is bounded on H. As in the case of trace class operators one shows that if this property holds for one orthonormal basis then it holds for all, and that the number j || Aψ j || 2 is independent of the choice of such a basis: Let in fact (φ j ) be a second orthonormal basis, and write Aψ j = k φ j | Aψ j φ k . Then, using the Bessel equality (16) that is, using again (16), Observe that this equality shows that if we take (ψ j ) j = (φ j ) j then k || A † ψ k || 2 H < ∞ hence the adjoint A † of a Hilbert-Schmidt operator is also a Hilbert-Schmidt operator; we may thus replace A by A † in the inequality above, which yields k || A † φ k || 2 < ∞ as claimed.
Hilbert-Schmidt operators form a vector space L 2 (H) and defines a norm on this space ; this norm is associated with the scalar product If A and B are Hilbert-Schmidt operators then λ A is trivially a Hilbert-Schmidt operator and ||λ A|| HS = |λ| || A|| HS for every λ ∈ C; on the other hand, by the triangle inequality, for every orthonormal basis (ψ j ) hence A + B is also a Hilbert-Schmidt operator and we have Finally, || A|| HS = 0 is equivalent to Aψ j = 0 for every index j that is to A = 0. The space L 2 (H) is complete for that norm, and hence a Banach space (it is actually even a Hilbert space when equipped with the scalar product (30). In addition L 2 (H) is closed under multiplication (and hence an algebra).
It turns out that the space L 2 (H) is a two-sided deal in the algebra of B(H) of bounded operators: if A ∈ L 2 (H) and B ∈ B(H) then A B ∈ L 2 (H) and B A ∈ L 2 (H). Let us show that B A ∈ L 2 (H). We have, denoting by || B|| the operator norm of B, Applying the same argument to A B = ( B † A † ) † shows that A B ∈ L 2 (H) as well.

Hilbert-Schmidt and trace class
An essential property is that every trace class operator is the product of two Hilbert-Schmidt operators (and hence itself a Hilbert-Schmidt operator). Let us glorify this important statement as a theorem: Theorem 5 (i) A bounded operator A on H is of trace class if and only if it is the product of two Hilbert-Schmidt operators: Proof. In what follows (ψ j ) is an orthonormal basis in H. (i) Assume that A = B C where B and C are both Hilbert-Schmidt operators. We have, using respectively the triangle and the Cauchy-Schwarz inequalities, we get, since B and C are both Hilbert-Schmidt operators, proving that A is indeed of trace class. Assume, conversely, that A ∈ L 1 (H).
In view of the polar decomposition theorem there exists a unitary operator U on H such that It follows that B = U C ∈ L 2 (H) as well. (ii) We have seen that every trace class operator is a product A = B C of two Hilbert-Schmidt operators. In view of the algebra property (i) of L 2 (H) the operator A is itself Hilbert-Schmidt operator.
Let us now specialize to the case where H = L 2 (R n ). In this case Hilbert-Schmidt operators are exactly those operators that have a square integrable kernel; as a consequence a density matrix also has square integrable kernel.
(ii) Every trace class operator (and hence every density matrix ) on L 2 (R n ) can be represented by (31) Proof. (i) The condition is necessary. Let (ψ j ) be an orthonormal basis in which we can rewrite, using the definition of the inner product as This is now (31) with Let us show that K ∈ L 2 (R n × R n ). Remarking that the tensor products applying the Bessel equality (16) Define now the operator A by the equality (31); we have and hence, since the basis (ψ j ) j is orthonormal, and A is thus a Hilbert-Schmidt operator. (ii) In view of property the algebra property (see (i) in Theorem 5) a trace class operator is a fortiori a Hilbert-Schmidt operator. The claim follows in view of the statement (i).

The Phase Space Picture
From now on we assume that the Hilbert space H is L 2 (R n ), the space of complex-valued square integrable functions on R n (we are thus dealing with quantum systems with n degrees of freedom).

Weyl operators and symbols
Let us first explain what we mean by a Weyl symbol. Recall that a function K(x, y) defined on R n × R n is the kernel of an operator A if we have for all ψ ∈ L 2 (R n ). A deep theorem from functional analysis ("Schwartz's kernel theorem") tells us that every continuous linear operator from spaces of test functions to the tempered distributions 2 can be represented in this way, the integral in (32) being possibly replaced by a distributional bracket. By definition, the Weyl symbol of the operator A is the function this formula is easily inverted using an inverse Fourier transform in p, yielding the expression of the kernel in terms of the symbol: These two formulas uniquely define the kernel and the symbol in terms of each other, and imply the "Weyl correspondence (or transform)", which expresses unambiguously the operator A in terms of the symbol a: one often writes A = Op W (a), and this notation is unambiguous because the symbol of A is uniquely determined by (33). Formula (35) can be rewritten in several different ways; one common expression is where Π(x 0 , p 0 ) is the reflection (or parity) operator [33,34,40] Π The usefulness of the Weyl correspondence in quantum mechanics comes from the fact that it associates to real symbols self-adjoint operators. In fact, more generally: We refer to de Gosson [33,34,40] for detailed discussions of the Weyl correspondence from the point of view outlined above; Littlejohn's well-cited paper [58] contains a very nice review of the topic with applications to semiclassical approximations.

Twisted products and convolutions
Composing Weyl operators leads in a natural way to the notion of twisted product, which is essential in the theory of deformation quantization: assume that the two operators A = Op W (a) and B = Op W (b) can be composed, and set A B = C = Op W (c). Then the Weyl symbol c is given by the expression where ∂σ(u, z, v) is the coboundary of σ(u, v) viewed as a 1-chain: For detailed proofs of these properties see de Gosson [34], Chapter 10. The function is called the twisted product 3 of a and b. Thus, by definition, To the twisted product is associated the twisted convolution a#b of two symbols; it is defined by where F σ is the symplectic Fourier transform. The equivalence of both definitions is due to the fact that the symplectic Fourier transform is its own inverse. Explicitly (de Gosson, [40], Section 11.1): an equivalent statement is to say that the twisted symbol c σ of the product C = A B is given by 3 It was defined by J. von Neumann following the work of Weyl.

The Wigner function of a density matrix
The essential point to understand now is that the Wigner function ρ = W ρ (x, p) we are going to define below is (up to an unimportant constant factor) the Weyl symbol of the operator ρ: the Wigner function is thus a dequantization of ρ, that is a phase space function obtained from this operator 4 . Also notice that it is the first time Planck's constant appears in a quite explicit way; we could have a priori replaced with any other real parameter η: this change wouldn't have consequence for the involved mathematics (but it would of course change the physics!). We will come back to this essential point later, but

Definition of the Wigner function of a density matrix
To a density matrix ρ on L 2 (R n ) one associates in standard texts its Wigner function (also called Wigner distribution). It is the function W ρ of the variables x = (x 1 , ..., x n ) and of the conjugate momenta p = (p 1 , ..., p n ) usually defined in physics texts by where |x is an eigenstate of the operator x = ( x 1 , ..., x n ) (where x j = multiplication by x j ). Performing the change of variables x −→ y = 2x ′ we can rewrite this definition in the equivalent form this has some practical advantages when one uses the Wigner-Weyl-Moyal formalism. In spite of their formal elegance, formulas (44), (45) are at first sight somewhat obscure and need to be clarified, especially if one wants to work analytically with them. Assume first that ρ represents a pure state: ρ = ρ ψ = |ψ ψ| where ψ ∈ L 2 (R n ) is normalized. We get, using the relations ψ(x) = x|ψ and ψ * (x) = ψ|x , is the usual Wigner function (or Wigner distribution, or Wigner transform) of ψ ∈ L 2 (R n ) (see Wigner [97], Hillery et al. [47]; for the mathematical theory [33,34,40]); equivalently In the general case, where ρ = j α j |ψ j ψ j | is a convex sum of operators of the type above one immediately gets, by linearity, the expression A very important result we will prove later on (Theorem 9), but use immediately, is the following: The Wigner function of a mixed state is square integrable: W ρ ∈ L 2 (R 2n ).
One also often uses the cross-Wigner transform of a pair of square integrable functions. It is given by It naturally appears as an interference term when calculating the Wigner function of a sum; in fact, using definition (47) one immediately checks that Notice that W (ψ, φ) is in general a complex number and that The cross-Wigner function has many applications; in particular it allows to reformulate the notion of weak-value as an interference between the past and the future in the time-symmetric approach to quantum mechanics (see de Gosson and de Gosson [41]).

The Weyl symbol of a density matrix
In the case of density matrices we have: Theorem 7 Let ρ be a density matrix on L 2 (R n ): is the Wigner function of ρ.
Proof. The action of the projection ρ j = |ψ j ψ j | on a vector ψ ∈ L 2 (R n ) is given by hence the kernel of ρ j is the function . It follows, using formula (33), that the Weyl symbol of ρ j is Formula (52) follows by linearity.

Statistical interpretation of the Wigner function
The importance of the Wigner function of a density matrix comes from the fact that we can use it as a substitute for an ordinary probability density for calculating averages (it is precisely for this purpose Wigner introduced his eponymous transform in [97]). For all ψ ∈ L 2 (R n ) such that 5 ψ ∈ L 1 (R n ) and F ψ ∈ L 1 (R n ) the marginal properties are and hence, in particular, In the second equality (53) is the -Fourier transform of ψ. One should be aware of the fact that while W ψ is always real (and hence so is ρ = W ρ ) as can be easily checked by taking the complex conjugates of both sides of the equality (47), it takes negative values for all ψ which are not Gaussian functions. (This is the celebrated "Hudson theorem" [48]; also see Janssen [49] for the multidimensional case.) A caveat: this result is only true for the Wigner function W ψ of a single function ψ; the case of a general distribution ρ = j α j W ψ j is much subtler, and will be discussed later. Let us introduce the following terminology: we call an observable A a "good observable" for the density matrix ρ if its Weyl symbol a (i.e. the corresponding classical observable) satisfies aρ ∈ L 1 (R 2n ), that is (ρ the Wigner function of ρ; we are using the shorthand notation z = (x, p), d 2n z = d n xd n p). We assume in addition that a is real so that A is Hermitean. Notice that "goodness" is guaranteed if the symbol a is square integrable, because the Cauchy-Schwarz inequality then implies that since ρ is square integrable (as mentioned above, see Theorem 9).
Theorem 8 Let ρ be a density matrix on L 2 (R n ) and ρ its Wigner function. The average value of every good observable A with respect to ρ is then finite and given by the formula Proof. By linearity it suffices to consider the case where ρ = |ψ ψ| so that ρ = W ψ; this reduces the proof of formula (57) to that of the simpler equality Replacing in the equality above W ψ(x, p) by its integral expression (47) yields Since we assume that the "goodness" assumption (56) is satisfied, we can use Fubini's theorem and rewrite this equality as a double integral: Let us perform the change of variables x ′ = x + 1 2 y and y ′ = x − 1 2 y; we have x = 1 2 (x ′ + y ′ ) and y = x ′ − y ′ and hence, using definition (35) of the Weyl operator A, which we set out to prove. We remark that the identity (58) can be extended to the cross-Wigner function (49); in fact, adapting the proof of (58) one sees that if ψ and φ are square integrable, then

The displacement operator and the ambiguity function
In this subsection we review a few properties of the Wigner function which are perhaps not all so well-known in quantum mechanics; these properties are important because they give an insight into some of the subtleties of the Weyl-Wigner-Moyal transform. We also define a related transform, the ambiguity function.

Redefinition of the Wigner function
Recall that the reflection operator (37) is explicitly given by the formula It can be used to define the Wigner function in a very concise way. In fact: for every ψ ∈ L 2 (R n ) we have This is easy to verify: we have, by definition (37) of Π(x 0 , p 0 ), which proves (60), taking definition (47) of the Wigner function into account. Formula (60) shows quite explicitly that, up to the factor (π ) −n , the Wigner function is the probability amplitude for the state |ψ to be in the state | Π(x 0 , p 0 )ψ ; this was actually already observed by Grossmann [45] and Royer [83] in the mid 1970s.

The Moyal identity
An important equality satisfied by the Wigner function is Moyal's identity 6 (see [34,40] for a proof); it is valid for all square-integrable functions ψ and φ on R n . In particular: This formula implies the following interesting fact which is not immediately obvious: consider the spectral decomposition (23) of a density operator in Theorem 3: ρψ = j λ j ψ j |ψ ψ j 6 It is somtimes also called the "orthogonality relation".
here the λ j are the eigenvalues of ρ and the corresponding eigenvectors ψ j form an orthonormal basis of H. When H = L 2 (R n ) the corresponding Wigner function is therefore It follows from Moyal's identity (61) that the W ψ j form an orthonormal system of vectors in the Hilbert space L 2 (R 2n ) (but not a basis as is easily seen by "dimension count"). As a consequence of the Moyal identity we prove the fact, mentioned in Section 4.2.1, that the Wigner function of a density matrix is square integrable.
Proof. Since L 2 (R 2n ) is a vector space it is sufficient to consider the pure case, that is to prove that W ψ ∈ L 2 (R 2n ) if ψ ∈ L 2 (R 2n ). But this immediately follows from Moyal's identity (62).
The Moyal identity can be extended to the cross-Wigner function (49); recall that for ψ, φ ∈ L 2 (R n ) it is defined by In fact, for all ψ, ψ ′ , φ, φ ′ ∈ L 2 (R n ) we have (see for instance de Gosson [34,40]). Denoting by ·|· the inner product on L 2 (R 2n ) this identity can be written in the form In particular An important remark: one can prove [34,40], using this generalized Moyal identity, that if vectors ψ j form an orthonormal basis of L 2 (R n ) then the vectors (2πℏ) n/2 W (ψ j , ψ k ) form an orthonormal basis of the space L 2 (R 2n ) (that these vectors are orthonormal is clear from (65)).

The ambiguity function
A transform closely related to the Wigner function and well-known from signal analysis (especially radar theory) is the ambiguity function Amb ψ (it is also called the "auto-correlation function"). It can be introduced in several equivalent ways; we begin by defining it explicitly by a formula: for Comparing with the definition of the Wigner function one cannot help being surprised by the similarity of both definitions. In fact, it is easy to show by performing an elementary change of variables that if ψ is an even function (that is ψ(−x) = ψ(x) for all x ∈ R n ) then W ψ and Amb ψ are related by There are two complementary "natural" ways to define the ambiguity function. The first is to observe that the Wigner function and the ambiguity function are symplectic Fourier transforms of each other: Amb ψ = F σ W ψ and W ψ = F σ Amb ψ; they are of course equivalent since F −1 σ = F σ . For a proof, see de Gosson [34,40]. There is still another way to define the ambiguity function. Let D(z 0 ) = D(x 0 , p 0 ) be the Weyl displacement operator (it is also called the Glauber-Sudarshan displacement operator, or the Heisenberg operator, or the Heisenberg-Weyl operator). It is defined by This operator is the time-one propagator for the Schrödinger equation associated with the classical translation Hamiltonian σ(z, z 0 ) = x 0 p − p 0 x (see the discussions in de Gosson [33,34,39] and Littlejohn [58]); this observation motivates the notation often found in the literature. We are using here the coordinate expression of the displacement operator; we leave it to the reader as an exercise to check that D(z 0 ) coincides with the operator commonly used in quantum optics (a and a † are the annihilation and creation operators; see Potoček [80] for a discussion of these notational issues). The displacement operator is related to the reflection operator Π(z 0 ) by the simple formula where Π is the parity operator Πψ(x) = ψ(−x). That the operators D(z 0 ) correspond to translations in phase space quantum mechanics is illustrated by the following important relation satisfied by the Wigner transform: (it is easily proven by a direct computation, see de Gosson [33,34,40], Littlejohn [58]). Using the displacement operator, the ambiguity function is given by Amb ψ(z 0 ) = 1 2π n D(z 0 )ψ|ψ ; one verifies by a direct calculation using (70) that one recovers the first analytical definition (67). The displacement operators play a very important role, not only in quantum mechanics, but also in related disciplines such as harmonic analysis, signal theory, and time-frequency analysis. They can be viewed as a representation of the canonical commutation relations (the Schrödinger representation of the Heisenberg group); this is related to the fact these operators satisfy and also The second formula shows that the displacement operators form a projective representation of the phase space translation group. In addition to being used to define the ambiguity function, the displacement operators allow one to define Weyl operators in terms of their "twisted symbol" (sometimes also called "covariant symbol"), which is by definition the symplectic Fourier transform a σ (z) = F σ a(z) of the ordinary symbol a. Let in fact A = Op W (a), that is (formula 36). Using the displacement operator D(z 0 ) in place of the reflection operator Π(z 0 ) we have (see [33,34,40,58]). This formula has many applications; it is essential in the study of the positivity properties of trace class operators as we will see in a moment. Notice that formula (78) is Weyl's original definition [96] in disguise: making the change of variables z 0 −→ −Jz 0 in this formula one gets, noting that a σ (−Jz 0 ) = F a(z 0 ) and which is the formula originally proposed by Weyl [96], in analogy with the Fourier inversion formula (see the discussion in de Gosson [39,40]).

Kernels and symbols
It is tempting to redefine the trace of a Weyl operator A = Op W (a) by the formula Tr( A) = 1 2π n a(z)d 2n z.
But doing this one should not forget that even if the operator A is of trace class, formula (79) need not give the actual trace. First, the integral in the right-hand side might not be convergent; secondly even if it is we have to prove that it really yields the right result. We will discuss the validity of formula (79) and of other similar formulas below, but let us first prove some intermediary results. We begin by discussing the Weyl symbols of Hilbert-Schmidt and trace class operators.
Theorem 10 Let A = Op W (a) be a Hilbert-Schmidt operator. Then a ∈ L 2 (R 2n ) and we have |a(z)| 2 d 2n z = (2π ) n/2 K(x, y)d n xd n y.
Conversely, every Weyl operator with symbol a ∈ L 2 (R 2n ) is a Hilbert-Schmidt operator.
Proof. In view of Theorem 6 we have K ∈ L 2 (R n × R n ). Let us prove formula (80) when K ∈ S(R n × R n ); it will then hold by continuity for arbitrary K ∈ L 2 (R n × R n ) in view of the density of S(R n × R n ) in L 2 (R n × R n ). In view of formula (33) relating the kernel and the symbol of a Weyl operator we have |a(x, p)| 2 dp = (2π ) n |K(x + 1 2 y, x − 1 2 y)| 2 d n y.
Integrating this equality with respect to the x variables we get, using Fubini's theorem Set now x ′ = x + 1 2 y and y ′ = x − 1 2 y; we have d n x ′ d n y ′ = d n xd n y and hence which we set out to prove. The converse is obvious since the condition a ∈ L 2 (R 2n ) is equivalent to K ∈ L 2 (R n × R n ) in view of the inequality above.

Rigorous results
We now specialize our discussion to the case H = L 2 (R n ). Recall that we showed in Section 3.2 that the formula defines an inner product on the ideal L 2 (H) of Hilbert-Schmidt operators in H the associated "trace norm" being defined by Let us state and prove the following result: (ii) The Hilbert-Schmidt inner product is given by the convergent integral and hence || A|| 2 HS = Tr( A † A) is given by Proof. (i) We first observe that in view of Theorem 10 we have a ∈ L 2 (R n ) and b ∈ L 2 (R n ) hence the integrals in (85) and (86) are absolutely convergent. Let (ψ j ) be an orthonormal basis of L 2 (R n ); by definition of the trace we have Expanding Bψ j and A † ψ j in the basis (ψ j ) we get Aψ ℓ |ψ j ψ ℓ and hence, using the Bessel equality (16), In view of formula (59) we have where W (ψ j , ψ k ) is the cross-Wigner transform (49) of ψ j , ψ k ; denoting by ·|· the inner product on L 2 (R 2n ) these equalities can be rewritten and hence It follows from the extended Moyal identity (64) that Since (ψ j ) is an orthonormal basis the vectors (2π ) n/2 W (ψ j , ψ k ) also form an orthonormal basis (see the remark following formula (66)), hence the Bessel identity (16) allows us to write the equality above as which is formula (85). (ii) It immediately follows from formula (85) using (83) and (84), recalling that if A = Op W (a) then A † = Op W (a * ). Part (i) of the result above allows us -at last! -to express the trace of a Weyl operator in terms of its symbol provided that the latter is absolutely integrable: We have, in view of formula (43) giving the twisted symbol of the product of two Weyl operators and hence, using the Plancherel identity (6) for F σ , which proves the formula (90).
The following result is very much in the spirit of the C * -algebraic approach outlined in Section 2.1.3 (cf. Gracia-Bondía and Varilly [28,29]). Let us denote by Op W denote the Weyl transform: a −→ A = Op W (a); it associates to every symbol the corresponding Weyl operator.
Theorem 13 (i) Op W is an isomorphism of the Hilbert space L 2 (R n ) onto the algebra L 2 (L 2 (R n )) of Hilbert-Schmidt operators on L 2 (R n ); (ii) ...

Proof. (i) In view of Theorem 10 a bounded operator
Hilbert-Schmidt if and only if its Weyl symbol a is in L 2 (R n ). The Weyl correspondence being linear and one-to-one the statement follows.

Metaplectic Group and Symplectic Covariance
For a complete study of the metaplectic group in quantum mechanics see our book [38]; on a slightly more general and technical level see [33,34]. An excellent introduction to the symplectic group is given in Arvind et al. [2], also see García-Bullé et al. [30].

The generators of Sp(n)
Recall that the symplectic form on phase space R 2n can be defined by σ(z, z ′ ) = (z ′ ) T Jz where J = 0 n×n I n×n −I n×n 0 n×n is the standard symplectic matrix (we use the notation z = (x, p), z ′ = (x ′ , p ′ )). By definition, the symplectic group Sp(n) consists of all real 2n × 2n matrices S such that σ(Sz, Sz ′ ) = σ(z, z ′ ) for all vectors z, z ′ ; such a matrix is called a symplectic matrix. Rewriting this condition as (Sz ′ ) T JSz = (z ′ ) T Jz we thus have S ∈ Sp(n) if and only if S T JS = J. It is an easy exercise to show that if S is symplectic then S −1 and S T are symplectic as well, hence this defining relation is equivalent to SJS T = J. The symplectic group plays an essential role in classical mechanics in its Hamiltonian formulation; its role in quantum mechanics is no less important, in association with its double covering, the metaplectic group Mp(n) which we briefly describe now. There are several ways to introduce the metaplectic group. We begin by giving a definition using the notion of free symplectic matrix and its generating function (we are following here our presentation in [33,34]). Let One says that the block-matrix (91) is a free symplectic matrix if B is invertible, i.e. det B = 0. To a free symplectic matrix is associated a generating function: it is the quadratic form The terminology comes from the fact that the knowledge of A(x, x ′ ) uniquely determines the free symplectic matrix S: we have as can be verified by a direct calculation. The interest of the notion of free symplectic matrix comes from the fact that such matrices generate the symplectic group Sp(n). More precisely every S ∈ Sp(n) can be written as a product S = S A S A ′ (we place the corresponding generating functions A and A ′ as subscripts).
Defining, for symmetric P and invertible L, the symplectic matrices V −P and M L by a straightforward calculations shows that the free symplectic matrix S A can be factored as This implies that the symplectic group Sp(n) is generated by the set of all matrices V −P and M L together with J. It is easy to deduce from this that the determinant of a symplectic matrix always is equal to one since we obviously have det V −P = det M L = det J.

Generalized Fourier transforms
Now, to every free symplectic matrix S A we associate two operators S A,m by the formula where m corresponds to a choice of argument for det B −1 : m = 0 mod 2 if det B −1 > 0 and m = 1 mod 2 if det B −1 < 0. It is not difficult to prove that the generalized Fourier transforms S A,m are unitary operators on L 2 (R n ). These operators generate a group: the metaplectic group Mp(n). One shows that, as for the symplectic group, every S ∈ Mp(n) can be written (non uniquely) as a product S A,m S A ′ ,m ′ . This group is a double covering of Sp(n), the covering projection being simply defined by Here are three examples of free symplectic matrices and of their metaplectic counterparts; these can be used to give an alternative definition of the metaplectic group: • The standard symplectic matrix J has as generating function A(x, x ′ ) = −x · x ′ and hence the two corresponding metaplectic operators are ± J with observe that J = i −n/2 F where F is the usual Fourier transform; • The symplectic shear V −P = I 0 P I (P = P T ) is not free, but x · x ′ and the corresponding metaplectic operators are hence ± U −P with • The symplectic rescaling matrix M L = L −1 0 0 L T is not free but the product It turns out that an easy calculation shows that, similarly to the factorization (96) of free symplectic matrices, the quadratic Fourier transform (97) can be written where M L,m and J are defined as above and when P = P T . It follows that the elementary operators V −P , M L,m and J generate Mp(n) (these operators are used in many texts to define the metaplectic group; our approach using (97) has some advantages since among other things it makes immediately clear that metaplectic operators are generalized Fourier transforms).

A conjugation property for the displacement and reflection operators
Everything here stems from the following observation: let S ∈ Mp(n) have projection S ∈ Sp(n) (the symplectic matrix S is thus "covered" by the two metaplectic operators ± S). Then for every phase space point z 0 = (x 0 , p 0 ) the displacement operators D(Sz 0 ) and D(z 0 ) are related by the conjugation formula (recall that S † = S −1 since metaplectic operators are unitary). This formula is most easily proven using the generators of Sp(n) and the corresponding generators of Mp(n); for a complete proof see for instance de Gosson [34], §8.1.3, also [58]. It easily follows from (99) that the reflection operator (37) satisfies a similar relation: In fact, recalling (formula (72) to get (100) we have to show that S † Π S = Π since we will then have It suffices for that purpose to show that S † A,m Π S A,m = Π since the generalized Fourier transforms S A,m generate the metaplectic group. Now, Π S A,m ψ(x) = S A,m ψ(−x) hence, by (97), (94)) we get, making the change of variables that is Π S A,m = S A,m Π; it follows that we have

Symplectic covariance
Collecting the facts above we have: Theorem 14 Let z = (x, p) be a point in the phase space R 2n and S a metaplectic operator with projection π Mp ( S) = S in Sp(n). (i) We have (ii) For every symbol a we have where a • S −1 (z) = a(S −1 z).
Proof. (i) To prove the first identity (101) we recall (formula (60)) that W ψ(z) = 1 π n ψ| Π(z)ψ and hence, using (100) and the unitarity of metaplectic operators, which is precisely (101). The proof of the second identity (101) is similar using the definition (74) of the ambiguity function together with property (99). (ii) Recall (formula (36)) that the Weyl operator A = Op W (a) can be written A = 1 π n a(z) Π(z)d 2n z and hence, using (100), performing the change of variables z ′ = Sz we have, since det S = 1, as claimed. Applying the machinery above to the density matrix we get: Corollary 15 Let {(ψ j , α j )} be a mixed state with density matrix ρ and Wigner function ρ. Let S ∈ Mp(n). The mixed state {( Sψ j , α j )} has density matrix S ρ S † and Wigner function ρ(S −1 z), where S = π Mp ( S).
because of the first formula (101). It follows that the Weyl symbol of the density matrix corresponding to {( Sψ j , α j )} is a(S −1 z) where a = (2π ) n ρ is the Weyl symbol of ρ; that S ρ S † is the density matrix of {( Sψ j , α j )} follows from formula (102). Note that the fact that S ρ S † is the density matrix of {( Sψ j , α j )} can also be proven directly using the definition (8) of the density matrix in term of projectors.

Variable Planck Constant
We now address one of the central themes of this Review, namely the mathematical consequences of possible changes in the value of Planck's constant h.

A consequence of Moyal's identity
We begin by a few straightforward observations involving the Moyal identity introduced in Section 4.3.2. Let η be a real parameter; we assume for the moment that η > 0. This parameter will play the role of a variable = h/2π. For any square integrable ψ we define the η-Wigner transform (or distribution) of ψ by replacing by η in the usual definition: Of course W ψ = W ψ (the usual Wigner transform). The mathematical properties of W η ψ are of course the same that those of W ψ, replacing everywhere with η. In particular, replacing the -Fourier transform (55) with the η-Fourier transform the marginal properties (53) become ). An important equality satisfied by the Wigner function is Moyal's identity 7 which is valid for all square integrable functions ψ and φ (see de Gosson [40]). In particular Let us now address the following question: for ψ given, can we find φ such that W η φ = W ψ for η = ? The answer is "no"! More generally: Theorem 16 (i) A pure state |ψ does not remain a pure state if we vary : let W ψ be the Wigner function of |ψ . There does not exist any state |φ such that W η φ = W ψ if η = . (ii) Assume that |ψ becomes a mixed state when is replaced with η. Then we must have η ≤ .
Proof. (i) We have ψ, φ ∈ L 2 (R n ). Assume that W η φ = W ψ; then hence, using the marginal properties (53), |φ(x)| 2 = |ψ(x)| 2 so that φ and ψ have same norm: ||φ|| = ||ψ||. On the other hand, using the Moyal identity (107), the equality W ψ = W η φ implies that (ii) Assume that there exists a sequence (φ j ) of (normalized) functions in L 2 (R n ) and a sequence of positive constants α j summing up to one such that W ψ = j α j W η φ j . Proceeding as above we get, using again the marginal properties, On the other hand, squaring W ψ we get hence, integrating and using respectively the Moyal identity for (W ψ) 2 and W η φ j W η φ k , and the Cauchy-Schwarz inequality we get 1 2π which implies, using (108), that 1 2π n ≤ 1 2πη n , that is η ≤ as claimed.
A caveat: property (ii) in the theorem above does not say that a pure state automatically becomes a mixed state if we decrease Planck's constant. It merely says that if a pure state becomes mixed, it can only happen if Planck's constant has decreased. We will see later that this is related to the uncertainty principle.

Bochner's theorem
We begin by recalling Bochner's theorem about the Fourier transform of a probability density. That theorem says that a (complex valued) function f on R m , continuous at the origin and such that f (0) = 1 is the characteristic function of a probability density on R m if and only if it is of positive type, that is, if for all choices of points z 1 , ..., z N ∈ R m the N × N matrix is positive semidefinite (that is, the eigenvalues of F (N ) are all ≥ 0). Let us introduce two modifications of the symplectic Fourier transform F σ . First we allow the latter to depend on an arbitrary parameter η = 0 and set F σ,η a(z) = a σ,η (z) = 1 It coincides with F σ when η = . We next define the reduced symplectic Fourier transform F ♦ is by Obviously F ♦ a and F σ,η a = a σ,η are related by the simple formula a ♦ (z) = (2πη) n a σ,η (ηz).
With this notation Bochner's theorem on Fourier transforms of probability measures can be restated in the following way: a real function ρ on R 2n is a probability density if and only if its reduced symplectic Fourier transform ρ ♦ is continuous, ρ ♦ (0) = 1, and for all choices of z 1 , ..., z N ∈ R 2n the N × N matrix Λ whose entries are the complex numbers ρ ♦ (z j − z k ) is positive semidefinite: (A matrix is said to be positive semidefinite if all its eigenvalues are ≥ 0). When condition (113) is satisfied one says that the reduced symplectic Fourier transform ρ ♦ is of positive type.

The notion of η-positivity
The notion of η-positivity, due to Kastler [52], generalizes Bochner's notion: let a ∈ S ′ (R 2n ) and η a real number; we say that a ♦ is of η-positive type if for every integer N the N × N matrix Λ (N ) with entries is positive semidefinite (which we write "≥ 0" for short) for all choices of (z 1 , z 2 , ..., z N ) ∈ (R 2n ) N : The condition (114) is equivalent to the polynomial inequalities 1≤j,k≤N for all N ∈ N, λ j , λ k ∈ C, and z j , z k ∈ R 2n . If a is of η-positive type then it is also of of (−η)-positive type as is immediately seen by taking the complex conjugate of the left-hand side of (115). When η = 0 we can rewrite conditions (114)-(115) using the symplectic η-Fourier transform: replacing (z j , z k ) with η −1 (z k , z j ) and noting that σ(z k , z j ) = −σ(z j , z k ) the conditions (114) are equivalent to The polynomial conditions (115) become in this case

KLM condition and the quantum Bochner theorem
We are now going to prove an essential result (the "quantum Bochner theorem") originally due to Kastler [52], and Loupias and Miracle-Sole [59,60]; also see Parthasarathy [75,76] and Parthasarathy and Schmidt [77] for different points of view. The proof we will give is simpler than that in [52,59,60], which uses the theory of C * -algebras; our proof is partially based on the discussions in [68,70,95]. For this we will need a technical result from linear algebra ("Schur's Lemma"), which says that the entrywise product of two positive semidefinite matrices is also positive semidefinite: For a proof of this result see for instance Bapat [3].
Let ρ = j α j W ψ j be the Wigner function of ρ. We have ρ ≥ 0 if and only if the two following conditions hold: (i) The reduced symplectic Fourier transform ρ ♦ is continuous and ρ ♦ (0) = 1; (ii) ρ ♦ is of η-positive type.
Proof. Let us first show that the conditions (i)-(ii) are necessary. Assume that ρ ≥ 0; then for a family of normalized functions ψ j ∈ L 2 (R n ), the coefficients α j being ≥ 0 and summing up to one. It is thus sufficient to show that the Wigner transform W η ψ of an arbitrary ψ ∈ L 2 (R n ) is of η-positive type. This amounts to showing that for all (z 1 , ..., z N ) ∈ (R 2n ) N and all (λ 1 , ..., λ N ) ∈ C N we have for every complex vector (λ 1 , ..., λ N ) ∈ C N and every sequence (z 1 , ..., z N ) ∈ (R 2n ) N (see condition (117)). Since the η-Wigner function W η ψ and the η-ambiguity Amb η function are obtained from each other by the symplectic η-Fourier transform Let us prove that the inequality (119) will follow. Taking into account the fact that D η (−z k ) † = D η (z k ) and using the relation (76) which becomes here we have, expanding the square in the right-hand side of (120), in view of formula (74), which becomes here proving the equality (120). Let us now show that, conversely, the conditions (i) and (ii) are sufficient, i.e. that they imply that ( ρψ|ψ) L 2 ≥ 0 for all ψ ∈ L 2 (R n ); equivalently (see formula (57) in Theorem 8) for ψ ∈ L 2 (R n ). Let us set, as above, where z j and z k are arbitrary elements of R 2n . To say that a σ,η is of ηpositive type means that the matrix Λ ′ = (Λ ′ jk ) 1≤j,k≤N is positive semidefinite; choosing z k = 0 and setting z j = z this means that every matrix (a σ,η (z)) 1≤j,k≤N is positive semidefinite. Setting we claim that the matrix M (N ) = (M jk ) 1≤j,k≤N is positive semidefinite. In fact, M is the Hadamard product of the positive semidefinite matrices and Schur's Lemma 17 implies that M (N ) is also positive semidefinite. It follows from Bochner's theorem that the function b defined by is a probability density; in particular we must have b(0) ≥ 0. Integrating the equality above with respect to z we get, using the Plancherel formula (6) for the symplectic η-Fourier transform hence the inequality (122) since b(0) ≥ 0.

The covariance matrix
Let ρ(z) be a real function on phase space R 2n . We assume that this function is integrable and that and the marginal identities (53) hold: We will in addition assume that ρ decreases sufficiently fast at infinity: Setting z α = x α if 1 ≤ α ≤ n and z α = p α−n if n + 1 ≤ α ≤ 2n, the covariances and variances of the variables z = (x, p) associated with ρ are the numbers and In the formulas (127) and (128) above z α is the average with respect to ρ of z α ; more generally one defines for an integer k ≥ 0 the moments Notice that our condition (126) guarantees the existence of both z α and z 2 α as follows from the trivial inequalities It follows that the quantities (127) and (128) are well-defined in view of condition (126). Since the integral of ρ is equal to one, formulae (127) and (128) can be rewritten as We will call the symmetric 2n × 2n matrix the covariance matrix associated with ρ. For instance, when n = 1 Here is an example: let ρ(z) be given by the formula where Σ is a symmetric positive definite real 2n × 2n matrix. A straightforward calculation involving Gaussian integrals shows that Σ is precisely the associated covariance matrix. We will see later that ρ(z) is the Wigner function of a quantum state if Σ satisfies a certain condition related to the uncertainty principle. However, ρ(z) can always be viewed as a classical probability distribution.

Two lemmas
We are going to state two preliminary results which will be used for the proof of Theorem 21 below characterizing the covariance matrix of a quantum state. We will not prove these results here, and refer to the original sources. That we need so much preparatory material is indicative of the difficulty of the topic we address. We begin with the well-known Williamson's symplectic diagonalization theorem. Let us first recall the following terminology [33,34]: suppose that Σ is a symmetric positive definite real 2n × 2n matrix. Then the matrix ΣJ has the same eigenvalues as the antisymmetric 8 matrix Σ 1/2 JΣ 1/2 and they are therefore of the type ±iλ 1 , ..., ±iλ n where λ j > 0. These numbers λ j are called the symplectic eigenvalues of the matrix Σ (or sometimes also the Williamson invariants of Σ). Williamson's diagonalization result generalizes to the multidimensional case the elementary observation that every 2 × 2 real symmetric matrix Σ = a b b c with a > 0 and ac − b 2 > 0 can be Proof. See for instance [33,34] and the references therein. For this we will need, in addition to Williamson's symplectic diagonalization result, the following technical result: Lemma 20 (Narcowich) Let f (z) be a twice differentiable function defined on phase space. If f is of η-positive type then where f ′′ (0) is the matrix of second derivatives (the Hessian matrix) of f (z) at z = 0.
6.3.3 A necessary (but not sufficient) condition for a state to be "quantum" We are going to prove an essential result, which goes back to Narcowich [69]. It says that the covariance matrix of a quantum state must satisfy a certain condition which implies -but is stronger than -the Robertson-Schrödinger uncertainty principle. We recall that it is assumed that the Wigner function ρ satisfies the condition (1 + |z| 2 )|ρ(z)|d 2n z < ∞.
This implies, among other things, that the Fourier transform (and hence also the symplectic Fourier transform) of ρ is twice continuously differentiable. In fact, writing and hence in view of the inequalities (130), (131).
Theorem 21 Suppose that the phase space function ρ with associated covariance matrix Σ is the η-Wigner transform of a density matrix ρ.
have λ j ≥ 1 2 |η| for all j hence (138). Property (ii) immediately follows from (138); it can also be proved directly: setting η ′ = rη with 0 < r ≤ 1 we have The relation of the result above with the uncertainty principle comes from the following observation (see Narcowich [69], de Gosson [34], Chapter 13, de Gosson and Luef [44]): the relation (which we will call the "strong uncertainty principle") implies the Robertson-Schrödinger inequalities It is however essential to note that (140) and (141) are not equivalent. Here is a counterexample in the case n = 2. Consider the symmetric matrix Assume that Σ is a covariance matrix; we thus have (∆x 1 ) 2 = (∆x 2 ) 2 = 1 and (∆p 1 ) 2 = (∆p 2 ) 2 = 1, and also ∆(x 1 , p 1 ) = ∆(x 2 , p 2 ) = 0 so that the inequalities (141) are trivially satisfied for the choice η = 1 (they are in fact equalities). The matrix Σ + iJ is nevertheless indefinite (its determinant is −1); Σ is not even invertible. It is also essential to realize that condition (140) is necessary but not sufficient for a phase space function to be the Wigner function of a quantum state. Let us check this on the following example due Narcowich and O'Connell [70], further discussed in de Gosson and Luef [43]. Consider the function f (x, p) defined for n = 1 by where a and b are positive constants such that ab ≥ 1 4 2 . Now define since f (x, p) is an even function ρ(x, p) is real; in view of the Fourier inversion formula we have and hence ρ(x, p)dpdx = f (0, 0) = 1 so that ρ(x, p) is a candidate for being the Wigner function of some density matrix. Calculating the covariance matrix Σ associated with ρ, one finds after some tedious calculations ( [70], pp.4-5) that Σ + i 2 J ≥ 0. However, ρ cannot be the Wigner function of a density matrix ρ for if this were the case we would have p 4 ρ ≥ 0; but by definition of ρ p 4 ρ(x, p)dpdx = ∂ 4 f ∂x 4 (0, 0) = −24a 2 < 0.

Gaussian Bosonic states
We focus now on the case where all involved states are Gaussian: we say that a pure or mixed quantum state is Gaussian if its Wigner function is of the type where Σ (the "covariance matrix") is a positive definite 2n × 2n matrix satisfying a certain condition that will be stated later (intuitively speaking Σ can't be "too small" because then ρ(z) would be too sharply peaked and thus violate the uncertainty principle of quantum mechanics). Gaussian states appear naturally in every quantum system which can be described or approximated by a quadratic Bosonic Hamiltonian (Wolf et al. [98]); because of their peculiarities they play an exceptionally important role in quantum mechanics and optics (see Barnett and Radmore [4]).

Definition and examples
We will define a generalized Gaussian as any complex function on R n of the type where M = X +iY is a complex symmetric 2n×2n invertible matrix; X and Y are real matrices such that X = X T > 0 and Y = Y T . The coefficient in front of the exponential is chosen so that ψ M is normalized to unity: ||ψ M || = 1.
Suppose that X = I and Y = 0; then is the standard (or fiducial) coherent state where (det M ) −1/2 is given by the formula where G is the symplectic symmetric matrix is a symplectic matrix.
Proof. To simplify notation we set C(X) = (π ) n/4 (det X) 1/4 . By definition of the Wigner transform we have where the phase F is defined by Using the Fourier transformation formula (145) above with x replaced by p + Y x and M by 1 2 X we get On the other hand we have (2π ) n/2 det( 1 2 X) −1/2 C(X) 2 = 1 π n and hence W ψ M (z) = 1 π n e − 1 Gz 2 where so that G is given by (148). One immediately verifies that G = S T S where S is given by (149) and that S T JS = J hence S ∈ Sp(n) as claimed.
In particular, when ψ M is the standard coherent state (144) we recover the well-known formula

A necessary and sufficient condition
We are going to discuss the following question: For which values of η can the Gaussian function ρ be the η-Wigner function of a density operator? Narcowich [67] was the first to address this question using techniques from harmonic analysis using the approach in Kastler's paper [52]; we give here a new and simpler proof using the multidimensional generalization of Hardy's uncertainty principle. Let us begin with what is usually called in the literature "Hardy's uncertainty principle". In what follows we denote by F η ψ the η-Fourier transform given, for ψ ∈ L 2 (R n ), by An old result (1933) due to Hardy [46] quantifies the "folk theorem" following which a function and its Fourier transform cannot be simultaneously arbitrarily sharply peaked. In fact Hardy proved that if ψ ∈ L 2 (R) satisfies then we must have ab ≤ 1. In particular, if ab = 1 then ψ(x) = N e −ax 2 /2η for some constant N (we are thus here in the presence of a particular quantum tomography result, which says that it suffices, in the Gaussian case, to know the position and momentum probabilities to determines the state). We will use the following generalization to the multidimensional case of Hardy's result: Lemma 23 (Hardy) Let A and B be two real positive definite matrices and ψ ∈ L 2 (R n ), ψ = 0. Assume that for a constant C > 0. Then: (i) The eigenvalues λ j , j = 1, ..., n, of the matrix AB are all ≤ 1/η 2 ; (ii) If λ j = 1/η 2 for all j, then ψ(x) = ke − 1 2 Ax 2 for some complex constant k.
Proof. See de Gosson and Luef [42,44], de Gosson [34]. We will also need the following positivity result: Lemma 24 If R is a symmetric positive semidefinite 2n × 2n matrix, then is a symmetric positive semidefinite N × N matrix for all z 1 , ..., z N ∈ R 2n .
Proof. There exists a matrix L such that R = L * L (Cholesky decomposition). Denoting by z|z ′ = z · z ′ the inner product on C 2n we have, since the z j are real vectors, L * z j · z k = L * z j |z k = z j |Lz k = z j · (Lz k ) * hence Rz j · z k = Lz j · (Lz k ) * . It follows that 1≤j,k≤N hence our claim. We now have the tools needed to give a complete characterization of Gaussian η-Wigner functions. Recall from Theorem 21 that a necessary condition for a matrix Σ to be the covariance matrix of a quantum state is that it satisfies the condition Σ + iη 2 J ≥ 0. It turns out that in the Gaussian case this condition is also sufficient: where λ min is the smallest symplectic eigenvalue of Σ; equivalently Proof. Let us give a direct proof of the necessity of condition (156) for the Gaussian (155) to be the η-Wigner transform of a positive trace class operator. Let ρ = (2πη) n Op W η (ρ) and set a(z) = (2πη) n ρ(z). Let S ∈ Mp(n); the operator ρ is of trace class if and only if S ρ S −1 is, in which case Tr( ρ) = Tr( S ρ S −1 ). Choose S with projection S ∈ Sp(n) such that Σ = S T DS is a symplectic diagonalization of Σ. This choice reduces the proof to the case Σ = D, that is to ρ(z) = (2π) −n (det Λ −1 )e − 1 2 (Λ −1 x 2 +Λ −1 p 2 ) .
In particular, since α j ≥ 0 for every j = 1, 2, ..., n, Setting 2λ min = ℏ and writing Σ in the block-matrix form Σ xx Σ xp Σ px Σ pp where Σ xx = (∆(x j , x k )) 1≤j,k≤n , Σ xp = (∆(x j , p k )) 1≤j,k≤n and so on, one shows [44] that (157) implies the generalized uncertainty relations (the "Robertson-Schrödinger inequalities"; see de Gosson and Luef [44] for a detailed discussion of these inequalities) where, for ≤ j ≤ n, the ∆x 2 j = ∆(x j , x j ), ∆p 2 j = ∆(p j , p j ) are viewed as variances and the ∆(x j , p k ) as covariances. We have given a detailed discussion of the Robertson-Schrödinger inequalities in de Gosson and Luef [44] from the symplectic point of view.
7 Some Speculations 7.1 The fine structure constant Dirac [17] already speculated in 1937 that physical constants such as the gravitational constant or the fine structure constant might be subject to change over time. These questions have since been a very active area of research (see the recent reviews [22,91]). Some scientists have suggested that the fine structure constant, α ≈ 1/137, might not be constant, but could vary over time and space. This dimensionless constant, introduced by Sommerfeld in 1916, measures the strength of interactions between light and matter, or equivalently, how strong electrical and magnetic forces are. It can be expressed as a combination of three constants: the charge on an electron, the speed of light, and Planck's constant h: The quest for testing the non-constancy of α is ongoing. The Oklo natural nuclear reactor is known to give limits on the variation of the fine structure constant over the period since the reactor was running (˜1.8 billion years). In 1999, a team of astronomers using a telescope in Hawaii reported that measurements of light absorbed by very distant galaxy-like objects in space called quasars -which are so far away that we see them today as they looked billions of years ago -suggest that the value of the fine structure constant was once slightly different from what it is today. Experiments can in principle only put an upper bound on the relative change per year. For the fine structure constant, this upper bound is comparatively low, at roughly 10 −17 per year. That claim was controversial, and still unproven. But if true, it must mean that at least one of the three fundamental constants that constitute α must vary. The possibility that some constants of Nature could vary in space-time has remained a subject of fascination which has motivated numerous theoretical and experimental researches [10,23].

Planck's constant
Kentosh and Mohageg focused on h, and specifically on whether h depends on where (not when) you measure it. If h changes from place to place, so do the frequencies, and thus the "ticking rate", of atomic clocks. And any dependence of h on location would translate as a tiny timing discrepancy between different GPS clocks. The physicist Freeman Dyson has suggested (private communication) that the increasing precision of measurements of time could lead to non-ambiguous results. Mohageg and his student Kentosh [53,54] have tested the constancy of h using the freely available data from GPS. Kentosh and Mohageg were actually motivated by the fact that h also appears in the fine structure constant, whose possible variation is a very active area of research in experimental physics. After careful analysis of the data from seven highly stable GPS satellites, Kentosh and Mohageg concluded that h is identical at different locations to an accuracy of seven parts in a thousand. In other words, if h were a one-metre measuring stick, then two sticks in different places anywhere in the world would not differ by more than seven millimetres. At least as interesting is the possible time-variation of Planck's constant (see Mangano et al. [63]). This deserves to be explored because if true it could shed some light on the Early Universe, just after the Big Bang. In fact, if the fine structure constant has been increasing since the Big Bang, this could perhaps be due to a decrease of Planck's constant. If such a variation could be experimentally detected, then it would mean, following our discussion of the quantum Bochner theorem, that the early Universe was much more "quantum" than it is now; this would of course have major implications in terms of entanglement.

Units
Testing the constancy of a physical parameter means going to extraordinary lengths in terms of precision measurements, and is intimately related to choices of unit systems. The physicist Michael Duff [20] remarked in 2002 (also see Duff [21,22]) that all the fundamental physical dimensions could be expressed using only one: mass. Duff first noticed the obvious, namely that lengths can be expressed as times using c, the velocity of light, as a conversion factor. One can therefore take c = 1, and measure lengths in seconds. The second step was to use the relation E = hν which relates energy to a frequency, that is to the inverse of a time. We can thus measure a time using the inverse of energy. But energy is equivalent to mass as shown by Einstein, so that time can be measured by the inverse of mass. Thus, setting c = h = 1 we have reduced all the fundamental dimensions to one: mass. A further step consists in choosing a reference mass such that the gravitational constant (first measured by Cavendish in 1798) is equal to one: G = 1. Summarizing, we have obtained a theoretical system of units in which c = h = G = 1. Now, a very important physical parameter is, without doubt, the fine structure constant α = e 2 /2ε 0 hc (ε 0 the dielectric constant); it is a dimensionless number whose approximate value is 1/137. There are other ways to define irreducible unit systems. Already Stoney, noting that electric charge is quantized, derived units of length, time, and mass in 1881 by normalizing G, c, and e to unity; Planck suggested in 1898-1899 that it would suffice to use G, c, and h to define length, mass, and time units. His proposal led to what are called today Planck's length ℓ P = Gh/c 3 and Planck mass and time M P = hc/G and T P = Gh/c 5 .