Thanks for the explanation. If you have the time, can you also explain _why_ would we be multiplying "each point of the universe with a different complex number of modulo 1?" What does it mean in physical reality; why multiply points with any number at all?
The simple answer is "because we can". In general, physicists have found that we should write down the most general mathematical theory compatible with what's observed. A famous example of this is Einstein's cosmological constant -- I'll leave that one to wikipedia [1] since I'm a particle physicist and not an expert on GR.
In the case of gauge theory, the idea that we should consider the most general case has been well proven. As GP pointed out, all observable phenomena ultimately depend only on the absolute modulus of the field Ψ, so a theoretical physicist naturally wonders, what happens if you allow its complex phase to vary. Turns out nothing interesting happens if you apply a global phase, but if you allow the phase to vary at every point in spacetime, it ends up breaking the theory. That is, unless you include an additional field at every point in spacetime that precisely cancels out the change induced by the gauge freedom.
In other words, the motivation is that we can't simply look and "see" whether or not there is a locally varying phase on the wavefunction Ψ, since we only can measure |Ψ|^2. So we have to assume there is, until proven otherwise. Since a local phase would imply the existence of an extra field to cancel it out, we can indirectly check for this scenario by looking for the corresponding field. As pointed out by GP, in the case of a U(1) gauge, it turns out there is such a field, and electromagnetism (and all of its laws) exactly fit the bill.
There are other "unmeasurable" symmetries you could apply to the wave function as well, beyond just a complex phase. SU(2) is a Lie group symmetry which would mean that the measurable properties of certain tuples of fields (Ψ, ϕ) are indistinguishable under a sort of complex-valued rotation of Ψ->ϕ and ϕ->Ψ. Again, if you assume a such symmetry is locally varied at every point in spacetime, you end up requiring not one but three new fields to cancel out the effects on the SM Lagrangian. It turns out that the vector bosons W+, W-, and Z, which mediate weak nuclear forces exactly fit the bill.
And at the risk of muddying the original point, it turns out that strictly speaking the neither the neutral Z boson nor the electromagnetic photon can be said to come from SU(2) or U(1). Instead, each field is a linear combination of some abstract fields (namely the neutral W0 and the weak hypercharge B bosons) from the unitary gauge induced by the compound symmetry SU(2)xU(1). This is because when both symmetries SU(2) and U(1) are present, there are different ways you can "mix" the two into the Lagrangian. The mixing that gives us the photon and the Z0 boson is known because of the experimental confirmation of these particles. This is what is meant when it is said that electromagnetism and weak neuclear forces are united in a higher-energy theory as a single "electroweak" force.
This is great, you perfectly bridged the gap between having been 25 once and vaguely remembering the math terms mentioned and bringing it back to reality. Really epic how this (at least to me) abstract and arcane math not only explains the behaviors of the subatomic particles, but even predicts the existence of more and their related forces.
The work of theoretical physicists always seemed like the other side of an ocean of math away, but thanks to this explanation it feels like I can at least make out a lighthouse in the distance.
> That is, unless you include an additional field at every point in spacetime that precisely cancels out the change induced by the gauge freedom.
Could you elaborate on this a bit? To a layperson this sounds like a hack. "Things get screwy when you screw with them, UNLESSSSS we add a magic thing that undoes our work". Well yeah.
It's just a mathematical tool of curiosity that we have found very useful.
Here's an analogy: You find a box, and you can't see inside it. You have no reason to think there's something inside it. But also, boxes have stuff in them sometimes. So, you shake the box, and hear something clinking around. Therefore, you infer there's something in the box.
Somebody next to you says "This sounds like a hack. There was a box and you had to go shake it until it started making sounds that it wasn't making before. UNLESSSSS we now magically have to agree that there's something in the box."
It's a perfectly reasonable question, and I'm just turning your words on you in good faith :)
To take the technical discussion a bit further, it's exactly this kind of reasoning that led to the discovery of the Higgs boson. Strictly speaking, it's impossible for gauge bosons (that's what the particles are called that show up when you add these locally-varying symmetries) to have nonzero mass.
The photon and gluons (from the SU(3) strong force) are massless, but the W and Z bosons are VERY massive.
This was a big problem with the Standard Model; the vector gauge bosons had every property expected from the gauge theory, except for this one point about their mass, which was experimentally incontrovertible.
That is, until Brout/Englert/Higgs came along.
They said "Yeah the vector bosons must be massless UNLESSSSSSSS you assume there's this magic additional field that couples to every particle's mass, in which case it perfectly cancels out all the problems and allows the W and Z bosons to be heavy".
It took 50 years but we found that particle eventually.
You're a really great writer on this topic. If you're not already, and you have the time, you should seek out ways to do this in a way that has more reach. Thank you!
You know I tend to agree with the previous poster, not that it's a hack, but that I've always felt its a bit backward reasoning. You say the physics should be invariant under curvature of the field, but it isn't unless you add another field to cancel it. But you might as well have said from the start that the field is curved by another field and we need to consider that in our derivatives that describe our local physics, making them covariant. The explanation that "it makes sense that the fields are locally gauge invariant" always seemed a bit constructed after the fact so to speak.
The argument isn't against physicists inventing new fields or interactions to fit data. The argument is about why you can motivate it as "something that has to be added out of pure logic" :)
I don't think it's backwards. We started out with this big sweeping and simple statement that (seems to) holds true for physical reality, but it doesn't account for everything. The statement allows for two realities, one where there is one single global phase, and another where each point has its own local phase.
The first option might be true, no real way to measure it and that's sort of it, no explanation for all the other stuff going on. The second option is a little more complex, it requires an extra construct to make it work. And apparently when we do the math and work out this construct, it exactly maps to things we can measure in reality, that were not explained by the other simpler option.
So it's not backwards because it's simply the first full match in a depth first search through the possible realities that follow from this wave function theory.
It makes a lot more sense if you have the historical context though. When this is taught the context is left out and then it feels more like mathemagics to me and probably others as well. Maybe there are good books that explain this from a historical point of view, while still teaching the theory, but they must be pretty rare.
My comment (the GG..GGP) was an (over)reaction to the posted article. The article presents the new approach like a very "natural" explanation, but there are already some other "natural" explanations.
In a Physics degree, the order is quite historical. Like one full course for the three first next items, and all the other together. I'm not sure if there is a book with all of them, you probably need 4 or 5 books.
1) Non-Quantum Non-Relativistic Electromagnetism
2) Quantum Non-Relativistic Electromagnetism
3) Non-Quantum Relativistic Electromagnetism
4) Quantum Relativistic Electromagnetism
5) By the way, you can interpret the Quantum Relativistic Electromagnetism as a U(1) symmetry. (my comment)
6) It looks like a good idea. Let's use other groups to explain other known forces: the weak and strong force. (The G...GP comment about SU(2) and SU(3).)
7) ???
[See note 1]
For some reason, popular science articles love to show something almost magical and prefer to present something like the "5)". It makes it easier to hide the math and use hand waving.
Also, Physicist working in physic particle also believe that "5)" and "6)" are the correct approach, and the other are just useful for teaching and for historical reasons. But to discover "7)" it's better to think about some weird new symmetry group [2].
For examples, a few years ago, it was popular to think the next step "7)" was using a new group SU(5) that combines SU(2) and SU(3). The problems is that the experiments gave different results than then new proposed theory, not too bad but like 1% off. I still remember my professor talking about how great was SU(5) and how the experiment disagree, and he looked heartbroken because he really liked SU(5).
[1] You should add some material about the historical discovery of the weak and strong forces between "5)" and "6)".
[2] Other's prefer superstrings for "7)", there are other approach, but all are weird.
>To a layperson this sounds like a hack. "Things get screwy when you screw with them, UNLESSSSS we add a magic thing that undoes our work". Well yeah.
What you describe as a "hack" is actually the empirical (look/probe and see what happens) nature of physics as a science.
It seems like a hack because there's no a priori theoritical reason to do it. But physics is not based on verifying empirically a set of a priori rules (who would give them?), but by building theories and rules by empirically looking at things at seeing what model fits, and if a changed model fits better - we then verify those empirically with more experiments.
And this is not "Things get screwy when you screw with them, UNLESSSSS we add a magic thing that undoes our work", but more like a reverse engineering session:
"Behavior X appears to be described by this formula with parameter p. What if we changed the parameter to -p? Hmm, the results would still be consistent with X, if only there was an additional factor v in the formula.
Would this (-p,v) combo buy us anything over our previous (p)?
Wow, yeah, v would then perfectly match the behavior we see for this other thing Y too. So (-p, v) seems to describe both X and Y, whereas before with p we could only describe X.
A really simple example is voltage. What does it even mean if one cable is on a potential of 5 V? It's always compared with the Ground voltage since the voltage is always a difference between 2 electric potentials. That means you could add a constant to each potential and nothing would change. So in this case not gauging would be quite hacky... (This example has nothing to do with the phase though, but just to illustrate. Almost always when you measure something, some gauging is at least implicitly involved.)
So it turns out this happens quite often that there is some kind of constant that can be divided out. In case of particle physics a whole framework has been developed out of it that has really close relations to Lie group theory. (The experimentally confirmed parallels are just astonishing with group generators and elements corresponding to interaction particles and the normal particles.)
But there is an important distinction between global and local symmetries. Global symmetries, when you apply the same change to the field at every point, are physically meaningful, not so local symmetries, when you apply [potentially] different changes at different points.
For example the laws of physics are time translation invariant, i.e. it does not matter what point in time you call t equals zero which essentially means that the equations do not contain time but only time differences so that you can add the same constant to all your times and nothing changes as the constant cancels out when calculating a difference between two times.
Its the same with voltage, where you put your reference potential does not matter but this is again a global symmetry and you have to use the same reference potential everywhere, you will obviously get nonsense if you use different reference potentials at different points.
Local symmetries on the other hand are kind of defects in the mathematics of physical theories, they are the expression of redundancies in the mathematical description. Say you want to describe the orientation - but not the strength - of the magnetic field on earth and for simplicity lets assume the earth is flat and the magnetic field parallel to the surface, then you could do this by associating a two dimensional vector with each point on earth that describes the tip of a compass needle placed at that point.
But there is a problem, there are longer and shorter compass needles but that is irrelevant for the orientation of the magnetic field, the length of the vector does not actually matter. You could multiply this vector field with a different constant at every point, i.e. independently change the length of all the compass needles and you would still describe the same magnetic field orientation across earth. What you do to fix this is to declare that all fields are physically equivalent if they only differ by a constant factor - which may depend on the point - at each point.
The other solution is to use a better mathematical representation without the redundancy, instead of compass needle tip vectors you use the bearing angle and take the compass needle length out of the equation to begin with. Problem solved. Now you can no longer change the field value, i.e. the angle, at every point independently and still describe the same physical situation. Also note that there is still a global symmetry, you can still change all the angles by the same constant, you are free to pick which direction you label with angle zero, which is again physically meaningful and expresses that space is isotropic, i.e. there is no preferred direction and the equation therefore do not depend on the direction but only on the angle between directions.
And you can do the same thing with the known physical laws, you can for example get rid of the U(1) symmetry - which should better be called a redundancy - in quantum electro dynamics. The price you have to pay is that the resulting equations lack some other properties often considered desirable, for example they are no longer obviously local.
Nitpicking: IIRC Adding 5V everywhere is like multiplying Ψ by expt(i 5V t cte), where t is the time and cte is a constant that may involve c, ℏ, e, and perhaps a number, but I'm too lazy to lookup.
Looking at my comment, it says "multiply each point of the universe by a different complex" it's actually a "event" like in relativity, i.e. (ct, x, y z). This is an easy case where the function you use to multiply does not depend on x, y, z, but only on t.
It turns out that the magic thing we have to add, exactly fits the observations we have of electromagnetism. That's an indication that we screw with the theory in the same way that nature does, and in the end that's the goal of physics: understand the way nature behaves.
You are excellent at explaining this stuff. You're bringing some clarity to things that typically sound like abstract gobbledygook to me. We need more people who can bridge that gap.
I don't know about quantum mechanics, but when we talk about space we should be free to add a quantity to the whole universe (like adding 1 to the x coordinate of everything) because this just shifts the whole universe, or accordingly, shifts the origin - the (0, 0, 0) point - in the opposite direction.
The origin is set at an arbitrary point so this "space shift invariance" is saying that it doesn't matter what point we set for the origin (and mathematically this corresponds to the conservation of momentum - see Noether's theorem[0])
Hmm maybe the "zero" for the quantum states is arbitrary, so you should be able to add anything to it for the whole universe, and this merely changes the zero state in the opposite direction.. and since this should be a conversation law, pretty sure this is equivalent to the conservation of electric charge
I may be starting to get it. The magic numbers we add to everything or multiply everything with really represents just a change in the viewpoint, origin of the observer. And because it should be possible to change the "location" of the observer (say the voltage we take to be 0 volts) and still get the same theory to hold up, we can discover that it can only hold up if we assume the existence of some new field. Something like this?
Not GP, or a physicist, but my understanding is that the different number at each point you multiply with represents a degree of freedom at each point of spacetime, and in this degree of freedom is where the electromagnetic field lives.