Planck: Demographics and Diversity

Another aspect of Planck’s legacy bears examining.

A couple of months ago, the 2018 Gruber Prize in Cosmology was awarded to the Planck Satellite. This was (I think) a well-deserved honour for all of us who have worked on Planck during the more than 20 years since its conception, for a mission which confirmed a standard model of cosmology and measured the parameters which describe it to accuracies of a few percent. Planck is the latest in a series of telescopes and satellites dating back to the COBE Satellite in the early 90s, through the MAXIMA and Boomerang balloons (among many others) around the turn of the 21st century, and the WMAP Satellite (The Gruber Foundation seems to like CMB satellites: COBE won the Prize in 2006 and WMAP in 2012).

Well, it wasn’t really awarded to the Planck Satellite itself, of course: 50% of the half-million-dollar award went to the Principal Investigators of the two Planck instruments, Jean-Loup Puget and Reno Mandolesi, and the other half to the “Planck Team”. The Gruber site officially mentions 334 members of the Collaboration as recipients of the Prize.

Unfortunately, the Gruber Foundation apparently has some convoluted rules about how it makes such group awards, and the PIs were not allowed to split the monetary portion of the prize among the full 300-plus team. Instead, they decided to share the second half of the funds amongst “43 identified members made up of the Planck Science Team, key members of the Planck editorial board, and Co-Investigators of the two instruments.” Those words were originally on the Gruber site but in fact have since been removed — there is no public recognition of this aspect of the award, which is completely appropriate as it is the whole team who deserves the award. (Full disclosure: as a member of the Planck Editorial Board and a Co-Investigator, I am one of that smaller group of 43, chosen not entirely transparently by the PIs.)

I also understand that the PIs will use a portion of their award to create a fund for all members of the collaboration to draw on for Planck-related travel over the coming years, now that there is little or no governmental funding remaining for Planck work, and those of us who will also receive a financial portion of the award will also be encouraged to do so (after, unfortunately, having to work out the tax implications of both receiving the prize and donating it back).

This seems like a reasonable way to handle a problem with no real fair solution, although, as usual in large collaborations like Planck, the communications about this left many Planck collaborators in the dark. (Planck also won the Royal Society 2018 Group Achievement Award which, because there is no money involved, could be uncontroversially awarded to the ESA Planck Team, without an explicit list. And the situation is much better than for the Nobel Prize.)

However, this seemingly reasonable solution reveals an even bigger, longer-standing, and wider-ranging problem: only about 50 of the 334 names on the full Planck team list (roughly 15%) are women. This is already appallingly low. Worse still, none of the 43 formerly “identified” members officially receiving a monetary prize are women (although we would have expected about 6 given even that terrible fraction). Put more explicitly, there is not a single woman in the upper reaches of Planck scientific management.

This terrible situation was also noted by my colleague Jean-Luc Starck (one of the larger group of 334) and Olivier Berné. As a slight corrective to this, it was refreshing to see Nature’s take on the end of Planck dominated by interviews with young members of the collaboration including several women who will, we hope, be dominating the field over the coming years and decades.

(Almost) The end of Planck

This week, we released (most of) the final set of papers from the Planck collaboration — the long-awaited Planck 2018 results (which were originally meant to be the “Planck 2016 results”, but everything takes longer than you hope…), available on the ESA website as well as the arXiv. More importantly for many astrophysicists and cosmologists, the final public release of Planck data is also available.

Anyway, we aren’t quite finished: those of you up on your roman numerals will notice that there are only 9 papers but the last one is “XII” — the rest of the papers will come out over the coming months. So it’s not the end, but at least it’s the beginning of the end.

And it’s been a long time coming. I attended my first Planck-related meeting in 2000 or so (and plenty of people had been working on the projects that would become Planck for a half-decade by that point). For the last year or more, the number of people working on Planck has dwindled as grant money has dried up (most of the scientists now analysing the data are doing so without direct funding for the work).

(I won’t rehash the scientific and technical background to the Planck Satellite and the cosmic microwave background (CMB), which I’ve been writing about for most of the lifetime of this blog.)

Planck 2018: the science

So, in the language of the title of the first paper in the series, what is the legacy of Planck? The state of our science is strong. For the first time, we present full results from both the temperature of the CMB and its polarization. Unfortunately, we don’t actually use all the data available to us — on the largest angular scales, Planck’s results remain contaminated by astrophysical foregrounds and unknown “systematic” errors. This is especially true of our measurements of the polarization of the CMB, unfortunately, which is probably Planck’s most significant limitation.

The remaining data are an excellent match for what is becoming the standard model of cosmology: ΛCDM, or “Lambda-Cold Dark Matter”, which is dominated, first, by a component which makes the Universe accelerate in its expansion (Λ, Greek Lambda), usually thought to be Einstein’s cosmological constant; and secondarily by an invisible component that seems to interact only by gravity (CDM, or “cold dark matter”). We have tested for more exotic versions of both of these components, but the simplest model seems to fit the data without needing any such extensions. We also observe the atoms and light which comprise the more prosaic kinds of matter we observe in our day-to-day lives, which make up only a few percent of the Universe.

All together, the sum of the densities of these components are just enough to make the curvature of the Universe exactly flat through Einstein’s General Relativity and its famous relationship between the amount of stuff (mass) and the geometry of space-time. Furthermore, we can measure the way the matter in the Universe is distributed as a function of the length scale of the structures involved. All of these are consistent with the predictions of the famous or infamous theory of cosmic inflation), which expanded the Universe when it was much less than one second old by factors of more than 1020. This made the Universe appear flat (think of zooming into a curved surface) and expanded the tiny random fluctuations of quantum mechanics so quickly and so much that they eventually became the galaxies and clusters of galaxies we observe today. (Unfortunately, we still haven’t observed the long-awaited primordial B-mode polarization that would be a somewhat direct signature of inflation, although the combination of data from Planck and BICEP2/Keck give the strongest constraint to date.)

Most of these results are encoded in a function called the CMB power spectrum, something I’ve shown here on the blog a few times before, but I never tire of the beautiful agreement between theory and experiment, so I’ll do it again: PlanckSpectra (The figure is from the Planck “legacy” paper; more details are in others in the 2018 series, especially the Planck “cosmological parameters” paper.) The top panel gives the power spectrum for the Planck temperature data, the second panel the cross-correlation between temperature and the so-called E-mode polarization, the left bottom panel the polarization-only spectrum, and the right bottom the spectrum from the gravitational lensing of CMB photons due to matter along the line of sight. (There are also spectra for the B mode of polarization, but Planck cannot distinguish these from zero.) The points are “one sigma” error bars, and the blue curve gives the best fit model.

As an important aside, these spectra per se are not used to determine the cosmological parameters; rather, we use a Bayesian procedure to calculate the likelihood of the parameters directly from the data. On small scales (corresponding to 𝓁>30 since 𝓁 is related to the inverse of an angular distance), estimates of spectra from individual detectors are used as an approximation to the proper Bayesian formula; on large scales (𝓁<30) we use a more complicated likelihood function, calculated somewhat differently for data from Planck’s High- and Low-frequency instruments, which captures more of the details of the full Bayesian procedure (although, as noted above, we don’t use all possible combinations of polarization and temperature data to avoid contamination by foregrounds and unaccounted-for sources of noise).

Of course, not all cosmological data, from Planck and elsewhere, seem to agree completely with the theory. Perhaps most famously, local measurements of how fast the Universe is expanding today — the Hubble constant — give a value of H0 = (73.52 ± 1.62) km/s/Mpc (the units give how much faster something is moving away from us in km/s as they get further away, measured in megaparsecs (Mpc); whereas Planck (which infers the value within a constrained model) gives (67.27 ± 0.60) km/s/Mpc . This is a pretty significant discrepancy and, unfortunately, it seems difficult to find an interesting cosmological effect that could be responsible for these differences. Rather, we are forced to expect that it is due to one or more of the experiments having some unaccounted-for source of error.

The term of art for these discrepancies is “tension” and indeed there are a few other “tensions” between Planck and other datasets, as well as within the Planck data itself: weak gravitational lensing measurements of the distortion of light rays due to the clustering of matter in the relatively nearby Universe show evidence for slightly weaker clustering than that inferred from Planck data. There are tensions even within Planck, when we measure the same quantities by different means (including things related to similar gravitational lensing effects). But, just as “half of all three-sigma results are wrong”, we expect that we’ve mis- or under-estimated (or to quote the no-longer-in-the-running-for-the-worst president ever, “misunderestimated”) our errors much or all of the time and should really learn to expect this sort of thing. Some may turn out to be real, but many will be statistical flukes or systematic experimental errors.

(If you were looking a briefer but more technical fly-through the Planck results — from someone not on the Planck team — check out Renee Hlozek’s tweetstorm.)

Planck 2018: lessons learned

So, Planck has more or less lived up to its advanced billing as providing definitive measurements of the cosmological parameters, while still leaving enough “tensions” and other open questions to keep us cosmologists working for decades to come (we are already planning the next generation of ground-based telescopes and satellites for measuring the CMB).

But did we do things in the best possible way? Almost certainly not. My colleague (and former grad student!) Joe Zuntz has pointed out that we don’t use any explicit “blinding” in our statistical analysis. The point is to avoid our own biases when doing an analysis: you don’t want to stop looking for sources of error when you agree with the model you thought would be true. This works really well when you can enumerate all of your sources of error and then simulate them. In practice, most collaborations (such as the Polarbear team with whom I also work) choose to un-blind some results exactly to be able to find such sources of error, and indeed this is the motivation behind the scores of “null tests” that we run on different combinations of Planck data. We discuss this a little in an appendix of the “legacy” paper — null tests are important, but we have often found that a fully blind procedure isn’t powerful enough to find all sources of error, and in many cases (including some motivated by external scientists looking at Planck data) it was exactly low-level discrepancies within the processed results that have led us to new systematic effects. A more fully-blind procedure would be preferable, of course, but I hope this is a case of the great being the enemy of the good (or good enough). I suspect that those next-generation CMB experiments will incorporate blinding from the beginning.

Further, although we have released a lot of software and data to the community, it would be very difficult to reproduce all of our results. Nowadays, experiments are moving toward a fully open-source model, where all the software is publicly available (in Planck, not all of our analysis software was available to other members of the collaboration, much less to the community at large). This does impose an extra burden on the scientists, but it is probably worth the effort, and again, needs to be built into the collaboration’s policies from the start.

That’s the science and methodology. But Planck is also important as having been one of the first of what is now pretty standard in astrophysics: a collaboration of many hundreds of scientists (and many hundreds more of engineers, administrators, and others without whom Planck would not have been possible). In the end, we persisted, and persevered, and did some great science. But I learned that scientists need to learn to be better at communicating, both from the top of the organisation down, and from the “bottom” (I hesitate to use that word, since that is where much of the real work is done) up, especially when those lines of hoped-for communication are usually between different labs or Universities, very often between different countries. Physicists, I have learned, can be pretty bad at managing — and at being managed. This isn’t a great combination, and I say this as a middle-manager in the Planck organisation, very much guilty on both fronts.

Leon Lucy, R.I.P.

I have the unfortunate duty of using this blog to announce the death a couple of weeks ago of Professor Leon B Lucy, who had been a Visiting Professor working here at Imperial College from 1998.

Leon got his PhD in the early 1960s at the University of Manchester, and after postdoctoral positions in Europe and the US, worked at Columbia University and the European Southern Observatory over the years, before coming to Imperial. He made significant contributions to the study of the evolution of stars, understanding in particular how they lose mass over the course of their evolution, and how very close binary stars interact and evolve inside their common envelope of hot gas.

Perhaps most importantly, early in his career Leon realised how useful computers could be in astrophysics. He made two major methodological contributions to astrophysical simulations. First, he realised that by simulating randomised trajectories of single particles, he could take into account more physical processes that occur inside stars. This is now called “Monte Carlo Radiative Transfer” (scientists often use the term “Monte Carlo” — after the European gambling capital — for techniques using random numbers). He also invented the technique now called smoothed-particle hydrodynamics which models gases and fluids as aggregates of pseudo-particles, now applied to models of stars, galaxies, and the large scale structure of the Universe, as well as many uses outside of astrophysics.

Leon’s other major numerical contributions comprise advanced techniques for interpreting the complicated astronomical data we get from our telescopes. In this realm, he was most famous for developing the methods, now known as Lucy-Richardson deconvolution, that were used for correcting the distorted images from the Hubble Space Telescope, before NASA was able to send a team of astronauts to install correcting lenses in the early 1990s.

For all of this work Leon was awarded the Gold Medal of the Royal Astronomical Society in 2000. Since then, Leon kept working on data analysis and stellar astrophysics — even during his illness, he asked me to help organise the submission and editing of what turned out to be his final papers, on extracting information on binary-star orbits and (a subject dear to my heart) the statistics of testing scientific models.

Until the end of last year, Leon was a regular presence here at Imperial, always ready to contribute an occasionally curmudgeonly but always insightful comment on the science (and sociology) of nearly any topic in astrophysics. We hope that we will be able to appropriately memorialise his life and work here at Imperial and elsewhere. He is survived by his wife and daughter. He will be missed.

WMAP Breaks Through

It was announced this morning that the WMAP team has won the $3 million Breakthrough Prize. Unlike the Nobel Prize, which infamously is only awarded to three people each year, the Breakthrough Prize was awarded to the whole 27-member WMAP team, led by Chuck Bennett, Gary Hinshaw, Norm Jarosik, Lyman Page, and David Spergel, but including everyone through postdocs and grad students who worked on the project. This is great, and I am happy to send my hearty congratulations to all of them (many of whom I know well and am lucky to count as friends).

I actually knew about the prize last week as I was interviewed by Nature for an article about it. Luckily I didn’t have to keep the secret for long. Although I admit to a little envy, it’s hard to argue that the prize wasn’t deserved. WMAP was ideally placed to solidify the current standard model of cosmology, a Universe dominated by dark matter and dark energy, with strong indications that there was a period of cosmological inflation at very early times, which had several important observational consequences. First, it made the geometry of the Universe — as described by Einstein’s theory of general relativity, which links the contents of the Universe with its shape — flat. Second, it generated the tiny initial seeds which eventually grew into the galaxies that we observe in the Universe today (and the stars and planets within them, of course).

By the time WMAP released its first results in 2003, a series of earlier experiments (including MAXIMA and BOOMERanG, which I had the privilege of being part of) had gone much of the way toward this standard model. Indeed, about ten years one of my Imperial colleagues, Carlo Contaldi, and I wanted to make that comparison explicit, so we used what were then considered fancy Bayesian sampling techniques to combine the data from balloons and ground-based telescopes (which are collectively known as “sub-orbital” experiments) and compare the results to WMAP. We got a plot like the following (which we never published), showing the main quantity that these CMB experiments measure, called the power spectrum (which I’ve discussed in a little more detail here). The horizontal axis corresponds to the size of structures in the map (actually, its inverse, so smaller is to the right) and the vertical axis to how large the the signal is on those scales.

Grand unified spectrum

As you can see, the suborbital experiments, en masse, had data at least as good as WMAP on most scales except the very largest (leftmost; this is because you really do need a satellite to see the entire sky) and indeed were able to probe smaller scales than WMAP (to the right). Since then, I’ve had the further privilege of being part of the Planck Satellite team, whose work has superseded all of these, giving much more precise measurements over all of these scales: PlanckCl

Am I jealous? Ok, a little bit.

But it’s also true, perhaps for entirely sociological reasons, that the community is more apt to trust results from a single, monolithic, very expensive satellite than an ensemble of results from a heterogeneous set of balloons and telescopes, run on (comparative!) shoestrings. On the other hand, the overall agreement amongst those experiments, and between them and WMAP, is remarkable.

And that agreement remains remarkable, even if much of the effort of the cosmology community is devoted to understanding the small but significant differences that remain, especially between one monolithic and expensive satellite (WMAP) and another (Planck). Indeed, those “real and serious” (to quote myself) differences would be hard to see even if I plotted them on the same graph. But since both are ostensibly measuring exactly the same thing (the CMB sky), any differences — even those much smaller than the error bars — must be accounted for almost certainly boil down to differences in the analyses or misunderstanding of each team’s own data. Somewhat more interesting are differences between CMB results and measurements of cosmology from other, very different, methods, but that’s a story for another day.

The first direct detection of gravitational waves was announced in February of 2015 by the LIGO team, after decades of planning, building and refining their beautiful experiment. Since that time, the US-based LIGO has been joined by the European Virgo gravitational wave telescope (and more are planned around the globe).

The first four events that the teams announced were from the spiralling in and eventual mergers of pairs of black holes, with masses ranging from about seven to about forty times the mass of the sun. These masses are perhaps a bit higher than we expect to by typical, which might raise intriguing questions about how such black holes were formed and evolved, although even comparing the results to the predictions is a hard problem depending on the details of the statistical properties of the detectors and the astrophysical models for the evolution of black holes and the stars from which (we think) they formed.

Last week, the teams announced the detection of a very different kind of event, the collision of two neutron stars, each about 1.4 times the mass of the sun. Neutron stars are one possible end state of the evolution of a star, when its atoms are no longer able to withstand the pressure of the gravity trying to force them together. This was first understood by S Chandrasekhar in the early years of the 20th Century, who realised that there was a limit to the mass of a star held up simply by the quantum-mechanical repulsion of the electrons at the outskirts of the atoms making up the star. When you surpass this mass, known, appropriately enough, as the Chandrasekhar mass, the star will collapse in upon itself, combining the electrons and protons into neutrons and likely releasing a vast amount of energy in the form of a supernova explosion. After the explosion, the remnant is likely to be a dense ball of neutrons, whose properties are actually determined fairly precisely by similar physics to that of the Chandrasekhar limit (discussed for this case by Oppenheimer, Volkoff and Tolman), giving us the magic 1.4 solar mass number.

(Last week also coincidentally would have seen Chandrasekhar’s 107th birthday, and Google chose to illustrate their home page with an animation in his honour for the occasion. I was a graduate student at the University of Chicago, where Chandra, as he was known, spent most of his career. Most of us students were far too intimidated to interact with him, although it was always seen as an auspicious occasion when you spotted him around the halls of the Astronomy and Astrophysics Center.)

This process can therefore make a single 1.4 solar-mass neutron star, and we can imagine that in some rare cases we can end up with two neutron stars orbiting one another. Indeed, the fact that LIGO saw one, but only one, such event during its year-and-a-half run allows the teams to constrain how often that happens, albeit with very large error bars, between 320 and 4740 events per cubic gigaparsec per year; a cubic gigaparsec is about 3 billion light-years on each side, so these are rare events indeed. These results and many other scientific inferences from this single amazing observation are reported in the teams’ overview paper.

A series of other papers discuss those results in more detail, covering the physics of neutron stars to limits on departures from Einstein’s theory of gravity (for more on some of these other topics, see this blog, or this story from the NY Times). As a cosmologist, the most exciting of the results were the use of the event as a “standard siren”, an object whose gravitational wave properties are well-enough understood that we can deduce the distance to the object from the LIGO results alone. Although the idea came from Bernard Schutz in 1986, the term “Standard siren” was coined somewhat later (by Sean Carroll) in analogy to the (heretofore?) more common cosmological standard candles and standard rulers: objects whose intrinsic brightness and distances are known and so whose distances can be measured by observations of their apparent brightness or size, just as you can roughly deduce how far away a light bulb is by how bright it appears, or how far away a familiar object or person is by how big how it looks.

Gravitational wave events are standard sirens because our understanding of relativity is good enough that an observation of the shape of gravitational wave pattern as a function of time can tell us the properties of its source. Knowing that, we also then know the amplitude of that pattern when it was released. Over the time since then, as the gravitational waves have travelled across the Universe toward us, the amplitude has gone down (further objects look dimmer sound quieter); the expansion of the Universe also causes the frequency of the waves to decrease — this is the cosmological redshift that we observe in the spectra of distant objects’ light.

Unlike LIGO’s previous detections of binary-black-hole mergers, this new observation of a binary-neutron-star merger was also seen in photons: first as a gamma-ray burst, and then as a “nova”: a new dot of light in the sky. Indeed, the observation of the afterglow of the merger by teams of literally thousands of astronomers in gamma and x-rays, optical and infrared light, and in the radio, is one of the more amazing pieces of academic teamwork I have seen.

And these observations allowed the teams to identify the host galaxy of the original neutron stars, and to measure the redshift of its light (the lengthening of the light’s wavelength due to the movement of the galaxy away from us). It is most likely a previously unexceptional galaxy called NGC 4993, with a redshift z=0.009, putting it about 40 megaparsecs away, relatively close on cosmological scales.

But this means that we can measure all of the factors in one of the most celebrated equations in cosmology, Hubble’s law: cz=Hd, where c is the speed of light, z is the redshift just mentioned, and d is the distance measured from the gravitational wave burst itself. This just leaves H₀, the famous Hubble Constant, giving the current rate of expansion of the Universe, usually measured in kilometres per second per megaparsec. The old-fashioned way to measure this quantity is via the so-called cosmic distance ladder, bootstrapping up from nearby objects of known distance to more distant ones whose properties can only be calibrated by comparison with those more nearby. But errors accumulate in this process and we can be susceptible to the weakest rung on the chain (see recent work by some of my colleagues trying to formalise this process). Alternately, we can use data from cosmic microwave background (CMB) experiments like the Planck Satellite (see here for lots of discussion on this blog); the typical size of the CMB pattern on the sky is something very like a standard ruler. Unfortunately, it, too, needs to calibrated, implicitly by other aspects of the CMB pattern itself, and so ends up being a somewhat indirect measurement. Currently, the best cosmic-distance-ladder measurement gives something like 73.24 ± 1.74 km/sec/Mpc whereas Planck gives 67.81 ± 0.92 km/sec/Mpc; these numbers disagree by “a few sigma”, enough that it is hard to explain as simply a statistical fluctuation.

Unfortunately, the new LIGO results do not solve the problem. Because we cannot observe the inclination of the neutron-star binary (i.e., the orientation of its orbit), this blows up the error on the distance to the object, due to the Bayesian marginalisation over this unknown parameter (just as the Planck measurement requires marginalization over all of the other cosmological parameters to fully calibrate the results). Because the host galaxy is relatively nearby, the teams must also account for the fact that the redshift includes the effect not only of the cosmological expansion but also the movement of galaxies with respect to one another due to the pull of gravity on relatively large scales; this so-called peculiar velocity has to be modelled which adds further to the errors.

This procedure gives a final measurement of 70.0+12-8.0, with the full shape of the probability curve shown in the Figure, taken directly from the paper. Both the Planck and distance-ladder results are consistent with these rather large error bars. But this is calculated from a single object; as more of these events are seen these error bars will go down, typically by something like the square root of the number of events, so it might not be too long before this is the best way to measure the Hubble Constant.

GW H0

[Apologies: too long, too technical, and written late at night while trying to get my wonderful not-quite-three-week-old daughter to sleep through the night.]

JSONfeed

More technical stuff, but I’m trying to re-train myself to actually write on this blog, so here goes…

For no good reason other than it was easy, I have added a JSONfeed to this blog. It can be found at http://andrewjaffe.net/blog/feed.json, and accessed from the bottom of the right-hand sidebar if you’re actually reading this at andrewjaffe.net.

What does this mean? JSONfeed is an idea for a sort-of successor to something called RSS, which may stand for really simple syndication, a format for encapsulating the contents of a blog like this one so it can be indexed, consumed, and read in a variety of ways without explicitly going to my web page. RSS was created by developer, writer, and all around web-and-software guru Dave Winer, who also arguably invented — and was certainly part of the creation of — blogs and podcasting. Five or ten years ago, so-called RSS readers were starting to become a common way to consume news online. NetNewsWire was my old favourite on the Mac, although its original versions by Brent Simmons were much better than the current incarnation by a different software company; I now use something called Reeder. But the most famous one was Google Reader, which Google discontinued in 2013, thereby killing off most of the RSS-reader ecosystem.

But RSS is not dead: RSS readers still exist, and it is still used to store and transfer information between web pages. Perhaps most importantly, it is the format behind subscriptions to podcasts, whether you get them through Apple or Android or almost anyone else.

But RSS is kind of clunky, because it’s built on something called XML, an ugly but readable format for structuring information in files (HTML, used for the web, with all of its < and > “tags”, is a close cousin). Nowadays, people use a simpler family of formats called JSON for many of the same purposes as XML, but it is quite a bit easier for humans to read and write, and (not coincidentally) quite a bit easier to create computer programs to read and write.

So, finally, two more web-and-software developers/gurus, Brent Simmons and Manton Reece realised they could use JSON for the same purposes as RSS. Simmons is behind NewNewsWire and Reece’s most recent project is an “indie microblogging” platform (think Twitter without the giant company behind it), so they both have an interest in these things. And because JSON is so comparatively easy to use, there is already code that I could easily add to this blog so it would have its own JSONfeed. So I did it.

So it’s easy to create a JSONfeed. What there isn’t — so far — are any newsreaders like NetNewsWire or Reeder that can ingest them. (In fact, Maxime Vaillancourt apparently wrote a web-based reader in about an hour, but it may already be overloaded…). Still, looking forward to seeing what happens.

Python Bug Hunting

This is a technical, nerdy post, mostly so I can find the information if I need it later, but possibly of interest to others using a Mac with the Python programming language, and also since I am looking for excuses to write more here. (See also updates below.)

It seems that there is a bug in the latest (mid-May 2017) release of Apple’s macOS Sierra 10.12.5 (ok, there are plenty of bugs, as there in any sufficiently complex piece of software).

It first manifested itself (to me) as an error when I tried to load the jupyter notebook, a web-based graphical front end to Python (and other languages). When the command is run, it opens up a browser window. However, after updating macOS from 10.12.4 to 10.12.5, the browser didn’t open. Instead, I saw an error message:

    0:97: execution error: "http://localhost:8888/tree?token=<removed>" doesn't understand the "open location" message. (-1708)

A little googling found that other people had seen this error, too. I was able to figure out a workaround pretty quickly: this behaviour only happens when I wanted to use the “default” browser, which is set in the “General” tab of the “System Preferences” app on the Mac (I have it set to Apple’s own “Safari” browser, but you can use Firefox or Chrome or something else). Instead, there’s a text file you can edit to explicitly set the browser that you want jupyter to use, located at ~/.jupyter/jupyter_notebook_config.py, by including the line

c.NotebookApp.browser = u'Safari'

(although an unrelated bug in Python means that you can’t currently use “Chrome” in this slot).

But it turns out this isn’t the real problem. I went and looked at the code in jupyter that is run here, and it uses a Python module called webbrowser. Even outside of jupyter, trying to use this module to open the default browser fails, with exactly the same error message (though I’m picking a simpler URL at http://python.org instead of the jupyter-related one above):

>>> import webbrowser
>>> br = webbrowser.get()
>>> br.open("http://python.org")
0:33: execution error: "http://python.org" doesn't understand the "open location" message. (-1708)
False

So I reported this as an error in the Python bug-reporting system, and hoped that someone with more experience would look at it.

But it nagged at me, so I went and looked at the source code for the webbrowser module. There, it turns out that the programmers use a macOS command called “osascript” (which is a command-line interface to Apple’s macOS automation language “AppleScript”) to launch the browser, with a slightly different syntax for the default browser compared to explicitly picking one. Basically, the command is osascript -e 'open location "http://www.python.org/"'. And this fails with exactly the same error message. (The similar code osascript -e 'tell application "Safari" to open location "http://www.python.org/"' which picks a specific browser runs just fine, which is why explicitly setting “Safari” back in the jupyter file works.)

But there is another way to run the exact same AppleScript command. Open the Mac app called “Script Editor”, type open location "http://python.org" into the window, and press the “run” button. From the experience with “osascript”, I expected it to fail, but it didn’t: it runs just fine.

So the bug is very specific, and very obscure: it depends on exactly how the offending command is run, so appears to be a proper bug, and not some sort of security patch from Apple (and it certainly doesn’t appear in the 10.12.5 release notes). I have filed a bug report with Apple, but these are not publicly accessible, and are purported to be something of a black hole, with little feedback from the still-secretive Apple development team.

Updates:

Knightian Uncertainty

[Update: I have fixed some broken links, and modified the discussion of QBism and the recent paper by Chris Fuchs— thanks to Chris himself for taking the time to read and find my mistakes!]

For some reason, I’ve come across an idea called “Knightian Uncertainty” quite a bit lately. Frank Knight was an economist of the free-market conservative “Chicago School”, who considered various concepts related to probability in a book called Risk, Uncertainty, and Profit. He distinguished between “risk”, which he defined as applying to events to which we can assign a numerical probability, and “uncertainty”, to those events about which we know so little that we don’t even have a probability to assign, or indeed those events whose possibility we didn’t even contemplate until they occurred. In Rumsfeldian language, “risk” applies to “known unknowns”, and “uncertainty” to “unknown unknowns”. Or, as Nicholas Taleb put it, “risk” is about “white swans”, while “uncertainty” is about those unexpected “black swans”.

(As a linguistic aside, to me, “uncertainty” seems a milder term than “risk”, and so the naming of the concepts is backwards.)

Actually, there are a couple of slightly different concepts at play here. The black swans or unknown-unknowns are events that one wouldn’t have known enough about to even include in the probabilities being assigned. This is much more severe than those events that one knows about, but for which one doesn’t have a good probability to assign.

And the important word here is “assign”. Probabilities are not something out there in nature, but in our heads. So what should a Bayesian make of these sorts of uncertainty? By definition, they can’t be used in Bayes’ theorem, which requires specifying a probability distribution. Bayesian theory is all about making models of the world: we posit a mechanism and possible outcomes, and assign probabilities to the parts of the model that we don’t know about.

So I think the two different types of Knightian uncertainty have quite a different role here. In the case where we know that some event is possible, but we don’t really know what probabilities to assign to it, we at least have a starting point. If our model is broad enough, then enough data will allow us to measure the parameters that describe it. For example, in recent years people have started to realise that the frequencies of rare, catastrophic events (financial crashes, earthquakes, etc.) are very often well described by so-called power-law distributions. These assign much greater probabilities to such events than more typical Gaussian (bell-shaped curve) distributions; the shorthand for this is that power-law distributions have much heavier tails than Gaussians. As long as our model includes the possibility of these heavy tails, we should be able to make predictions based on data, although very often those predictions won’t be very precise.

But the “black swan” problem is much worse: these are possibilities that we don’t even know enough about to consider in our model. Almost by definition, one can’t say anything at all about this sort of uncertainty. But what one must do is be open-minded enough to adjust our models in the face of new data: we can’t predict the black swan, but we should expand the model after we’ve seen the first one (and perhaps revise our model for other waterfowl to allow more varieties!). In more traditional scientific settings, involving measurements with errors, this is even more difficult: a seemingly anomalous result, not allowed in the model, may be due to some mistake in the experimental setup or in our characterisation of the probabilities of those inevitable errors (perhaps they should be described by heavy-tailed power laws, rather than Gaussian distributions as above).

I first came across the concept as an oblique reference in a recent paper by Chris Fuchs, writing about his idea of QBism (or see here for a more philosophically-oriented discussion), an interpretation of quantum mechanics that takes seriously the Bayesian principle that all probabilities are about our knowledge of the world, rather than the world itself (which is a discussion for another day). He tentatively opined that the probabilities in quantum mechanics are themselves “Knightian”, referring not to a reading of Knight himself but to some recent, and to me frankly bizarre, ideas from Scott Aaronson, discussed in his paper, The Ghost in the Quantum Turing Machine, and an accompanying blog post, trying to base something like “free will” (a term he explicitly does not apply to this idea, however) on the possibility of our brains having so-called “freebits”, quantum states whose probabilities are essentially uncorrelated with anything else in the Universe. This arises from what is to me a mistaken desire to equate “freedom” with complete unpredictability. My take on free will is instead aligned with that of Daniel Dennett, at least the version from his Consciousness Explained from the early 1990s, as I haven’t yet had the chance to read his recent From Bacteria to Bach and Back: a perfectly deterministic (or quantum mechanically random, even allowing for the statistical correlations that Aaronson wants to be rid of) version of free will is completely sensible, and indeed may be the only kind of free will worth having.

Fuchs himself tentatively uses Aaronson’s “Knightian Freedom” to refer to his own idea

that nature does what it wants, without a mechanism underneath, and without any “hidden hand” of the likes of Richard von Mises’s Kollective or Karl Popper’s propensities or David Lewis’s objective chances, or indeed any conception that would diminish the autonomy of nature’s events,

which I think is an attempt (and which I admit I don’t completely understand) to remove the probabilities of quantum mechanics entirely from any mechanistic account of physical systems, despite the incredible success of those probabilities in predicting the outcomes of experiments and other observations of quantum mechanical systems. I’m not quite sure this is what either Knight nor Aaronson had in mind with their use of “uncertainty” (or “freedom”), since at least in quantum mechanics, we do know what probabilities to assign, given certain other personal (as Fuchs would have it) information about the system. My Bayesian predilections make me sympathetic with this idea, but then I struggle to understand what, exactly, quantum mechanics has taught us about the world: why do the predictions of quantum mechanics work?

When I’m not thinking about physics, for the last year or so my mind has been occupied with politics, so I was amused to see Knightian Uncertainty crop up in a New Yorker article about Trump’s effect on the stock market:

Still, in economics there’s a famous distinction, developed by the great Chicago economist Frank Knight, between risk and uncertainty. Risk is when you don’t know exactly what will happen but nonetheless have a sense of the possibilities and their relative likelihood. Uncertainty is when you’re so unsure about the future that you have no way of calculating how likely various outcomes are. Business is betting that Trump is risky but not uncertain—he may shake things up, but he isn’t going to blow them up. What they’re not taking seriously is the possibility that Trump may be willing to do things—like start a trade war with China or a real war with Iran—whose outcomes would be truly uncertain.

It’s a pretty low bar, but we can only hope.

SOLE Survivor

I recently finished my last term lecturing our second-year Quantum Mechanics course, which I taught for five years. It’s a required class, a mathematical introduction to one of the most important set of ideas in all of physics, and really the basis for much of what we do, whether that’s astrophysics or particle physics or almost anything else. It’s a slightly “old-fashioned” course, although it covers the important basic ideas: the Schrödinger Equation, the postulates of quantum mechanics, angular momentum, and spin, leading almost up to what is needed to understand the crowning achievement of early quantum theory: the structure of the hydrogen atom (and other atoms).

A more modern approach might start with qubits: the simplest systems that show quantum mechanical behaviour, and the study of which has led to the revolution in quantum information and quantum computing.

Moreover, the lectures rely on the so-called Copenhagen interpretation, which is the confusing and sometimes contradictory way that most physicists are taught to think about the basic ontology of quantum mechanics: what it says about what the world is “made of” and what happens when you make a quantum-mechanical measurement of that world. Indeed, it’s so confusing and contradictory that you really need another rule so that you don’t complain when you start to think too deeply about it: “shut up and calculate”. A more modern approach might also discuss the many-worlds approach, and — my current favorite — the (of course) Bayesian ideas of QBism.

The students seemed pleased with the course as it is — at the end of the term, they have the chance to give us some feedback through our “Student On-Line Evaluation” system, and my marks have been pretty consistent. Of the 200 or so students in the class, only about 90 bother to give their evaluations, which is disappointingly few. But it’s enough (I hope) to get a feeling for what they thought.

SOLE 2016 Chart

So, most students Definitely/Mostly Agree with the good things, although it’s clear that our students are most disappointed in the feedback that they receive from us (this is a more general issue for us in Physics at Imperial and more generally, and which may partially explain why most of them are unwilling to feed back to us through this form).

But much more fun and occasionally revealing are the “free-text comments”. Given the numerical scores, it’s not too surprising that there were plenty of positive ones:

  • Excellent lecturer - was enthusiastic and made you want to listen and learn well. Explained theory very well and clearly and showed he responded to suggestions on how to improve.

  • Possibly the best lecturer of this term.

  • Thanks for providing me with the knowledge and top level banter.

  • One of my favourite lecturers so far, Jaffe was entertaining and cleary very knowledgeable. He was always open to answering questions, no matter how simple they may be, and gave plenty of opportunity for students to ask them during lectures. I found this highly beneficial. His lecturing style incorporates well the blackboards, projectors and speach and he finds a nice balance between them. He can be a little erratic sometimes, which can cause confusion (e.g. suddenly remembering that he forgot to write something on the board while talking about something else completely and not really explaining what he wrote to correct it), but this is only a minor fix. Overall VERY HAPPY with this lecturer!

But some were more mixed:

  • One of the best, and funniest, lecturers I’ve had. However, there are some important conclusions which are non-intuitively derived from the mathematics, which would be made clearer if they were stated explicitly, e.g. by writing them on the board.

  • I felt this was the first time I really got a strong qualitative grasp of quantum mechanics, which I certainly owe to Prof Jaffe’s awesome lectures. Sadly I can’t quite say the same about my theoretical grasp; I felt the final third of the course less accessible, particularly when tackling angular momentum. At times, I struggled to contextualise the maths on the board, especially when using new techniques or notation. I mostly managed to follow Prof Jaffe’s derivations and explanations, but struggled to understand the greater meaning. This could be improved on next year. Apart from that, I really enjoyed going to the lectures and thought Prof Jaffe did a great job!

  • The course was inevitably very difficult to follow.

And several students explicitly commented on my attempts to get students to ask questions in as public a way as possible, so that everyone can benefit from the answers and — this really is true! — because there really are no embarrassing questions!

  • Really good at explaining and very engaging. Can seem a little abrasive at times. People don’t like asking questions in lectures, and not really liking people to ask questions in private afterwards, it ultimately means that no questions really get answered. Also, not answering questions by email makes sense, but no one really uses the blackboard form, so again no one really gets any questions answered. Though the rationale behind not answering email questions makes sense, it does seem a little unnecessarily difficult.

  • We are told not to ask questions privately so that everyone can learn from our doubts/misunderstandings, but I, amongst many people, don’t have the confidence to ask a question in front of 250 people during a lecture.

  • Forcing people to ask questions in lectures or publically on a message board is inappropriate. I understand it makes less work for you, but many students do not have the confidence to ask so openly, you are discouraging them from clarifying their understanding.

Inevitably, some of the comments were contradictory:

  • Would have been helpful to go through examples in lectures rather than going over the long-winded maths to derive equations/relationships that are already in the notes.

  • Professor Jaffe is very good at explaining the material. I really enjoyed his lectures. It was good that the important mathematics was covered in the lectures, with the bulk of the algebra that did not contribute to understanding being left to the handouts. This ensured we did not get bogged down in unnecessary mathematics and that there was more emphasis on the physics. I liked how Professor Jaffe would sometimes guide us through the important physics behind the mathematics. That made sure I did not get lost in the maths. A great lecture course!

And also inevitably, some students wanted to know more about the exam:

  • It is a difficult module, however well covered. The large amount of content (between lecture notes and handouts) is useful. Could you please identify what is examinable though as it is currently unclear and I would like to focus my time appropriately?

And one comment was particularly worrying (along with my seeming “a little abrasive at times”, above):

  • The lecturer was really good in lectures. however, during office hours he was a bit arrogant and did not approach the student nicely, in contrast to the behaviour of all the other professors I have spoken to

If any of the students are reading this, and are willing to comment further on this, I’d love to know more — I definitely don’t want to seem (or be!) arrogant or abrasive.

But I’m happy to see that most students don’t seem to think so, and even happier to have learned that I’ve been nominated “multiple times” for Imperial’s Student Academic Choice Awards!

Finally, best of luck to my colleague Jonathan Pritchard, who will be taking over teaching the course next year.

Electoral woes and votes

Like everyone else in my bubble, I’ve been angrily obsessing about the outcome of the US Presidential election for the last two weeks. I’d like to say that I’ve been channelling that obsession into action, but so far I’ve mostly been reading and hoping (and being disappointed). And trying to parse all the “explanations” for Trump’s election.

Mostly, it’s been about what the Democrats did wrong (imperfect Hillary, ignoring the white working class, not visiting Wisconsin, too much identity politics), and what the Republicans did right (imperfect Trump, dog whistles, focusing on economics and security).

But there has been an ongoing strain of purely procedural complaint: that the system is rigged, but (ironically?) in favour of Republicans. In fact, this is manifestly true: liberals (Democrats) are more concentrated — mostly in cities — than conservatives (Republicans) who are spread more evenly and dominate in rural areas. And the asymmetry is more true for the sticky ideologies than the fungible party affiliations, especially when “liberal” encompasses a whole raft of social issues rather than just left-wing economics. This has been exacerbated by a few decades of gerrymandering. So the House of Representatives, in particular, tilts Republican most of the time. And the Senate, with its non-proportional representation of two per state, regardless of size, favours those spread-out Republicans, too (although party dominance of the Senate is less of a stranglehold for the Republicans than that of the House).

But one further complaint that I’ve heard several times is that the Electoral College is rigged, above and beyond those reasons for Republican dominance of the House and Senate: as we know, Clinton has won the popular vote, by more than 1.5 million as of this writing — in fact, my own California absentee ballot has yet to be counted. The usual argument goes like this: the number of electoral votes allocated to a state is the sum of the number of members of congress (proportional to the population) and the number of senators (two), giving a total of five hundred and thirty-eight. For the most populous states, the addition of two electoral votes doesn’t make much of a difference. New Jersey, for example, has 12 representatives, and 14 electoral votes, about a 15% difference; for California it’s only about 4%. But the least populous states (North and South Dakota, Montana, Wyoming, Alaska) have only one congressperson each, but three electoral votes, increasing the share relative to population by a factor of 3 (i.e., 300%). In a Presidential election, the power of a Wyoming voter is more than three times that of a Californian.

This is all true, too. But it isn’t why Trump won the election. If you changed the electoral college to allocate votes equal to the number of congressional representatives alone (i.e., subtract two from each state), Trump would have won 245 to 191 (compared to the real result of 306 to 232).1 As a further check, since even the representative count is slightly skewed in favour of small states (since even the least populous state has at least one), I did another version where the electoral vote allocation is exactly proportional to the 2010 census numbers, but it gives the same result. (Contact me if you would like to see the numbers I use.)

Is the problem (I admit I am very narrowly defining “problem” as “responsible for Trump’s election”, not the more general one of fairness!), therefore, not the skew in vote allocation, but instead the winner-take-all results in each state? Maine and Nebraska already allocate their two “Senatorial” electoral votes to the statewide winner, and one vote for the winner of each congressional district, and there have been proposals to expand this nationally. Again, this wouldn’t solve the “problem”. Although I haven’t crunched the numbers myself, it appears that ticket-splitting (voting different parties for President and Congress) is relatively low. Since the Republicans retained control of Congress, their electoral votes under this system would be similar to their congressional majority of 239 to 194 (their are a few results outstanding), and would only get worse if we retain the two Senatorial votes per state. Indeed, with this system, Romney would have won in 2012.

So the “problem” really does go back to the very different geographical distribution of Democrats and Republicans. Almost any system which segregates electoral votes by location (especially if subjected to gerrymandering) will favour the more widely dispersed party. So perhaps the solution is to just to use nationwide popular voting for Presidential elections. This would also eliminate the importance of a small number of swing states and therefore require more national campaigning. (It could be enacted by a Constitutional amendment, or a scheme like the National Popular Vote Interstate Compact.) Alas, it ain’t gonna happen.


  1. I have assumed Trump wins Michigan, and I have allocated all of Maine to Clinton and all of Nebraska to Trump; see below. ↩︎