Sunday, September 28, 2008
Thursday, September 11, 2008
"The government says its aim is to insulate every home in Britain by 2020 - and energy companies, councils and voluntary organisations will be making door-to-door visits in deprived areas to promote the scheme."Whether or not it is "enough" (and I'm sure plenty of people will be lining up to say it's inadequate), it is still a positive move. Although our old house will certainly not have cavity wall insulation by 2020, due to the fact that it has no cavity...
Back when I was living in the UK, Gordon was one of the most (if not the most) successful Chancellors ever. Now it seems that he is destined to be remembered primarily as one of the feeblest PMs. A sad but entirely predictable end to a political career (and yes I did predict it confidently, although maybe not in writing). Of course it is well-known that all political lives end in failure. I wonder why they never manage to quit while they are ahead. Did anyone ever get out at the top, leaving the masses desperate for more?
(Thinks for a minute.....ah, John Smith. That's taking things a little too far, though.)
Wednesday, September 10, 2008
The story starts way back with the IPCC AR4 drafts, actually. The very first semi-public version of this (ie the first version that people like me were allowed to see and comment on) attempted to address the issue of combining the different probabilistic estimates of climate sensitivity. They did this by averaging different pdfs that various authors have generated. It was immediately obvious to me (and possibly many others) that this was a completely invalid approach, and I said as much in my comments. In fact this episode - combined with the growing realisation that people were actually taking these outlandish "pdfs of climate sensitivity" seriously - is largely what prompted me to quickly write the "multiple constraints" paper. Anyway, this "averaging" result was removed from the next draft and I had hoped that our paper (which as far as I'm aware, no-one has argued against in the literature) might have put an end to that whole idea.
Fast forward on a few months and I saw an AGU presentation (by Reto Knutti) which appeared to again address the question of combining constraints. Its punchline was an analysis where where adding more observations appeared to result in an increase in uncertainty. Now, while it is certainly true that additional information can lead to an increase in uncertainty, such situations tend to be somewhat pathological and uncommon in practice, so I had a closer look. It only took a couple of emails back and with Reto to work out there was something badly wrong in what they had done - they had "combined" their constraints by averaging them, so I pointed out that this was invalid. Reto didn't reply to this point, and I assumed he had gone away and re-worked things. So, I was more than a little surprised that the next time I saw him speak, in the autumn of 2007, he presented the same result and mentioned that the paper (Sanderson et al) was now in press!
Time for a slight digression into why averaging the constraints is wrong. There is a standard theorem in probability theory that one always expects to learn from new information (and this somewhat woolly statement can be formalised in a rigorous manner). This general result crops up in all sorts of specific applications, and its correctness is not in doubt. Now, the sort of quadratic cost function that Sanderson et al use as their basic measure of model-data mismatch is pretty much ubiquitous in estimation, and it has a natural identification with the logarithm of the likelihood function corresponding to Gaussian uncertainties. It is no exaggeration to say that these quadratic log-likelihood cost functions underpin almost all practical probabilistic prediction and estimation in geosciences, all the way from a simple least-squares linear regression up to the advanced data assimilation methods used in modern numerical weather prediction. As a simple illustration, let's say we have an unknown variable x, and we make one observation o of it which has (Gaussian) observational uncertainty e. The resulting likelihood function P(o|x) is the Gaussian N(o,e), and taking the negative logarithm of this we reach the quadratic cost function 0.5*((x-o)/e)2. If we have more than one observation, things are little more complicated because we need to account not only for the uncertainties on each observation, but also the covariance of these uncertainties. The resulting multivariate Gaussian likelihood N(O,E) naturally gives rise to the cost function 0.5*(X-O)TE-1(X-O) where X and O are now vectors of the estimated and observed variables, and E is the covariance matrix. In the simple case where we have two observations of a single variable and the covariance of the errors E is diagonal, this simplifies to 0.5*[((x-o1)/e1)2 + ((x-o2)/e2)2] - and this is the direct sum of the cost functions that arise from taking each observation individually. And whatever the form of the covariance matrix E, the log-likelihood is never equivalent to 0.25*[((x-o1)/e1)2 + ((x-o2)/e2)2], which is the result of the "averaging the constraints" procedure that Sanderson et al invented.
The (normal, correct) cost function above corresponds to the Gaussian N((o1*e22+o2*e12)/(e12+e22), sqrt(e12*e22/(e12+e22)), this being the well-known formula for the optimal interpolation of two uncertain estimates. It is easy to see that the width of this Gaussian is lower than for either of the single-observation Gaussians, since sqrt(e12*e22/(e12+e22)) is less than the smaller of e1 and e2. (The same result holds if we use a full covariance matrix E with off-diagonal elements, but it's not as easy to show that within Blogger's limited maths capability!)
If, however, we use the average of the two single-obs cost functions, the resulting Gaussian is now N((o1*e22+o2*e12)/(e12+e22),sqrt(2*e12*e22/(e12+e22)) and it is immediate that its width, sqrt(2*e12*e22/(e12+e22), lies between e1 and e2. So, if we start with the more accurate observation, and then "combine" it with a less accurate one through this "averaging" methdology, then the final answer will always have greater uncertainty than we started with, which is contrary to the well-established theorem I mentioned previously. Effectively, this method is asserting that a weak observation with wide uncertainty should cause us to forget what we already knew!
So in summary, averaging constraints is obviously nonsense that has no possible probabilistic interpretation. I spent a lot of time late last year trying to explain this in simple terms to several of the authors of the Sanderson et al paper, and in response they gave every appearance of trying to argue that averaging was a valid approach, although they also accused me of reading too much into the paper. To be fair, they had not directly generated any posterior pdfs, and did not explicitly state that these cost functions were log-likelihoods, but given (a) the theoretical background I have briefly outlined above, and (b) the supposed increase in uncertainty as more observations were considered had been repeatedly presented as the main result of this work, I don't think my reading of it was unreasonable. Moreover, if they had really meant to disavow this interpretation, then it is puzzling that they continued to defend the methodology as valid. Myles Allen even challenged me to write a comment if I thought their paper was wrong, and given the potential impact of this erroneous "averaging the constraints" meme if it is widely adopted, I was originally tempted to do so. But I have just looked for the paper on-line, and am surprised (but pleased) to see several changes to the paper that have been made subsequent to our discussions. For starters, they have removed the most contentious claim - which was previously made in the abstract as well as the main text - that their cost function provides a constraint on climate sensitivity that weakens as more observations are used. In fact the paper now contains an entirely new paragraph that was not there before. This explicitly states that their results cannot be interpreted in terms of log-likelihoods. They could have gone a little further and admitted that the widths of their cost functions cannot be directly related to uncertainty in any way whatsoever, but I suppose that would have been rather embarrassing given that the explicit goal of the paper is to explore this issue. They have also added a little discussion of their reasoning behind the averaging method (which was not justified at all in the earlier version of the paper). Now it is explicitly presented as alternative to the "incorrect" alternative of adding the individual terms, which (as they state) is wrong if uncertainties are not independent. It is disappointing that they didn't find the space to mention here that the obvious alternative to adding the terms, if they are not independent, is just to use the full covariance matrix as I have outlined above. For while of course the addition of individual constraints is incorrect unless the uncertainties are judged to be independent, this may at least be a reasonable approximation if the dependencies are small (and it could in principle be biased either way), whereas their curious "averaging" method is guaranteed to be wrong for all possible covariance structures, with the resulting uncertainty guaranteed to be too wide. Indeed it is trivial to observe that averaging is algebraically equivalent to addition, but for the arbitrary and unjustifiable multiplicative scaling factor of sqrt(N) on the width of the implied joint likelihood function, where N is the number of individual terms.
I guess the authors would claim that these last-minute changes to the paper are nothing more than cosmetic - just a matter of explaining things in simple enough terms for people like me to understand, rather than representing any substantive modifications to their previous claims. (They certainly don't acknowledge my help in fixing the problems in the original version.) The real proof of the pudding will be in whether they and others continue to use this "averaging the constraints" meme...I will watch with interest to see if and how it spreads...
Wednesday, September 03, 2008
Snow fell on Mt Fuji Aug 9 at earliest time ever
However, it seems that this new record is due to an arbitrary decision to redefine some hail as snow. It has still been unseasonally wet and cold for the latter part of the summer - which I would have been quite happy about but for the fact that we were hoping to have another trip into the mountains. I'm certainly glad we made the effort earlier.
Police in Japan have been left red-faced by an apparent murder that turned out to be an unusual case of mistaken identity.
Back at the local police headquarters, officials notified reporters who had turned up early the same morning to cover an annual earthquake drill. They began preparing to write up the launch of a major murder investigation.
Dozens of extra officers were dispatched to interview potential witnesses, while the evening edition of the local newspaper carried a report of the gruesome find, complete with a photograph of the body's resting place.
What they don't mention in this story is that by the time the police realised it was a doll, it had already confessed to two charges of shoplifting and one of child neglect.
It is now rumoured to be considering standing as a candidate in the forthcoming election for the new Prime Minister.
So he's up to 8 out of 14, which means the p-value is still below 5%, but it is heading in the right direction. There is no sign of the September forecast yet. The August one only appeared on the 8th, by which time it was already well set to be very a wet month with cold days, but of course he claims that this forecast was really made some time previously...