Saturday, 6 June 2015

Statistics

At some point during the week just past, I accepted the offer from Ewell Library of a MacMillan A level statistics text from the Work Out Series for £1, thinking that my statistics could do with a brush up.

It turned out to be a book from 1991, so some years before PCs, laptops, tablets and telephones came to have the position in our life that they do now. There was even mention of the progammable calculators which played a small part in my own professional life while with the Manpower Services Commission and whem, as I recall Texas Instruments made the device of choice, possibly the TI-59. Nevertheless, I also had a soft spot for a Sharp Pocket Computer, possibly something from the PC 1250 or PC 1260 series, from roughly the same era, introduced to me by my brother. He could teach it to do the most wonderful things.

However, I rapidly moved out of the nostalgia fest when I came across a whole branch of statistics about which I knew nothing, the business of calculating means, medians and what have you from statistics which had been grouped. That is to say, for example, that rather than being given the birth dates of the people in your sample, you had been given their ages in ten year age bands. The text offers various ways of getting around these bands, presumably depending on various unstated assumptions about the distribution of the relevant statistics in the population as a whole.

Eventually I remembered about the business of 'age at' death tapes on which I once cut some statistical teeth. The problem here was that one was trying to estimate, for example, the number of people who were 59 on June 30th 1972, for which purpose you subtracted the number of people who had died aged 59 during the year from the number of such people alive on June 30th 1971, this being the way that the magnetic tapes containing the details of the deaths of the thousands of such people were compiled. Whereas graduates of the present statistics course would have much more quickly cottoned onto the fact that being 59 at death in the year preceding June 30th 1972 does not mean that you would have been 59 at that date. On such arcane matters were statistical reputations made and lost.

And then, a few days later, something drew my attention to an article about the effects of the moon on human affairs, in particular on certain classes of hospital admissions, an article already mentioned once, at reference 1, and in which the author was very stern about the misdeeds of the author of a prior article, misdeeds which included a variation of this very problem. Rather than including the precise interval between admission and full moon in his data, this last author had simply coded the events in question to one of the 29 days of the lunar cycle, throwing away the modest number of events which properly belonged to day 30 and the overall effect of which was to further confuse an already confused story.

A bonus arose from the first author being an astronomer and I learned that the period of the moon, of the lunar cycle, varied in the relevant period from about 29.25 days to 29.75 days. I had had no idea that it varied so much and I still have no idea how long the whole cycle is, if indeed there is a cycle. Perhaps it will continue to vary in an apparently random way until the end of days.

A penalty arose from it being made even clearer that my grasp of statistics was not even good enough to fully appreciate even the relatively straightforward stuff deployed by an astronomer. I had never heard of the all important - in this context anyway - Mann-Witney test. There is an article in wikipedia, but I am not sure that I am going to get around to reading it - beyond having got the impression, once again, that the temptation to draw complex conclusions from simple data is considerable, particularly in these days of easy to use statistical packages on computers - perhaps even of statistical apps on telephones - and that a little learning is very likely to be dangerous. We clearly need the same regime as we have for doctors and accountants, that is to say a total ban on practice by unqualified practitioners. You must pay your dues to the relevant professional organisation if you want a brass plate on your door!

All of this being capped by a piece in yesterday's DT about a row about the excessive difficulty of a recent GCSE examination question, illustrated above. It took me a few seconds, including taking a peek at the answer, to realise what the drift of the question was, let alone work out what the answer was. Deciding whether I would have got there without the answer is left as an exercise for the reader.

My exercise will be to ponder on the significance of the negative solution to the equation. What would it mean if there were two positive solutions? Alternatively, why could there not be two?

Reference 1: http://www.psmv2.blogspot.co.uk/2015/06/part-2-of-2.html.

No comments:

Post a Comment