Wednesday 21 December 2016

Jennings, Hameed, Duckett .... and Bayes?

As the umpires of time take off the bails of 2016, and as the England team pick themselves off the mat after an absolute hammering from India, I bring you this year's festive look at the data. It's...a bit different.

England's test tour of Bangladesh and India featured the introduction of three new batsmen into test cricket. Keaton Jennings and Haseeb Hameed have been singled out as definite bright spots of the tour, while Ben Duckett may have to wait a little while for his next opportunity. But how can we sensibly assess each one's tour? And how likely is it that each one will be a medium-to-long term success at test level?

Things like this are difficult. Whenever a new player is picked, nobody- not the fans, not the selectors, not the player themselves- knows with certainty how it's going to go. Some demonstrably very talented players don't succeed in the long run, and some apparently more limited ones do and it isn't always obvious why. As a player starts out in test cricket (or at any new level or form of cricket), we must acknowledge that there are range of possible "outcomes" for their career- they may later be remembered as a legend, an unfulfilled talent, an "every dog has his day"... we just don't know.

But over time, we find out. With each fresh innings we see more of them, and gradually our uncertainty morphs into knowledge. A couple of years ago, I don't think it was enormously obvious which of Joe Root and Gary Ballance would be the better test player. Now, however, I think have a certain degree of confidence about the answer to that.

This is, in essence, a problem of forecasting. We have data about the past and we want to make a (hopefully, educated) guess about the future. Which of two players will be better? With each new data point, we update our forecast and hopefully arrive closer to the truth*.

There's a famous theorem of mathematics- called Bayes theorem, after Rev. Thomas Bayes, which we can use to do exactly this. As an equation it looks like this:



For our purposes: 'B' is something I have observed (e.g.  "Keaton Jennings scoring a century"). 'A' is the thing I want to find out the probability of being true (e.g. "Keaton Jennings will be a long term success at test level").

P(A|B) is the thing I want to know. It is the probability that A is true, now that I know that B is true.
P(A) is the "prior"- the probability I would have given to A being true before I knew that B was true.
P(B) is the probability of B happening without regard to whether A is true.
P(B|A) is the probability of B happening assuming A is true.

Let's try and apply this to England's new recruits, albeit in a slightly crude fashion.

To begin, let's suppose four possible long to medium term outcomes for a player's test career:

1) Very good
2) Good
3) Okay
4) Not so good

Let's now consider all the batsmen who made their debut for England batting in the top six between 2000 and 2014. (I cut it off at 2014 so as to have reasonable some chance to say how their career went after their debut).

We'll exclude Ryan Sidebottom from the sample because he was there as a nightwatchman, and Chris Woakes because I think most people would say he's mainly a bowler, leaving us with 22 players.

Ranking them by batting average and dividing the list according to the categories above, I would say they break down something like this (*controversy alert!*):
Group 1 ("very good")- Cook, Pietersen, Root
Group 2 ("good") Barstow, Bell, Strauss, Trescothick, Trott
Group 3 ("okay") Ali, Ballance, Collingwood, Compton, Stokes
Group 4 ("not so good") Bopara, Carberry, Clarke, Key, Morgan, Robson, Shah, Smith, Taylor

This gives us our initial values for P(A) in the Bayes theorem equation. For a generic England debutant batsman in the modern era:
P(very good)=3/22=13.6%
P(good)=5/22=22.7%
P(okay)=5/22=22.7%
P(not so good)=9/22=40.9%

We then want to know the probability for a player belonging to each of these groups to have a given outcome from one innings. We'll categorise the outcomes of an innings pretty crudely:  0-9. 10-49, 50-99, 100-149, greater than150.

Based on the records of the players listed above we can estimate the probability that (e.g.) a player belonging to Group 2 ('good') will score between 100 and 150 in any given innings. The table looks something like this:



Obviously, the best players are more likely to get high scores and less likely to get low scores. But crucially there's a finite probability for any innings outcome for any group of player. This actually gives us all the information we need to take one innings outcome for a player and use Bayes theorem to generate a new forecast about the probability that they belong to each of our four categories.

So, let's take the example of Keaton Jennings. Before he batted, I though of him as just a generic debutant and my forecast about his ability at test level looked this:

P(Jennings is very good)=13.6%
P(Jennings is good)=22.7%
P(Jennings is okay)=22.7%
P(Jennings is not so good)=40.9%

After he scored a hundred, applying Bayes theorem gives:
P(Jennings is very good)=16.8%
P(Jennings is good)=26.8%
P(Jennings is okay)=23.5%
P(Jennings is not so good)=32.9%

So the odds I would give to him turning out to be a very good player after the fashion of Root, Cook or Pietersen went up after his hundred, but only modestly. It's still only one data-point after all, and the fact remains that most batsmen don't turn out to be a Root, Cook or Pietersen.

He then got two low scores and a fifty. Applying the process iteratively we end up at:

P(Jennings is very good)=14.2% 
P(Jennings is good)=28.0%
P(Jennings is okay)=24.3%
P(Jennings is not so good)=33.4%

So there's still a high degree of uncertainty. Relative to before his debut the probability that things won't work out is down and the probability that he'll turn out to be great is up. But only modestly. We don't know much.

For Hameed and Duckett we can do the same thing with their results on tour.

Hameed is in a similar boat to Jennings. The probability that he'll be a long term success is up, but only modestly. We'll have to wait to be sure.

P(Hameed is very good)=18.0% 
P(Hameed is good)=28.7%
P(Hameed is okay)=23.2%
P(Hameed is not so good)=30.1%

For Ben Duckett, the outlook is a bit poorer. Our calculation now gives over a 50% chance that he'll be in the "not so good" category and a less than 10% chance that he'll be in the "very good" category. Specifically:

P(Duckett is very good)=7.3% 
P(Duckett is good)=17.2%
P(Duckett is okay)=23.1%
P(Duckett is not so good)=52.4%

Still, though, the calculation calls us to be circumspect. We have some indications about Ben Duckett's test prowess, but not the full picture. A nearly 25% chance that he'll turn out either good or very good is far from nothing.

There are two things I like about this way of thinking. Firstly, it allows us to acknowledge the world's inherent uncertainty without throwing up our hands and giving up. We can't have absolute certainty, but we can have some idea. Secondly, it gives us a mechanism to build new information into our thinking, to update our view of the world as we get new information.

The calculation I've outlined above is clearly much too crude, and leaves too much out to be used for selection purposes. But I genuinely think this way of thinking- i.e. probabilistic and updating forecasts based on new information- is well suited to this kind of thing. "Keaton Jennings will open England's batting for years to come" is too certain a statement for a complicated and uncertain world. Maybe, "there's a 42% chance that Keaton Jennings will turn out to be at least as good as Marcus Trescothick" is closer to the truth.


* There's a really nice book about statistics and forecasting called "The Signal and the Noise" by Nate Silver. It doesn't mention cricket, more's the pity, but it covers many fields- from baseball to finance to earthquakes-  where some kind of forecasting is desirable and looks at why some of these areas have seen great successes using statistical methods and others have seen catastrophic failures. It's very readable and I very much recommend it if you're into that sort of thing.

Saturday 20 August 2016

One remarkably consistent aspect of test cricket

It seems that James Vince and I have at least one thing in common: despite starting the season full of high hopes, neither of us have had a very prolific summer. I haven't blogged much of late, and indeed I haven't paid as much attention to England's test summer as I normally would, due to various other things that have occupied my time and brain space. This is a shame for me, because the series with Pakistan seems to have been a great one, judging by the bits of coverage I did catch.

In today's return to the statistical fray, I was interested to have a look into how the relative importance of different parts of the batting order has changed over time in test cricket. For instance, it is a well worn claim that tail enders are better batsmen than they used to be- does this mean teams now rely on them more for runs compared to other parts of the team? England have relied heavily on their lower-middle order of late- is this part of a trend or just how things are in one team right now?

To get a sense of this I divided the batting order into 4 parts: openers (1-2), upper middle order (3-5), lower middle order (positions 6-8) and tail (9-11) and looked at the percentage of runs off the bat each part of the order contributed in tests in each year since 1946.

I don't want to undermine my own blog too much, but the result was the most strikingly featureless dataset I have ever written about- as you can see in the graph below. The points show year by year data and the lines show 10 year averages.

Consistently, openers get about 26% of the runs, positions 3-5 get about 41%, numbers 6-8 get 25 % and the tail about 8 %. This has barely changed at all in the last 70 years.

The one small trend you can pick up is that the gap between openers and the lower middle order closes over time from a position where openers were contributing 3-4% more than numbers 6-8
up until the present day when the two contributions are basically equal (openers 25.3% vs lower middle 25.7 % over the last 10 years). This change is consistent with the increased batting role of wicket keepers which we discussed in the last post. There is a big uptick in the lower middle order data just this year, that stands out as rather an outlier- this part of the batting order has made 32.7% of the runs 2016, several percentage points above the long term average. This is in large part driven by England's reliance on that part of the line up- fully 42.6% of England's runs off the bat have come from numbers 6-8 this year. I expect the global figure (and probably England's too) will regress to the mean a bit before the year is out.

Positions 3-5 consistently provide the biggest slice of the run scoring pie. The difference between their contribution and the openers is a couple of percentage points larger than can be explained by the fact there's simply one less player in the openers category. This is consistent with the notion that teams tend to put their best batsmen somewhere between 3 and 5.

Batsmen 9-11 meanwhile, for all the talk of improving tail enders, have chipped in with about 8%  of the teams runs extremely consistently all this while and show no signs of changing.

Plus ca change, plus c'est la meme chose.


Thursday 26 May 2016

Charting the evolving role of the wicketkeeper

Last week's test between England and Sri Lanka belonged to Jonny Bairstow. A century on his home ground and a match winning one at that- rescuing England from 83-5 and dragging them to a total out of the reach of Sri Lanka's callow batting line up. Behind the stumps in his role as wicketkeeper he took 9 catches, making it an all round good 3 days at the office.

Bairstow is an example of what would seem to have become a pretty established pattern for the modern test match side: picking your wicketkeeper with a heavy emphasis on their willow-wielding ability, and a lesser focus on their glovemanship than might have been seen in previous generations. I don't think I'm going too far out on a limb to suggest that Bairstow is not the best pure wicketkeeper available to England, but out of the plausible  keeping options he's the best of the batsmen, at least for the longer format.

This has made me wonder: how much has the wicketkeeper's role evolved over time? How much more are teams relying on their keepers to score runs? And has an increased emphasis on the batting prowess of keepers had a measurable cost in their performance behind the stumps?

The simplest thing to think would be that picking keepers based on their batting would come at a price in catches and stumpings. But can this be seen in the data?

I particularly enjoyed researching this post, not least because answering those questions will take not one, not two, not three but four graphs.

First of all, the run scoring. The graph below shows the run scoring output of designated  wicketkeepers, as a percentage of total runs scored by batsmen in tests from 1946-2015. The red points are the year by year data and the blue line is the decade by decade average. The decade by decade averages give you a better sense of the long term trends.




This data shows a clear evolution towards a greater dependence on wicket keepers to provide runs. Wicket keepers provided only 6% of runs in the immediate post-war period, but they now provide nearly 10%. This is, of course, very much in line with conventional wisdom. One thing that struck me, however is how steady this increase has been. I had expected to see a rather more dramatic increase in the 90s and early 2000s after Adam Gilchrist made the swashbuckling batsman-keeper cool, but the importance of the wicketkeeper's runs had been rising steadily for a while (with a bit of a dip in the 1980s).

But what of their behind the stump performance? If teams' enthusiasm for batsman-keepers is leading to a lower standing of keeping, one might expect that to be reflected in how wickets are taken. If keepers are worse than they used to be then perhaps modes of dismissal which depend on them- catches behind and stampings- will decrease relative to other, non-keeper dependent, modes of dismissal.

The next graph shows the percentage of total wickets that were catches by the keeper in tests from 1946-2015. (Again, red points=year by year, blue line=decade by decade)



Far from decreasing, the reliance on wicketkeeper catches to provide wickets increases steadily post 1946- over the same period that keeper run scoring was on the rise- before hitting a plateau around the 1990s. Modern wicketkeepers provide about 19% of the total wickets through catches, and that figure has shown any noticeable downward shift since keepers have been expected to provide more runs. It may well be that what this graph is telling us has most to do with the evolution wicket keeping and bowling styles rather than keeping quality, but in any case its true that modern teams rely on wicket keepers both for more runs, and for more catches than teams 70 years ago. As the responsibility of keepers has increased their responsibility as glovemen has not diminished at all.

Wicket keepers can also contribute to dismissals via stampings. This is a much rarer mode of dismissal than caught behind but, we some may argue its a truer test of wicket keeping skill. The graph below shows the percentage of wickets that were stumpings over the same period as the graphs above.



The contribution of stumpings to the total wickets decreases in the post war years- over the same period that the contribution of catches increase (perhaps reflective of a decrease in standing up to the stumps? I'm not sure). But it's held steady between 1.3% and 1.9% for the last 50 years. So, wicket keepers continue to hold up their end in whipping off the bails.

If we can't see any strong changes in wicket keeping contributions to wickets, what about other ways of measuring wicket keeping quality? Byes, for instance. The graph below shows the number of byes conceded per 1000 deliveries in test cricket from 1946-2015.

The rate of conceding byes has hardly changed in 70 years. Looking at the decade by decade trends you could argue that it was on a steady decrease up to the 90s before taking an uptick, but these changes are miniscule- corresponding to maybe 1 extra bye conceded in a 1000 deliveries.

So, while its clear that more runs are indeed required of the modern keeper, the expectations behind the stumps have not shifted that much. Keepers contribute a consistent ~19% of wickets through catches with an additional ~1.5% through stumpings. They concede about 7 byes per 1000 balls and have barely budged from that for 70 years. Considering that the expectations on their batting have increased, while they have remained steady in other aspects of the game, keepers arguably have more on their plate than ever before.



Monday 16 May 2016

Reverse Swept Radio

This week I had the pleasure of being interviewed by Andy Ryan on the excellent Reverse Swept Radio podcast. If you would like to hear me talk about cricket, stats and this blog, the link is here:

http://reversesweptradio.podbean.com/e/rsr-81-a-cricket-podcast/

Friday 13 May 2016

How much more valuable are first division runs?

England announced their squad to play Sri Lanka this week, with Hampshire's James Vince getting the nod to take up the middle order slot unfortunately vacated by James Taylor. Nick Compton, meanwhile, keeps his place at number 3, at least for the time being. Essex's Tom Westley, who has had a productive start to the season and has been much talked up, was left out (I was hoping he would be picked, but not for any cricketing reason- I just wanted the opportunity to make some Princess Bride jokes).

As England squad selections draw near, with places up for grabs, attention often turns to the county championship averages. One of the few things everyone seems to agree on at this point is that runs made in the first division of the championship should be valued more highly, being made against higher quality attacks. This seems eminently reasonable, but raises a question: how much more valuable are they? Can we make the comparison quantitative?

I'm going to have a go.

What we want is to take a sample of batsmen who played in both divisions in successive seasons and ask, on average, how much did their run output drop/rise on switching divisions. Such a sample is provided to us by the championship's promotion and relegation system.

What I've done is go through the county averages for all the completed seasons since 2010, looking at the performance of players in teams that were relegated or promoted and then comparing their season's batting average before and after the change of divisions. (So, for example, I took the batsmen who played for Kent in division 1 in 2010 and compared each batsman's average to what they managed in division 2 in 2011).

I only included batsmen who played at least 10 matches in both seasons. The results are depicted in the graph below. The batting average in division 2 for each batsman in the sample is one the x-axis, with division 1 on the y-axis. Players in relegated teams are in red, promoted teams in blue. Points below the black line averaged higher in division 2 than division 1, and above vice versa. The green line is the best linear fit to the data.

Of the 81 players in the sample, 52 averaged higher in division 1 and 29 averaged lower. So, the intuition that runs are harder to get in division 1 seems solid, as expected. But how big is the difference?

Well, on average the relegated players in the sample increased their averages by 4.98 runs on going from division 1 to division 2. The promoted players saw their averages drop by an average of 7.12 runs on going from division 2 to division 1. So based on those numbers the difference is moderate but noticeable- able to turn a "very good" set of numbers into merely "good" ones and "good" into merely "acceptable".

The linear fit which I attempted (which should be taken with absolute ladelfulls of salt) gives:

average in div 1=28.2 + 0.12 * (average in div 2)

so it would predict a player who averages 50 in division 2 to average only 34.2 in division 1. (As I say, don't take this equation too seriously, and possibly not seriously at all, not least since it predicts that players averaging less than 32 in div 2 should be expected to do better in div 1).

There is a chance that the difference between divisions is exaggerated in this data by a selection bias. Specifically, looking at players who were promoted from div 2 or relegated from div 1 may bias the sample towards players who under-performed their "true" ability when in div 1 or over-performed in div 2. In this case the shift in batting averages may in part be a case of regression to the mean, on top of the real change in the difficulty of run-getting.

This caveat notwithstanding, the difference in divisions seems quite considerable, and division 1 runs are worthy of their additional praise.

Thursday 5 May 2016

The candidates

Despite its title, this is not a surprise post about the extraordinary political wranglings currently in full swing in the land of baseball and chilli-dogs. No, this will be about the far weightier matter of whether certain batsmen are especially susceptible to being pinned LBW, and who those current players are.

In cricket commentary, it's common for players whose technique looks somehow prone to leave them trapped in front of their stumps to be described as "lbw candidates". This terminology seems to be applied specially to that particular means of dismissal- batsmen are rarely described as "caught behind candidates".

The questions I want to investigate in today's post stem from this.

Firstly, is "lbw candidate" a worthwhile category- is there a substantial subgroup of modern test batsmen who are especially more lbw prone than their peers?

Secondly, who are these prime candidates in the post-Shane Watson era? I've often heard Alastair Cook described as a "candidate". Does he deserve the title?

We'll also be touching on where in the world lbws are most prevalent.

To tackle this, I took a sample of 45 current test match players, representing all the test nations apart from Zimbabwe, who haven't had much opportunity to play recently. The sample was obtained by taking the most recent test for each nation and including all the batsmen in he top 7 who had played at least 15 tests and who weren't obvious night-watchmen. For each player I looked up the total number of LBW dismissals in their test career and divided it by the number of dismissals overall. This is what is on the x-axis of the graph below, with the batting average of each player on the y-axis. The colour/shape of each point indicates the country for which the batsman plays.

The black dashed line is the sample median (0.155) and the red dashed lines either side are the upper (0.187) and lower (0.125) quartiles. As you can see, the data is quite clustered horizontally suggesting only a fairly small degree of variation in vulnerability to LBW amongst current test batsmen. There's also no significant correlation between the LBWs/dismissal and the batting average, suggesting that having a high proportion of dismissals be LBW doesn't indicate much either way for a batsman's run scoring ability.

There are, however, a few noticeable outliers, far removed from the central cluster to whom we now come:


  • The Shane Watson memorial award for excellence in attracting LBW decisions (I like the idea of this award- we could call it the "iron pad" and award it annually) goes to South Africa's JP Duminy, who is way off to the right of the graph with 39% of his dismissals being LBW. (A lot of these were against spin bowlers).
  • There's a select trio of players to the left of the graph who hardly ever get pinned LBW. Namely Pakistan's Sarfraz Ahmed (0 lbws/28 dismissals), England's Ben Stokes (1/41) and Bangladesh's Tamim Iqbal (2/79). It may not be significant but these are all quite aggressive batsmen, so perhaps more than being good at avoiding LBWs, they're finding other, more exciting, ways to get out first.
  • There's a foursome of Pakistan players separated from the main cluster, at around 0.25 LBWs/dismissal. These are: Younis Khan, Misbah ul Haq, Asad Shafiq and Mohammed Hafeez. It's tempting to wonder whether this might be because they play a lot of tests in the UAE, where the low, slow pitches are thought to be favourable for LBWs. Indeed, in the graph below you can see that the UAE does have the highest rate of LBWs per dismissal of top 6 batsmen amongst test match hosts since 2010. However, this probably doesn't fully account for it- if we exclude tests in the UAE for these four players only Hafeez sees his percentage of LBWs drop significantly.


Overall, modern test batsmen don't vary too much in how frequently their pinned leg before, with a small number of exceptions. For what it's worth, Alastair Cook falls close to the central cluster of data points in our first graph, albeit slightly on the high side, with a rate of 0.19 LBWs/dismissal. And with Pakistan's apparently quite LBW prone top order coming to England this summer, it could be quite a good season for the thump of ball on pad, and the slowly raised finger. Maybe.

Saturday 16 April 2016

Throwing out the form book?

So it's been quite a while since I posted anything here, but with the thrill of a new English cricket season upon me, I'm strapping on my pads of data, taking up my bat of analysis and striding out to the wicket of the internet.

As I scratch around, hoping to hit a bit of early season form, I'm going to attempt some rudimentary analysis of exactly that concept- "form". The point of this blog is meant to be try and hold up some of cricket's hoariest old cliches and nuggets of received wisdom to the light of some data. The idea of being "in form" is surely one of the foremost such cliches in cricket- perhaps in all of sport.

The eseential claim is this: a player is more likely to perform well at times when they have performed well in the recent past. A player who has performed well recently is usually said to be "in form".

The explanations for this tend to hinge on a player's confidence being high when their recent performances have been good. Or people may speak about players "being in a good rhythm", or "in a good place".

But to what extent is "a run of good form" distinguishable from a run of good luck? You sometimes hear commentators say something along the lines of:

"when you're in good form, it's amazing how the little bits of luck start going your way as well- playing and missing rather than nicking it,  balls in the air going between fielders rather than to them..."

At which I might want to say to them: "Is it amazing? Is it though? Or is it just that you only assign players the property of "good form" when they happen to be on a good run of scores- which requires a certain amount of luck?".

I'm not going to attempt a full analysis of whether form is a "real" phenomenon- in the sense of being meaningfully predictive of future performance- in one blog post. Although I may come back to different aspects of the question later.

I do, however, have some data to show which impacts on this question and I think it's interesting.

To make the question narrower, and therefore more tractable, I asked: "are test match batsmen more likely to score a century when they have already scored a test century in the last month?"

To answer this, I looked at the careers of the 23 most prolific test match century scorers in history. I did this because I needed a sample of players who had scored enough centuries that one could meaningfully compare the games when they hadn't scored one recently, with games where they had. Obviously, this does introduce quite a big selection bias- it's possible that the results I obtain may only be applicable to those players at the very top of cricketing history's tree. So be aware of that when you decide what to think of the results.

The graph below shows the rate of century scoring per match in games within a month of having previously scored a century against the total number of centuries scored per match for each player.
Points above the blue line represent players who had a higher rate of century scoring when they had recently scored a century and those below the blue line represent players who had a lower rate of century scoring when they had recently scored a century. The lone point way off to the top right of the graph is, of course, Sir Donald Bradman.

As a group these batsmen scored an overall total of 723 centuries in 2945 games- a rate of 0.246 centuries per match. In games within a month of having scored a test century my research puts them at a total of 182 centuries in 704 games- for a nearly identical (but slightly higher) rate of 0.258 centuries per match. On an individual level 11 of the players were more prolific when they'd recently hit a hundred and 12 were less so. For most players the difference was minor, as indicated by the fact that most points in the graph fall fairly close to the blue line.

There isn't enough evidence here for me to boldly claim that form makes no difference to batsmen. But it does suggest that form doesn't matter as much as you might imagine, at least for this sample of batsmen who belong among history's greatest.

For those out of form I would say this: take heart- form is an ephemeral thing which can return as suddenly as it departs. And maybe it doesn't matter so much whether you have it or not.