Batting Medians
Dave Wilson |I admit that I’ve never really been a fan of batting averages, at least in terms of what it says about what we should expect from a batsman each time he walks to the crease. I understand the premise, i.e. the batting average is the average betwen dismissals rather than the average of all innings, and of course by now we have a feel for what the batting average tells us about a player’s ability.
However there are many batsmen for whom the average is undoubtedly misleading, and a prime example is the first player ever to take guard in a Test match, Charles Bannerman. Bannerman’s average is a shade under sixty at 59.75, but this is of course grossly inflated by his first innings, that deservedly famous 165 retired not out. But is an average of 60 a reasonable guide to what should have been expected of Bannerman when he batted? If we look at Bannerman’s Test scores they look like this:-
165*, 4, 10, 30, 15, 15*
Arranged ordinally they look like this:-
4, 10, 15, 15, 30, 165
Now we can see that the 165 knock is a dramatic outlier and I think it’s fair to say an average of almost 60 is misleading as a guide to expected score in Bannerman’s case – in six innings he exceeded that figure only once. It was more likely that he would score somewhere between 4 and 30.
Admittedly this is a small sample, but it does highight the potential problems with averages as currently utilised in ranking batsmen. Take as a further example Graham Gooch and VVS Laxman; they both played the same number of innings at 215, with Gooch outscoring Laxman by more than 300 runs, yet Laxman’s average is fully five runs higher than Gooch’s due to the latter’s low number of not outs (six, as against 34 by Laxman). But what else can be done to give a fairer approximation of a batsman’s ability?
There are a number of ways to summarise central tendency in data, and for a string of numbers these are average (or mean), median and mode. For those of you who aren’t familar with theses terms, here is a quick definition – average or mean is the sum of all of the numbers divided by the sample size; the median is the midpoint of the data set when ranked in order, i.e. there will be as many numbers above the median as below it; and the mode is the value which is most represented in the data set.
Below is the curve of all Test scores:-
It should be noted that this curve shows all Test innings, not just those by recognised batsmen. The various values of central tendency for this curve are as follows:-
Batting average: 30.17
Mean: 26.2
Median: 13.0
Mode: 0
Why is the mean lower than the batting average? As discussed earlier, the batting average is the average between dismissals, whereas the mean here is the average of all innings incuding not outs. The median as mentioned is the midpoint, i.e. as many scores above 13 as below, and as can be seen this is much lower than the mean. The mode is the total most often scored, which is zero.
The importance of selecting the most appropriate method of determining the central tendency of data is of course affected by the distribution of the sample.
In a normal distribution, the mean, median and mode can be closely located because the peak of the sample data is central – however if we look at the distribution of batting scores they typically show a positive skew as in the example above, that is the peak is towards the left hand side of the plot, i.e. towards the low end of the scores. For this reason, the average or mean misrepresents the central tendency of the batsman’s scores due to the likelihood of a small number of high-value outliers, as seen with the example of Bannerman above.
I would propose then that a better representation of the central tendency of batting performances, given their skewness, is the median. I don’t believe there’s been a detailed study of batting medians, at least I’m not aware of it, and so I’ve spent some time looking at all batsmen as far as the median of their batting performances is concerned.
Going back to the skew, the larger the skew vaue the longer the tail stretching to the right of the graph; the average skew is around 1.5, whereas for example RE Foster’s is 3.5, indicating a much longer tail due to his one very large innings of 287 – the highest I found is Wasim Akram, who comes in at 4.8. David Steele by comparison has a skew value of less than 0.5 indicating (because it’s closer to zero) that Steele very rarely failed, meaning that the distribution of his scores was more “normal”. In statistics, the degree of skewness is usually taken as highly skewed if larger than 1, moderately skewed between 0.5 and 1 and approximately symmetrical (e.g. as a normal distribution would be) if lower than 0.5. The data for all Test innings is a highly skewed +2.55.
There are very few batsmen who have a negatively-skewed distribution of scores, i.e. the median is much higher than the mean because the median score is at the high end. Consider Desmond Lewis as an example; his scores in asccending order were
4, 14, 72, 81, 88
With two not outs his average was 86.33, though the mean of the scores is 51.8. However the scores above the median (81, 88) are much closer to it in value (72) than those below (4, 14) which results in a negative skew of -0.55.
Looking at all of the batsmen, it appears that a median of 30 is a measure of greatness. Here are some notables with their associated medians to give you a feel for how the medians vary:-
40.0 Hobbs
56.5 Bradman
32.0 Sutcliffe
28.5 Headley
33.0 Nourse
32.0 Hutton
28.0 Compton
34.5 Walcott
36.0 Weekes
28.0 Worrell
33.5 Sobers
46.0 Barrington
34.0 Pollock, RG
31.0 Chappell, GS
32.5 Richards, IVA
30.0 Javed Miandad
31.0 Border
25.5 Waugh, SR
34.0 Tendulkar
33.5 Lara
30.0 Ponting
34.5 Kallis
33.0 Dravid
31.0 Sehwag
32.5 Sangakkara
The above values, representing as they do the midpoint of all scores, give a more realistic approximation of the actual score we can expect from a batsmen, wich is much lower than the batting average. So as can be seen, 30 represents a reasonable level of greatness, with some of the all-time greats achieving medians over 40. It can also be seen that there is a significant difference between the batting average and the median score, and in some cases this difference is significant. Another point to make about the median is that it will either be an integer or between two integers, e.g. 34.5 – this reduced granularity may also be a better way to rank batsmen than comparing averages to two decimal places.
If we look at the differences between batting average and median score, they rank like this:-
Avg less Median
44.75 Bannerman
43.44 Bradman
38.85 Valentine
38.72 Dempster
37.66 Weekes, KH
35.20 Kambli
34.50 Wood
32.33 Kuruppu
The aforementioned Bannerman fares worst here, and the others in the list apart from Bradman can be seen to have their averages somewhat padded. Bradman is an exception here (as he usually is) – the high difference is of course largely a result of his already high batting average. The way to cope with that is to look at the median as a percentage of the average:-
25.10% Bannerman
25.49% Aamer Malik
27.57% Lewis, AR
28.18% North, MJ
29.31% Bonnor
29.63% Watkins
30.43% Raina
31.58% Fender
31.85% Twose
32.98% Robinson, RT
No sign of Bradman there. Turning to those batsmen who had a very small difference between their average and median, only twelve in the history of Test cricket have a median value higher than their average, the most significant of which was David Steele. Steele made an immediate impact upon his introduction to Test cricket, with five of his first seven innings being over 50 and none below 39, his median at that point being a whopping 66.0! While it dropped off somewhat after that he still ended up with a median of 43.0 (higher ithan his average of 42.06) and a very low skew value of 0.43, indicating he very rarely failed – 15 of his 20 innings were scores of 20 or more.
It should be noted that the skew, while it is typically a reasonable indicator of consistency, will unfairly penalise those with long careers who have a number of high scores. For example, Lara has a skew value of 2.32, which would tend to suggest he wasn’t very consistent; however if we cap all of his high scores at 150 his skew value drops to a respectable 1.0.
Going back to the median, the highest medians recorded to date are shown below:-
72.0 Lewis
65.0 Richards, BA
56.5 Bradman
51.0 Hill, AJL
50.0 Gregory, RG
48.0 Duleepsinhji
46.0 Barrington
46.0 Walters, CF
43.0 Steele
42.0 Jaques
41.0 Barnes, SG
40.5 Ramaswami
40.5 Tyldesley, GE
40.0 Hobbs
Some of these batsmen did not feature in many Tests, Bradman and Barrington being the notable exceptions – nonetheless it’s fair to say these batsmen demonstrated consistent performances. This can be conformed looking at the skew measure of scores – Lewis was discussed above, but Cyril Walters, who played 18 Test innings has a very low skew value of of +0.1; in only four of those innings did he fail to score at least 20.
But to me the significant indicator of the median as far as measuring the central tendency of individual batsmen can be seen if we rank the batsmen on median and exclude those with fewer than twenty innings, in which case we have this list:-
Best medians all-time
56.5 Bradman
46.0 Barrington
42.0 Jaques
40.5 Tyldesley, GE
40.0 Hobbs
38.0 Sutcliffe
36.0 Weekes
35.0 Gunn, G
35.0 Barlow, EJ
35.0 Katich
I have always felt that it’s unreasonable to consider that Bradman, as some have said, is 50-60% better than everyone else based on the difference between his average and the next best (depending where you place the cut off from an innings played point of view) – here, Bradman enjoys a lead of just over 20%, which is still very significant but may be a better measure of his superiority over the rest.
To conclude, below is a list of the top batsmen, ranked on their medians, showing also skew and batting average (cut-offs 20+ innings, average 35+)
Player Median Skew Avg Bradman 56.5 1.05 99.94 Barrington 46.0 1.19 58.67 Jaques 42.0 0.89 47.47 Tyldesley 40.5 0.39 55.00 Hobbs 40.0 1.30 56.94 Sutcliffe 38.0 1.05 60.73 Weekes 36.0 1.22 58.61 Barlow 35.0 1.46 45.74 Katich 35.0 0.82 45.03 Gunn 35.0 0.88 40.00 Kallis 34.5 1.16 57.43 Walcott 34.5 1.18 56.68 Pollock 34.0 1.83 60.97 Trott 34.0 1.62 57.79 Tendulkar 34.0 1.40 56.25 Hassett 34.0 1.52 46.56 Sobers 33.5 2.14 57.78 Lara 33.5 2.32 53.18 Paynter 33.0 2.00 59.23 Nourse 33.0 1.80 53.81 Dravid 33.0 1.67 53.00 Khan 33.0 2.10 50.93 Pietersen 33.0 1.51 50.48 Bland 33.0 1.31 49.08 Jardine 33.0 0.81 48.00 Watson 33.0 0.93 39.23 Taylor 33.0 0.72 35.60 Sangakkara 32.5 1.85 56.42 Richards 32.5 1.79 50.23 Hammond 32.0 2.18 58.45 Hutton 32.0 2.32 56.67 Hussey 32.0 1.17 53.26 Smith 32.0 1.93 50.30 Walters 32.0 1.82 48.26 Hendren 32.0 1.49 47.63 Kanhai 32.0 1.78 47.53 Brown 32.0 1.64 46.82 Richardson 32.0 0.91 44.77 Ryder 31.5 1.63 51.62 Chappell 31.0 1.67 53.86 Sehwag 31.0 2.19 52.41 Border 31.0 1.43 50.56 Mead 31.0 1.55 49.37 Chanderpaul 31.0 1.20 49.18 Jackson 31.0 0.89 48.79 Amla 31.0 1.74 46.95 Subba Row 31.0 1.02 46.85 Sharpe 31.0 0.85 46.23 Anwar 31.0 1.27 45.53 Hunte 31.0 2.19 45.06 Redpath 31.0 1.25 43.45 Fingleton 31.0 0.99 42.46 Hayden 30.5 2.37 50.22 McCabe 30.5 1.91 48.21 Stollmeyer 30.5 1.66 42.33 Iqbal 30.5 1.25 40.21 Ponting 30.0 1.56 53.13 Miandad 30.0 2.15 52.57 Jayawardene 30.0 2.30 51.53 Cook 30.0 2.03 49.72 Boycott 30.0 1.47 47.72 Simpson 30.0 2.48 46.81 Martyn 30.0 1.17 46.38 Collins 30.0 1.94 45.06 Richardson 30.0 1.56 44.39 Edrich 30.0 2.48 43.54 Amarnath 30.0 0.88 42.50 Fredericks 30.0 1.34 42.49 Ramesh 30.0 1.11 37.97 Sheppard 30.0 1.14 37.80 Richardson 30.0 0.97 37.47 Inzamam 29.5 1.97 49.60 Sidhu 29.5 1.27 42.13 Kamal 29.5 0.52 37.73 Laird 29.5 0.59 35.29 Davis 29.0 1.66 54.20 Flower 29.0 1.75 51.55 Gavaskar 29.0 1.44 51.12 Dexter 29.0 1.64 47.89 May 29.0 1.97 46.77 Morris 29.0 1.59 46.48 Leyland 29.0 1.43 46.06 Greig 29.0 1.26 40.43 Ahmed 29.0 1.74 40.41 Headley 28.5 1.71 60.83 Youhana 28.5 1.52 52.29 Compton 28.0 1.89 50.06 Worrell 28.0 2.09 49.48 Bell 28.0 1.54 49.28 Mitchell 28.0 1.18 48.88 Lawry 28.0 1.57 47.15 Robertson 28.0 1.25 46.36 Woodfull 28.0 1.16 46.00 Pullar 28.0 1.69 43.86 Rowan 28.0 2.31 43.66 Butcher 28.0 1.85 43.11 Washbrook 28.0 1.70 42.81 Viswanath 28.0 1.54 41.93 Gayle 28.0 3.08 41.65 McDonald 28.0 1.64 39.32 Haddin 28.0 2.04 38.86 Gregory 28.0 0.99 36.96 Ganguly 27.5 1.70 42.18 Harvey 27.0 1.47 48.41 Graveney 27.0 2.14 44.38 Gower 27.0 1.77 44.25 Taylor 27.0 2.49 43.50 Gooch 27.0 2.34 42.58 Strauss 27.0 1.39 41.98 Taylor 27.0 1.15 40.77 Umar 27.0 2.15 40.08 D'Oliveira 27.0 1.28 40.06 Fleming 27.0 2.55 40.06 Stewart 27.0 1.60 39.55 Armstrong 27.0 1.93 38.68 Ranatunga 27.0 1.11 35.70 Cowper 26.5 2.78 46.84 Jones 26.5 1.77 44.27 Ransford 26.5 1.87 37.84 Gambhir 26.0 1.45 48.34 Gilchrist 26.0 1.33 47.61 Nurse 26.0 2.04 47.60 de Villiers 26.0 2.14 47.41 Laxman 26.0 2.09 46.26 O'Neill 26.0 1.51 45.55 Misbah 26.0 1.32 44.97 Turner 26.0 2.45 44.64 Kal'charran 26.0 1.23 44.43 Chappell 26.0 1.62 42.42 Taylor 26.0 1.36 41.76 Wessels 26.0 1.59 41.00 Symonds 26.0 1.95 40.61 Denness 26.0 2.24 39.69 Kippax 26.0 1.42 36.12 Waugh 25.5 1.37 51.06 Hazare 25.0 1.40 47.65 Clarke 25.0 1.24 46.31 Azharuddin 25.0 1.58 45.04 Greenidge 25.0 1.86 44.72 Cowdrey 25.0 1.28 44.06 Malik 25.0 1.82 43.70 Smith 25.0 1.30 43.67 Boon 25.0 1.51 43.66 Rowe 25.0 2.82 43.55 Slater 25.0 1.58 42.84 Umrigar 25.0 1.80 42.44 McGlew 25.0 2.47 42.06 Gibbs 25.0 1.93 41.95 Macartney 25.0 1.52 41.78 Dilshan 25.0 1.70 41.69 McCosker 25.0 0.97 39.56 Gurusinha 25.0 1.70 38.92 McKenzie 25.0 2.29 37.39 Duff 25.0 1.68 35.59 Samaraweera 24.5 1.62 52.61 Rae 24.5 0.75 46.18 Kirsten 24.5 1.91 45.27 Parfitt 24.5 1.18 40.91 Sarwan 24.5 2.52 40.01 Manjrekar 24.5 1.86 39.12 Khan 24.5 1.32 38.92 Wright 24.5 1.51 37.82 Mohammad 24.5 1.54 35.81 Jones 24.0 1.83 46.55 Crowe 24.0 2.18 45.37 Ryder 24.0 1.99 44.85 Haynes 24.0 1.44 42.29 Vengsarkar 24.0 1.43 42.13 Vaughan 24.0 1.84 41.44 Gomes 24.0 1.39 39.63 Ponsford 23.5 2.10 48.22 Lehmann 23.5 1.44 44.95 Mohammad 23.5 1.65 39.17 Dhoni 23.5 1.27 38.14 Nazar 23.5 2.15 38.09 Abel 23.5 1.65 37.20 McCullum 23.5 2.18 36.70 Thorpe 23.0 1.30 44.66 Sutcliffe 23.0 2.36 40.10 Iqbal 23.0 1.68 38.85 Burge 23.0 2.05 38.16 Khan 23.0 1.54 37.69 Manjrekar 23.0 2.48 37.15 Hooper 23.0 2.10 36.47 Barber 23.0 2.49 35.59 Atherton 22.5 1.29 37.70 Kelleway 22.5 1.64 37.42 Stackpole 22.5 1.97 37.42 Hughes 22.5 1.66 37.41 Raja 22.5 1.25 36.16 Woolley 22.5 1.39 36.07 Prince 22.0 1.51 43.36 de Silva 22.0 1.88 42.98 Yallop 22.0 2.31 41.13 Bardsley 22.0 1.73 40.47 Sardesai 22.0 2.22 39.23 McMillan 22.0 1.34 38.46 Miller 22.0 1.62 36.97 Holt 22.0 1.83 36.75 Umar Akmal 21.5 1.34 35.82 Mohammad 21.0 2.85 43.98 Edrich 21.0 2.14 40.00 Hafeez 21.0 1.38 35.17 Mohammad 20.5 1.84 44.34 Jayasuriya 20.5 2.89 40.07 Logie 20.5 1.10 35.80 Warnapura 20.5 0.98 35.70 Stoddart 20.5 2.35 35.57 Shrewsbury 20.5 1.90 35.47 Hardstaff 20.0 1.46 46.74 Til'karatne 20.0 1.66 42.88 Goodwin 20.0 1.46 42.84 Matthews 20.0 1.24 41.09 Broad 20.0 1.50 39.55 Trumper 20.0 2.08 39.04 Lindsay 20.0 1.84 37.66 Khan 20.0 1.98 37.10 Astle 20.0 1.92 37.02 Patil 20.0 1.85 36.93 Dias 20.0 0.94 36.71 Borde 20.0 1.58 35.59 Ritchie 20.0 1.63 35.21 Kambli 19.0 1.77 54.20 Reid 19.0 1.37 46.29 Abbas 19.0 2.27 44.79 Prior 19.0 0.97 44.71 Hill 19.0 1.58 39.21 Coney 19.0 1.60 37.57 Hussain 19.0 1.63 37.19 Lamb 19.0 1.39 36.09 Luckhurst 19.0 1.36 36.05 Fowler 19.0 2.10 35.32 Gatting 18.5 2.13 35.55 Booth 18.0 1.29 42.21 Fletcher 18.0 2.03 39.90 Iredale 18.0 1.35 36.68 Oram 18.0 1.61 36.33 Styris 18.0 1.64 36.04 Steel 17.5 2.49 35.29 Ranji 17.0 1.77 44.95 Faulkner 17.0 1.82 40.79 Ames 17.0 1.56 40.56 Sandham 17.0 3.46 38.21 Umar 17.0 2.56 36.63 Afridi 17.0 1.45 36.51 Shastri 17.0 2.07 35.79 Atapattu 16.5 2.36 39.02 Lloyd 16.0 2.23 46.67 Amiss 16.0 1.99 46.30 Edwards 16.0 1.49 40.37 Tres'thick 15.0 2.38 43.80
How about compare each batsmans score against the median score (z-score)for each say 3-5 year period and average it so to get an idea how each batsman is rated according to his contemporaries
Comment by Joe Drake-Brockman | 12:00am GMT 24 November 2011
Very interesting article, but i doubt many lower order batsman would like it 😀
Comment by Leandro | 12:00am GMT 24 November 2011
Great work. I value consistency (median) as more important than averages. I would also like to see statistics on dot balls for one day match averages.
It should also be remembered that cricket is a team game and not all batsmen played for the team, rather they played for themselves and is more readily apparent in the one day format.
Comment by Steven Delit | 12:00am GMT 3 January 2012
Great article.
A question and a request 🙂
Question – are you taking the median of runs scored in an innings, or runs scored between dismissals?
Request – could you please let me know where you got the data from to do this analysis? I’d like to play around with it as well.
Thank you!
Comment by Masum | 12:00am BST 28 August 2013
I think this would also be very relevant for bowlers as an average of say 28 doesn’t translate to an actual analysis like a median of say 4/120 would and would allow for a better comparison of bowlers
Comment by Daniel | 5:39am GMT 8 February 2015