Making All-time Test Batting Averages Fully Comparable

Adorable Asshole · Nov 1, 2023

First theorem of CW : Any thread given enough time would turn into Murali v Warne.

shortpitched713 · Nov 1, 2023

I don't think anyone really likes this list, as small sample size kings Barry Richards, Voges, and Arif are ending up in the top 5.

I'm interested in what exactly the biggest factor for downgrading Bradman so severely is. I think it's a very realistic number, mind. Is it simply down to exclusion of a sheer weight of "dead runs"?

If such a characterization is indeed accurate (which I think you'll find many on here disputing shortly), then it would have been somewhat of a surreal spectacle to see. A batsman time and time again batting well beyond a total and match situation many other batsmen would have called for declarations, against a demoralized and essentially beaten bowling opposition, while not advancing the position of his own team, and seemingly all 21 other players taking part simply humoring him through the spectacle. There's anecdotal support for such a characterization of course, but mostly from those with an axe to grind (O'Reilly being the most prominent). And it certainly would have helped that the Don was immaculate in his ability to very quickly bludgeon and cash in on runs in such situations. However, the whole picture painted is in stark contrast to that painted of the immaculate GOAT of batting by historians and most contemporaries.

Coronis · Nov 1, 2023

shortpitched713 said:
I don't think anyone really likes this list, as small sample size kings Barry Richards, Voges, and Arif are ending up in the top 5.

I'm interested in what exactly the biggest factor for downgrading Bradman so severely is. I think it's a very realistic number, mind. Is it simply down to exclusion of a sheer weight of "dead runs"?

If such a characterization is indeed accurate (which I think you'll find many on here disputing shortly), then it would have been somewhat of a surreal spectacle to see. A batsman time and time again batting well beyond a total and match situation many other batsmen would have called for declarations, against a demoralized and essentially beaten bowling opposition, while not advancing the position of his own team, and seemingly all 21 other players taking part simply humoring him through the spectacle. There's anecdotal support for such a characterization of course, but mostly from those with an axe to grind (O'Reilly being the most prominent). And it certainly would have helped that the Don was immaculate in his ability to very quickly bludgeon and cash in on runs in such situations. However, the whole picture painted is in stark contrast to that painted of the immaculate GOAT of batting by historians and most contemporaries.

Well excluding certain scores above x seems disingenuous to me - that itself is a large part of Bradman’s skillset (18/29 hundreds were 150+, 12 were 200+) - especially considering such large scores are often the instrumental difference in the team winning. Also it would be almost impossible to define how many of these runs are “dead”. How do you define dead runs wrt a first innings score? Say, 100 runs less on the scoreboard changes the whole complexion of the match and the way the opposition approaches their innings.

For example (in the 30’s at least) Australia had O’Reilly and Grimmett of course, but their bowling depth after that was frankly awful compared with England. They lose a lot more of those matches without the Don.

shortpitched713 · Nov 1, 2023

Coronis said:
Well excluding certain scores above x seems disingenuous to me - that itself is a large part of Bradman’s skillset (18/29 hundreds were 150+, 12 were 200+) - especially considering such large scores are often the instrumental difference in the team winning. Also it would be almost impossible to define how many of these runs are “dead”. How do you define dead runs wrt a first innings score? Say, 100 runs less on the scoreboard changes the whole complexion of the match and the way the opposition approaches their innings.

I know, I find the whole exercise a bit dubious. It was just curious to me that such a dubious methodology lands Bradman with an adjusted average that I would find quite realistic instinctively, given his actual mortal coil.

AndrewB · Nov 1, 2023

The Sean said:
Every college physics professor in the world has more absolute physics knowledge than Einstein had, but that doesn’t mean they are greater than he was. Einstein himself also acknowledged the debt he owed to those who had gone before with his famous quote about “standing on the shoulders of giants.”

That was Newton, not Einstein (well, I suppose Einstein might have said it as well).

TheJediBrah · Nov 2, 2023

shortpitched713 said:
I know, I find the whole exercise a bit dubious. It was just curious to me that such a dubious methodology lands Bradman with an adjusted average that I would find quite realistic instinctively, given his actual mortal coil.

confirmation bias

Spark · Nov 2, 2023

So, like, according to this, NZ currently has no fewer than three of the greatest 30 batsmen ever to play the game currently in their Test match top order. Which by any reasonable estimation would make them one of the greatest if not the greatest top orders to ever play the game.

So my question is this: how do you explain NZ not being utterly dominant in Test cricket and steamrollering everyone they encounter by sheer weight of top order runs?

Actually no, my real question is: did you do any sanity checks on your methodology at all? While I also completely agree with all of @The Sean's criticisms, even disregarding the historical critiques the whole thing falls apart under the weight of its own arguments.

pbkettle · Nov 2, 2023

pbkettle said:
In my article on this topic (CW, 29 October), I suggested a general approach to the task - one which treats all players in a consistent way to ensure that criteria applied to a particular batter get applied to all.

I mentioned a hope that, after considering the method and findings obtained, readers may wish to introduce additional factors they consider would then produce a better reflection of batters' relative abilities at the crease. These “extensions” might be thought of as representing refinements to the basic scheme put forward, and some possibilities in my own mind were given at the end of the piece.

So I invite readers to formulate their own proposed improvements and see what these imply by incorporating them into the basic model or something similar.

For further details of the model used, and how to operate it, please contact me: pbkettle@gmail.com

ataraxia said:
I always knew Barry was better than Bradman

Glad you agree with me. I'd be interested to know the basis of your own view. Perhaps you could elaborate?

ataraxia said:
I always knew Barry was better than Bradman

Adorable Asshole said:
Always knew Sangakkara was better than hacks like Sachin and Sobers.

How did you know, I wonder. Please tell whether this is purely an intuitive judgement or from some sort of analysis.

The Sean said:
Ok, I’ve had a re-read of this so as to try to get to grips with it. I’m specifically avoiding pithy one-liner responses because it’s clear that a lot of effort has gone into this analysis and I respect anyone who puts in the hard yards to develop something like this, and it deserves a proper and considered response.

Obviously, the (scarcely believable) reduction in Bradman’s numbers is the striking headline, which I’ll come to but don’t want to focus on at the expense of everything else. Rather, I have questions generally about the methodology and what seem to be inconsistencies to me but which I may not have fully understood.

To start with, dead runs. Of course, I understand that we can find examples where a team has batted on and on beyond the point of really needing to, but is that really so common as to literally exclude all instances of particularly high scoring throughout the entirety of Test history? As cricket fans, we often lament the fact that players don’t go big when the opportunity presents itself, and yet here you have invented a criterion which seems specifically designed to do nothing except statistically punish players for doing so. To exclude all runs beyond 100/135/150 scored by any batsman in any innings they have played makes absolutely zero sense in a study literally focused on batting averages, and would do so only if you told me that Mark Waugh was a paid consultant on this exercise.

Let’s take just as one example Hanif Mohammad’s epic 337 against the West Indies in Bridgetown in 1958, which saved Pakistan from what seemed a certain heavy defeat. Pakistan had been rolled very cheaply in their first innings in the face of a huge West Indian total, and were forced to follow-on nearly 500 runs behind. If Hanif gets out for 150 there, Pakistan lose and lose huge. Why on earth should his last 187 runs be discounted as “dead”? In fact, I’d say that each run he scored beyond 150 was more valuable, not less.

Second, the dominance rating could do with more clarity (at least from my not-very-smart perspective). For starters, just the inclusion alone of certain players who have played so few innings is bound to skew the analysis. Barry Richards was surely a magnificent player, but seven Test innings can’t possibly be enough of a sample size for a study that presents itself as a statistical one. Graeme Pollock, it should be noted, played in that same series and scored more runs than Richards at a better average (albeit with one of those innings being a lot of “dead” runs according to your criteria). The inclusion of Taslim Arif is even more incongruous – he played only ten Test innings, and made 42% of his entire Test aggregate in just one of them! And yet there he is, looking down on almost everyone from fourth spot. If Richards and Taslim are included, why not Andy Ganteaume or Kurtis Patterson? And Lawrence Rowe must be kicking himself for not retiring after his first Test.

However, more than who is and is not included, what I don’t follow is how the dominance ratings are first applied and then adjusted. Notwithstanding the removal of dead runs which I have already discussed, I read your passage explaining the theory behind calculating the dominance rating, but can’t see any source numbers which show clearly why an individual batsman has a specific dominance rating and what his baseline 1.0 number is. Again I'd argue that the "dead runs" part of this analysis doesn't tally with the goal as we are looking at Test averages in the context of dominance over their peers, and yet are actively removing the huge scores which would indicate the kind of true dominance we are looking for.

Beyond even that though, is that I can’t see how or why the older batsmen have had their numbers reduced by so much in order to align with what their “equivalent” would be in the Present Era. To use an example that you note in your piece:

“A batsman who exhibits the same degree of dominance today as Bradman did in his own time would not require nearly such a high batting average as he achieved. Instead of an average of 91.0 (excluding dead runs), this reduces by 23%, to become 70.1 to allow for subsequent deflation in the level and spread of averages; and is further reduced by 4.5%, down to 66.9, to reflect the general increase in batting skills since Bradman’s playing days.”

Once again, I may have missed where this is clearly explained by why does Bradman (or someone like him) have his average reduced by 23% - what is the rationale for that specific number? It’s a colossal reduction, particularly given you are already removing “dead runs” (which Bradman made many, many more of than anyone else by proportion due to his almost inhuman ability to go big), and also applying the arbitrary “expertise” reduction (which I’ll come to later) as well. It seems like an excuse to apply reduction upon reduction for unclear reasons in order to reach a number you want it to.

You reference Charles Davis’ study during your piece, noting areas where he adjusted or downgraded Bradman and others in his standardisation that you haven’t, and have been “fairer” in your analysis. And yet Davis’ final conclusion – with Bradman’s standardised average of 84.5 – was that from a statistical point of view he really shouldn’t exist, being so many standard deviations from the mean, and that it would be reasonable to expect a player as statistically dominant as Bradman to emerge in the game of cricket approximately once in every 200,000 years. How has your exercise – which you suggest is fairer and more reasonable – reduced Bradman’s number by so much more than Davis did to the point that he’s not an outlier at all?

Third, the eras seem inconsistent. The Present Era as defined by the study encompasses a period of nearly 24 years, which is at odds with some previous eras being confined to a single decade. One of the core tenets of this study is that everything before 2000 is to be standardised to the baseline of what it would be in the Present Era, and yet this era is so broad in scope that I don’t see how that is meaningful or even possible. We’ve seen enough studies of standardised batting averages over the years to have concluded that Test batting conditions in recent times are very different to those of the early 2000s (nearly a quarter of a century ago!) but this exercise places it all in the same bucket.

For previous eras, the amount of time designated as an “era” varies widely. In many cases, it is a simple calendar decade. But then we also have the period after WWII which when designating eras to include batsmen encompasses a full 21 years inclusive from 1946-66, but when deciding on expertise % is split in half. Why? The few years after the war saw an explosion in runscoring to an all time peak. The 1950s then, as we know, saw a concerted effort to make pitches in most of the world more bowler friendly with a corresponding drop in batting averages. Then the 1960s saw a rebalance between bat and ball that took global averages back up past the 30 mark where they have remained more or less ever since (barring a dip into the 29s during the 1990s).

The blokes batting in the mid-1960s aren’t the same guys who were batting in the late-1940s, nor were they batting in necessarily comparable conditions to all the guys in between, so what is the rationale behind a 46-66 era – or why the expertise splits it with quite a significant percentage difference in 1955? Why not stop it at the end of a decade? The next era is 13 years inclusive from 67-79, followed by simple calendar decades again for the ‘80s and ‘90s. If that shortening is because of the sheer increase in cricketing volume over time – more cricket was played by more cricketers in later decades – then I get that, but it subsequently renders completely senseless the idea of a “Present Era” encompassing a period nearly two and a half times longer than those immediately before! I don’t see how you can meaningfully baseline or compare when the parameters are seemingly so arbitrary and inconsistent.

Fourth, expertise advancement. This is one of my pet hates in regard to cricketing discussions generally, not just this particular piece, because it so often relies on the concept of teleporting a player from the past without warning decades into the future and then judging him on that. You make this point during your piece where you write:

“…if Bradman were to be transported to the Present Era with his demonstrated abilities unchanged, he would be somewhat less dominant in relation to present day players than he was in relation to his contemporaries because batting expertise in general has risen in the intervening period. (Hence my lower, fully standardised, average for him.) The same point applies to all batsmen of each previous era.”

Yes, cricket has evolved over time, because of course it has. In every field of human endeavour we are faster and stronger and know more than we used to, and if we don’t then something has gone seriously wrong. We have access to tools, knowledge and technology that previous generations didn’t have and in many cases couldn’t even dream of. However, this to me is a natural function our advancement, rather than an indication that we are today inherently “better” than those who had gone before. Every college physics professor in the world has more absolute physics knowledge than Einstein had, but that doesn’t mean they are greater than he was. Einstein himself also acknowledged the debt he owed to those who had gone before with his famous quote about “standing on the shoulders of giants.”

If we are acknowledge – as we should – that the game of cricket has evolved over the course of its history then we also need to acknowledge those things which impacted players of the past which today’s champions no longer have to deal with. Pitches weren’t covered from the elements, boundaries weren’t roped in 20 yards from the fence, protective equipment protected **** all, bats didn’t have sweet spots which turned a piece of willow into a cannon, and bowlers delivered the ball from further up the pitch due to no-balls being determined by the position of the back foot, not the front. If we are going to use the evolution of the game to rate players, then we need to adjust in both directions.

And that’s without even addressing the arbitrary percentages applied to downgrade players’ “expertise” from previous eras. Why are batsmen from the 1990s downgraded 2%, but players from the 1980s by more than double that at 4.5%? What is the rationale behind the percentage decreases applied at each time period? To my point above about time period inconsistencies, why are we saying that expertise has increased by a certain (inconsistent) percentage in virtually every decade, but then there has been absolutely no increase at all in batting expertise between 2000 and 2023?

When it comes to the application of these reductions for previous players, why aren’t they consistent with the figures you’d previously allocated for each era. For example, Ken Barrington played in an era which you have said was 7.5% lower in expertise than the Present Era. However, on your ranked table in the column “Allow for Advance in Expertise”, you have adjusted Barrington’s average by 3.8%. Why not 7.5% - has that adjustment already been factored in elsewhere in another number? It was hard to tell, and throughout the list the % reduction on the table was lower than the % reduction for era which you’d nominated earlier. I may well be missing something there which you’d already explained somewhere else, but it would be good to understand why it is different.

Finally, a sense check. I always think that will studies like this one of the most interesting parts is coming to the end of your analysis and discovering something that you wouldn’t have expected to find or that casts new light on previously accepted wisdom. However, I also strongly believe that any conclusion should be balanced by a sense check, a smell test if you will – basically, does this make sense based on everything I know of the subject at hand and what could reasonably be expected? I won’t even address the top of the list, and the inclusion in an all time top ten of blokes for whom the hashtag #samplesizelol could have been invented.

Rather, I’d look further down the list for my sense check. If you conducted a study to rank Test batsmen and found that Viv Richards and Len Hutton were placed one after the other on your list, then you’d probably think that’s a fair starting point that you may well be on the right track. However, if the two consecutive positions they held on the list were 53rd and 54th, I’d reckon you might cross-reference that against your knowledge and appreciation of cricketing history and have another look at your methodology.

While I appreciate the effort which went into this, and accepting fully that I may well have gotten the wrong end of the stick on literally everything, this study screams to me to be one which started with the controversial conclusion that it wanted to reach, and then introduced ever more extreme criteria in order to reach it.

I didn't exclude all

The Sean said:
Ok, I’ve had a re-read of this so as to try to get to grips with it. I’m specifically avoiding pithy one-liner responses because it’s clear that a lot of effort has gone into this analysis and I respect anyone who puts in the hard yards to develop something like this, and it deserves a proper and considered response.

Obviously, the (scarcely believable) reduction in Bradman’s numbers is the striking headline, which I’ll come to but don’t want to focus on at the expense of everything else. Rather, I have questions generally about the methodology and what seem to be inconsistencies to me but which I may not have fully understood.

To start with, dead runs. Of course, I understand that we can find examples where a team has batted on and on beyond the point of really needing to, but is that really so common as to literally exclude all instances of particularly high scoring throughout the entirety of Test history? As cricket fans, we often lament the fact that players don’t go big when the opportunity presents itself, and yet here you have invented a criterion which seems specifically designed to do nothing except statistically punish players for doing so. To exclude all runs beyond 100/135/150 scored by any batsman in any innings they have played makes absolutely zero sense in a study literally focused on batting averages, and would do so only if you told me that Mark Waugh was a paid consultant on this exercise.

Let’s take just as one example Hanif Mohammad’s epic 337 against the West Indies in Bridgetown in 1958, which saved Pakistan from what seemed a certain heavy defeat. Pakistan had been rolled very cheaply in their first innings in the face of a huge West Indian total, and were forced to follow-on nearly 500 runs behind. If Hanif gets out for 150 there, Pakistan lose and lose huge. Why on earth should his last 187 runs be discounted as “dead”? In fact, I’d say that each run he scored beyond 150 was more valuable, not less.

Second, the dominance rating could do with more clarity (at least from my not-very-smart perspective). For starters, just the inclusion alone of certain players who have played so few innings is bound to skew the analysis. Barry Richards was surely a magnificent player, but seven Test innings can’t possibly be enough of a sample size for a study that presents itself as a statistical one. Graeme Pollock, it should be noted, played in that same series and scored more runs than Richards at a better average (albeit with one of those innings being a lot of “dead” runs according to your criteria). The inclusion of Taslim Arif is even more incongruous – he played only ten Test innings, and made 42% of his entire Test aggregate in just one of them! And yet there he is, looking down on almost everyone from fourth spot. If Richards and Taslim are included, why not Andy Ganteaume or Kurtis Patterson? And Lawrence Rowe must be kicking himself for not retiring after his first Test.

However, more than who is and is not included, what I don’t follow is how the dominance ratings are first applied and then adjusted. Notwithstanding the removal of dead runs which I have already discussed, I read your passage explaining the theory behind calculating the dominance rating, but can’t see any source numbers which show clearly why an individual batsman has a specific dominance rating and what his baseline 1.0 number is. Again I'd argue that the "dead runs" part of this analysis doesn't tally with the goal as we are looking at Test averages in the context of dominance over their peers, and yet are actively removing the huge scores which would indicate the kind of true dominance we are looking for.

Beyond even that though, is that I can’t see how or why the older batsmen have had their numbers reduced by so much in order to align with what their “equivalent” would be in the Present Era. To use an example that you note in your piece:

“A batsman who exhibits the same degree of dominance today as Bradman did in his own time would not require nearly such a high batting average as he achieved. Instead of an average of 91.0 (excluding dead runs), this reduces by 23%, to become 70.1 to allow for subsequent deflation in the level and spread of averages; and is further reduced by 4.5%, down to 66.9, to reflect the general increase in batting skills since Bradman’s playing days.”

Once again, I may have missed where this is clearly explained by why does Bradman (or someone like him) have his average reduced by 23% - what is the rationale for that specific number? It’s a colossal reduction, particularly given you are already removing “dead runs” (which Bradman made many, many more of than anyone else by proportion due to his almost inhuman ability to go big), and also applying the arbitrary “expertise” reduction (which I’ll come to later) as well. It seems like an excuse to apply reduction upon reduction for unclear reasons in order to reach a number you want it to.

You reference Charles Davis’ study during your piece, noting areas where he adjusted or downgraded Bradman and others in his standardisation that you haven’t, and have been “fairer” in your analysis. And yet Davis’ final conclusion – with Bradman’s standardised average of 84.5 – was that from a statistical point of view he really shouldn’t exist, being so many standard deviations from the mean, and that it would be reasonable to expect a player as statistically dominant as Bradman to emerge in the game of cricket approximately once in every 200,000 years. How has your exercise – which you suggest is fairer and more reasonable – reduced Bradman’s number by so much more than Davis did to the point that he’s not an outlier at all?

Third, the eras seem inconsistent. The Present Era as defined by the study encompasses a period of nearly 24 years, which is at odds with some previous eras being confined to a single decade. One of the core tenets of this study is that everything before 2000 is to be standardised to the baseline of what it would be in the Present Era, and yet this era is so broad in scope that I don’t see how that is meaningful or even possible. We’ve seen enough studies of standardised batting averages over the years to have concluded that Test batting conditions in recent times are very different to those of the early 2000s (nearly a quarter of a century ago!) but this exercise places it all in the same bucket.

For previous eras, the amount of time designated as an “era” varies widely. In many cases, it is a simple calendar decade. But then we also have the period after WWII which when designating eras to include batsmen encompasses a full 21 years inclusive from 1946-66, but when deciding on expertise % is split in half. Why? The few years after the war saw an explosion in runscoring to an all time peak. The 1950s then, as we know, saw a concerted effort to make pitches in most of the world more bowler friendly with a corresponding drop in batting averages. Then the 1960s saw a rebalance between bat and ball that took global averages back up past the 30 mark where they have remained more or less ever since (barring a dip into the 29s during the 1990s).

The blokes batting in the mid-1960s aren’t the same guys who were batting in the late-1940s, nor were they batting in necessarily comparable conditions to all the guys in between, so what is the rationale behind a 46-66 era – or why the expertise splits it with quite a significant percentage difference in 1955? Why not stop it at the end of a decade? The next era is 13 years inclusive from 67-79, followed by simple calendar decades again for the ‘80s and ‘90s. If that shortening is because of the sheer increase in cricketing volume over time – more cricket was played by more cricketers in later decades – then I get that, but it subsequently renders completely senseless the idea of a “Present Era” encompassing a period nearly two and a half times longer than those immediately before! I don’t see how you can meaningfully baseline or compare when the parameters are seemingly so arbitrary and inconsistent.

Fourth, expertise advancement. This is one of my pet hates in regard to cricketing discussions generally, not just this particular piece, because it so often relies on the concept of teleporting a player from the past without warning decades into the future and then judging him on that. You make this point during your piece where you write:

“…if Bradman were to be transported to the Present Era with his demonstrated abilities unchanged, he would be somewhat less dominant in relation to present day players than he was in relation to his contemporaries because batting expertise in general has risen in the intervening period. (Hence my lower, fully standardised, average for him.) The same point applies to all batsmen of each previous era.”

Yes, cricket has evolved over time, because of course it has. In every field of human endeavour we are faster and stronger and know more than we used to, and if we don’t then something has gone seriously wrong. We have access to tools, knowledge and technology that previous generations didn’t have and in many cases couldn’t even dream of. However, this to me is a natural function our advancement, rather than an indication that we are today inherently “better” than those who had gone before. Every college physics professor in the world has more absolute physics knowledge than Einstein had, but that doesn’t mean they are greater than he was. Einstein himself also acknowledged the debt he owed to those who had gone before with his famous quote about “standing on the shoulders of giants.”

If we are acknowledge – as we should – that the game of cricket has evolved over the course of its history then we also need to acknowledge those things which impacted players of the past which today’s champions no longer have to deal with. Pitches weren’t covered from the elements, boundaries weren’t roped in 20 yards from the fence, protective equipment protected **** all, bats didn’t have sweet spots which turned a piece of willow into a cannon, and bowlers delivered the ball from further up the pitch due to no-balls being determined by the position of the back foot, not the front. If we are going to use the evolution of the game to rate players, then we need to adjust in both directions.

And that’s without even addressing the arbitrary percentages applied to downgrade players’ “expertise” from previous eras. Why are batsmen from the 1990s downgraded 2%, but players from the 1980s by more than double that at 4.5%? What is the rationale behind the percentage decreases applied at each time period? To my point above about time period inconsistencies, why are we saying that expertise has increased by a certain (inconsistent) percentage in virtually every decade, but then there has been absolutely no increase at all in batting expertise between 2000 and 2023?

When it comes to the application of these reductions for previous players, why aren’t they consistent with the figures you’d previously allocated for each era. For example, Ken Barrington played in an era which you have said was 7.5% lower in expertise than the Present Era. However, on your ranked table in the column “Allow for Advance in Expertise”, you have adjusted Barrington’s average by 3.8%. Why not 7.5% - has that adjustment already been factored in elsewhere in another number? It was hard to tell, and throughout the list the % reduction on the table was lower than the % reduction for era which you’d nominated earlier. I may well be missing something there which you’d already explained somewhere else, but it would be good to understand why it is different.

Finally, a sense check. I always think that will studies like this one of the most interesting parts is coming to the end of your analysis and discovering something that you wouldn’t have expected to find or that casts new light on previously accepted wisdom. However, I also strongly believe that any conclusion should be balanced by a sense check, a smell test if you will – basically, does this make sense based on everything I know of the subject at hand and what could reasonably be expected? I won’t even address the top of the list, and the inclusion in an all time top ten of blokes for whom the hashtag #samplesizelol could have been invented.

Rather, I’d look further down the list for my sense check. If you conducted a study to rank Test batsmen and found that Viv Richards and Len Hutton were placed one after the other on your list, then you’d probably think that’s a fair starting point that you may well be on the right track. However, if the two consecutive positions they held on the list were 53rd and 54th, I’d reckon you might cross-reference that against your knowledge and appreciation of cricketing history and have another look at your methodology.

While I appreciate the effort which went into this, and accepting fully that I may well have gotten the wrong end of the stick on literally everything, this study screams to me to be one which started with the controversial conclusion that it wanted to reach, and then introduced ever more extreme criteria in order to reach it.

srbhkshk said:
I will have to say I have massive respect for the author purely going with whatever his algo spits out and making absolutely zero attempt at making the list look even remotely like the generally accepted wisdom. It's the kind of confidence I wish I had.

srbhkshk said:
In applying models of performance, results usually include what reviewers regard as some odd findings. This means potential improvements to the model should be formulated and tested, and examining the findings in overall terms - here to judge whether the ordering of players is considered more realistic than using the raw averages.

The Sean said:

Ok, I’ve had a re-read of this so as to try to get to grips with it. I’m specifically avoiding pithy one-liner responses because it’s clear that a lot of effort has gone into this analysis and I respect anyone who puts in the hard yards to develop something like this, and it deserves a proper and considered response.

Obviously, the (scarcely believable) reduction in Bradman’s numbers is the striking headline, which I’ll come to but don’t want to focus on at the expense of everything else. Rather, I have questions generally about the methodology and what seem to be inconsistencies to me but which I may not have fully understood.

To start with, dead runs. Of course, I understand that we can find examples where a team has batted on and on beyond the point of really needing to, but is that really so common as to literally exclude all instances of particularly high scoring throughout the entirety of Test history? As cricket fans, we often lament the fact that players don’t go big when the opportunity presents itself, and yet here you have invented a criterion which seems specifically designed to do nothing except statistically punish players for doing so. To exclude all runs beyond 100/135/150 scored by any batsman in any innings they have played makes absolutely zero sense in a study literally focused on batting averages, and would do so only if you told me that Mark Waugh was a paid consultant on this exercise.

Let’s take just as one example Hanif Mohammad’s epic 337 against the West Indies in Bridgetown in 1958, which saved Pakistan from what seemed a certain heavy defeat. Pakistan had been rolled very cheaply in their first innings in the face of a huge West Indian total, and were forced to follow-on nearly 500 runs behind. If Hanif gets out for 150 there, Pakistan lose and lose huge. Why on earth should his last 187 runs be discounted as “dead”? In fact, I’d say that each run he scored beyond 150 was more valuable, not less.

Second, the dominance rating could do with more clarity (at least from my not-very-smart perspective). For starters, just the inclusion alone of certain players who have played so few innings is bound to skew the analysis. Barry Richards was surely a magnificent player, but seven Test innings can’t possibly be enough of a sample size for a study that presents itself as a statistical one. Graeme Pollock, it should be noted, played in that same series and scored more runs than Richards at a better average (albeit with one of those innings being a lot of “dead” runs according to your criteria). The inclusion of Taslim Arif is even more incongruous – he played only ten Test innings, and made 42% of his entire Test aggregate in just one of them! And yet there he is, looking down on almost everyone from fourth spot. If Richards and Taslim are included, why not Andy Ganteaume or Kurtis Patterson? And Lawrence Rowe must be kicking himself for not retiring after his first Test.

However, more than who is and is not included, what I don’t follow is how the dominance ratings are first applied and then adjusted. Notwithstanding the removal of dead runs which I have already discussed, I read your passage explaining the theory behind calculating the dominance rating, but can’t see any source numbers which show clearly why an individual batsman has a specific dominance rating and what his baseline 1.0 number is. Again I'd argue that the "dead runs" part of this analysis doesn't tally with the goal as we are looking at Test averages in the context of dominance over their peers, and yet are actively removing the huge scores which would indicate the kind of true dominance we are looking for.

Beyond even that though, is that I can’t see how or why the older batsmen have had their numbers reduced by so much in order to align with what their “equivalent” would be in the Present Era. To use an example that you note in your piece:

“A batsman who exhibits the same degree of dominance today as Bradman did in his own time would not require nearly such a high batting average as he achieved. Instead of an average of 91.0 (excluding dead runs), this reduces by 23%, to become 70.1 to allow for subsequent deflation in the level and spread of averages; and is further reduced by 4.5%, down to 66.9, to reflect the general increase in batting skills since Bradman’s playing days.”

Once again, I may have missed where this is clearly explained by why does Bradman (or someone like him) have his average reduced by 23% - what is the rationale for that specific number? It’s a colossal reduction, particularly given you are already removing “dead runs” (which Bradman made many, many more of than anyone else by proportion due to his almost inhuman ability to go big), and also applying the arbitrary “expertise” reduction (which I’ll come to later) as well. It seems like an excuse to apply reduction upon reduction for unclear reasons in order to reach a number you want it to.

You reference Charles Davis’ study during your piece, noting areas where he adjusted or downgraded Bradman and others in his standardisation that you haven’t, and have been “fairer” in your analysis. And yet Davis’ final conclusion – with Bradman’s standardised average of 84.5 – was that from a statistical point of view he really shouldn’t exist, being so many standard deviations from the mean, and that it would be reasonable to expect a player as statistically dominant as Bradman to emerge in the game of cricket approximately once in every 200,000 years. How has your exercise – which you suggest is fairer and more reasonable – reduced Bradman’s number by so much more than Davis did to the point that he’s not an outlier at all?

Third, the eras seem inconsistent. The Present Era as defined by the study encompasses a period of nearly 24 years, which is at odds with some previous eras being confined to a single decade. One of the core tenets of this study is that everything before 2000 is to be standardised to the baseline of what it would be in the Present Era, and yet this era is so broad in scope that I don’t see how that is meaningful or even possible. We’ve seen enough studies of standardised batting averages over the years to have concluded that Test batting conditions in recent times are very different to those of the early 2000s (nearly a quarter of a century ago!) but this exercise places it all in the same bucket.

For previous eras, the amount of time designated as an “era” varies widely. In many cases, it is a simple calendar decade. But then we also have the period after WWII which when designating eras to include batsmen encompasses a full 21 years inclusive from 1946-66, but when deciding on expertise % is split in half. Why? The few years after the war saw an explosion in runscoring to an all time peak. The 1950s then, as we know, saw a concerted effort to make pitches in most of the world more bowler friendly with a corresponding drop in batting averages. Then the 1960s saw a rebalance between bat and ball that took global averages back up past the 30 mark where they have remained more or less ever since (barring a dip into the 29s during the 1990s).

The blokes batting in the mid-1960s aren’t the same guys who were batting in the late-1940s, nor were they batting in necessarily comparable conditions to all the guys in between, so what is the rationale behind a 46-66 era – or why the expertise splits it with quite a significant percentage difference in 1955? Why not stop it at the end of a decade? The next era is 13 years inclusive from 67-79, followed by simple calendar decades again for the ‘80s and ‘90s. If that shortening is because of the sheer increase in cricketing volume over time – more cricket was played by more cricketers in later decades – then I get that, but it subsequently renders completely senseless the idea of a “Present Era” encompassing a period nearly two and a half times longer than those immediately before! I don’t see how you can meaningfully baseline or compare when the parameters are seemingly so arbitrary and inconsistent.

Fourth, expertise advancement. This is one of my pet hates in regard to cricketing discussions generally, not just this particular piece, because it so often relies on the concept of teleporting a player from the past without warning decades into the future and then judging him on that. You make this point during your piece where you write:

“…if Bradman were to be transported to the Present Era with his demonstrated abilities unchanged, he would be somewhat less dominant in relation to present day players than he was in relation to his contemporaries because batting expertise in general has risen in the intervening period. (Hence my lower, fully standardised, average for him.) The same point applies to all batsmen of each previous era.”

Yes, cricket has evolved over time, because of course it has. In every field of human endeavour we are faster and stronger and know more than we used to, and if we don’t then something has gone seriously wrong. We have access to tools, knowledge and technology that previous generations didn’t have and in many cases couldn’t even dream of. However, this to me is a natural function our advancement, rather than an indication that we are today inherently “better” than those who had gone before. Every college physics professor in the world has more absolute physics knowledge than Einstein had, but that doesn’t mean they are greater than he was. Einstein himself also acknowledged the debt he owed to those who had gone before with his famous quote about “standing on the shoulders of giants.”

If we are acknowledge – as we should – that the game of cricket has evolved over the course of its history then we also need to acknowledge those things which impacted players of the past which today’s champions no longer have to deal with. Pitches weren’t covered from the elements, boundaries weren’t roped in 20 yards from the fence, protective equipment protected **** all, bats didn’t have sweet spots which turned a piece of willow into a cannon, and bowlers delivered the ball from further up the pitch due to no-balls being determined by the position of the back foot, not the front. If we are going to use the evolution of the game to rate players, then we need to adjust in both directions.

And that’s without even addressing the arbitrary percentages applied to downgrade players’ “expertise” from previous eras. Why are batsmen from the 1990s downgraded 2%, but players from the 1980s by more than double that at 4.5%? What is the rationale behind the percentage decreases applied at each time period? To my point above about time period inconsistencies, why are we saying that expertise has increased by a certain (inconsistent) percentage in virtually every decade, but then there has been absolutely no increase at all in batting expertise between 2000 and 2023?

When it comes to the application of these reductions for previous players, why aren’t they consistent with the figures you’d previously allocated for each era. For example, Ken Barrington played in an era which you have said was 7.5% lower in expertise than the Present Era. However, on your ranked table in the column “Allow for Advance in Expertise”, you have adjusted Barrington’s average by 3.8%. Why not 7.5% - has that adjustment already been factored in elsewhere in another number? It was hard to tell, and throughout the list the % reduction on the table was lower than the % reduction for era which you’d nominated earlier. I may well be missing something there which you’d already explained somewhere else, but it would be good to understand why it is different.

Finally, a sense check. I always think that will studies like this one of the most interesting parts is coming to the end of your analysis and discovering something that you wouldn’t have expected to find or that casts new light on previously accepted wisdom. However, I also strongly believe that any conclusion should be balanced by a sense check, a smell test if you will – basically, does this make sense based on everything I know of the subject at hand and what could reasonably be expected? I won’t even address the top of the list, and the inclusion in an all time top ten of blokes for whom the hashtag #samplesizelol could have been invented.

Rather, I’d look further down the list for my sense check. If you conducted a study to rank Test batsmen and found that Viv Richards and Len Hutton were placed one after the other on your list, then you’d probably think that’s a fair starting point that you may well be on the right track. However, if the two consecutive positions they held on the list were 53rd and 54th, I’d reckon you might cross-reference that against your knowledge and appreciation of cricketing history and have another look at your methodology.

While I appreciate the effort which went into this, and accepting fully that I may well have gotten the wrong end of the stick on literally everything, this study screams to me to be one which started with the controversial conclusion that it wanted to reach, and then introduced ever more extreme criteria in order to reach it.

Click to expand...

Innings above the specified thresholds which were used solely to identify matches worth looking at to see if dead runs were scored. More shortly.

The Sean said:
Ok, I’ve had a re-read of this so as to try to get to grips with it. I’m specifically avoiding pithy one-liner responses because it’s clear that a lot of effort has gone into this analysis and I respect anyone who puts in the hard yards to develop something like this, and it deserves a proper and considered response.

Obviously, the (scarcely believable) reduction in Bradman’s numbers is the striking headline, which I’ll come to but don’t want to focus on at the expense of everything else. Rather, I have questions generally about the methodology and what seem to be inconsistencies to me but which I may not have fully understood.

To start with, dead runs. Of course, I understand that we can find examples where a team has batted on and on beyond the point of really needing to, but is that really so common as to literally exclude all instances of particularly high scoring throughout the entirety of Test history? As cricket fans, we often lament the fact that players don’t go big when the opportunity presents itself, and yet here you have invented a criterion which seems specifically designed to do nothing except statistically punish players for doing so. To exclude all runs beyond 100/135/150 scored by any batsman in any innings they have played makes absolutely zero sense in a study literally focused on batting averages, and would do so only if you told me that Mark Waugh was a paid consultant on this exercise.

Let’s take just as one example Hanif Mohammad’s epic 337 against the West Indies in Bridgetown in 1958, which saved Pakistan from what seemed a certain heavy defeat. Pakistan had been rolled very cheaply in their first innings in the face of a huge West Indian total, and were forced to follow-on nearly 500 runs behind. If Hanif gets out for 150 there, Pakistan lose and lose huge. Why on earth should his last 187 runs be discounted as “dead”? In fact, I’d say that each run he scored beyond 150 was more valuable, not less.

Second, the dominance rating could do with more clarity (at least from my not-very-smart perspective). For starters, just the inclusion alone of certain players who have played so few innings is bound to skew the analysis. Barry Richards was surely a magnificent player, but seven Test innings can’t possibly be enough of a sample size for a study that presents itself as a statistical one. Graeme Pollock, it should be noted, played in that same series and scored more runs than Richards at a better average (albeit with one of those innings being a lot of “dead” runs according to your criteria). The inclusion of Taslim Arif is even more incongruous – he played only ten Test innings, and made 42% of his entire Test aggregate in just one of them! And yet there he is, looking down on almost everyone from fourth spot. If Richards and Taslim are included, why not Andy Ganteaume or Kurtis Patterson? And Lawrence Rowe must be kicking himself for not retiring after his first Test.

However, more than who is and is not included, what I don’t follow is how the dominance ratings are first applied and then adjusted. Notwithstanding the removal of dead runs which I have already discussed, I read your passage explaining the theory behind calculating the dominance rating, but can’t see any source numbers which show clearly why an individual batsman has a specific dominance rating and what his baseline 1.0 number is. Again I'd argue that the "dead runs" part of this analysis doesn't tally with the goal as we are looking at Test averages in the context of dominance over their peers, and yet are actively removing the huge scores which would indicate the kind of true dominance we are looking for.

Beyond even that though, is that I can’t see how or why the older batsmen have had their numbers reduced by so much in order to align with what their “equivalent” would be in the Present Era. To use an example that you note in your piece:

“A batsman who exhibits the same degree of dominance today as Bradman did in his own time would not require nearly such a high batting average as he achieved. Instead of an average of 91.0 (excluding dead runs), this reduces by 23%, to become 70.1 to allow for subsequent deflation in the level and spread of averages; and is further reduced by 4.5%, down to 66.9, to reflect the general increase in batting skills since Bradman’s playing days.”

Once again, I may have missed where this is clearly explained by why does Bradman (or someone like him) have his average reduced by 23% - what is the rationale for that specific number? It’s a colossal reduction, particularly given you are already removing “dead runs” (which Bradman made many, many more of than anyone else by proportion due to his almost inhuman ability to go big), and also applying the arbitrary “expertise” reduction (which I’ll come to later) as well. It seems like an excuse to apply reduction upon reduction for unclear reasons in order to reach a number you want it to.

You reference Charles Davis’ study during your piece, noting areas where he adjusted or downgraded Bradman and others in his standardisation that you haven’t, and have been “fairer” in your analysis. And yet Davis’ final conclusion – with Bradman’s standardised average of 84.5 – was that from a statistical point of view he really shouldn’t exist, being so many standard deviations from the mean, and that it would be reasonable to expect a player as statistically dominant as Bradman to emerge in the game of cricket approximately once in every 200,000 years. How has your exercise – which you suggest is fairer and more reasonable – reduced Bradman’s number by so much more than Davis did to the point that he’s not an outlier at all?

Third, the eras seem inconsistent. The Present Era as defined by the study encompasses a period of nearly 24 years, which is at odds with some previous eras being confined to a single decade. One of the core tenets of this study is that everything before 2000 is to be standardised to the baseline of what it would be in the Present Era, and yet this era is so broad in scope that I don’t see how that is meaningful or even possible. We’ve seen enough studies of standardised batting averages over the years to have concluded that Test batting conditions in recent times are very different to those of the early 2000s (nearly a quarter of a century ago!) but this exercise places it all in the same bucket.

For previous eras, the amount of time designated as an “era” varies widely. In many cases, it is a simple calendar decade. But then we also have the period after WWII which when designating eras to include batsmen encompasses a full 21 years inclusive from 1946-66, but when deciding on expertise % is split in half. Why? The few years after the war saw an explosion in runscoring to an all time peak. The 1950s then, as we know, saw a concerted effort to make pitches in most of the world more bowler friendly with a corresponding drop in batting averages. Then the 1960s saw a rebalance between bat and ball that took global averages back up past the 30 mark where they have remained more or less ever since (barring a dip into the 29s during the 1990s).

The blokes batting in the mid-1960s aren’t the same guys who were batting in the late-1940s, nor were they batting in necessarily comparable conditions to all the guys in between, so what is the rationale behind a 46-66 era – or why the expertise splits it with quite a significant percentage difference in 1955? Why not stop it at the end of a decade? The next era is 13 years inclusive from 67-79, followed by simple calendar decades again for the ‘80s and ‘90s. If that shortening is because of the sheer increase in cricketing volume over time – more cricket was played by more cricketers in later decades – then I get that, but it subsequently renders completely senseless the idea of a “Present Era” encompassing a period nearly two and a half times longer than those immediately before! I don’t see how you can meaningfully baseline or compare when the parameters are seemingly so arbitrary and inconsistent.

Fourth, expertise advancement. This is one of my pet hates in regard to cricketing discussions generally, not just this particular piece, because it so often relies on the concept of teleporting a player from the past without warning decades into the future and then judging him on that. You make this point during your piece where you write:

“…if Bradman were to be transported to the Present Era with his demonstrated abilities unchanged, he would be somewhat less dominant in relation to present day players than he was in relation to his contemporaries because batting expertise in general has risen in the intervening period. (Hence my lower, fully standardised, average for him.) The same point applies to all batsmen of each previous era.”

Yes, cricket has evolved over time, because of course it has. In every field of human endeavour we are faster and stronger and know more than we used to, and if we don’t then something has gone seriously wrong. We have access to tools, knowledge and technology that previous generations didn’t have and in many cases couldn’t even dream of. However, this to me is a natural function our advancement, rather than an indication that we are today inherently “better” than those who had gone before. Every college physics professor in the world has more absolute physics knowledge than Einstein had, but that doesn’t mean they are greater than he was. Einstein himself also acknowledged the debt he owed to those who had gone before with his famous quote about “standing on the shoulders of giants.”

If we are acknowledge – as we should – that the game of cricket has evolved over the course of its history then we also need to acknowledge those things which impacted players of the past which today’s champions no longer have to deal with. Pitches weren’t covered from the elements, boundaries weren’t roped in 20 yards from the fence, protective equipment protected **** all, bats didn’t have sweet spots which turned a piece of willow into a cannon, and bowlers delivered the ball from further up the pitch due to no-balls being determined by the position of the back foot, not the front. If we are going to use the evolution of the game to rate players, then we need to adjust in both directions.

And that’s without even addressing the arbitrary percentages applied to downgrade players’ “expertise” from previous eras. Why are batsmen from the 1990s downgraded 2%, but players from the 1980s by more than double that at 4.5%? What is the rationale behind the percentage decreases applied at each time period? To my point above about time period inconsistencies, why are we saying that expertise has increased by a certain (inconsistent) percentage in virtually every decade, but then there has been absolutely no increase at all in batting expertise between 2000 and 2023?

When it comes to the application of these reductions for previous players, why aren’t they consistent with the figures you’d previously allocated for each era. For example, Ken Barrington played in an era which you have said was 7.5% lower in expertise than the Present Era. However, on your ranked table in the column “Allow for Advance in Expertise”, you have adjusted Barrington’s average by 3.8%. Why not 7.5% - has that adjustment already been factored in elsewhere in another number? It was hard to tell, and throughout the list the % reduction on the table was lower than the % reduction for era which you’d nominated earlier. I may well be missing something there which you’d already explained somewhere else, but it would be good to understand why it is different.

Finally, a sense check. I always think that will studies like this one of the most interesting parts is coming to the end of your analysis and discovering something that you wouldn’t have expected to find or that casts new light on previously accepted wisdom. However, I also strongly believe that any conclusion should be balanced by a sense check, a smell test if you will – basically, does this make sense based on everything I know of the subject at hand and what could reasonably be expected? I won’t even address the top of the list, and the inclusion in an all time top ten of blokes for whom the hashtag #samplesizelol could have been invented.

Rather, I’d look further down the list for my sense check. If you conducted a study to rank Test batsmen and found that Viv Richards and Len Hutton were placed one after the other on your list, then you’d probably think that’s a fair starting point that you may well be on the right track. However, if the two consecutive positions they held on the list were 53rd and 54th, I’d reckon you might cross-reference that against your knowledge and appreciation of cricketing history and have another look at your methodology.

While I appreciate the effort which went into this, and accepting fully that I may well have gotten the wrong end of the stick on literally everything, this study screams to me to be one which started with the controversial conclusion that it wanted to reach, and then introduced ever more extreme criteria in order to reach it.

Re your sense check on findings: what improvements on my basic model do you suggest to deal with what you consider to be odd or anomalous findings? That's part of what my article is about. Alternatively, what basically different model do you propose?

TheJediBrah · Nov 2, 2023

The Sean said:
this study screams to me to be one which started with the controversial conclusion that it wanted to reach, and then introduced ever more extreme criteria in order to reach it.

Yeah this is blatantly obviously the case

pbkettle said:
Re your sense check on findings: what improvements on my basic model do you suggest to deal with what you consider to be odd or anomalous findings?

If I'm reading this right you've decided that some statistics are wrong and decided to build a method around "correcting" it? Basically the antithesis of the scientific method?

pbkettle · Nov 4, 2023

In my article on this topic (CW, 29 October), I suggested a general approach to the task - one which treats all players in a consistent way to ensure that criteria applied to a particular batter get applied to all.

I mentioned a hope that, after considering the method and findings obtained, readers may wish to introduce additional factors they consider would then produce a better reflection of batters' relative abilities at the crease. These “extensions” might be thought of as representing refinements to the basic scheme put forward, and some possibilities in my own mind were given at the end of the piece.

So I invite readers to formulate their own proposed improvements and see what these imply by incorporating them into the basic model or something similar.

Howe_zat · Nov 4, 2023

i think pbkettle is glitching out a bit

Adorable Asshole · Nov 4, 2023

Spark said:
So, like, according to this, NZ currently has no fewer than three of the greatest 30 batsmen ever to play the game currently in their Test match top order. Which by any reasonable estimation would make them one of the greatest if not the greatest top orders to ever play the game.

So my question is this: how do you explain NZ not being utterly dominant in Test cricket and steamrollering everyone they encounter by sheer weight of top order runs?

Actually no, my real question is: did you do any sanity checks on your methodology at all? While I also completely agree with all of @The Sean's criticisms, even disregarding the historical critiques the whole thing falls apart under the weight of its own arguments.

@Fuller Pilch thoughts?

honestbharani · Nov 5, 2023

Adorable Asshole said:
@Fuller Pilch thoughts?

3 is too low. It should be 10 obviously...

Fuller Pilch · Nov 5, 2023

honestbharani said:
3 is too low. It should be 10 obviously...

That's probably true since Rachin is better than Bradman and can't make the test team.

pbkettle · Nov 5, 2023

An overview of responses to my piece on standardising Test batting averages, and my comments – Peter Kettle

Firstly, I’m delighted that many responses have been posted. In commenting on them, I thought it useful to very briefly summarise the points made under distinct headings for different aspects, and give my comments in a consolidated note rather than scatter them around in many places.

(A) Views on the overall approach and main elements of the model used

TheJediBrah has this critical interpretation of the underlying aim: that I have “decided that some statistics are wrong and decided to build a method around correcting them? Basically the antithesis of the scientific method?” What I actually thought, before doing any analysis, is that raw batting averages across different generations are not strictly comparable, for a good number of reasons. Then I sought to neutralise those inter-generational factors that affect the level of averages which have nothing to do with the abilities of the batsmen themselves. In principle, the resulting adjusted averages could be either close to or far apart from the raw averages.

It is the case that I had a prior suspicion that the chasm in official (“raw”) averages exaggerated the underlying value of Bradman’s batting to his team. I sensed that - most unusually among elite batsmen - he was quite often continuing to pile up mountains of runs when a Test match had, it appeared, already been put out of the opposition’s reach. If true, the runs concerned will have contributed nothing material to the match result, as well as being gained more easily as the pressure was then off. One task was therefore to test this preliminary thought, and also to explore the potential presence of the same feature (“dead runs”) for the other leading Test batsmen being considered.

To answer a question by The Sean on the selection of batsmen for comparison, I included all batsmen who had at least 20 Test innings (plus eight others) and an average that is at least 1.0 standard deviations above the Mean value of all averages combined for a given era (ie for those with a minimum of 20 innings). The 1.0 standard deviation is then referred to as a dominance rating of 1.0, and similarly for players with averages of 1.5 standard deviations above the Mean (dominance rating of 1.5) and so on.

On the principles applied, some such as Coronis thought that in identifying dead runs I was excluding all runs that exceeded the stated thresholds – ie innings of 150 plus runs for post-WW1 eras, and 100 plus runs versus Zimbabwe and Bangladesh, with use of lower thresholds for earlier times. This wasn’t the case. What I said was that dead runs had been identified by examining these innings.

I should have made it clearer that “examining” meant assessing whether these innings at some stage crossed the point when the opposition had no more than a negligible likelihood of winning, and should also have explained that the thresholds were chosen, intuitively, as a signal for a potentially large imbalance in the team totals and hence the possibility of dead runs being scored.

Coronis thinks it would be “almost impossible” to define how many of the runs scored are “dead”. The test applied is clearly stated in my text with elaboration by footnote (v) on the first innings and by footnote (vi). Whether and when the test is satisfied calls for informed judgement, based mainly on the scorecard at different stages of a match.

Perhaps the above points might satisfy OverratedSanity who says that the notion of Dead Runs needs clarifying.

The Sean objects to applying the concept of Dead Runs because whether or not to exclude them is a matter of personal opinion. Yet so it is, for instance, with the basis of Wisden India’s “match impact” assessments and ICC’s rating of player performance, or with the qualities applied to identify the greatest Test innings of all time. Specifying the criteria that are deemed to be relevant and worth including, and deciding what relative importance is to be assigned to each of them, also involve personal opinion and hence subjectivity. (At the end of his post, The Sean hints that some regard could be paid to dead runs but they shouldn’t be completely discounted.)

No objection has been raised in principle to the key (second) element, being the concept of Relative Dominance over contemporaries and its application to translate the batting averages of past eras into equivalent averages of the present era. The Sean is puzzled by how the numbers get so much reduced (for players in post-WW1 times) in applying the dominance ratings. This is very largely because the spread of individual averages around the overall average for all players combined (the “Mean value”) has been steadily and systematically reducing over time. And so when the degree of dominance attained by a player of a past era is applied, unaltered, to the scoring context of the Present Era, the absolute increase above the overall average becomes smaller. It follows that the reduction is greater for eras the further back in time one goes.

There is acceptance by those posting that Batting Expertise (the model’s other main element) has advanced, more or less continuously, over time and so should be factored in. This is generally taken to be self-evident.

The Sean is bothered about what he sees as an undesirable implication of use of estimated advances in expertise. This is that it somehow judges Bradman adversely in relation to players of the Present Era. It actually doesn’t, as explained below. Two things are going on:

First, when Bradman is notionally transported to the Present Era with identical abilities to those he did have, if he is assumed to retain the same degree of relative dominance as he did in his own playing time, his average will be lowered considerably in absolute terms. This is because the spread of averages has been reduced over time. (Put another way, if someone today were to have that same degree of relative dominance as Bradman did in his own time, he would require a much lower average than Bradman had.)
Second, Bradman’s (unchanged) abilities, when viewed in relation to the collection of batsmen of the Present Era (ie its batsmen en masse), will not give him a degree of dominance as high as that which he achieved in his own time. This is because present day batsmen have, in overall terms, somewhat more advanced expertise than did Bradman’s contemporaries.

As to the specific number generated by the first calculation, in Bradman’s case 91.0 coming down to 70.1. This reduction is arrived at by:

determining his Dominance Rating for an average of 90.1 (after excluding his dead runs), which is 3.24 (ie 3.24 standard deviations above the overall average for his playing time),
multiplying this Dominance Rating by the value of 1 standard deviation applying to the Present Era which is 13.25: giving 42.93.
This is then added to the overall average for the Present Era being 27.15, producing the figure 70.1.

There is also comment on the issue of the specification of eras. I note The Sean refers in detail to changes in scoring levels from decade to decade, which might imply a need to disaggregate some of my eras. However, the numbers he quotes line up with those given by Zaheer Clarke in his December 2015 article The Decade Analysis. Clarke’s numbers are for the combined averages of all players, with no qualifying minimum number of innings, and his averages differ materially from the combined averages for those with 20 plus innings which I have used.

Some may feel that the 24 year Present Era is too lengthy. The Sean reckons “this era is so broad in scope that I don’t see how it is meaningful”. My basis for the division of time into individual eras is the relative stability of the overall (“Mean”) value of averages and their spread (for players with 20 plus innings). On examining these two factors over the course of the Present Era, it is found that:

Changes to the Mean value have been small: 2004-8 is 26.8; 2009-13 is 27.7; 2014-18 is 26.4 and 2019-21 is 25.8 (based on all Test players combined).
Likewise, only small changes have occurred to the coefficient of variation, denoting the extent of spread of averages around the Mean value. For the same periods it is: 0.79, 0.76, 0.78, 0.85.

(The associated standard deviations being: 21.3, 21.0, 20.7, 21.9)

(B) Agreements with my findings for specific batsmen

These are brief and usually playful comments - by Ataraxia, ma1978 and by Adorable Asshole.

(C) The intriguing case of Don Bradman

Some accept the finding of Bradman being part of a continuum and having a number of players close to him. Shortpitched713 considers his standardised average is “very realistic” and asks about the biggest factor in his downgrading (from 99.94 down to 66.96).

Excluding “dead runs” (done first) accounts for 9 of the 33 deduction, transferring his dominance rating to the scoring context of the Present Era accounts for as much as 21, and the subsequent improvement in general batting expertise contributes the remaining 3. (This is noted under the heading Combined Impacts and can be derived from the first table.)

If “dead runs” are not discounted, Bradman’s standardised average would be 76.7.

Elsewhere, Shortpitched713 says “I find it (the finding for Bradman) quite realistic instinctively, given his actual mortal coil”. This last point is one I emphasise – see text under the graph of the resulting top 25 batsmen.

(D) Reservations/ disagreements with my findings

These are mainly based on "generally accepted” views, and sometimes it seems on intuition, about the relative abilities of batsmen – occasionally, simply by assertion (Honestbharani).

I much like this trenchant phrasing of Spark: “…the whole thing falls apart under the weight of its own arguments”. Though I don’t have any feel for what this is getting at! You’ll have to enlighten me – please.

Spark asks whether I did any sanity checks on my methodology? My answer: methodology was considered sound in its principles, as far as it went. It did produce a few odd/perplexing results, but that is all part of the process. Iterate to refine the model - which is to come - by adding variables or imposing constraints (eg Not Outs must be below X% for all players, or number of innings must, in general, be at least, say, 18 but allow for very special cases).

In similar vein, Srbhkshk observes that I made no attempt “to make the (resulting) list look even remotely like the generally accepted wisdom”. I’m keen to know where the “generally accepted” ordering of prominent Test batsmen is to be found (other than raw averages). And, if such a list really does exist, why should it continue to prevail?

(E) Suggested improvements to my approach

Use of models that are of a basic nature can be expected to produce some results that may be considered odd/anomalous/perverse. Hence my hope that identifying these cases might now, or in due course, lead to suggested improvements by participants.

Shortpitched713 wants to exclude those batsmen having only a few innings - so altering the top ten, and excluding the little known player Taslim Arif as well as Barry Richards; also wants to exclude Voges, who had 31 innings though quite a high proportion of Not Out innings (22.6%). The Sean makes the same point about few innings.

On further reflection, I’d retain Barry Richards for all time as a worthy special case for the reasons mentioned (if not Arif or Voges).

Some other potential improvements that I think worth exploring were noted at the end of my piece.

(F) Are attempts to standardise averages across generations worth making?

Line and Length reckons that all these attempts are doomed to fail, and that “artificially contrived averages (ie standardised across eras) are even more meaningless” than raw averages; both are of little value because averages “aren't the be all and end all when comparing players” (a point I fully agree with). And do, please, let me know of the various attempts to standardise batting averages that you have come across; some may well be new to me.

Whilst Owzat contends that due to the many difficulties encountered, the “only sensible way to compare is to take players in one era, and compare the averages of the time”. Yes, there are many differences in playing conditions, the balance of the contest between bat and ball, etc. But these things are what the use of dominance ratings is designed to get around (as explained in my article).

My own test on worthwhile-or-not is to compare the ranking of batsmen on the raw and standardised averages and consider which gives a more “acceptable/realistic” picture in overall terms. And then try to deal with the oddities, either by trying to modify the model used or making an ad hoc repositioning of a player based on broader evidence (eg a player’s average for all first-class matches) – a next step.

Final comment: I wonder if any participants have views about the merit/demerit of Charles Davis approach to the same standardisation task in his book, The Best of The Best.

(The Sean refers to Davis’ study in his post, but doesn’t offer a view on its soundness or value.)

The Sean said:
Ok, I’ve had a re-read of this so as to try to get to grips with it. I’m specifically avoiding pithy one-liner responses because it’s clear that a lot of effort has gone into this analysis and I respect anyone who puts in the hard yards to develop something like this, and it deserves a proper and considered response.

Obviously, the (scarcely believable) reduction in Bradman’s numbers is the striking headline, which I’ll come to but don’t want to focus on at the expense of everything else. Rather, I have questions generally about the methodology and what seem to be inconsistencies to me but which I may not have fully understood.

To start with, dead runs. Of course, I understand that we can find examples where a team has batted on and on beyond the point of really needing to, but is that really so common as to literally exclude all instances of particularly high scoring throughout the entirety of Test history? As cricket fans, we often lament the fact that players don’t go big when the opportunity presents itself, and yet here you have invented a criterion which seems specifically designed to do nothing except statistically punish players for doing so. To exclude all runs beyond 100/135/150 scored by any batsman in any innings they have played makes absolutely zero sense in a study literally focused on batting averages, and would do so only if you told me that Mark Waugh was a paid consultant on this exercise.

Let’s take just as one example Hanif Mohammad’s epic 337 against the West Indies in Bridgetown in 1958, which saved Pakistan from what seemed a certain heavy defeat. Pakistan had been rolled very cheaply in their first innings in the face of a huge West Indian total, and were forced to follow-on nearly 500 runs behind. If Hanif gets out for 150 there, Pakistan lose and lose huge. Why on earth should his last 187 runs be discounted as “dead”? In fact, I’d say that each run he scored beyond 150 was more valuable, not less.

Second, the dominance rating could do with more clarity (at least from my not-very-smart perspective). For starters, just the inclusion alone of certain players who have played so few innings is bound to skew the analysis. Barry Richards was surely a magnificent player, but seven Test innings can’t possibly be enough of a sample size for a study that presents itself as a statistical one. Graeme Pollock, it should be noted, played in that same series and scored more runs than Richards at a better average (albeit with one of those innings being a lot of “dead” runs according to your criteria). The inclusion of Taslim Arif is even more incongruous – he played only ten Test innings, and made 42% of his entire Test aggregate in just one of them! And yet there he is, looking down on almost everyone from fourth spot. If Richards and Taslim are included, why not Andy Ganteaume or Kurtis Patterson? And Lawrence Rowe must be kicking himself for not retiring after his first Test.

However, more than who is and is not included, what I don’t follow is how the dominance ratings are first applied and then adjusted. Notwithstanding the removal of dead runs which I have already discussed, I read your passage explaining the theory behind calculating the dominance rating, but can’t see any source numbers which show clearly why an individual batsman has a specific dominance rating and what his baseline 1.0 number is. Again I'd argue that the "dead runs" part of this analysis doesn't tally with the goal as we are looking at Test averages in the context of dominance over their peers, and yet are actively removing the huge scores which would indicate the kind of true dominance we are looking for.

Beyond even that though, is that I can’t see how or why the older batsmen have had their numbers reduced by so much in order to align with what their “equivalent” would be in the Present Era. To use an example that you note in your piece:

“A batsman who exhibits the same degree of dominance today as Bradman did in his own time would not require nearly such a high batting average as he achieved. Instead of an average of 91.0 (excluding dead runs), this reduces by 23%, to become 70.1 to allow for subsequent deflation in the level and spread of averages; and is further reduced by 4.5%, down to 66.9, to reflect the general increase in batting skills since Bradman’s playing days.”

Once again, I may have missed where this is clearly explained by why does Bradman (or someone like him) have his average reduced by 23% - what is the rationale for that specific number? It’s a colossal reduction, particularly given you are already removing “dead runs” (which Bradman made many, many more of than anyone else by proportion due to his almost inhuman ability to go big), and also applying the arbitrary “expertise” reduction (which I’ll come to later) as well. It seems like an excuse to apply reduction upon reduction for unclear reasons in order to reach a number you want it to.

You reference Charles Davis’ study during your piece, noting areas where he adjusted or downgraded Bradman and others in his standardisation that you haven’t, and have been “fairer” in your analysis. And yet Davis’ final conclusion – with Bradman’s standardised average of 84.5 – was that from a statistical point of view he really shouldn’t exist, being so many standard deviations from the mean, and that it would be reasonable to expect a player as statistically dominant as Bradman to emerge in the game of cricket approximately once in every 200,000 years. How has your exercise – which you suggest is fairer and more reasonable – reduced Bradman’s number by so much more than Davis did to the point that he’s not an outlier at all?

Third, the eras seem inconsistent. The Present Era as defined by the study encompasses a period of nearly 24 years, which is at odds with some previous eras being confined to a single decade. One of the core tenets of this study is that everything before 2000 is to be standardised to the baseline of what it would be in the Present Era, and yet this era is so broad in scope that I don’t see how that is meaningful or even possible. We’ve seen enough studies of standardised batting averages over the years to have concluded that Test batting conditions in recent times are very different to those of the early 2000s (nearly a quarter of a century ago!) but this exercise places it all in the same bucket.

For previous eras, the amount of time designated as an “era” varies widely. In many cases, it is a simple calendar decade. But then we also have the period after WWII which when designating eras to include batsmen encompasses a full 21 years inclusive from 1946-66, but when deciding on expertise % is split in half. Why? The few years after the war saw an explosion in runscoring to an all time peak. The 1950s then, as we know, saw a concerted effort to make pitches in most of the world more bowler friendly with a corresponding drop in batting averages. Then the 1960s saw a rebalance between bat and ball that took global averages back up past the 30 mark where they have remained more or less ever since (barring a dip into the 29s during the 1990s).

The blokes batting in the mid-1960s aren’t the same guys who were batting in the late-1940s, nor were they batting in necessarily comparable conditions to all the guys in between, so what is the rationale behind a 46-66 era – or why the expertise splits it with quite a significant percentage difference in 1955? Why not stop it at the end of a decade? The next era is 13 years inclusive from 67-79, followed by simple calendar decades again for the ‘80s and ‘90s. If that shortening is because of the sheer increase in cricketing volume over time – more cricket was played by more cricketers in later decades – then I get that, but it subsequently renders completely senseless the idea of a “Present Era” encompassing a period nearly two and a half times longer than those immediately before! I don’t see how you can meaningfully baseline or compare when the parameters are seemingly so arbitrary and inconsistent.

Fourth, expertise advancement. This is one of my pet hates in regard to cricketing discussions generally, not just this particular piece, because it so often relies on the concept of teleporting a player from the past without warning decades into the future and then judging him on that. You make this point during your piece where you write:

“…if Bradman were to be transported to the Present Era with his demonstrated abilities unchanged, he would be somewhat less dominant in relation to present day players than he was in relation to his contemporaries because batting expertise in general has risen in the intervening period. (Hence my lower, fully standardised, average for him.) The same point applies to all batsmen of each previous era.”

Yes, cricket has evolved over time, because of course it has. In every field of human endeavour we are faster and stronger and know more than we used to, and if we don’t then something has gone seriously wrong. We have access to tools, knowledge and technology that previous generations didn’t have and in many cases couldn’t even dream of. However, this to me is a natural function our advancement, rather than an indication that we are today inherently “better” than those who had gone before. Every college physics professor in the world has more absolute physics knowledge than Einstein had, but that doesn’t mean they are greater than he was. Einstein himself also acknowledged the debt he owed to those who had gone before with his famous quote about “standing on the shoulders of giants.”

If we are acknowledge – as we should – that the game of cricket has evolved over the course of its history then we also need to acknowledge those things which impacted players of the past which today’s champions no longer have to deal with. Pitches weren’t covered from the elements, boundaries weren’t roped in 20 yards from the fence, protective equipment protected **** all, bats didn’t have sweet spots which turned a piece of willow into a cannon, and bowlers delivered the ball from further up the pitch due to no-balls being determined by the position of the back foot, not the front. If we are going to use the evolution of the game to rate players, then we need to adjust in both directions.

And that’s without even addressing the arbitrary percentages applied to downgrade players’ “expertise” from previous eras. Why are batsmen from the 1990s downgraded 2%, but players from the 1980s by more than double that at 4.5%? What is the rationale behind the percentage decreases applied at each time period? To my point above about time period inconsistencies, why are we saying that expertise has increased by a certain (inconsistent) percentage in virtually every decade, but then there has been absolutely no increase at all in batting expertise between 2000 and 2023?

When it comes to the application of these reductions for previous players, why aren’t they consistent with the figures you’d previously allocated for each era. For example, Ken Barrington played in an era which you have said was 7.5% lower in expertise than the Present Era. However, on your ranked table in the column “Allow for Advance in Expertise”, you have adjusted Barrington’s average by 3.8%. Why not 7.5% - has that adjustment already been factored in elsewhere in another number? It was hard to tell, and throughout the list the % reduction on the table was lower than the % reduction for era which you’d nominated earlier. I may well be missing something there which you’d already explained somewhere else, but it would be good to understand why it is different.

Finally, a sense check. I always think that will studies like this one of the most interesting parts is coming to the end of your analysis and discovering something that you wouldn’t have expected to find or that casts new light on previously accepted wisdom. However, I also strongly believe that any conclusion should be balanced by a sense check, a smell test if you will – basically, does this make sense based on everything I know of the subject at hand and what could reasonably be expected? I won’t even address the top of the list, and the inclusion in an all time top ten of blokes for whom the hashtag #samplesizelol could have been invented.

Rather, I’d look further down the list for my sense check. If you conducted a study to rank Test batsmen and found that Viv Richards and Len Hutton were placed one after the other on your list, then you’d probably think that’s a fair starting point that you may well be on the right track. However, if the two consecutive positions they held on the list were 53rd and 54th, I’d reckon you might cross-reference that against your knowledge and appreciation of cricketing history and have another look at your methodology.

While I appreciate the effort which went into this, and accepting fully that I may well have gotten the wrong end of the stick on literally everything, this study screams to me to be one which started with the controversial conclusion that it wanted to reach, and then introduced ever more extreme criteria in order to reach it.

PLEASE SEE MY CONSOLIDATED SET OF COMMENTS AT END OF PAGE 1 (ALSO APPEARS AT END PAGE 2)

Prince EWS · Nov 5, 2023

pbkettle said:
PLEASE SEE MY CONSOLIDATED SET OF COMMENTS AT END OF PAGE 1 (they also appears at end page 2) – pbkettle

We've got it.

Stop with the all-caps identical responses.

shortpitched713 · Nov 5, 2023

The dominance rating is an interesting concept, statistically. I'll have to look into that methodology a bit more. So thanks for that pbkettle! :thumbsup:

TheJediBrah · Nov 5, 2023

pbkettle said:
What I actually thought, before doing any analysis, is that raw batting averages across different generations are not strictly comparable, for a good number of reasons. Then I sought to neutralise those inter-generational factors that affect the level of averages which have nothing to do with the abilities of the batsmen themselves. In principle, the resulting adjusted averages could be either close to or far apart from the raw averages.

Whereas you've actually done the opposite, you've adjusted averages to be less representative of the abilities of the batsmen

Adders · Nov 15, 2023

Why is this thread stickied?

srbhkshk · Nov 15, 2023

pbkettle said:
In similar vein, Srbhkshk observes that I made no attempt “to make the (resulting) list look even remotely like the generally accepted wisdom”. I’m keen to know where the “generally accepted” ordering of prominent Test batsmen is to be found (other than raw averages). And, if such a list really does exist, why should it continue to prevail?

As I mentioned in the post, I actually like your method even if there is a certain degree of subjectivity to it, so I have no reason to give for why the general wisdom should continue to prevail.
That said, your methods focus completely on era adjustment and do not take into account the different conditions batsman playing in the same era would have encountered and as such reduce the validity of the list in terms of actually making the averages comparable.

Making All-time Test Batting Averages Fully Comparable

International Regular

International Captain

International Coach

International Captain

International Vice-Captain

Request Your Custom Title Now!

Global Moderator

Cricket Spectator

Request Your Custom Title Now!

Cricket Spectator

Audio File

International Regular

Whatever it takes!!!

Hall of Fame Member

Cricket Spectator

Global Moderator

International Captain

Request Your Custom Title Now!

Cricketer Of The Year

International Captain