• Welcome to the Cricket Web forums, one of the biggest forums in the world dedicated to cricket.

    You are currently viewing our boards as a guest which gives you limited access to view most discussions and access our other features. By joining our free community you will have access to post topics, respond to polls, upload content and access many other special features. Registration is fast, simple and absolutely free so please, join the Cricket Web community today!

    If you have any problems with the registration process or your account login, please contact us.

The flaw in batting averages

weldone

Hall of Fame Member
Indeed. Migara's comparing the global average of batsmen after four years with the average of a select few after fifteen years. The data is distorted by the fact that Matthew Elliott for example didn't played for fifteen years - De Silva's average after fifteen years should be compared to what he averaged after four, not what the average of everyone averaged after four.

Wow, I just realised I explained it the same way you did. It really makes a lot more sense in one's head. :D
Exactly my point...presented in a simpler way I feel...
 

Migara

International Coach
That pattern will be for batsmen of similar calibre...You can't say Dhoni averages 45 after 4 years and DeSilva 35 after 15 years hence average decreases by 10 in 11 years...You have to compare Dhoni with Dhoni and DeSilva with DeSilva to know how averages wane...Hope this time I've made it simpler...
No. It's not the point. If Dhoni looses his average faster tha De Silva that will tell De SIlva was a better player than Dhoni. But this is the performance of a standardized batsman, who's not a legend, neither a rabbit. We are comparing all the player to this standardized batsman. A shown by the R2 of 0.40 it only describes 40% of the relationship. But 40% is a moderately strong relationship in stats. There may be outliers, but with stats, that's always possible,
 

weldone

Hall of Fame Member
Indeed. The conclusion assumes that everyone plays the same exorbitant number of innings. To correctly measure how a player improves or declines over time, only the data of the players who have actually survived over that time should be considered.

To use a crude example, if there are only two batsmen in history - one who plays 100 games and scores 20 in all of them, and one who plays 50 games and scores 40 in all of them - the global average after 50 games is 30, and the global average after 51 games is 20. It'd be silly to draw the conclusion that players typically decline extremely sharply at game 51, and the Migara's data uses the same flawed logic on a bigger scale.
Even simpler this time...
 

Prince EWS

Global Moderator
No. It's not the point. If Dhoni looses his average faster tha De Silva that will tell De SIlva was a better player than Dhoni. But this is the performance of a standardized batsman, who's not a legend, neither a rabbit. We are comparing all the player to this standardized batsman. A shown by the R2 of 0.40 it only describes 40% of the relationship. But 40% is a moderately strong relationship in stats. There may be outliers, but with stats, that's always possible,
The standard batsmen doesn't play for fifteen years though, so your data after however many innings that is will represent a different sample to your data of batsmen after one innings. You've created an unwanted variable.
 

weldone

Hall of Fame Member
Then, the point I highlight will be firther strengthened. I always used to think a legend in ODIs should have a long long career. As you suggest, if the wane is still more for a typical batsman, adjusted averages of People like Miandad, Tendulkar, Jayasuriya, De Silva and Inzi will further jump up, and further getting them up. That's why I used the number of innigs also to dampen the effect.
Yes my friend, we all agree to this point...There's no qualms about the fact that a great player should have a long career...The point is different as I and Prince are saying in other posts...

However, one shouldn't use number of innings, but should use number of years for several reasons...One of them is that some countries like Australia and India play more ODI cricket than some countries like England...So, an English player playing for 10 years will almost certainly play less number of innings than an Indian player playing for 10 years, although that doesn't belittle the English player.
 

Migara

International Coach
Indeed. Migara's comparing the global average of batsmen after four years with the average of a select few after fifteen years. The data is distorted by the fact that Matthew Elliott for example didn't played for fifteen years - De Silva's average after fifteen years should be compared to what he averaged after four, not what the average of everyone averaged after four.

Wow, I just realised I explained it the same way you did. It really makes a lot more sense in one's head. :D
Exactly not the case. I am extrapolating average of a batsman after playing for 10 years (who played for 4 years already) using the global trend.

Thrend for each batsman will differ, and it's unknown as well.. All these differences add in and makes up the trend line, The next best thing we can do is to extrapolate using the global trend.

On other hand I can intrapolate what will be the average of a batsman if he played for 4 years rather than 10 using the global trend. Once again trand for each batsmen is different, but all even out on to the trend line.

So basically I am not comapring averages of players with short averages with that of long careers one to one.
 

Migara

International Coach
However, one shouldn't use number of innings, but should use number of years for several reasons...One of them is that some countries like Australia and India play more ODI cricket than some countries like England...So, an English player playing for 10 years will almost certainly play less number of innings than an Indian player playing for 10 years, although that doesn't belittle the English player.
Yes. Thats one of the problems. I used mediian of both because there are players who have made come backs after long periods, giving them a long career, but very little number of matches.

Standradizing the number of innigs multiplying by matches per year will solve that problem IMO.
 

weldone

Hall of Fame Member
No. It's not the point. If Dhoni looses his average faster tha De Silva that will tell De SIlva was a better player than Dhoni. But this is the performance of a standardized batsman, who's not a legend, neither a rabbit. We are comparing all the player to this standardized batsman. A shown by the R2 of 0.40 it only describes 40% of the relationship. But 40% is a moderately strong relationship in stats. There may be outliers, but with stats, that's always possible,
My friend you don't realize that this standardized batsman is becoming better and better as he is playing more and more...After 30 innings this standardized batsman is like Misbah and Raina...Suddenly after 300 innings this standardized batsman is starting to play like Ponting and Tendulkar...
 

weldone

Hall of Fame Member
Yes. Thats one of the problems. I used mediian of both because there are players who have made come backs after long periods, giving them a long career, but very little number of matches.

Standradizing the number of innigs multiplying by matches per year will solve that problem IMO.
No...For those who made a comeback after a long period consider only the years that when they played for longetivity...Say a batsman plays in 1997 after 1993...deduct those 3 years in between and that solves the problem...number of innings is not of huge importance...
 

Migara

International Coach
To use a crude example, if there are only two batsmen in history - one who plays 100 games and scores 20 in all of them, and one who plays 50 games and scores 40 in all of them - the global average after 50 games is 30, and the global average after 51 games is 20. It'd be silly to draw the conclusion that players typically decline extremely sharply at game 51, and the Migara's data uses the same flawed logic on a bigger scale.
Don't be silly. When the number of observations increase, the "sharp"decline you are talking about will get less and less sharper. with 1200+ observations, most of those "sharp"edges will smooth out.

Put it in other words, there may be batsmen who'll deviate a lot from the line. They are few in number. There will be more and more batsmen when you come towards the trend line. Stats is all about that. ignoring few outliers to describe the huge majority.
 

Prince EWS

Global Moderator
So basically I am not comapring averages of players with short averages with that of long careers one to one.
You are; you just don't quite realise it.

Thrend for each batsman will differ, and it's unknown as well.. All these differences add in and makes up the trend line, The next best thing we can do is to extrapolate using the global trend.

On other hand I can intrapolate what will be the average of a batsman if he played for 4 years rather than 10 using the global trend. Once again trand for each batsmen is different, but all even out on to the trend line.
Yeah, I realise that. What I'm proposing is that the method you use to find the global trend is incorrect and accounts for batsmen who are totally irrelevant.

If you want to compare the global average of batsmen at innings 7 and innings 27 and hence calculate the trend that occurs across this time, what you should be doing is removing all those who never played 27 innings from your data, as they distort the results. Your trend result takes them into account at one point of your graph and not at another.
 

Prince EWS

Global Moderator
Don't be silly. When the number of observations increase, the "sharp"decline you are talking about will get less and less sharper. with 1200+ observations, most of those "sharp"edges will smooth out.

Put it in other words, there may be batsmen who'll deviate a lot from the line. They are few in number. There will be more and more batsmen when you come towards the trend line. Stats is all about that. ignoring few outliers to describe the huge majority.
The point of the above example though is that no-one has declined or improved at all - we've merely seen a change in the batsmen making up the data. It exploits a flaw.

If you want to measure the trend of decline or improvement, you can only take into account those who completely the full journey. If someone is averaging 17 and they suddenly get removed from some data, that's going to have an effect. Even if no-one actually improves, an improvement will appear as that batsman is no longer weakening the averages. As the number of innings goes up, more and more batsmen will cease to make their team and hence you will get more and more of an exaggerated improvement (or under-stated decline, in fact).
 

weldone

Hall of Fame Member
Stats is all about that. ignoring few outliers to describe the huge majority.
I know that...And there lies the problem, after a certain number of innings those outliers become the only available data in this case...We don't get data for an ordinary batsman after 200 innings because an ordinary batsman doesn't play that many innings...
 

Migara

International Coach
My friend you don't realize that this standardized batsman is becoming better and better as he is playing more and more...After 30 innings this standardized batsman is like Misbah and Raina...Suddenly after 300 innings this standardized batsman is starting to play like Ponting and Tendulkar...
Let'sput it in easier way. Batsmen who average 5 will last about one and half years. Batsman who averages 30 may have played for 6 years, or may have played for 18 years.One that played for 6 years is improving. One that played for 18 years is getting worse. Does it defy the common logic?

This standardized batsman will achieve a maximum average of about 33 after playing for 10 years. after that he'll start to get worse. So he'll be never playying as Tendulkar or Ponting, who averages 40+.

This will also tell that if you calculate Average of all batsmen who plyed certain amount of time, highest average will be in batsmen who played 10-12 years.
 

weldone

Hall of Fame Member
Don't be silly. When the number of observations increase, the "sharp"decline you are talking about will get less and less sharper. with 1200+ observations, most of those "sharp"edges will smooth out.

Put it in other words, there may be batsmen who'll deviate a lot from the line. They are few in number. There will be more and more batsmen when you come towards the trend line. Stats is all about that. ignoring few outliers to describe the huge majority.
The point Prince made was not about the 'sharp edge'...Go through it once again and you'll understand...
 

Migara

International Coach
I know that...And there lies the problem, after a certain number of innings those outliers become the only available data in this case...We don't get data for an ordinary batsman after 200 innings because an ordinary batsman doesn't play that many innings...
That's exactly the situation, but the weight of the runs of batsmen who are in the middle of the graph will make that outliers effect minimal.
 

weldone

Hall of Fame Member
This will also tell that if you calculate Average of all batsmen who plyed certain amount of time, highest average will be in batsmen who played 10-12 years.
So, according to you averaging 40 after 1 year is tougher than averaging 40 after 10 years...Now you are making it laughable Migara...
 

Migara

International Coach
So, according to you averaging 40 after 1 year is tougher than averaging 40 after 10 years...Now you are making it laughable Migara...
Average of 40 after one year is no where near the line. That batsman is a clear outlier. You can't use outliers to extrapolate.
 

weldone

Hall of Fame Member
Average of 40 after one year is no where near the line. That batsman is a clear outlier. You can't use outliers to extrapolate.
O my God, you are not getting a simple point after so many posts that I am astonished...batsmen playing for 10-12 years in your plot average better than batsman playing for 1 year in your line because they were better batsmen and not because playing well for 10-12 years is easier...If even now you are not getting then I am sorry I can't make it simpler, I have my limitations...
 

Migara

International Coach
The maximum errors will take place at the end of the line. So we can basically forget extrapolations for batsmen who played less than two years. But we can intrapolate with no difficulty if the target playing period is roughly more than two years.

The slope of the curve in first two years is absurd, but that's due to the assumption I made that it should go through the origin.

If you put it solely mathematically and discarding cricketing sense, an intercept about 16 will result. But still, the curve has the same shape, but less convex.
 

Top