Stats do come up more often in cricket than other sports though. You might, for example, point to Ponting's record against India as a reason why you think Oz will lose this series. You'd never point to Jonny Wilkinson's poor kicking record against a given team as a reason why England won't win the six nations. Maybe it is overdone a bit in cricket- can't imagine that in any stats related job 8 matches played would be considered anywhere near a reasonable number of trials, yeah?
See what you're saying and if you were designing an experiment where you had to decide how many matches in one country is 'fair', probably not. However, as so often with real-world data, you're often not there at the start of an experiment, especially in a longitudinal 'study' like the career of a batsman. Excuse the nerdery (and apologies if you've seen all this before).
In terms of statistical analysis, a batsman's career is something resembling a experimenter's nightmare. There are so many factors to consider when coming up with a number which means something and they're impossible to control for so it becomes an issue of sampling. Okay, so let's concentrate on taking a representative sample. Considering the complexities involved, the obvious (geez, ONLY) choice is a stratified random sample. Too easy, now we have to define the strata.
If your criteria is number of matches he's played, you might proportionately pick matches based on the country he's played in. Okay so 9 matches out of 120 = around 8% of all his matches played. Multiplying that by the 9 matches gives you the number of matches to pick from the 9 (around 0.7; since we're talking whole Tests here, you round up and randomly select 1 match from his 9 in India). Repeat for all countries he's played in. One more example is Australia; 68 Tests in Australia ~57% which equates to taking a random sample of ~39 matches from that 68.
Soon enough, you have what you think is a representative sample based on the countries he's played in. Awesome! Oh wait.....
Here's a non-exhaustive list of the factors which are uncontrolled and impact on whether the same is representative. They seem basic but impact on each strata;
- Significantly different people comprising bowling attacks within the strata
- Significantly different performance of the bowlers who he plays multiple times based on irreversible changes to their ability through age, injury, wear-and-tear, etc. (and, in reality, each of those impact on a bowler at different rates)
- Significantly different bowler types comprising bowling attacks within the strata
- Significantly different atmospheric conditions within the same match, let alone between matches, let alone over the course of all the Tests Ponting has played in any given country
- Significantly different pitch conditions within matches, let alone between matches let alone over the course of all the Tests Ponting has played in any given country
- Significantly different times of day when the runs are scored within the same match, let alone between matches, let alone over the course of all the Tests Ponting has played in any given country
- Significantly different condition of the balls when runs are scored within the same match, let alone between matches, let alone over the course of all the Tests Ponting has played in any given country
Etc., etc. And, of course, each of those factors don't act in isolation, they pretty much all interact. Experimental reality is that you cannot assume a sample is representative until huge factors such as the above are controlled-for. Or you could pick a different strata but then you'd still have to control for a bunch of factors and their interactions.
That's why I say an arthimetic mean is a blunt-force measure. It's doesn't take into account any of the above factors (by definition, being number of runs scored/completed innings) and, essentially, is reliant upon all of the above factors having been encountered in equal measure and smoothed out for a fair analysis of a batsman's career. Even if your career has been as long as, say, Steve Waugh, Ricky Ponting, etc., you can't say with certainty that this is the case. There are far more complex phenomena analysed than cricketers' careers but then, they're generally analysed with an incredible number of caveats associated with them, use a heap more data to ensure sample representation and, in the case of demographic stats, take decades.
This is why a cricketer's average, for me, is merely a guide and arguing over % points in deciding who's better is a waste of time for all involved. We're talking really basic experimental techniques here; arguing over averages is just completely wrong. Could write about this for days, really......