cricrate: new cricket ratings website

OverratedSanity · Jun 1, 2015

Dan said:
Would it be better to consider Laxman's entry point as 1/-225, rather than 1/50?

Exactly how it should be considered imo.

viriya · Jun 1, 2015

Dan said:
Would it be better to consider Laxman's entry point as 1/-225, rather than 1/50?

But then you have the case where you give undue credit for irrelevant innings in an innings defeat.. it's an interesting idea but I feel that it's going to be impractical to calibrate how 1/-225 compares with say 50/3.

Dan · Jun 1, 2015

Then factor exit point into it too -- an innings that takes you from 1/-225 to 9/-100 is still wholly irrelevant, whereas 1/-225 to 5/150 is hugely valuable.

OverratedSanity · Jun 1, 2015

Dan said:
Then factor exit point into it too -- an innings that takes you from 1/-225 to 9/-100 is still wholly irrelevant, whereas 1/-225 to 5/150 is hugely valuable.

Exactly. Because this isn't followed in viriya's methodology, Laxman's 167, which while an excellent knock but in the end meaningless, is rated higher than the far greater and more meaningful 281. The reason McCullum's 302 and VVS's 281 are amazing is because, as I said before, they dragged the team from a hopeless position all the way to a dominant one. It wasn't a high quality but futile hit and giggle session which was never going to change anything (ala Astle's 222 or the VVS 167)

viriya · Jun 1, 2015

Dan said:
Then factor exit point into it too -- an innings that takes you from 1/-225 to 9/-100 is still wholly irrelevant, whereas 1/-225 to 5/150 is hugely valuable.

OverratedSanity said:
Exactly. Because this isn't followed in viriya's methodology, Laxman's 167, which while an excellent knock but in the end meaningless, is rated higher than the far greater and more meaningful 281. The reason McCullum's 302 and VVS's 281 are amazing is because, as I said before, they dragged the team from a hopeless position all the way to a dominant one. It wasn't a high quality but futile hit and giggle session which was never going to change anything (ala Astle's 222 or the VVS 167)

Good points, I'll think about it - seems like a good way to give credit for hopeless -> winning knocks.

weldone · Jun 1, 2015

Dan said:
Then factor exit point into it too -- an innings that takes you from 1/-225 to 9/-100 is still wholly irrelevant, whereas 1/-225 to 5/150 is hugely valuable.

perfect

Top_Cat · Jun 1, 2015

Red Hill said:
I like your passion Viriya, but I reckon the flaw in these sort of methods are that there are SO MANY factors to be taken into account. Highlighting one factor that might be important might mask another equally important flaw or factor. In the end, a great innings is a great innings, and as much as I like rankings (because I like orderly lists) the only real way to "rank" innings is to watch them and make an individual decision. Looking at everything statistically or mathematically has serious limitations and always will.

Yeah. It's screaming for some sort of feature extraction/ML exercise from commentary text. Runs scored/wickets taken, although unbiased, are not very descriptive when it comes to really understanding the value of an innings. Tweaking/weighting/etc. just adds extra components to what are data with very poor (by scientific standards) predictive ability. For Tests without good text data, could even run ASR on TV commentary and analyse the text using ML methods. The computational firepower required for something like this in analysing even one Test would be enormous, though. Know a bloke who used lecture audio to build language models for use in his own ASR software and he needed a cluster for any given series of 1-hour lectures so you can imagine what it would take for 6 hours/5 days of it....

Even if you could do it (it's possible), gotta wonder to what end. The insights might not be any better than experienced watchers of the game.

RossTaylorsBox · Jun 1, 2015

ASR for cricket matches would be a pretty great research topic. I wonder how feasible it would be to generate scorecards and highlights packages.

Top_Cat · Jun 1, 2015

OverratedSanity said:
Exactly. Because this isn't followed in viriya's methodology, Laxman's 167, which while an excellent knock but in the end meaningless, is rated higher than the far greater and more meaningful 281. The reason McCullum's 302 and VVS's 281 are amazing is because, as I said before, they dragged the team from a hopeless position all the way to a dominant one. It wasn't a high quality but futile hit and giggle session which was never going to change anything (ala Astle's 222 or the VVS 167)

One could disagree that Laxman's 167 was futile because it gave him the confidence to play with the freedom he did in Kolkata. His place in the side was very shaky to that point and he was developing a nasty habit of being bogged down then flashing at the sucker ball. The way he went about the 167, it wasn't just how many shots but where. Plus, more importantly, he he had a way to play against the Aussie bowlers who he'd be facing in the return series that others hadn't figured out. The half-ton in the first dig in Kolkata is obviously overshadowed by the big double that came after it but, again, the style of it was a decent pointer to what was coming. I remember watching him score that knock and thinking when SWaugh enforced the follow-on that maybe someone should have had a word to the skipper before doing so. But then, after crushing India in the first Test, Waugh obviously wanted to murder them dead and not enforcing it would have overturned a few decades of Aussie tradition in those circumstances but hey hindsight, etc.

Laxman, in Sydney, nicks out to McGrath again in the second dig and none of it may have happened at all.

Uppercut · Jun 1, 2015

Top_Cat said:
Yeah. It's screaming for some sort of feature extraction/ML exercise from commentary text. Runs scored/wickets taken, although unbiased, are not very descriptive when it comes to really understanding the value of an innings. Tweaking/weighting/etc. just adds extra components to what are data with very poor (by scientific standards) predictive ability. For Tests without good text data, could even run ASR on TV commentary and analyse the text using ML methods. The computational firepower required for something like this in analysing even one Test would be enormous, though. Know a bloke who used lecture audio to build language models for use in his own ASR software and he needed a cluster for any given series of 1-hour lectures so you can imagine what it would take for 6 hours/5 days of it....

Even if you could do it (it's possible), gotta wonder to what end. The insights might not be any better than experienced watchers of the game.

I don't think that conclusion makes any sense. "Experienced watchers of the game" hold entirely different opinions on pretty much everything.

I don't think your methodology would be useful either. The point of quantitative analysis is surely to distinguish reality from bias. But both reality and bias are captured in a commentary variable. So the variable doesn't really tell you anything.

Daemon · Jun 1, 2015

viriya said:
Good points, I'll think about it - seems like a good way to give credit for hopeless -> winning knocks.

Should be done in a way such that 2nd innings knocks aren't given a disproportionate advantage over 1st innings knocks though.

Top_Cat · Jun 1, 2015

Uppercut said:
I don't think your methodology would be useful either. The point of quantitative analysis is surely to distinguish reality from bias. But both reality and bias are captured in a commentary variable. So the variable doesn't really tell you anything.

What you're talking about is the basis of statistical machine learning. That's the thing it's meant to do with text and, given, it's not easy. The limitations of data from descriptive/structured datasets have been known for a long time, can only mine and manipulate them so many ways. What I'm talking about, feature extraction and learning from large-scale, unstructured datasets (read: text data with no numbers) is what Google, Microsoft and Twitter have made billions from and what the humanities, medicines, etc. have well and truly woken up to with regards the sometimes hundreds of years of text data many are busy computerising. It is, of course, an evolving science, need to carefully curate and evaluate data, etc. but you might be hard-pressed to convince them they're wasting their time.

OverratedSanity · Jun 1, 2015

Top_Cat said:
One could disagree that Laxman's 167 was futile because it gave him the confidence to play with the freedom he did in Kolkata. His place in the side was very shaky to that point and he was developing a nasty habit of being bogged down then flashing at the sucker ball. The way he went about the 167, it wasn't just how many shots but where. Plus, more importantly, he he had a way to play against the Aussie bowlers who he'd be facing in the return series that others hadn't figured out. The half-ton in the first dig in Kolkata is obviously overshadowed by the big double that came after it but, again, the style of it was a decent pointer to what was coming. I remember watching him score that knock and thinking when SWaugh enforced the follow-on that maybe someone should have had a word to the skipper before doing so. But then, after crushing India in the first Test, Waugh obviously wanted to murder them dead and not enforcing it would have overturned a few decades of Aussie tradition in those circumstances but hey hindsight, etc.

Laxman, in Sydney, nicks out to McGrath again in the second dig and none of it may have happened at all.

Thinking way too much into it imo. The 167 might have done all you mentioned, sure.

It does not, however, make it in isolation, a great innings. An important innings for VVS, yes. But an ATG knock better than the 281 or 302? Nah, not close.

Days of Grace · Jun 1, 2015

viriya said:
Ok so of the factors considered for batting innings (http://www.cricrate.com/test/batting/index.php):
1. Runs scored: 281, big +ve
2. Not out: nil, minor factor either way
3. Percentage of total: 42% of total, good but not 50%+ as is the case for most great innings
4. Bowling quality: The 4 main bowlers were McGrath, Warne, Gillespie and Kasprowicz. This was during Warne's mid-career low and only McGrath (with 872) had a very good rating. The average rating comes out to be ~600 which is good but not great
5. Point of entry: 52/1, openers set a decent platform, even if the new ball isn't deadly in Eden Gardens, McGrath is most effective then.
6. Wickets at crease: Laxman is 5th out, so he's there for 3 wickets - not significant
7. Support: Dravid makes 28% of the team's total, so Laxman doesn't get much from this factor because he had very good support
8. Strike rate: 62.17, very good but minor factor
9. Location: Home, no bonus
10. Match status: Follow-on under pressure, big +ve
11. Result: This ended Australia's 16 match win-streak, but it was also only the start of Australia's decade-long dominance. The team rating was 148 before the match which is similar to what SA and Australia are rated right now - very good but not the amazing levels (175-200) the Aussies reached in the mid-2000s.
12. Close match: It actually didn't end up being a close match, and this only really affects the 4th innings so irrelevant
13. Milestone: Double hundred, minor credit

So with all these factors, the innings is rated at 2631, which misses the top 100 by a small margin.

I'm curious if you guys think there are other factors that should be considered. IMO I do think Laxman's innings deserves a top 100 spot - just that it's not so clear cut. I think it's slightly overrated because of the fact that it ended a streak and winning after a follow-on rarely happens. For instance, if Harbhajan had not won them the game, would we still consider this such a great innings?

It's not the factors, it's the weighting you give each one IMO. I have Laxman at no.4 in my system but one thing I have done that is different from you is I give points for the highest partnership a batsman is involved in. Therefore, players like BJ Watling can get significant credit for supporting McCullum's 300.

Also, I calculate the significance of the innings not only for the team innings, but for the entire match. Therefore, a double century is seen as the significant innings of the match and not just a large score in a large team innings. Likewise, a 60* in a team score of 100 but in a match aggregate of 1500/35 loses a lot of points compared to if you just rated the significance compared to his teammates in the same team innings.

Therefore, with my method, one can see how Laxman's 281 would be rated significantly higher than his 167 which, incidentally, doesn't make my top 100.

Hope this makes sense.

Uppercut · Jun 1, 2015

Top_Cat said:
What you're talking about is the basis of statistical machine learning. That's the thing it's meant to do with text and, given, it's not easy. The limitations of data from descriptive/structured datasets have been known for a long time, can only mine and manipulate them so many ways. What I'm talking about, feature extraction and learning from large-scale, unstructured datasets (read: text data with no numbers) is what Google, Microsoft and Twitter have made billions from and what the humanities, medicines, etc. have well and truly woken up to with regards the sometimes hundreds of years of text data many are busy computerising. It is, of course, an evolving science, need to carefully curate and evaluate data, etc. but you might be hard-pressed to convince them they're wasting their time.

Hey I have a lot of time for textual analysis as a field. But it's not being used for this type of question. Where I am it's used to track how the media drives asset-price bubbles, for example. What you suggested is comparable to trying to use it to determine whether asset prices were accurate, by compiling the opinions of well-respected financial columnists. That would be complete nonsense.

No doubt if you had such a dataset there'd be a lot of interesting stuff you could do with it. Just not that. Or rather, I personally wouldn't find it at all convincing.

Top_Cat · Jun 1, 2015

I don't think we're talking about the same things. If I'm right, what you're talking about is more in the predictive analytics/data mining domain whereas what I'm talking is more under the the machine learning banner. There's a lot to say about this sort of stuff but I feel like it'll derail the thread.

viriya · Jun 1, 2015

Days of Grace said:
Also, I calculate the significance of the innings not only for the team innings, but for the entire match. Therefore, a double century is seen as the significant innings of the match and not just a large score in a large team innings. Likewise, a 60* in a team score of 100 but in a match aggregate of 1500/35 loses a lot of points compared to if you just rated the significance compared to his teammates in the same team innings.

I actually disagree with this.. Each innings in a match has it's own shape. A ton made in the 4th innings on a 5th day is more valuable/tougher than 150-200 made in the 1st in a lot of cases..

Days of Grace · Jun 2, 2015

viriya said:
I actually disagree with this.. Each innings in a match has it's own shape. A ton made in the 4th innings on a 5th day is more valuable/tougher than 150-200 made in the 1st in a lot of cases..

Yes, so then you also factor in the match situation/significance of the innings.

viriya · Jun 2, 2015

Days of Grace said:
Yes, so then you also factor in the match situation/significance of the innings.

I already do.. anyway, adding that factor wouldn't make a huge difference in Laxman's case. More weight to bowling quality + possible credit for backs-to-the-wall efforts using "effective point of entry" possibly could.

longranger · Jun 2, 2015

I don't mean to sound like a broken drum, but if the VVS 281 is not a part of a Top 10 batting performance list, a lot of people (including me) will fail to have interest in that list. Whilst I'm not asking you to totally reverse engineer to ensure this knock has a top-10 rating, you definitely have to relook your criteria. And if it doesn't find a place in the Top 100, well...

cricrate: new cricket ratings website

Request Your Custom Title Now!

International Captain

Hall of Fame Member

Request Your Custom Title Now!

International Captain

Hall of Fame Member

Request Your Custom Title Now!

Cricket Web: All-Time Legend

Request Your Custom Title Now!

Request Your Custom Title Now!

Request Your Custom Title Now!

Request Your Custom Title Now!

Request Your Custom Title Now!

International Captain

Request Your Custom Title Now!

Request Your Custom Title Now!

International Captain

International Captain

International Captain

U19 Cricketer