The point is that the tracking isn't as simple as first thought (i.e. just keep your eye on the ball). It's not a simple perception (i.e. faulty vs awesome eyes) or processing (processing speed once the signal leave the cortex) it's both and then on top of that, cognition. I'm no expert in this area but I believe it's been well-established that peripheral vision has a much wider field of view and better motion detection (is dominated by rods rather than cones) than the fovea but the trade-off is that, being rod-dominant, it won't perceive detail as well as cones are more for detailed/high-resolution perception, hence the fovea is cone-dominant. So the peripheral vision is more sensitive and works better at night but has a lower resolution than the fovea.
Getting into more fuzzy territory but, as more of the cortex is devoted to perceiving what the far smaller fovea 'sees' (>50%), to keep up processing speed, the brain has a heap of short-cuts and makes some educated guesses about the speed, direction, etc. of a moving object as perceived by the peripheral vision. This means it's more likely to make mistakes and, as I said, when the ball passes from the peripheral vision to the fovea and you're seeing the ball as it really is moving, probably too late to back out of the shot (assuming you're going to play one).
All this is independent of what your brain does with the info once it comes in. Quick calculation suggests that a bloke who lets go of the ball at 150Km/h, you've got about 450ms play a shot. It takes 80ms (from memory, heh) for the signal to go from your eyes to the visual cortex so that leaves 370ms to bounce that signal around your pre-frontal cortices, the motor cortices and get your arms and legs moving to do something with the ball. It's fairly well-established that a lot of the physicality is achieved with pre-emptive movement built-upon from years of practice (450ms is nowhere near enough time) but still, not a lot of time. And, of course, this is assuming your brain only perceives and reacts to one signal, which is obviously untrue (i.e. you track the ball down the pitch taking in new information as it comes).
Processing speed doesn't differ too much from person to person so, to me, there are several bottlenecks; how your brain deals with the hand-over from using peripheral vision to foveal as the ball gets closer as well as the numerous decision-making phases it goes through as the ball gets closer. Maybe, with regards the best players, they just go through fewer decision-making cycles and make the decision to play earlier, maybe they do have faster processing, maybe their hand-over from the different parts of the eye is smoother, who knows? This is assuming a fairly homogenous physiological response to a stressful event too (i.e. facing someone bowling 150Km/h) which obviously would not be the case from person to person.
They talk a lot about this in aviation, actually, although obviously on a timescale of second/minutes than milliseconds. If you're in a stressful situation, the latest research suggests the best way is to make a decision early and go all the way with it rather than second-guessing yourself all the way down to the ground. Second-guessing = delays and, under stress, the info to inform subsequent decisions will already be warped by your stressful state which will lead to more bad decisions, more stress, more warped info, etc. Maybe that's what separates the best from the rest, that ability to decide early that they have all the info needed to play the ball correctly so they just go with it. Maybe, at the millisecond scale, that's what they really mean when they say 'play your natural game' because filling your head with other stuff (including new info as the ball flies down the pitch) causes delays/errors.