Cutler or McCown? The NFL’s Sample Size Problem


Photo Credit: Chicago Bears Official Site

There are 16 games in an NFL season. That is not breaking news for you, I’m sure. But it creates a problem when evaluating player performances; as anyone with a passing interest in sabermetrics can tell you, 16 games for a baseball player is not nearly enough to tell you anything statistically significant. But for football, that’s what we’re stuck with. Obviously the two sports are not strictly analogous; a quarterback has an active role in many more plays in any given game than any single baseball player would. So in that sense, we can mine more data on a per-game basis than we can for a baseball player. Which has some merit, for sure; I’m not saying that there are zero valuable statistics in football. I think there are multiple metrics that can be used to discern how well a player performed in any given game, or over the course of the season.

But I think the problem comes into play when people attempt to use stats derived from an inherently small sample to predict player performance. This phenomenon is currently playing a major role in the ongoing media debate surrounding the potential return of Jay Cutler. By any reasonable metric, Josh McCown has been very good, and that includes the “eye test.” By all the same measures, Jay Cutler was very good as well, when healthy. But due to a variety of factors (the obvious coolness of McCown’s story and the general dislike of Cutler (itself a tired and by now obsolete narrative) chief among them) there has been a growing chorus of voices who think McCown should remain the starter regardless of Cutler’s health situation.

I can understand that urge. No one ever wants to rock the boat, and the Bears have waited so long for a competent offense that it’s easy to want to handle it gingerly, like a baby bird, afraid to break something so seemingly fragile. But when you consider that the offense didn’t miss a beat with the transition to McCown, I think it’s fair to speculate that they wouldn’t miss a beat with the transition back to Cutler, and if any bumps were to arise I doubt they’d be too jarring. Which leaves the “McCown is just better” crowd, and those people have to be basing their case strictly on McCown’s performance this season; before this year, he had done little that would lead you to believe he was capable of playing as well as he’s played this season. And make no mistake about it, he’s playing well. His performance Monday night tied for the second-highest rated quarterback performance of the season according to ESPN’s QBR metric. (Which isn’t a perfect stat, but it’s better than passer rating, in my opinion. I’ll also note that if a Cowboy catches either of the two balls McCown threw right to defenders, or if the refs don’t throw a flag for defensive holding on a ball that was actually intercepted, things might look a little different.) But here we run into the problem with the sample size, as McCown has started just five games. Jay Cutler has started and finished six games, and he only played poorly in one of them, the Week 4 loss to Detroit. McCown’s body of work is more recent, and his performance Monday was the best performance by either quarterback this season, but it also came against the league’s worst passing defense. Context matters.

McCown’s performance has also been buoyed by the very impressive play of his skill position players, notably the suddenly crazy exploits of Alshon Jeffery. Marc Trestman’s game-planning, play-calling, and game-management have all been very, very good. Those things would be there for Jay Cutler, as well; McCown is not doing things that Cutler is incapable of doing. (Well, maybe the spinning, Elway-like dive into the end zone; I’m not sure Cutler’s ankle would be up for that one.) But because he’s played nearly a half-season’s worth of games, it becomes easy for fans, media, and players alike to view that as a large enough sample to justify an ongoing entrenchment as the starter.

But that’s not enough. Data collected from 5.5 games just isn’t enough to be used to gauge future performance. On Twitter, I equated this “QB controversy” to an above-average MLB player missing two weeks, while his below-average replacement filled in and played well. That drew questions as to the validity of that analogy, and I’d like to try to clear that up a bit. As was noted, 5 games in the NFL roughly correlates to 50 MLB games in terms of share of the season. But it does not correlate to 50 games in terms of statistical sampling. (I’ll also note that 50 games is still a small sample by baseball standards.) It’s not a difficult difference to illustrate; football teams within the past decade have finished the regular season with every possible finishing record, from 16-0 to 0-16. Teams in the NFL routinely go on lengthy winning streaks; the Chiefs began this season at 9-0. But no MLB team has ever gone 162-0, or 0-162. No team wins or loses 90 games in a row. The NFL is often praised for its parity, the idea that any team could beat any team at any time, or that any team could make the playoffs in any given year.

Those things are true, but the first point is true in baseball as well. The second part isn’t, because the sample size is much greater. If the MLB season were 16 games, any team could make the playoffs. (I’m using team results to compare the leagues, but the sample size issues work for analyzing individuals as well.) The sports are different, and it’s not an apples-to-apples comparison (the nature of baseball success is more random, which certainly plays into this discussion, although not to a degree that makes me think my greater point is incorrect.) But there’s no way that five games can tell us anything useful about McCown’s likelihood of sustaining success, no more than Rex Grossman’s various multi-game hot streaks told us about his ability to be consistently successful.

I think the fact that Marc Trestman has been unwavering in his support of Cutler should give you a good idea as to who should start. If you wanted to read into it further, you could probably look at it as insight into their organizational belief in Cutler. This is a team that is fighting for a playoff spot, coached by one of the more analytically minded men in the league, who works for one of the more analytically minded general managers. Neither of them was involved with acquiring Cutler (who will be a free agent after this year), and they know how well McCown has played. If there was ever a time for the Bears to make a move similar to San Francisco’s benching of Alex Smith from last season, this would be it. But all indications are that if Cutler can play, he’ll play. That should tell you more about the gulf between their abilities than anything I can say, and certainly more than anything six games worth of stats could tell you.

It’s also important to note that if Jay comes back and doesn’t play well, that doesn’t mean they made the wrong call. The results might not have been what they wanted, but if they believe that a healthy Cutler gives them a better chance to win than McCown would give them, they have to do it. Cutler not playing well wouldn’t automatically mean that McCown would have been better. But let’s hope we won’t have to worry about that particular narrative twist.

Anyway, it’s a good problem to have, isn’t it? Beats the hell out of debating Grossman vs. Orton.

Jay Rigdon is the editor and lead writer at Bleacher Nation Bears, and can also be found @BearsBN on Twitter.