Wednesday, October 19, 2005

fun with RRI

I took the RRI for pool A (my pool) and added some variability to see how the pool might play out. First, the expectations:

RRI:
FG 2636
DoG 2610
CL 2563
PBR 2459

Scores:
FG 15 DoG 14.3
FG 15 CL 13.1
FG 15 PBR 10.5
DoG 15 CL 13.8
DoG 15 PBR 11.1
CL 15 PBR 12.3

Next, I assumed that there was a 100 point standard deviation in what the true RRI was for a team, and that there was a 100 point sd for a game between any two teams (implying each team has a 71 point sd). Is that the right amount of variability? Too much? Too little? I don't know. It's probably too much, I guess. Over the range we're dealing with, there is a linear relationship of about 39 points of RRI for 1 point in a game to 15.

Then I ran the pool 30 times. The results:

PP% is % making it to the power pools.
Finish 1 2 3 4 PP%
FG 13 11 5 1 80%
DoG 12 9 7 2 70%
CL 5 8 12 5 43%
PBR 0 2 6 22 7%

Oddly, there wasn't a single three-way tie for 1st in the run, and only two occasions where there was a two-way tie (one of the teams lost to PBR). It took another 10 runs for a three-way tie for 1st, although PBR was involved. It wasn't until run #53 that Chain won the three-way due to a 15-9.7 win over FG.

PBR finally won the pool in run #36, eking out 2 1-pointers and a 2-pointer, while Furious turned out to be a lot worse than anyone thought (true RRI of 2363 instead of 2636; maybe there were a lot of visa problems or the flight with MG/Shank/Drew got cancelled).

I'm tempted to run the full tournament, but the tiebreakers and crossovers are just a little too conditional to do it simply.

4 comments:

Anonymous said...

Jim:

Couldn't you figure out a standard deviation for each teams score relative to its predicted score based on its RRI prior to the tournament?

I am not sure it's a good assumption that all teams have the same standard deviation. Also, I am not sure that the std deviation should be the same against teams of differing strengths (as measured by RRI).

I guess you could plot a curve for each team with the teams performance (Team Score - Opponents Score) on one axis and difference in the RRIs (Team RRI going into tournament - Opponents RRI).

I think that would give you a better measure of how the teams are likely to match up against eachother. It would take into account the fact that some teams crush weaker opponents, while others have a tendency to play down to their level.

It's been years since I took statistics, but I would be curious how that approach would work.

parinella said...

Justin,

Good points, all. But doing what you suggest would take a) access to the raw data and b) work, a lot more than would be merited by the accuracy of any results.

I looked at some college teams and found that their average sd from tournament to tournament was about 100 (maybe 75), and surmised that club teams would vary more since they have jobs and lives and don't all travel together. DoG had an sd of 70 this year, but also an upward trend of 23 points per tournament, so we ought to play at 2632 at Nationals.

Anonymous said...

The wind is usually much heavier at nationals than at a typical tournament.
Are the fields setup so it is mostly a crosswind game, upwind-downwind game, or a mixture? Crosswind games would make upsets less likely whereas upwind-downwind games make an upset more likely.

,Bob Koca

parinella said...

The fields are crosswind, if the winds are typical. Yes, that means that the better team will win more often. You can also play games to 1 if you are interested in seeing upsets.

Conceptually, I actually wouldn't mind if the games were shorter in order to make upsets more likely, but the idea of making the games shorter by playing them upwind/downwind doesn't appeal to me. In a real high wind game, or in a game with almost no turnovers, it's almost as if you're playing to 2 or 3, if you count only breaks as scores. As Tarr said a couple months ago, the game is most interesting when a team has about a 50% chance of scoring.