Friday, April 10, 2015
Friday, January 17, 2014
Monday, August 26, 2013
In the early 1990s, my teams (Earth Atomizer and Big Brother) recorded every pass of the season, entering them in a notebook using a shorthand notation during games, and some friends and I would compile them afterwards. Among other things, we found that forehands were thrown away about 50% more frequently than backhands, about 60% of hucks were complete (except for a certain anti-stat hothead who went 4 for 16), and 1.5% of passes were dropped. We did use that first piece of knowledge (coupled with “scouting” observations) about forehands to decide to force forehand most of the time. But what else did we gain for all that time spent?
I have come around to believe that for the time being, the concrete value of tracking individual statistics to predict or to evaluate is doomed by two things, context and sample size. We tried to make one adjustment for context, namely, separating out “tough games” from “chump games”. But then that fed into the second issue, sample size, since we now had fewer games to draw from. And was the line between “tough” and “chump” in the right place? Some of the games were tough because of bad playing conditions, others because we just played badly or exceptionally well in a game that would normally be a blowout, and still others because we had a skeleton crew. Oh, and some teams played zone and forced the handlers to pile up twice as many throws as usual (without, I hope you realize, playing twice as well). But we counted them all equally.
(I should also add that “opportunities” is something that must be accounted for when trying to analyze. But it’s not always as simple as dividing by the number of touches. In the seminal basketball analytics book “Basketball on Paper”, Dean Oliver (now in charge of stats at ESPN) highlighted that player efficiency decreases with increased usage as the players who bear the brunt of the offensive load have to make plays that are closer to the margin. Furthermore, these players will also draw the toughest defenders.)
But we could still tell who was good at completing passes, right? Well, as I like to say at my job where I analyze my company’s engineering performance, it depends. They recorded individual stats on last year’s NexGen tour. I was very excited to get this dataset, because every game was against a quality opponent, almost everyone played almost every game, and each game was a showcase and not just one of many in a long weekend. As it turns out, this dataset too suffers from some confounders such as the first half of the tour beings spent figuring out how to play together and what roles to settle into, but it was still the purest dataset I know of. For the complete tour, turnover percentage* of the players ranged from 3.4% to 12.4%. But for the most part, the guys at the higher end of the turnover range also threw a higher percentage of their passes for goals, while the low-turnover guys didn’t throw as many goals. Here’s the graph for all of them, split out by how often they touched the disc per point: *They didn’t separate out drops from throwaways so we’ll have to use this instead of incompletion rate.
Note also that the high touch players were in the lower left corner. I can think of two explanations for this besides them being conservative handlers. One, an in-bounds pull almost always results in an uncontested completed pass. Two, passes in general in that half of the field are typically easier to complete because the defense has to respect the threat of the long pass, and I think that handlers have a higher percentage of the touches there than they do closer to the endzone, where pass frequency is more evenly distributed. At the other end of the graph, deeps are going to be catching more of their passes near the endzone, resulting in relatively more opportunities for goal throws but also with each completion a little more difficult because of the reduced space. The risk/benefit of a few extra yards changes near the goal line as well. I wrote in “Ultimate Techniques and Tactics”, a book co-authored with Eric Zaslow and published by Human Kinetics and still available through your favorite Internet reseller, that being in the endzone instead of just on the goal line increases your chance of scoring as much as being 10 yards closer elsewhere on the field.
I once set up a simulation of an offense where the players were equally talented (i.e., had the same incompletion rate per yard of throw) but had different roles in the offense and different throw choices. The first thing I noticed is that a particular player sometimes had MVP-level tournaments and sometimes had tournaments where he would have been benched. The more important point, though, was that the players’ stat lines resembled those of real teams such as NexGen, with some players racking up the goals and turnovers while others had lots of touches but few fantasy league stats. This leads me to conclude that much of the difference between the stat lines of any two players is not a difference in effectiveness but simply a matter of taste. (Note that there are still some players who stand out, either good or bad, but you generally don’t need a calculator to know that.) Two equally-efficient players can have drastically different stat lines due not to any difference in skill or on-field decision-making but to the difference in their roles.
So what do these detailed individual stats (at this stage in our history, where we have only the stats of our own team against a wide range of opponents in vastly different environments) bring to the table? Accountability and self-awareness. Lord Kelvin wrote, “If you cannot measure it, you cannot improve it.” Simply being aware of your actual completion percentage on hucks should force you to contemplate whether you are making good choices. I remember going over each of my turnovers in a weekend (with the aid of the stat pad) and being shocked to learn how many of them were simply poor risk/reward decisions, and I was able to eliminate some of those.
Lest you think I’ve given up on stats, I haven’t. But I think the payoff for now would come on analyzing team decisions. The first priority would be to get realistic baselines for performance. I routinely see people write that five turnovers in a game is typical or that drops never happen or that hucks are completed 75% of the time. While there are certainly examples of these happening, I would guess that they aren’t the typical performance. The other area I would like to quantify the value of particular scenarios. For instance, how much harder is it for a team to score off a deep, high pull vs a low pull vs a brick, and how consistent are good pullers at achieving good pulls? Could someone who is an otherwise bad defender still be a good D player simply by virtue of his pulls? On the offensive side, exactly how costly is it to rest one of your top players? How deadly is it to turn it over in your own half of the field? Might the Huck-‘n’-Hope offensive style actually be a reasonable strategy due to the long field left after a turnover? We might have opinions about those now, but until we measure these, we don’t know.
Friday, March 01, 2013
Thursday, July 07, 2011
But by springtime, I was back to 100%, though of course 100% ain't what it used to be. I did some sessions with a personal trainer through an online coupon, then I discovered a cardio/core group workout in town and have been going once or twice a week since then. Add in the usual basketball/softball/tournaments/other workouts and I'm actually feeling pretty strong these days (again, see above 100% comment).
Because it was free, I applied for the World Championship of Beach Ultimate team, and got picked for the Masters team. When applying, I thought that I probably wouldn't go if selected, but once the selection actually happened, I got a bit stoked about it, so I'll be heading to Italy this August.
My frisbee season kicked off at another Italian beach tournament, Paganello, which is like Spring Break but with a four-day ultimate tournament thrown in. I played again with the team known this year as Los Rabbit. We had 17 players, up from about 11 two years ago when we lost in the finals as Los Ox. (The team won last year as Los Tiger but I couldn't make it.) This time I spent the day in Milan on my way there and walked around the city. I'm always impressed by the huge churches, in this case the Duomo, which when built was supposed to be able to accommodate all 40 000 of the city's inhabitants. As always, hanging out with friends and taking part in the event's festivities are a large part of the tournament. We had cocktail hour at the seaside hotel every night, including one night where the hotel had a wine and cheese party for its guests (we assumed at first that there was a private function, but then we found out it was for us, fresh off a late game). The big tournament party as always featured lots of people wearing weird costumes to fit the theme.
This was the tournament where I felt most like a role player. I belonged on the team, and I could have played more without the team getting worse as a result, but I could have also played less without the team getting worse. PT was fairly even in pool play (we never called subs), and I was moving and playing very well. Prior to the quarterfinals, for some still undetermined reason, I completely hit the wall and felt like I was running in very thick and deep sand. I couldn't even play without feeling like I couldn't make it through the point if we turned it. (I did get a layout block early but am pretty sure it was gift-wrapped for me by the thrower.) I took myself out of the game because it was so close and we had lots of options. I recovered a bit for the semis later that day but still felt pretty crappy. Even the next day after a relatively calm Sunday night, I still felt like crap, so in some ways, my performance in the finals should rank among my career highlights, even though I only played 4 or 5 points (about half of our O points), since I had to go all-out just to play (and I distinctly remember hearing myself breathing fast while running down the field). Anyway, got my first Paganello championship. Perhaps my biggest accomplishment, though, was in making my flight back despite the Italian transportation system doing its best to thwart me. Don't believe it when you hear "at least the trains run on time."
A few weeks later was the White Mountain Open. Rain forced us to move to a multi-purpose sports facility in Quechee. But never before had I seen a combination driving range/inclined polo field. We started off the day with only 7 players and added two late in the first round. We played well enough through 1.5 games before collapsing. I had to start calling timeouts to give us some extra rest. (It didn't help me that I had done a particularly hard cardio/core workout the day before.) We got a few extra people on Sunday and that made a big difference, and we stormed back to take 9th place. At 13-13 in the finals we threw it away in their end zone, but Alex made the defensive play of the day. He ran "full speed" into an opponent and his girlish yelp of pain/fear threw off the cutter enough that he stopped his cut to see what was going on and the disc (which was in the air) hit the ground. We punched it in, then got a break to win 15-13.
Next was the GM qualifier. One of the teams bailed and blamed the USAU for their not knowing what was going on, so we played only two games. Again I had a hard cardio/core the day before so was a bit fatigued, but it didn't matter. Our whole team played a bit sloppy. We won, though, and qualified for the GM championship, which is this weekend in Ohio.
A few weeks later was the Boston Invite. The Masters RC was able to work it out with the TD that we could have a pool of Masters teams on Saturday, thus counting as a Masters tournament that will require one fewer team at fall Masters Regionals in order to avoid the anti-wildcard. We had our best day of DoG Masters in quite some time, winning all four games, including 15-10 against the Canadian team GLUM (who weren't at full strength). We played a team of Dominicans + Brodie + a couple other Americans in the 9-24 pre-quarters, jumped out to an 8-4 lead, and limped home to a 14-12 win. This put us in the 9-16 quarters against Mephisto. We were already starting to lose players and so did open subbing. We started out well, going up a break, and even had a second break but it was called back on a pick that the defender would have had no chance on, we turned it, and they didn't look back. We were then scheduled for two consolation games, but we were down to fewer than 10 people who _could_ play and nearly 1 who actually _wanted_ to play, so we discussed with the other teams and arranged it so that we didn't have to play and the teams who wanted to play could play.
And as I mentioned, this leads us to today. We are seeded 2nd in the GM tournament, with a likely semifinal matchup against Surly. Top seed and defending champ Old And In The Way is most likely not going to be as strong as last year due to having to leave Colorado this year (and the rest of us will not have to acclimate). It's always a pleasant change to go from playing against young kids who are eager to lay out into you to playing against old guys who are even more afraid of hurting themselves.
Friday, March 25, 2011
I would like to hear your statistically informed opinion on the following thought experiment: assume that there are thirteen players of roughly average (on the scale of all ultimate players) and equal ability (compared to each other). The fourteenth is a player of outstanding ability--someone widely thought to be one of the best players in the game.
They play a pickup game in which everyone is trying their best to win. What is the probability that the team with the elite player wins?
Hey, good question. I did some simulations about 15 years ago for a UPA Newsletter article. I will use the chart in there to make estimates.
(First, what is an "average" ultimate player? What is the average income between a homeless guy, Joe the Plumber, and Bill Gates? When you have such a range between high and low, "average" becomes a funny concept. I'll assume "average" is someone who would fit in nicely on a low-level regionals team.)
Two teams that score at equal rates will of course win an equal amount of the time (with a slight advantage to the team that receives in the first half, but we'll ignore that). A team that has a 5 percentage point advantage (e.g., 40% vs 35% of the time they touch the disc, they score) will win 65-75% of the time (with the bigger advantage when the percentages are at the lower end). A 10 point advantage goes from 76-87.
With the average groups, I'll assume that teams score about 30% of the time. Top Open teams playing against top Open teams in moderate wind might be around 50%. What effect does this awesome player have?
First, I think the effect on defense will be less than on offense. He will get some poach blocks but since there is no star on the other team he won't be able to thwart their offense. Let's assume he gets 3 additional blocks but otherwise has no effect on their offensive efficiency (such a player at the elite Open level would be possibly the best player in history). Previously they were 15/50 in a game to 15, change that to 15/53, that's a drop to only 28.3%. To lower their % to 25%, he'd need to get 10 blocks a game.
Let's pause for a minute and consider what a superstar team would do against this team. I'd guess 15-1 or 15-2 is a fairly typical score for a game like this, though there is a question of whether they are trying their best to win, if for no other reason than they have 4 games that day (but so does the other team, and I'll guess they aren't in as good shape so would be further from peak efficiency). If they had 5 turnovers, that'd only be 75%. So, adding 7 elite players to an average team would take you from 30% up to 75%. I suspect that most of the benefits come from the first one or two, and almost nothing from 5-7. (Dennis suggested 20 years ago that the highest marginal value is provided by the second player, because that gives the first player someone to throw to). So, to get those 45 percentage points, I'll say it's 14, 14, 9, 4, 2, 1, 1 for each added player.
That puts the O efficiency at 44%, D efficiency at 28%. That means the O will score 15/34 times instead of 15/50. The other team will score 28.3% of 33 times or 9.3 goals. Set the point spread at 5.5.
Using a Pythagorean exponent of somewhere between 4 and 6, which my earlier research has suggested, that gives an expected winning percentage of 87-95%. Interpolating my table would give an estimate of about 93%.
Also, IIRC, a 40 point difference in RRI translated to a 1 point difference in expected score.
Tuesday, March 08, 2011
For those of you who have been under a rock these last five years, the SSAC, termed "Dorkapalooza" by ESPN's Bill Simmons, brings together bigwigs from across professional sports and focuses on what the data can and cannot tell you. Somewhat surprisingly, it seems that much of the focus is turning toward the squishier side of things. One talk suggested that they could predict achievement and likelihood of arrest for NFL players based solely on what they said during pre-draft interviews. (I missed most of this one, but they had one dimension that was Distrust and two that dealt with how players handled nuance.)
The biggest problem was in trying to figure out what to attend. Except for the opening and closing panels, there were always five sessions going on at once. (Some or all of these will eventually appear on the web site; all were filmed.) The big ones were all panels with a moderator and four speakers. My favorite panel was the Referee Analytics hosted by the Sports Guy and featuring noted bad-ref-hater and Dallas Mavericks' owner Mark Cuban, longtime NFL ref Mike Carey, controversial author Jon Werstheim (who claims that most of home field advantage is due to referee bias), and sabermetrician Phil Birnbaum (who has rebutted (and confirmed) many of Werstheim and co-author's claims on his blog). Cuban muzzled himself a bit for fear of being fined yet again by the NBA (this was a frequent source of joking during the panel) but still managed to let his opinions be known. One of his pet peeves is that he feels (and Werstheim's book "Scorecasting" asserts as well) that NBA games are not called consistently over the course of the game. Carey couldn't stress strongly enough that in the NFL, a foul is a foul is a foul (though he did differentiate between grasping a jersey at the point of attack and doing so away from the play). Another interesting point brought up was whether refs profiled based on past history and whether it's more fair to do so or not.
The opening session was moderated by Malcolm Gladwell, author of the book "Outliers" which focused on talent and how people become experts. His book cited the "10 000 hour rule", that a person (who is talented above some threshold level) still has to do focused practice for 10 000 hours to truly become an expert. I calculated that I have only about 6000 or so hours of ultimate (counting games as 1 hour and practices as 2 hours, figuring that much of the time I'm at the field is down time). So does this mean that there are no experts at ultimate or other amateur sports? I've often wondered (not that it's even a meaningful question) how the best ultimate players rate compared to other sports, each within their sport. Obviously ultimate players aren't as good at ultimate as Tiger Woods is at golf, but where would they fall? My gut feel now is that it's somewhere around scratch golfers or low single-digit handicap (pros are 5-10 better than scratch), which is to say pretty damn good, but with inconsistencies and weaknesses and probably no aspect of their game truly world-class. (There are something like 25 million golfers in the US.)
Anyway, good discussion at this panel, which included Houston Rockets GM Daryl Morey (one of the founders of the conference), NYG DE Justin Tuck, former NBA coach and current announcer Jeff Van Gundy, and training scientist/CEO Mark Verstegen. It was pointed out that talent can be a curse if the talent possessor relies too much on the talent during their development and doesn't hone the other skills that will be necessary when he gets to a high enough level. (In some cases, like with Tracy McGrady, talent alone might be enough to be a perennial All-Star but still be considered an underachiever, and that if he had had "a desire to practice", he could have been one of the best ever.) I really liked a quote from Van Gundy: "Soft, stupid, or selfish. You can be ONE of these, but not TWO." The most amazing statement I heard was that the panel thought that intelligence was more necessary for defense than for offense, the reason being that stupid players will make mistakes that are easily exploited and if this is on defense, the whole D will fall apart. This contrasts with my image of ultimate, where the guys who can run but don't know the game or aren't skilled get put on D, while the O players have to recognize patterns and feel the flow of the game and identify the open field space. I felt that intelligence is more useful on offense to do these things and to be able to recognize those defensive mistakes as soon as they happen and punish them. Perhaps this again speaks to the immaturity of ultimate, that "punishing mistakes" is not a given for elite players.
There were a couple underlying themes throughout. One is that the pro teams aren't especially interested in ranking the players from top to bottom with a single metric but are more interested in the marginal production they will have on their team in a particular role. Mike Zarren of the Celtics maintained that the Kendrick Perkins trade actually made them more likely to win this year and was not a trade for future considerations (and that they felt really bad trading him since they all liked him so much, but hey, it's a business). Another theme was that you need not only play-by-play data but inside information (blocking schemes, pass coverage responsibilities) to make sense of what happens on a lot of plays, and that pro teams are hoarding this information (except for baseball, which is doing amazing things with Pitch F/X, Hit F/X and Field F/X). A final theme, mentioned above, is that teams are trying to get analytics on what might be better thought of as psychology. How can a team decide whether Player A or Player B is more likely to develop based on their personalities? They are trying to quantify this to improve their drafts and their development systems. This was a big topic in the opening panel. I felt that the panel seemed to place too much of the blame on the players when they fail to develop to their full potential and not enough on the coaches or on the specific player/coach/organization interaction. I'm not sure how they would measure this, but how much of an organization's success at "developing talent" is due to making smart personnel picks and how much is due to having a good organization? What if JaMarcus Russell had been picked by the Patriots instead of the Raiders?
One final theme is that all the panelists were gracious except for Aaron Schatz of Football Outsiders, who seemed irritated that all these idiots were preventing him from being known as the smartest man in football and that all these little people at the conference would deign to bother him. I talked to Mike Carey for several minutes one-on-one at the reception and discussed my own observing experiences and asked him how they deal with certain tough issues like profiling. (He said that they call everyone equally but did admit, I think, to focusing more on certain matchups where fouls were more likely to occur.) I chatted for several minutes with two of the golf panelists. (Mark Broadie, who developed the "strokes gained" formula, told me that all handicaps actually have about the same first putt distance on average but the pros will be hitting it to that distance with their five irons while the hackers are doing it with their chip shots.) I chatted for a couple minutes with basketball stat guru Dean Oliver about ultimate (he is friends with some West Coast ultimate players). I made acquaintance with one of the MIT Sloan students whose team won the AECOM business case contest and he was friendly.
Another favorite topic was on the "optical tracking" in the NBA. They put three cameras per half-court in a few arenas and captured 25 frames per second and so were able to track where everyone was all the time. Some interesting stats they came up:
A tip-in attempt is 22 percentage points lower in shooting average than a putback.
Every 1.5' in extra shot distance costs 1 percentage point.
A contested shot is 12 percentage points lower than an uncontested one from the same distance.
Defenders space themselves from the shooters close to optimally, so a shooter who steps back costs himself as much by having a longer shot as he gains by being more open.
I really enjoyed listening to Mark Cuban, who seemed to be on every panel. As befits a self-made billionnaire, he seemed quite sharp and tech-savvy and any fan should love to have him as his team's owner (though the Mavs do not use statistical process control on their metrics, he said). I do have to admit that I don't like the "game experience" that Cuban and other places offer these days, with the nonstop lights and loud noises that aren't part of the game-viewing experience. He seems to feel that it's about way more than just the game, that he needs to offer entertainment (in addition to a quality team playing basketball) in order to draw in fans and keep them. If you want to talk about the game, you can do it afterwards.
I was never quite sure from what perspective I was supposed to be listening during the conference, whether as an individual ultimate player, as an ultimate team leader, as a regular sports fan, as a wannabe sports stats nerd, or as an engineering metrics and stats guy. One of the presentations on some analytics software would have needed to have been modified only slightly to be presented at work, and they seemed even to have borrowed their "Analytics Maturity Model" from the Capability Maturity Model Integration (CMMI), which specifies best business practices for engineering organizations and with which I deal with frequently (we're undergoing a CMMI appraisal right now, in fact). I do occasional little studies with baseball or basketball stats (most recent one was to examine whether NBA players and coaches target round scoring numbers like 40 or 50 points (they do; there are about 50% more 50-52 point games than would be expected based on the number of all other high scoring games) but nothing too rigorous or involving too much database diving, though I keep telling myself I'll start one day (and could even justify the time as professional training).
All in all, it was a fun and worthwhile use of $275 and a vacation day. I brought home some insights, possibly some tips for work, and a heightened interest in sports. I did not bring home any ideas on the ultimate ultimate stat, and indeed have come to the conclusion that this stat is impossible to obtain due to sample size and context issues, but there is still hope for evaluating some strategic questions.