How does NJIT Professor Bruce Bukiet’s baseball projection model work? Baseball’s Chief Geek and I discussed what it takes – in terms of logic and computing time – to model a baseball game. We also discussed the difficulty of modeling baseball compared to sports like basketball or football. It’s to Bukiet’s credit, and talent in applied mathematics, that he expressed his thoughts in a thoroughly accessible way.
It turns out there are two things that make baseball easier to model than sports such as basketball or football. The first is, unlike in football where there are many things going on at once, “a baseball game is pretty much a set of one-on-one confrontations between a pitcher and a batter. Everybody else doesn’t matter so much,” Bukiet said. “It’s the pitcher and the batter, something happens, then the pitcher and the batter, something happens.” The other thing that facilitates the modeling of a baseball game is that in baseball, order matters. “If a person gets a single and then a home run, that’s two runs. If you get a home run and then a single, that’s one run. So order matters big time in baseball.”
Back in 1988 and ’89 when he first began creating projections for the upcoming baseball season, Bukiet used what he refers to as a Brute Force approach to modeling. He determined there were only six things a player could do. “You could have a walk, a single, a double, a triple, a home run, and an out. Then after one guy, any of those six things could have happened, after two guys, you could have had six times six because order matters. After three guys, it’s six times six times six is 216. By the time you got to the tenth guy, it’s 6 to the 10th possibilities.” Bukiet found the computer zoomed through calculations for the first six or seven players. It took a full minute to get from the 10th to the 11th guy. To get from the 11th to the 12th player took closer to six minutes. “In a game, about 40 people get up,” Bukiet said. “I recognized that it was going to take many millions of years to do one lineup.”
Bukiet needed a better way. He did some further studying and realized he could apply the Markov Process to his projections. With this theory, it is understood that in a random process such as a baseball game, future probabilities are determined by the most recent values. For Bukiet’s model this meant that he had only 25 situations he needed to work with. There could be three situations with outs: 0, 1, or 2 outs. There could be 8 situations with base runners: nobody on base; a guy on first; a guy on second; a guy on third; guys on first and second; guys on first and third; guys on second and third; and bases loaded. There could also be three outs.
“Everything in baseball turns your team from one of the 25 things to another of the 25 things. … Instead of things growing six to the six to the six, you basically have 25 things that you’re going among and it’s got to be the same 25 things,” Bukiet said. “So I streamlined it and turned this many year’s operation thing into something that would take about a second and a half.”
As for the math in the model, Bukiet said it’s made up of basic stats. “There is no calculus in it. There’s a lot of addition and multiplication, a little bit of subtraction and division, and a lot of just logical thinking.”