Estimating the win probability in a hockey game
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
When a hockey game is being played, its data comes continuously. Therefore, it is possible to use the stream mining method to estimate the win probability (WP) of a team once the game begins. Based on 8 seasons’ data of NHL from 2003-2014, we provide three methods to estimate the win probability in a hockey game. Win probability calculation method based on statistics is the first model, which is built based on the summary of the historical data. Win probability calculation method based on data mining classification technique is the second model. In this model, we implemented some data classification algorithms on our data and compared the results, then chose the best algorithm to build the win probability model. Naive Bayes, SVM, VFDT, and Random Tree data classification methods have been compared in this thesis on the hockey dataset. We used stream mining technique in our last model, which is a real time prediction model, which can be interpreted as a trainingupdate- training model. Every 20 events in a hockey game are split as a window. We use the last window as the training data set to get decision tree rules used for classifying the current window. Then a parameter can be calculated by the rules trained by these two windows. This parameter can tell us which rule is better than another to train the next window. In our models the variables time, leadsize, number of shots, number of misses, number of penalties are combined to calculate the win probability. Our WP estimates can provide useful evaluations of plays, prediction of game result and in some cases, guidance for coach decisions.