|
Sportpunter's
Tennis Model
The Rating Method
The first step in creating
a model that can predict tennis matches as accurately as possible
is to produce a rating system. Whilst many punters would use
the official ATP tennis ranking’s, there is a lot of evidence
that suggests that the ATP tennis ratings are not a good indication
of current form.
Clarke (1994) produced what
is called a SPARKS method (short for set-point-marks) which
calculates the margin of victory of a tennis match.
A margin of victory is important
when looking at a rating system. Bedford and Clarke (2000)
found that the SPARKS method produced significantly better
predictions than the official ATP tennis ratings. Their method
though was relatively simple despite being amongst the first
to look at this area. They didn’t look at surfaces and many
other factors, however it was a definite step in the right
direction.
In the Australian Open 2002,
there was a lot of media coverage about why most of the top
seeds were eliminated from the event early on. In a newspaper
article in the Australian Financial Review, written by head
of Champion Data and myself, we outlined that a lot of the
top seeds were not in the best of form especially on the hard
court surface.
So why isn’t the ATP ratings
a good predictor for tennis matches?
There are a number of reasons
why.
- The ATP ratings do not take into consideration
the quality of the opposition.
If Lleyton Hewitt defeats
Pete Sampras in the first round of the Australian Open, he
would gain just as much as if he had defeated Jakub Herm-Zahlava.
- As mentioned previously, The ATP ratings
do not take into consideration the margin of victory that
the player won or lost by.
Lleyton Hewitt would gain
just as many points in defeating Pete Sampras 6-0 6-0 6-0
as he would if he had defeated him 6-4 2-6 7-6 0-6 10-8.
- The ATP ratings give points for players
who win due to the opposition retiring.
If Lleyton Hewitt was behind
2-6 2-4 and then Pete Sampras retires, Hewitt would receive
points for progressing to the next round.
- The ATP ratings system gives players
points for walkovers.
If Pete Sampras obtained an
injury between matches and could not front up for the next
game against Hewitt, Hewitt would receive points for progressing
to the next round despite not playing a game.
- The ATP ratings system does not take
the playing surface into account.
Many players play better or
worse on certain surfaces and this has to be taken into consideration
when looking at a players performance. Defeating Pete Sampras
on grass is a lot better win than defeating him on clay.
- Probably most importantly, the ATP ratings
system does not look at current form.
According to the ATP ratings
system, the last 12 months of tennis is taken into consideration
for their ratings. This means that a players performance
12 months ago has just as much weighting as what that player
did last week. As a good example of this, Guestavo Kuerten
was ranking number 2 at the end of 2001, however he had lost
his last seven games.
- The ATP ratings system does not take
into considering any home ground advantage
- The ATP ratings system does not take
into consideration how long it has been since a player played
on a particular surface
- The ATP ratings do not account for head
to head performances.
- The ATP ratings system does not take
into consideration other important tournaments like the
Davis Cup, challenger and qualifying results.
- The ATP ratings system gives bonus points
to players with good sportsmanship and other factors which
is hardly representative of their current form.
Our Ratings System
Sportpunter's ratings system
does take all this into consideration and awards players with
a rating based on the SPARKS method. Each player also has
a surface rating which is added to their overall rating to
get an expected marks victory on each particular surface.
These ratings are added together using linear regression.
Based on each weeks performance, a players ratings might up
or down which is shown here
for men and here
for women. Likewise their surface rankings might also
change as shown here
for men and here
for women. The information on player preferred surfaces
is only given by rank. The amount which a players rating changes
based on the outcome of a match is found by a statistical
method called ‘exponential smoothing’ which changes the ratings
by a percentage after each match.
Interestingly a couple of
factors were also considered. One is the head to head approach.
Many punters look at past head to heads to predict what is
going to happen in a current match. However punters will generally
only look at how many wins each player has whereas they should
be looking at their current form when they played head to
head and their margin of victory. For a hypothetical example,
Ferrera might have played Hewitt 5 times head to head, Ferrera
might have won the first 4, but Hewitt the last game. Does
Ferrera currently have the edge over Hewitt based on his 4-1
head to head record, or was it Ferrera's win came about when
he was number one in the world and Hewitt was only just starting
in tennis.
Sportpunter's tennis model
takes these into consideration, by looking at the expected
margin of victory and the actual result based on the quality
of the players at the time, the surface it was played on and
many other factors.
Home ground advantage in tennis,
whilst despite smaller than many other sports still exists.
Sportpunter has added this into their model. By a full analysis
of every single player played in every tournament around the
world, Sportpunter has been able to determine exactly how
much home ground advantage comes into play and how much of
this is just merely due to the court surface and other factors.
The theory that players will
not come back well after playing a five set match however
does have an effect. It has been shown statistically that
players do tend to play below their usual performance if they
last played a five set match in grand slam tournaments. Sportpunter's
model accounts for this. And has accurately determined how
much of a factor it has.
Tiredness and unfamiliarity
are other factors which Sportpunter's tennis model takes into
consideration. If a player is "first-up" on a particular
surface, Sportpunter has determined what effect this will
have on their performance on average. Likewise Sportpunter's
model looks at how many matches a player has played on a surface
throughout the tournament.
Sportpunter's predictions
are updated every day, taking into account every match previously
played.
One of the advantages of a
computer model for analysing tennis is it's memory. While
humans will forget how a tennis player played one month ago,
a computer model will never forget. And more importantly it
will determine to what degree that match a month ago has an
effect.
Head to Head Probabilities
The expected outcome of each
match can easily be converted to a probability. The method
to convert these expected marks to probability is done by
another statistical procedure called ‘logistic binary regression’.
Calculating the Tournament
Probabilities
Given that we have the probabilities
for head to head matches, we can simulate the entire tournament.
The total amount of possibilities of a tournament are incredibly
high. For a small 32 player tournament, there are a total
of 31 matches, and therefore there are 232 possible
outcomes which is over four billion different outcomes. Given
this, the best way to calculate the probabilities for a tournament
are via computer simulation. Each match is simulated and their
rankings are adjusted after each round. This is an important
step because if a little known player with a small ranking
previous to the tournament made the final, his ranking at
that time would be a lot higher originally due to his good
form throughout the tournament so far.
So the tournament is simulated
approximately 10,000 times depending on the size and the number
of matches remaining.
Using the model to gamble
Original this was not the
purpose of the model, it was just for matter of public interest,
but seeing if the model is profitable is an important part
of any statistical model when predicting sport outcomes.
Why is this? Well it’s quite
simple. An important distinction to make is that bookmakers
do not make odds based on the probabilities of winning, but
rather what the general public thinks the probabilities of
winning are. Their main concern is to balance the books based
on what the average joe-bloe believes.
Therefore the model could
be proven a statistically better predictor than the general
public if it is profitable based on bookmakers odds. Given
below is a step by step method of how one can gain an advantage
over bookmakers and have the potential to make money by gambling
on tennis matches.
The Gambling Technique
Converting Bookmakers Odds to Probabilities
By converting a bookmaker's
odds to probabilities we can directly compare these to our
own probabilities to see if there is a possibility of an advantage
in a gamble. The inverse of the bookmaker's price is the expected
probability.
For example in late March
20001 the bookmaker's gave Fernando Gonzalez (CHI) odds as
high as $4.50 to defeat Pete Sampras (USA). This means that
the bookmaker's (or the general public) believe that Fernando
Gonzalez (CHI) have approximately a chance
of winning. We had predicted a 71.7% chance for Pete Sampras
(USA) to win the match, consequently this means that Fernando
Gonzalez (CHI) have a 28.3% chance. This probability is higher
than what the bookmaker's have Fernando Gonzalez (CHI) at
and therefore this is where we have an advantage over the
bookies and would gamble on Fernando Gonzalez (CHI) for this
game.
Put simply, we have a 28.3%
chance of returning $4.50 from a $1 bet, so on average our
$1 bet will return 0.283 * $4.50 = $1.27. Hence an expected
profit of 27%.
How much of an advantage
do we have?
The advantage over the bookmaker's,
or overlay, is calculated by taking the bookmaker's price
into account by the following formula:
Overlay = [Our probability
* Bookies Price] – 1
Therefore in this game, we
had an overlay of (0.283 * 4.50) - 1 = 27.3%
This overlay is very large,
and represents a very good betting opportunity, even though
we still believe for Pete Sampras (USA) will win the match.
It is important though not
to bet on any match that has a small overlay. To take into
account some error, one should only bet on matches where a
large overlay is recorded.
Not all matches will we have
an advantage over the bookmaker's however. If the bookmaker's
price is similar to our probabilities then there is no room
for an advantage. This is mainly due to the fact that the
bookmaker takes a 5% to 8% overlay per game.
How much should be bet?
Even if the odds are on your
side, you still need to guard against losing all your bank.
We can work out mathematically the percentage
of your bank you should bet to maximise your rate of growth.
The amount to bet is given
using a system called the 'Kelly' method which was found by
Kelly in 1956. It uses the bookmaker's price, your probability
and the amount of overlay that you have in determining how
much to gamble. For more information on money management please
see our money management page.
When shouldn't we bet?
Although some will say this
is up to the individual, I believe there are a few times when
one shouldn't bet on an event. One of these is when a player
has not played many games on a particular surface. When the
matches are shown from our website, there is also a column
that shows 'Games (p1)' and 'Games (p2)' which refer to how
many games player one and two have played on the current surface
that the tournament is being played on. Betting on matches
where these values are low could result in more long term
losses. The reason for this is is that when a player starts
his first game, he is given a surface rating of zero, this
is because we have little information about this player. However
he might be a clay court specialist, but our ratings will
not reflect this. Therefore it is only recommended to gamble
on a player once he has played several games on that surface
and can develop a satisfactory surface rating.
Another time in which one
probably should not gamble is when a player is returning from
injury or currently has an injury. Likewise I would refrain
from gambling is a player retired recently from singles or
doubles matches.
There are also times when
in the past it has been proven statistically that betting
on certain matches might not be profitable in the long term.
This is due to a number of factors including a bias towards
the underdogs with bookmakers prices. To see a full analysis
of the tennis model then click
here.
When does Sportpunter bet?
Sportpunter, follows the bets
and bets on exactly what the model suggests. This way you
can feel secure in that we want to make the model as good
as possible because we too are betting on it. Our suggested
bets that we display are for when both players have played
at least 5 games on that particular surface. We don't recommend
betting on games where one of the players has played less
than 5 games on the surface.
Similarly we look at overlays
when making our bets. History has shown that there is little
value in betting on big outsiders in tennis; the favourite
win in tennis more often when compared to other sports. Bookmakers
know this and because of the favourite long-shot bias we don't
bet on players where the calculated probability of then winning
is less than 30%. When a player has a probability of winning
between 30% and 50% we only bet on them if the overlay is
above 25%. But when we calculate the player as a favourite,
we bet on them no matter what the overlay is.
This is what comprises Sportpunter's
tennis selections, and is exactly what goes into out betting
history. The Excel spreadsheet that you have received has
a system where you can judge the minimum overlay to bet based
on the probability. It is currently more "smooth"
than the simple method mentioned above and might be more to
your tasting. If not then simply change it so that your bets
will equal the one's that Sportpunter suggest.
Where did you find those
odds?
In fact, well we didn't. Because
different tennis matches are played each week bookmakers can
only suggest odds the day before they are being played. This
means that bookmakers often release odds close to the start
of the tennis match. Whilst we would like to use these odds
to help you with your suggested bets, many people like to
know far in advance who they should be betting on. Hence we
create "average odds" that are standardised over
the average of all bookmakers to 103%. Pinnacle
Sports has about that same margin, and betfair
have as low as 102% and even less if you know how to lay.
Hence the odds given for the suggested bets are more like
average odds. You should be able to find odds that are better
than the one's suggested.
How have we gone so far?
Our past record for ATP tennis
is shown here.
From this it can be seen that in only half of the year 2003,
we managed to increase our bank account from $500 to over
$10,000. Our %ROI or %return on investment has constantly
hovered around the 5% mark.
Given our good long term history,
the potential to make money is very high. There are 3,000
ATP matches in a year and 2,500 WTA matches. Hence lets suggest
that we bet on 1000 matches. The average betsize has been
shown to be 20% of your bank. And with the quarter
kelly method of betting this means an average bet size
of 5% per bet of your bank. Lets assume that you have a bank
size of $5,000. Given this one can approximate how much they
believe they will earn in a year.
Potential Profit = Bank Size
* %BankBet * #Games * 10%ROI
Potential Profit = $5,000
* 5% * 1000 * 5% = $12,500.
So with a modest bank size
of $5,000, one could gain a profit of $12,500 should the results
in the future follow that of the past. And there is no reason
why it shouldn't. Although the betting history has been updated
since April 2003. Long term followers will know that the model
has been running since Jan 2001 with equally as good results.
And Finally…
I hope that you enjoy my website
and get the most out of it for yourself. Whether you’re a
punter, or just interested in tennis, or maybe interested
in sports statistics and mathematics, I’m sure that you will
get something interesting out of this website.
If you have any questions,
please feel free to email me at jlowe@sportpunter.com.
Otherwise happy punting!
|