Rivals manager: introduction
I’ve been playing at Rivals manager for a while. It’s a fantasy cycling game rolling all over the season. The principle is simple, you select a team of 30 to 35 riders before the beginning of the season, and they will score points riding and performing during the cycling road season, from January to October. There are a few constraints though:
- Riders all have a cotation. Cheaper riders cost 3 credits, while the most expensive riders cost around 35-38 credits depending on their performances during past seasons.
- Your team has to have between 30 to 35 riders. You have to find a balance between expensive and already accomplished riders and young guns or lesser known riders who didn’t perform very well precedent years.
- Your team cannot cost more than 225 credits. Although you can technically keep it much lower than that - you could take a team of 30 riders costing 3 each - it seems a good idea to aim for 225 credits and spend as much as you can.
- There is a special category of riders called Top Ligue - top league in English. For the 2023 edition, 9 riders lie in this category: Tadej POGACAR, Wout VAN AERT, Remco EVENEPOEL, Primoz ROGLIC, Jonas VINGEGAARD, Julian ALAPHILIPPE, Mathieu VAN DER POEL, Jasper PHILIPSEN, Aleksander VLASOV. They are cycling superstars. You do not have to pick superstars necessarily, but at the same time, you cannot spend more than 63 credits in this category. This is the rule for the 2023 edition but the number of top league riders allowed or the cost constraint has changed from time to time.
Then, riders will score points everytime they perform well in a race. The more prestigious the race, the more points riders can get. Similarly, more riders will be awarded points in a Grand Tour or a Monument than in a continental race at .1 level.
Season 23’ is well under way. Based on past years, can we have an idea of what has proven to be the best strategies - and by what margin the oracle would have won each season?
Knapsack problem
So, to summarize, our problem can be formulated as follow:
- Given a total number of credit of 225,
- And a constraint on the number of team members which has to be between 30 to 35
- With an extra constraint on the cost of the most expensive riders
How can we maximize the number of points collected at the end of the year?
I’m not very familiar with classical optimization problems, but it turns out that this is exactly the formulation of a Knapsack problem. The original formulation goes as follow:
Given a set of items, each with a weight and a value, determine which items to include in the collection so that the total weight is less than or equal to a given limit and the total value is as large as possible.
Replace the collection by your squad, an item by a rider, the weight by the cotation of the rider, and the value by the total of points scored at the end of the season, and you almost got your Rivals’ challenge! We’ll have to add the two constraints over the ‘top league’ riders and the fact that the total number of riders is constrained too.
Collecting data
Let’s begin by grabbing some data for our experiment. The rivals manager website is quite old and is not necessarily at the state of the art, and there is no API available to perform the request. So let’s do a little bit of web scrapping. We’ll use rvest to do so. We’ll also use dplyr and tidyr as utility packages for data handling, and purrr for functional oriented programming functions.
First function to read the riders’ results.
Scrap web page for riders’ results
#' read results table from rivals manager website
#'
#' @param year Year of interest
<- function(year) {
read_riders_results <- paste0("http://velo-club.com/historique.php?choix=Classement&annee=", year)
url read_html(url) |>
html_element("table") |>
html_table() |>
select(-Pays, -Place) |>
rename(Name = Nom, Cost = Valeurs)
}
Second function to read the riders’ original cotations. It’s a bit redundant but riders who didn’t score any point have to be found in this table as they do not appear in the yearly results tables above (only riders with at least 1 point do).
Scrap riders’ cotation
#' read values table from rivals manager website
#'
#' @param year Year of interest
<- function(year) {
read_riders_values <- paste0("http://velo-club.com/historique.php?choix=Cotation&annee=", year)
url read_html(url) |>
html_element("table") |>
html_table() |>
select(-Pays) |>
rename(Name = Nom, Team = Equipe, Cost = Valeurs)
}
Third function is necessary to grab actual results from human players.
Collect human performances
#' read scores from human players
#'
#' @param year Year of interest
<- function(year) {
read_team_scores <- paste0("http://velo-club.com/historique.php?choix=Classement&annee=",
url
year,"&lstClassement=Managers")
read_html(url) |>
html_element("table") |>
html_table() |>
select(-Pays, -Place) |>
rename(Name = Nom)
}
Let’s now request the data for the period 2013-2022 for which we have historical data. We’ll join the results and the values table to make sure we have all the riders available at the beginning of every season.
Bind riders data for each year altogether
map(2013:2022, function(x) {
<- read_riders_results(x)
riders_results <- read_riders_values(x)
riders_values left_join(riders_values, riders_results, by = c("Name", "Cost"))
|> setNames(2013:2022) |>
}) bind_rows(.id = "Year") -> data_riders
Let’s make sure we flag properly the ‘top league’ for every year in the history. Sadly, it is hard to find valid info about who was in the top league in the past. I tried to collect all the information I could on the forum, but it’s not easy to find. I think I managed to find every member of the top league, but couldn’t find the maximum cost for top league riders you were able to pick for years 2013, and 2015 to 2019. There was no constraint at all in 2020. Since it seems the cotation strategy was the same for the 2013-2019 period, I’ll make the hypothesis that the limit for top league riders was identical to the one for the year 2014, i.e. 93.
Add Top League and compute performance ratio (Points over Cost)
<- list(
top_league "2013" = c("Joaquin RODRIGUEZ", "Alberto CONTADOR", "Philippe GILBERT",
"Peter SAGAN", "Bradley WIGGINS", "Vincenzo NIBALI", "Alejandro VALVERDE",
"Edvald BOASSON HAGEN", "Cadel EVANS", "Samuel SANCHEZ", "Tom BOONEN",
"Fabian CANCELLARA", "Chris FROOME", "Thomas VOECKLER", "John DEGENKOLB",
"Andre GREIPEL", "Andy SCHLECK", "Mark CAVENDISH", "Rui Alberto COSTA",
"Simon GERRANS", "Robert GESINK", "Ryder HESJEDAL", "Tony MARTIN",
"Rigoberto URAN"),
"2014" = c("Peter SAGAN", "Alejandro VALVERDE", "Joaquin RODRIGUEZ", "Chris FROOME",
"Vincenzo NIBALI", "Alberto CONTADOR", "Rui Alberto COSTA", "Nairo QUINTANA",
"Fabian CANCELLARA", "Philippe GILBERT", "Bauke MOLLEMA", "Mark CAVENDISH",
"Greg VAN AVERMAET", "Edvald BOASSON HAGEN", "John DEGENKOLB",
"Sylvain CHAVANEL", "Andre GREIPEL", "Sergio HENAO", "Daniel MORENO",
"Richie PORTE", "Bradley WIGGINS"),
"2015" = c("Alejandro VALVERDE", "Peter SAGAN", "Alberto CONTADOR", "Vincenzo NIBALI",
"Chris FROOME", "Joaquin RODRIGUEZ", "John DEGENKOLB", "Alexander KRISTOFF",
"Michal KWIATKOWSKI", "Nairo QUINTANA", "Rui Alberto COSTA",
"Fabian CANCELLARA", "Philippe GILBERT", "Greg VAN AVERMAET",
"Bauke MOLLEMA", "Simon GERRANS", "Romain BARDET", "Nacer BOUHANNI",
"Arnaud DEMARE", "Daniel MARTIN", "Daniel MORENO", "Tejay VAN GARDEREN"),
"2016" = c("Alejandro VALVERDE", "Alexander KRISTOFF", "Peter SAGAN",
"Chris FROOME", "Alberto CONTADOR", "Vincenzo NIBALI", "Nairo QUINTANA",
"Greg VAN AVERMAET", "Joaquin RODRIGUEZ", "John DEGENKOLB", "Fabio ARU",
"Michal KWIATKOWSKI", "Thibaut PINOT", "Rui Alberto COSTA", "Philippe GILBERT",
"Romain BARDET", "Tom DUMOULIN", "Tony GALLOPIN", "Michael MATTHEWS",
"Andre GREIPEL", "Bauke MOLLEMA", "Daniel MORENO"),
"2017" = c("Peter SAGAN", "Alejandro VALVERDE", "Alexander KRISTOFF",
"Greg VAN AVERMAET", "Chris FROOME", "Nairo QUINTANA", "Alberto CONTADOR",
"Romain BARDET", "Vincenzo NIBALI", "Michael MATTHEWS", "Thibaut PINOT",
"Julian ALAPHILIPPE", "Fabio ARU", "Jhoan Esteban CHAVES", "Rui Alberto COSTA",
"Tom DUMOULIN", "Giacomo NIZZOLO", "Nacer BOUHANNI", "John DEGENKOLB",
"Bauke MOLLEMA", "Diego ULISSI", "Edvald BOASSON HAGEN", "Richie PORTE"),
"2018" = c("Peter SAGAN", "Greg VAN AVERMAET", "Chris FROOME", "Alejandro VALVERDE",
"Alexander KRISTOFF", "Nairo QUINTANA", "Tom DUMOULIN", "Michael MATTHEWS",
"Vincenzo NIBALI", "Thibaut PINOT", "Romain BARDET", "Michal KWIATKOWSKI",
"Julian ALAPHILIPPE", "Fabio ARU", "Daniel MARTIN", "Nacer BOUHANNI",
"Arnaud DEMARE", "Philippe GILBERT", "Rigoberto URAN", "Sonny COLBRELLI",
"Andre GREIPEL", "Richie PORTE", "Diego ULISSI", "Ilnur ZAKARIN"),
"2019" = c("Peter SAGAN", "Alejandro VALVERDE", "Greg VAN AVERMAET", "Julian ALAPHILIPPE",
"Chris FROOME", "Tom DUMOULIN", "Elia VIVIANI", "Alexander KRISTOFF",
"Michael MATTHEWS", "Thibaut PINOT", "Simon YATES", "Romain BARDET",
"Nairo QUINTANA", "Primoz ROGLIC", "Arnaud DEMARE", "Michal KWIATKOWSKI",
"Geraint THOMAS", "Tim WELLENS", "Jasper STUYVEN", "Sonny COLBRELLI",
"Ion IZAGIRRE", "Miguel Angel LOPEZ", "Philippe GILBERT", "Daniel MARTIN",
"Vincenzo NIBALI"),
"2020" = c("Julian ALAPHILIPPE", "Primoz ROGLIC", "Alejandro VALVERDE",
"Peter SAGAN", "Greg VAN AVERMAET", "Egan BERNAL", "Jakob FUGLSANG", "Elia VIVIANI",
"Alexander KRISTOFF","Matteo TRENTIN",
"Thibaut PINOT", "Pascal ACKERMANN", "Michael MATTHEWS",
"Oliver NAESEN", "Tim WELLENS", "Tom DUMOULIN", "Miguel Angel LOPEZ", "Bauke MOLLEMA",
"Mathieu VAN DER POEL", "Adam YATES"),
"2021" = c("Primoz ROGLIC", "Julian ALAPHILIPPE", "Tadej POGACAR", "Wout VAN AERT",
"Mathieu VAN DER POEL", "Egan BERNAL", "Remco EVENEPOEL", "Peter SAGAN"),
"2022" = c("Tadej POGACAR", "Wout VAN AERT", "Primoz ROGLIC", "Julian ALAPHILIPPE",
"Mathieu VAN DER POEL", "Egan BERNAL", "Sonny COLBRELLI", "Joao ALMEIDA", "Remco EVENEPOEL", "Adam YATES")
)
<- c(`2013` = 93, `2014` = 93, `2015` = 93, `2016` = 93, `2017` = 93,
max_top_league `2018` = 93, `2019` = 93, `2020` = 225, `2021` = 65, `2022` = 63
)
# transform top-league data into a df
map2_vec(top_league, names(top_league),
~list(mutate(tibble(Name = .x), Year = .y, TopLeague = TRUE))) |>
bind_rows() |>
right_join(data_riders, by = c("Name", "Year")) |>
replace_na(list(TopLeague = FALSE)) |>
arrange(Year, desc(Cost)) |>
mutate(Ratio = round(Points/Cost, 2L)) -> data_riders
We now have a single data.frame will all the scores for every rider over the 2013 to 2022 period, and the information about the top ligue and the constraints which apply to it.
Table: scores and costs for every riders for 2013 to 2022
<- function(value) {
topleague_format # dash for false, checkmark for true
if (!value) " \u2013 " else "\u2713"
}
<- function(maxWidth = 55, ...) {
topleague_column colDef(cell = topleague_format, maxWidth = maxWidth, align = "center", class = "cell number", ...)
}
reactable(rename(data_riders, Top = TopLeague),
searchable = TRUE,
columns = list(
Top = topleague_column(),
Name = colDef(width=200)
))
Human performances
Similarly, we can retrieve all the results for any player over the same period. Let’s display the podium for each year.
Bind human data for each year altogether
map(2013:2022, read_team_scores) |>
setNames(2013:2022) |>
bind_rows(.id = "Year") |>
group_by(Year) |>
slice_max(Points, n = 3) |>
mutate(Ratio = round(Points/225, 2L))-> data_teams
We can observe several changes in the performance scores. Several phenomenom are at stake here. Firstly, rules have changed in the way the game was played. There were more riders in the top league category and even if the budget allowed was more important, it had an impact on the way teams were made. Secondly, 2020 was different because of covid. Thirdly, some riders have become easily predictable lastly, whith Tadej Pogacar or Wout Van Aert (and a few others) being extremly dominant. It doesn’t mean that they will necessarily be in the top teams, but they are somehow safe picks as they will win a lot of rewarding races all year long.
Plot for podium, from 2013 to 2022
library(ggplot2)
library(emojifont)
<- data_teams |>
tmp_plt group_by(Year) |>
arrange(desc(Points)) |>
mutate(Rank = fontawesome('fa-trophy'),
color = c("#FFD700", "#C0C0C0", "#CD7F32"),
Year = as.numeric(Year)) |>
ungroup()
ggplot(tmp_plt, aes(x = Year, y = Points, label = Rank, color = color)) +
geom_text(family='fontawesome-webfont', size=12) +
scale_color_manual(values=c("#C0C0C0", "#CD7F32", "#FFD700")) +
scale_x_continuous(breaks = 2013:2022) +
theme(legend.position='none',
plot.title = element_text(size = 30),
plot.subtitle = element_text(size = 20),
axis.text.y = element_text(size = 20),
axis.title.y = element_text(size = 20),
axis.title.x = element_text(size = 20),
axis.text.x = element_text(size = 20)
+
) ggtitle("Points for podium from 2013 to 2022")
Best team: Oracle
Now, let’s try to find what could have been the best team, the one we would have chosen if we were an oracle. One way to solve this problem is by using linear programming. The idea is to represent the problem as a value to be maximized while respecting the different constraints. One way to do it in R is to use the lpSolve package which is an interface to the free and open source solver lpSolve.
We can now specify our LP problem. We want to maximize the value of our team, i.e:
\[ \max \sum_{i=1}^N x_i\cdot\mathrm{Points}_i \]
where \(N\) is the number of riders in the game, \(x_i\) a binary variable (1 or 0) wether rider \(i\) is included in the team or not, and \(\mathrm{Points}_i\) the total of points for rider \(i\) at the end of the season. We have to respect the following contraints :
\[ s.t. \sum_i x_i \leq 35 \]
the team must not exceed 35 riders, and
\[ \sum_i x_i \geq 30 \]
must at least have 30 riders at minimum, and
\[ \sum_i x_i\cdot\mathrm{Cost}_i \leq 225 \]
the total cost must not exceed 225, and
\[ \sum_i x_i\cdot\mathrm{TopLeague}_i\cdot\mathrm{Cost}_i \leq \mathrm{MaxTopLeague} \]
the ‘top league’ cost must not exceed a max value which changes every year. In the above equations, \(Cost_i\) is the cost of the rider \(i\), \(TopLeague_i\) indicates if the rider \(i\) is in the top league category and \(\mathrm{MaxTopLeague}\) is a constant value for every year.
Now, we can specify our LP problem. Let’s solve the problem for the whole period now. Let’s build a maximize_score
function to easily compute the scores for each year.
Function to find the optimal team
#' @param data A df with columns Name, Cost, Points, TopLeague
#' @param max_top_league Maximum budget for top league riders
#' @return A dataframe, with one row corresponding to one rider from the best team.
<- function(data, max_top_league) {
maximize_score # define obj function
# this corresponds to maximizing the sum of points
<- data[["Points"]]
f.obj
<- nrow(data)
N # define constraints
# 1st and 2nd constraints concern all x_i
# 3rd constraints is for cost constraint
# 4th is about cost constraint for top league
<- matrix(c(rep(1, N),
f.con rep(1, N),
"Cost"]],
data[["TopLeague"]]),
data[[nrow = 4, byrow = TRUE)
# Inequality signs for constraints
<- c("<=",
f.dir ">=",
"<=",
"<=")
# Threshold values for constraints (rhs)
<- c(35,
f.rhs 30,
225,
max_top_league)
# Problem definition
<- lp(direction = "max",
linprog
f.obj,
f.con,
f.dir,
f.rhs,binary.vec = 1:length(f.obj))
# We return as a result the dataframe's rows corresponding to
# riders in the best team possible
<- data[as.logical(linprog$solution), ]
result return(list(result))
}
And let’s iterate over the years. Here are all the best teams for the 2013 to 2022 period.
Iteration on 2013 to 2022 period
# we use purrr::map2_vec() to iterate simultaneously on the data
# and on the max value for top league
<- data_riders |> nest(.by = Year)
data_nested <- map2_vec(
results_all_years $data, max_top_league,
data_nested.f = maximize_score) |>
setNames(2013:2022)
<- bind_rows(results_all_years, .id = "Year") |>
df_results rename(Top = TopLeague)
# function for table formatting
<- function(data, year) {
best_team_table reactable(filter(df_results, Year=={{year}}) |> select(-Year),
pagination=FALSE, compact=TRUE,
columns = list(
Top = topleague_column(),
Name = colDef(width=200)
))
}
# we must define tabsets programmaticaly
<- c(
template "## {{year}}\n",
"```{r, echo = FALSE}\n",
"best_team_table(df_results, {{year}})\n",
"```\n",
"\n"
)
<- lapply(
yearly_tables unique(df_results$Year),
function(year) knitr::knit_expand(text = template)
)
Let’s compare how well human did compared to the best teams.
Human vs best team
|>
data_teams group_by(Year) |>
mutate(Rank = c("First", "Second", "Third")) |>
select(Year, Rank, Points) |>
pivot_wider(names_from = Rank, values_from = Points) -> tmp
|>
df_results group_by(Year) |>
summarize(Score = sum(Points)) |>
left_join(tmp, by = "Year") |>
mutate(`Difference with best` = Score - First) |>
reactable()
Conclusion
Whoever plays that kind of game is always full of regrets. Why didn’t I pick this rider instead of this one? Why did I trust him, again? The good news is, there are a lot of possible teams better than the best human player, so hope is still there for you!
Session Info
Toggle
─ Session info ───────────────────────────────────────────────────────────────
setting value
version R version 4.2.2 Patched (2022-11-10 r83330)
os Ubuntu 22.04.1 LTS
system x86_64, linux-gnu
ui X11
language (EN)
collate en_US.UTF-8
ctype en_US.UTF-8
tz Europe/Berlin
date 2023-04-07
pandoc 2.18 @ /usr/lib/rstudio-server/bin/quarto/bin/tools/ (via rmarkdown)
quarto 1.0.36 @ /usr/lib/rstudio-server/bin/quarto/bin/quarto
─ Packages ───────────────────────────────────────────────────────────────────
! package * version date (UTC) lib source
P dplyr * 1.1.1 2023-03-22 [?] CRAN (R 4.2.2)
P emojifont * 0.5.5 2021-04-20 [?] CRAN (R 4.2.2)
P ggplot2 * 3.4.1 2023-02-10 [?] CRAN (R 4.2.2)
P lpSolve * 5.6.18 2023-02-01 [?] CRAN (R 4.2.2)
P purrr * 1.0.1 2023-01-10 [?] CRAN (R 4.2.2)
P reactable * 0.4.4 2023-03-12 [?] CRAN (R 4.2.2)
P rvest * 1.0.3 2022-08-19 [?] CRAN (R 4.2.1)
P sessioninfo * 1.2.2 2021-12-06 [?] CRAN (R 4.2.2)
P tidyr * 1.3.0 2023-01-24 [?] CRAN (R 4.2.2)
[1] /tmp/RtmpYfV9cl/renv-library-fe231bbb94
[2] /home/raphael/perso/website/posts/04-2023-Velogames-springclassics/renv/library/R-4.2/x86_64-pc-linux-gnu
[3] /home/raphael/perso/website/posts/04-2023-Velogames-springclassics/renv/sandbox/R-4.2/x86_64-pc-linux-gnu/9a444a72
[4] /tmp/RtmpYfV9cl/renv-sandbox
P ── Loaded and on-disk path mismatch.
──────────────────────────────────────────────────────────────────────────────
Last updated
2023-04-07 10:26:14 CEST
Details
Reuse
Citation
@online{nedellec2023,
author = {Raphaël Nedellec},
title = {What Was the Best {Rivals} Manager Team Possible? {A}
{Knapsack} Problem.},
date = {2023-04-05},
url = {https://raphael-nedellec.github.io/website/posts/03-2023-velogames-knapsack},
langid = {en}
}