There has been a lot of great discussion and feedback on the FF6WC Discord about the scoring system of Ultros League. Given the level of interest, I’d like to take some time to walk through the current scoring system, why the league is scored the way it is, and some of the alternatives that were considered during the league’s development. For those with an interest in the how and why of Ultros League, it might provide some interesting insight into the thought process behind some of the mechanics of the league.
“All scoring systems are bad…”
Before I dive too much farther in, I want to start off by saying that I don’t think there’s a perfect scoring system. Anyone who works in data science, statistics, or other analytics-based fields is likely familiar with the aphorism, ‘all models are wrong, but some are useful.’ I believe this holds true for building a scoring system for a league like Ultros League as well.
Ultros League is not alone in this struggle. Formula One (from which UL’s scoring borrows much of its DNA, as we’ll get into later) has adjusted their scoring system and points breakdown dozens of times over the last 50 years. NASCAR has iterated their approach several times. The NHL has never seemed to land on a perfect points approach to handle ties, the MLB shuffles up its playoff structure on a seemingly annual basis, fantasy sports is perpetually evolving and refining, and so on.
Standings, scores, even league structure can be sliced up and structured in a thousand different ways, and all of those ways carry benefits and drawbacks. Some of these are obvious, while others only start to become evident after a lot of testing, experience, and investigation. Ultimately, any scoring schema is going to involve compromises in one area to benefit another, and personal preferences will always have an impact on how people feel about it.
Long story short, we’re trying our best, but no scoring system will ever be perfect.
Stay on target
Ultros League’s main goal is to provide a “light competitive” experience for both newcomers and veteran WC players. While we don’t often emphasize the “light” portion of that, it’s important to re-iterate here. While the results definitely matter, and folks get invested in their performance (including me), this is still an async race league.
Ultros League is also set up to reward players for winning races. Rather than just measure who goes fastest, one of the design goals of UL is to aim to capture who is the best at outperforming similarly skilled racers on a variety of seeds using the same flagset, in the pressure of an async race environment. As Keith Collantine writes here:
To me, the most important thing [a points system] should do is give the title of ‘champion’ to the most deserving driver. The definition of ‘champion’ is a person who “has defeated or surpassed all rivals in competition”. Therefore I believe it’s wrong to give the title to anyone other than the driver who wins the most races.
https://www.racefans.net/2014/11/17/design-points-system-formula-one/
Finally, Ultros League is more than just a winner-take-all competition. In each division, there are actually two to three things going on: the competition to win the division and get in the Airship of Fame, the competition to get promoted (Megalixir excluded), and the competition to avoid being demoted (Tonic excluded). The more of these that are in play for more players at a time, the better.
So, given those basic league design philosophy, we have some basic touchpoints for scoring:
- We want a scoring system that offers a fair competitive landscape while maintaining casual appeal
- We want a scoring system that incentivizes and rewards winning races against the other racers
- We want a scoring system that ideally keeps the whole division engaged in one or more of fighting to win/get promoted/avoid demotion throughout a season
- We want a scoring system that is as simple and easy-to-understand as possible while achieving the above goals
Ultros League is inherently different from an MMR-/ELO-based ladder, which (in my opinion) has different fundamental design goals. Ladders likely do a better (or more accurate) job of establishing a top-to-bottom ranking of who’s better than who. However, there can be challenges when the sample size of matches is low (as it generally tends to be in Worlds Collide), and the system can tend more towards stagnation. Ultros League specifically tries to solve for this with promotion/demotion, a somewhat brute-force mechanism that provides a clear means of moving up/down the hierarchy and (when paired with the divisional structure) exposing WC players to more of their peers in the hopes of building community bonds, friendships, friendly rivalries, and so forth.
Standing on the shoulders of racing plumbers
Given the above goals, my original touchstone for looking at a scoring system was the lightly competitive but casual friendly favourite, Mario Kart. GP mode playthroughs in Mario Kart in most releases stay interesting throughout the 4 races because, thanks to the points system offering a premium reward for winning, leads are not always 100% safe and the top few players/CPU are often at some level of risk of being kicked out of the top spot (and losing the honour of getting a trophy spit out at them by a giant fish).
From Mario Kart, I moved pretty quickly to the original source material: Formula One. I quickly learned that Formula One has gone through a lot of points systems (to varying extremes of fan satisfaction). However, I noticed that F1 maintained a relatively unchanged system for nearly 30 years, from 1962 through to 1990:
Finish | Points |
1st | 9 |
2nd | 6 |
3rd | 4 |
4th | 3 |
5th | 2 |
6th | 1 |
7th+ | 0 |
Look familiar? This is basically where UL landed, with one modification: any racer who finishes the seed in UL gets their 1 point, offering an incentive against outright forfeiting races.
From this point, Formula One started tinkering with their points, a journey which is detailed in this excellent post from 2014: https://www.racefans.net/2014/11/17/design-points-system-formula-one/. Later permutations expanded the gap between 1st and lower places, and then the numbers start getting really big (which starts to stray away from one of our design goals of a simple, easy-to-understand system).
Another good discussion on the current F1 scoring system that I drew from is here: https://www.reddit.com/r/formula1/comments/p3n56w/the_great_bore_is_here_lets_talk_f1_points_system/. Near the end of the OP’s post, they identify a couple of key elements that I think are important for UL’s scoring as well:
- Point distribution should incentivice[sic] battles in the midfield
- Drivers should be in the title-fight for as long as possible
Why this works
While I did a bunch of excel tinkering with the points spread before the preseason, ultimately I felt that the 9/6/4/3/2/1 spread hit a sweet spot for a few reasons.
Simplicity
You can basically describe the scoring approach of Ultros League with one sentence and a few numbers. There are certainly more precise or complex solutions out there, but more complexity risks introducing complication, and it becomes a lot less satisfying for race results to go into a black box and come out with what can seemingly be an arbitrary outcome.
Volatility
With the mechanics of promotion and demotion, UL is designed to be fluid and dynamic from season to season. The scoring system reflects that on a micro-level within each division. With the variance between first- and last-place finishes, leads are rarely totally safe, and bottom-players are rarely completely eliminated early. Often, the entire division stays relevant for the entire season, making every race matter. Reducing the potential to clinch early also encourages front runners to keep competing, rather than clinch a win and then disappear.
Catchup Mechanics
Thanks to the premium points reward earned for a first-place finish, racers are rarely ever “out of it” in Ultros League. A single win at any point in the season can be enough to stave off demotion, squeeze into promotion, or leapfrog into first place. Remember: your division-mates are intended to be relatively comparable to you in terms of skill, so winning a race against them is no easy feat (and a rich reward is well-deserved).
On narratives, underdogs, and the thrill of uncertainty
“What percentage of the time should a better player beat a worse player?” While the knee-jerk answer is going to be “always,” this is a surprisingly tricky question to answer. In practice, if the number is too low, players become disengaged and feel that winning/losing is purely a result of chance. If the number is too high, competition stagnates as outcomes feel predetermined and as improving players struggle to see gains.
This is a recurring theme in sports. Baseball provides some good examples here, where the strongest teams in the league typically have a 60 to 65% winrate (and the best teams of all-time topped out at 76%). This arguably means that the “worse” team wins a third of the time.
In theory, this sounds broken, but in practice, this really gets us to the root of the excitement of sports. Underdog stories of a weak team beating the odds and winning it all are the stuff dreams (and sports movies) are made of. Moneyball is a classic story of a team that is slowly and cleverly built to become one of the strongest in the league, which amasses a huge winning streak and then fails to advance past the first round of the playoffs.
The uncertainty factor of these games gives us adages like football’s “any given Sunday,” and Ultros League leans into this. The best player doesn’t always win every race or every season. Any given race is an opportunity for a top-tier racer to have an untimely poor performance and an under-performer to have that one perfect race. With the current scoring system, that underdog win is, as often as possible, a meaningful and impactful event. Again, your division-mates are intended to be challenging opponents (for your skill level), and beating them is an achievement that deserves to be richly rewarded.
Looked at another way, the points system of Ultros League purposely blurs the line between pure player skill and the random variance of seed and race outcomes from week to week. On the whole, the best players will often win their races and generally win their divisions, but the points system intentionally muddies the waters there a little bit. We still take the results from 6 races to aim to reward players who perform consistently in races versus players who generally only succeed under narrow circumstances or specific/ideal seed conditions, but there’s enough wiggle room here that being in a division with an established top-level player doesn’t mean your season is over on week one.
Alternative approaches
Early on, I did look at a few other scoring systems. Here’s my thoughts on them.
Incremental scoring
Instead of 9/6/4/3/2/1, this would be something like 6/5/4/3/2/1. It’s simple and it’s even easier to track and remember than Ebot’s Rock coral! However, it doesn’t succeed as well on our goal to incentivize and reward winning. In this scheme, players who consistently get 2nd and 3rd have a very good chance of being promoted despite never winning a single race. Without question, these are good and consistent WC players, but again, the idea of Ultros League is to reward winning races. If the goal is to purely identify the best and most consistent performer, the better approach would probably just be…
Using total/average time
Sure, we could just run eight races and everybody take their best six times. It’s a good sample size, and you’ll probably end up with a pretty good top-to-bottom ranking of players. But is it very exciting? It lends itself to insurmountable leads, and it almost entirely kills the fun of weekly competition. On a week-to-week basis, there’s far less excitement here in watching your division-mates and seeing how they do. Finally, I think it’s just kind of dull.
“Golf”
I don’t know if this is the most accurate way to describe this concept, but essentially this would be giving the fastest racer each week a 0 score, and then having everyone else add the difference between their time and the winner’s time to their score. I think it’s a bit reductive since in a sense you can feel like you’re racing against just one person each week, specifically the fastest person. You can modify this to have a points-system as well, but this is also not a super intuitive scoring approach to begin with, and further levels of complexity start pushing this one into complicated territory.
The current system, but compress the scores
Probably the alternative I dislike the least, but it’s hard to compress the numbers much more. Dropping first place from 9 to 8 points (to 8/6/4/3/2/1) still gives a win some weight, but it’s a bit less, and you might be more likely to have the top few players trading places with less chance for outsiders to break in. If there are any changes made, this is the most likely, but I’m not sure that I’m convinced the benefits outweigh the negatives (and I think the answer is probably pretty subjective). Something like 7/5/4/3/2/1 gives a very small bonus to first place, but this is essentially the incremental approach described above at that point and we lose a lot of the “any given Sunday” excitement I mentioned earlier.
Counting wins
This is the proposed F1 solution from this page that I keep linking (you opened it in another tab but never actually read it, I bet). I actually don’t mind this either, as it mostly hits our design goals. My biggest complaint against it is that it’s maybe a little less “fun” than the Mario Kart/F1 approach. Accruing points is fun, and I really do like the level of uncertainty that the stepped points system introduces. I’m also not sure if this system is quite as intuitive – it makes sense with a little explanation, but you can’t quite scrawl it on the back of a napkin quite as easily as 9/6/4/3/2/1.
Getting to the point(s)
Ultimately, there’s no perfect answer or solution to this, and a lot of the design decisions around scoring system are linked to what Ultros League itself strives to be (and not be). Subjectivity and personal preference is always going to be a factor. Again, all scoring systems are inherently flawed in some manner, but a lot of time and thought was put into how this would be approached for Ultros League, and my hope is that we’ve been able to present a scoring system that is as minimally flawed as possible (or at least equally moderately flawed for everyone’s tastes).
Not every event or structure will ever scratch every itch, either. Some events will be more competitive or scored/structured for more precision, and some will be less. Ultros League isn’t niche by design, and is in fact designed to appeal as broadly as possible to as many WC players as possible. The beauty of the diverse and engaged Worlds Collide community is that even if Ultros League is not checking enough of your boxes, there are so many other great events and activities out there that might satisfy your interests better, and if there isn’t yet, then you can always start one of your own!
All that said, league participant feedback is extremely valuable. I’ve really enjoyed the discussions on this topic and look forward to having more great conversations on making Ultros League the most satisfying and fun WC activity that it can be. So please keep playing and making Ultros League the exciting and fun experience that you’ve all made it, keep sharing your insights and experiences in #ultros-league on the Discord channel, and above all else, please don’t tease the Octopus!