A case for in-game time using statistics; recommendations for leaderboards[LONG]
6 years ago
Madison, WI, USA

[big]tl;dr[/big] Math can detect cheaters, load times are inconsistent, people cheated (intentionally or not), RTA is bad. The leaderboards should be in-game time. Dolphin should maybe be legalized.

[section=Introduction] Yeah, I wrote 5 pages about Mario Kart load times, but I spent 3 months on this so what do you expect.

[big]Why[/big] I was curious how easy it would be to add up times for a 32 track run using a spreadsheet and I got very carried away. I’ve already been interested in load times and the in-game timer from previous forum posts on here, and compiling all this data produced some helpful results.

[big]The spreadsheet[/big] [big][big]Here it is <-- This is a link, it's hard to see[/big][/big] This spreadsheet contains all the data I will reference in this post. This consists of data from all of the runs in the individual cup leaderboards with a full video and mostly legible on-screen in-game timer. I didn’t do the 16 track and 32 track categories because I’m sane and I’d prefer not to watch even more runs. If anyone want to help log those times, feel free.

The Suspicious Runs sheet is discussed in detail in the Suspicious runs section. The All Cup Stats contains a data summary for all the non-play times calculated in the spreadsheet. The median non-play time was used instead of an average was used to rule out any skew caused by the mistimed runs like cookie587’s 51 second mistime. The 1st Quartile is the 25th percentile time for each cup representing an approximate best-possible non-play time for each console. The Wii Non-Play and WiiU Non-Play sheets are the Google statistics add-on output for the lists of non-play times at the bottom of the All Cup Stats sheet. Look at these if you love statistics and graphs. These are one of the few things I have to update manually, so they may be out of date occasionally. The X Cup sheets contain the specific data for each run. Runs submitted to two leaderboards (like No Skips and Skips) are listed twice. Runs that were mistimed are listed twice with both the mistimed time and the corrected time. The Real Time column on these sheets is the time claimed on speedrun.com unless noted as a corrected time. These sheets also make up an unofficial IGT leaderboard if you’re interested in that. The Deleted Runs sheet will contain reported runs that were removed from the leaderboard and also removed from my X Cup sheet.

[big]Terms[/big] IGT – In-game time. Timing a speedrun using Mario Kart’s on-screen timer. You time a whole run by adding up the on-screen times for each track. RTA – real time attack. The method of timing a run using an external timer (or video timecode) without pausing the timer. This is the method used by most speedrun leaderboards and is currently used on the MKWii leaderboard. Claimed time – The time the runner submitted to speedrun.com. Note this doesn’t always match the actual time of the run. Non-play time – The time you get when you subtract the in-game time from the real-time of the run. This consists of load screens, cutscenes, menus; the parts where you’re not really playing and just mashing buttons.

[section=Where the non-play time can fall] The non-play time for each run can fall into 3 categories:

  • Outliers greater than the median
  • Around the median
  • Outliers less than the median

Runs with non-play time greater than the median can result from 3 things (in order of likelihood):

  • Slow load times
  • Not mashing well through menus and cutscenes
  • An actual time lower than the claimed time (claiming a time slower than the run actually was) These runs are pretty uninteresting, but are helpful for the runners that fall into this category to understand why a similar (or even faster) run may result in a slower time.

Runs with a non-play time around the median are regular data and are important to help find outliers.

The most interesting runs are ones with non-play time less than the median. They can result from 2 things:

  • Fast load times (due to alternate forms of loading such as Dolphin or a USB/SD loader)
  • An actual time higher than the claimed time (claiming a time faster than the run actually was) This category is great because with this data, we’re able to consistently find what some would call ‘cheaters’ by just inputting a few numbеrs. Not all the players in this category are cheaters. They’re probably the minority. I believe most of the ‘cheated’ runs here are not intentionally malicious; many of the runs are here due to a timing error or maybe a misunderstanding. Detailed coverage of these runs can be found under Suspicious runs.

[section=Suspicious runs] The first tab on the MKWii IGT spreadsheet is a list of the all of the suspicious runs found from my data. Most of these runs were found because the non-play time was an outlier. I have a rule on the spreadsheet that highlights any time that is less than 1.5*IQR for each console. This is a standard statistical method for finding outliers in a normal distribution. Here I’ll cover a few of the runners here and how I made my conclusion.

[big]RobBobTheCornCob[/big] Mushroom Cup/Skips/Balanced is the fourth most popular category on the MKWii leaderboards and this guy got WR by just lying about his claimed time. This is one of the most egregious examples of the poor moderation on these leaderboards. In the submission comment, RobBob says, [quote]For some reason the Roxio extends black screens and these also mess up the audio. So,the audio is annoying in this whole video and I used Insane's time to confirm the time of this run.[/quote] What I’m concluding from this comment is that he used his capture card as an excuse to just copy Insane’s time. The non-play time for this run is 7 seconds faster than the 1st quartile for Mushroom Cup Wii U times. Timing by the video it’s 7:22.126 which puts his run within a second of the median. To be clear, the run is great and deserves 2nd place, but I’m not sure how this got by the mods. I even reported this in the Discord in October and nothing was done about it. Please just watch the videos.

[big]Gabbbii633[/big] Gabbbii’s runs are an example of what I might consider intentional cheating. Gabbbii has 4 verified runs that pretty obvious cheats by just watching the video. In the 3 (actually two, one is just submitted under two categories) Lightning Cup runs Gabbbii shows a phone timer at the start and after the run. The times on the phone are wildly different from just timing using the video. Even comparing the amounts cut off by what I’m assuming is stopping the timer early offscreen yields different amounts, which rules out a ‘slow’ timer. In Gabbbii’s 7:10 Shell Cup run, only the last two tracks are recorded on video, but with that I’m still able to conclude that the time is probably false. Getting the claimed time would require WR level in-game times on tracks 1 & 2. This seems too unreasonable, especially considering the skill level and other suspicious times submitted by the player. I wouldn’t expect the mods to catch this, especially when the submitted times are outside of the range that require a video. I don’t want to punish someone for submitting a run with a recording (which is uncommon on this leaderboard), so I’d recommend just editing the time to be proper.

[big]Menozen[/big] Menozen has two runs (including one WR) in Mushroom Cup that have outlier non-play times. The 8:05 WR No Skips/Strategic run has already been discussed a bit here. This run does not suffer from any timing errors, and it’s evident that the load times are faster. An interesting fact about this run is that timing by IGT, Francesco’s run is 8 seconds faster. Timing by real time, Menozen’s run is 9 seconds faster. That means Francesco loses 17 seconds to Menozen in just load times. This is exaggerated by the fact that Francesco has the slowest Wii I’ve documented in my data. The No Skips/No Items run also features people getting screwed by RTA timing. Timing by IGT, Spoofy’s 8:08 run and Spec’s 8:13 run are both faster than Menozen’s 8:06 run. These 3 people are all playing better runs and yet are still slower than Menozen due to RTA. So how are Menozen’s load times so fast? The possibilities are either Dolphin or a USB loader, so how can one tell which? Recently I figured out that each console has a unique audio signature visible in a spectrogram. Luckily, Menozen included audio in his run so we can check it using that. In the following image, I show samples of the audio from runs done on various consoles. https://i.imgur.com/qBZ305C.png You can see that the Wii has a distinct noise in during the load times which is due to the analog audio output the Wii is limited to. The Wii U has a different audio signature with less low frequency sound going into the load and no noise due to the nature of recording diɡitally through HDMI audio. Dolphin also uses a purely diɡital route, but shows an audio signature unique to itself featuring strong low frequencies and a light noise elsewhere going into the load. Just by looking at the image, it’s clear that Menozen’s run matches up with Dolphin’s audio signature. The image also shows how much faster his load times are compared to Wii, Wii U, and even the Dolphin build I used.

[big]TheSeineTV[/big] Similar to Menozen, TheSeineTV has two runs that have a fast load times. The 8:11 Mushroom Cup Skips/No Items run has already been discussed a bit here. An interesting fact to point out about TheSeineTV is that they have a point-a-camera-at-your-TV Shell Cup run with the Wii in frame in that has legit load times, but the game is in German. All the suspected Dolphin fast load runs are in English, not to mention the high resolution and anti-aliasing. Just by that, it’s clear the suspected runs are Dolphin. TheSeineTV’s Mushroom Cup run also features Spec’s 8:13 getting screwed by RTA timing due to fast loads. I’ll also note that TheSeineTV also has a Retro Tracks run that is in English and high resolution that I’d suspect is Dolphin.

[big]Insane[/big] Insane, the king of individual cup MKWii speedruns. He seems to have first place in nearly every category. Unfortunately, all of the runs done on his Wii have load times 9 seconds faster than the median time for a Wii. Insane switched to Wii U recently, and his load times are perfectly normal on there, but his Wii load times are well…insane! I asked Insane about his load times in a YouTube comment: https://i.imgur.com/MANRGiW.png He claims he left his Wii on as long as possible to improve load times, and ultimately its demise. I have never heard of this method to improve load times, and research turned up nothing, so I’m still skeptical. My guess is he was just using a USB loader. Regardless of his load times, his IGT is still really impressive. Unfortunately, he gets a 9 second head start compared to others attempting to beat his times on Wii. In my opinion, this is one of the strongest reasons to switch the leaderboard to IGT. A few of these times are unbeatable on disc with console using RTA. Some of them have already been broken in IGT, but place 2nd on the leaderboards due to Insane’s load times. There’s no way for Insane to prove he had a magical Wii, and he deserves no punishment. His runs are super impressive and deserve a spot on the leaderboard. Just give other players a chance and switch to IGT.

[big]Other suspicious runs[/big] Most of the other suspicious runs are too low on the leaderboard to discuss individually, and many of them are just mistimed. I still don’t understand how these get past the mods; all you have to do it watch the beginning and end of the video and do some subtraction. It takes less than 30 seconds. cookie587 claimed a run 51 seconds faster than it actually was and no one noticed! Some of these ‘other’ category suspicious runs have faster load times, but are low enough on the leaderboard that it’s not worth my time to figure out exactly why.

[big]Conclusion[/big] Don’t ban these people for claiming a faster time or having faster loads. For the mistimed runs, just edit the run time to be correct. For the fast load time runs, it gets trickier. The obvious answer is to switch the leaderboards to in-game time, but you still must question if we then start allowing USB loaders and/or Dolphin. I think USB loaders are fine if we switch to IGT, and I’ll discuss Dolphin in the Dolphin legality section.

[section=Dolphin legality] To be honest, I’m still unsure about whether or not Dolphin should be allowed. Pros:

  • Easier recording of runs, therefore better proof standards and better capture quality
  • Better accessibility for people without a console and capture card Cons:
  • Easier splicing of runs using savestates. Submitting a Dolphin movie (.dtm) file alongside the run to check for rerecord counts would NOT work due the ability to easily hex edit the rerecord count.
  • Possible run speed/stuttering of Dolphin running game at lower speed to give players an advantage when timing with IGT.
  • Ease of locking an RNG variable to have consistent RNG throughout the run.

On the topic of Gecko codes, cheating is no easier on Dolphin than console. Cheats like a speed modifier can easily be done on console with homebrew, as well as on Dolphin without changing the code.

The first two cons could be resolved by requiring that the Dolphin frame be shown in the recording. This ensures we are not watching a movie playback, and that the game is running at or near 100% speed. The values shown here usually vary, so just superimposing a static image of a Dolphin window over a movie playback could be detected.

Locking the RNG variable in the RAM is maybe possible, but I haven’t figured out how to do that. I’m also still unsure how much that would help in a run. Memory access is always a threat with emulators and there’s really no way to prevent it. I’d also imagine it’s possible to do the same with a Gecko code on console.

[section=Next steps]

  1. Update proof standards. Either require video or update the video requirement standards for the individual cup runs.
  2. Switch to in-game time to resolve unfairness in load times. This brings up the issue of runs without videos. I think it would be okay to subtract the median non-play time for the appropriate console. Most of these runs are not near the top of the leaderboard, so 100% accuracy is not needed. For the 16 track and 32 track leaderboards, it’s going to take more work, but it’s doable. We should still require/encourage a real-time field when submitting a run to encourage players not to take long breaks between tracks to ensure the spirit of a speedrun.
  3. Remove or edit suspicious runs appropriately.
  4. Legalize Dolphin and require video with Dolphin window in view
Edited by the author 6 years ago
Madtaz64, Alayan and 11 others like this
Virginia, USA

The changes you present make it way too easy to splice, TAS, or otherwise cheat runs, especially on Dolphin. Dolphin is too easy to exploit to cheat runs and hardly be able to tell the difference.

Spain

Lafungo told me I could be Mod if you guys want a lot of changes that would really help to improve the quality of the site and if Jcool contacts Lafungo I will be added so I can change this, I have no problem on retiming the runs with vídeo but has to be highly accepted, so just try to convince Jcool to contact Lafungo to add me, I am his friend

Madison, WI, USA

@AmayaMKWii

In the Dolphin section I mentioned: [quote]The first two cons could be resolved by requiring that the Dolphin frame be shown in the recording.[/quote]

I forgot to include any screenshots, so here's some examples of why this would prevent splicing via TAS movie playback. When playing back a TAS, the Dolphin title bar includes the Input and VI fields which aren't shown during normal playback.

TAS playback: https://i.imgur.com/U49MUZy.png

Normal play: https://i.imgur.com/6Q3b9qE.png

As for Gecko codes, Dolphin is no different from the Wii and Wii U for cheat enforcement.

Edited by the author 6 years ago
Spain

I will tell Lafungo about your idea @charlocharlie and hopefully I will change them, I have no problem on retiming runs and this is actually a really good idea, if you want to become mod and help me change these things just tell me.

Madison, WI, USA

@Javi17 I'd love to help switch over.

As an update, jcool edited the times of many of the mistimed runs and rejected the runs with fast load times. I updated the spreadsheet to reflect the changes. I don't entirely agree with removing these runs (the non-Dolphin ones) because they are legit runs. Hopefully we can switch to IGT soon and get these runs back on the leaderboards.

Francesco and Tigercat5 like this
Spain

Nice, @charlocharlie I will tell Lafungo about your idea on Monday (When I get Dem PC) and hopefully he'll do something about this. It's also good that Jcool retimed some run

Michigan, USA

Sorry to kinda- sorta necro this post, but I just realized I was mentioned in the original post. Yeah my wii load times are ridiculous. I've calculated that on my old wii my 32 track skips balanced run would be second place if load times were equal. Just another tidbit I thought I'd add to try to help the cause to switch to IGT. I'm all for it.

charlocharlie likes this
Spain

Sorry guys for late post, I told Lafungo about this but he didn't give me an answer

United States

In my un-humble opinion, the fact that all of this overwhelming evidence for the case of IGT being better then RTA (which has been shown for months) yet IGT still isn't implemented just shows how much of a shit show the current mods are. Maybe JC does a few things here and there but I think it is well obvious a shift in power is going to be necessary considering that the people who have done the most for the community have no power, yet those who couldn't give a shit less do.

On another note, I will HAPPILY go through every goddamn time on this site and convert them all to IGT if that's what it takes - because it pisses me off that I can't even run my favorite game due to the way we regulate this.

tl;dr: Mod CharloCharlie, for fucks sake, and something will finally get done in this shit hole.

Edited by the author 6 years ago
mariorules64, akori and 2 others like this
United States

In response to the general sentiments in this thread, as opposed to any specific post, I bumped Jcool up to super mod in hopes he is able to get a couple more people involved as opposed to site staff choosing sides.

EDIT/UPDATE: After some discussion I think the plan is to have someone else handle this due to having more interest.

Edited by the author 6 years ago
Javi17 and charlocharlie like this
Utah, USA

@kirkq Can you point me to where this discussion occurred? We were in the process of adding moderators/verifiers when this change was reverted, so an update on what the current plan is would be greatly appreciated so we can get things moving forward.

AmayaMKWii and Francesco like this
Virginia, USA

in regards to the original post, i'm slowly but surely liking the idea of igt more, but im still very hesitant about dolphin since it's not the same hardware and traditionally the community has never accepted unofficial emulators.

Spec87 and Frikkinfriks14 like this
Wisconsin, USA

I don't think the suggestions are dead, due to charlocharlie's recent promotion.

AmayaMKWii likes this
England

I like the idea of IGT. But for people without video, would we write down our times from each course and add them up?

Florida, USA

okay so after submitting a run that got rejected for being SD loaded (i knew emulators are banned but i thought sd loading was okay, my bad) if we do switch to IGT which its pretty obvious we should, will sd/usb loading be accepted?

Madison, WI, USA

@Legoman Thanks for being honest. I really appreciate it.

On the topic of USB/SD loading being allowed in IGT: I'd imagine so. There may be some backlash about Gecko codes and such, but I doubt it would be enough to prevent it.

Virginia, USA

For USB/SD, we probably could require people who do use that method should stream their startup/include it in their submission to prevent discrete speed hacks from being passed as legit.

LukeSaward, Spec87, and Francesco like this