Comments
thread: The Site
NorwaySgtKabukiman5 years ago

Moin everyone,

most of you reading this will probably have noticed the sudden performance problems and inconsistency bugs throughout the site. Let me just give a little bit of background as to what happened during the last two days.

We have been struggling to keep up with the growth of the speedrunning community for a while now and are constantly looking for ways to improve the site's performance. Right now the site is a big monolith, running on a single, beefy server. To reduce the server's load we now rented a second dedicated machine just to host the database. On Saturday we made the switch and moved MariaDB over to that new server.

In the hours following that we saw that performance decreased substantially, e.g. mean page load times went from 100ms to 400ms. To give the new server time to warm up its caches, we decided to let the experiment run until Monday morning, closely watching the server's metrics in the mean time.

Unfortunately the situation did not improve and we saw a lot more failed and slow requests throughout the nights. This plus the fact that things like user registration, game requests and other things broke made us decide to revert the experiment a few hours ago. We moved the database back and the site should be behaving normal (meaning "not great, but certainly much better than over the weekend") again.

We've learned that the database indeeds requires 80-90% of this server's performance and that all other services (nginx, Redis, PHP, e-mail) are negligible in terms of memory and CPU usage. We also learned how valuable our monitoring is and improved that setup a lot over the weekend.

Our next action items are

  • Debug why moving the database caused weird consistency errors. We configured MariaDB to run in full ACID compliance, so we don't expect transactions to just disappear without any kind of error.
  • Improve our database queries in general. We had a 100MBit connection between the two servers and our queries alone nearly saturated that link. We could also see that MariaDB spent a considerable amount of time just sending out packets. Reducing the resultset sizes should free up some time for actual query logic.
  • Further improve the caching layer and make more use of Redis in general. During the weekend we saw that the API rate limiting was causing 90% of all locks in MariaDB and just moving that logic to Redis was small, yet quick win for the performance.
  • We will most likely run a similar experiment with a dedicated database server in the future. We are thinking about replication and running multiple read-slaves, but still fear the additional complexity in our setup.

We apologise for the disruptions over the weekend. As they say, you can't make an omelette without breaking eggs.

-- The Team

andyrockin123, Goodigo and 20 others like this
thread: The Site
NorwaySgtKabukiman5 years ago

Guys, dan9er is totally right. Let's throw the code away and rebuild the site with Weebly, like a real 10x l33t rockstar coder.

EDIT: Accidentally mixed up dan9er and Duxez. Sorry.

Timmiluvs, Lonne and 3 others like this
NorwaySgtKabukiman5 years ago

I was watching Highspirits' any% run ( https://www.speedrun.com/Dragon_Quest_Builders/run/8yved96m ) and was noticing that it's classified as PS3 -- but in the VOD he says he's playing on PS4 (I cba to find the exact timestamp). From the accidental save screens it sure does look like the PS4 as well.

thread: Speedrunning
NorwaySgtKabukiman7 years ago

Closing this now. Thanks to those who actually answered OP's question. Please move meta discussions to the more appropriate channels (i.e. tumblr) or retreat to your personal safe spaces.

Zachoholic, Trollbear666 and 4 others like this
thread: The Site
NorwaySgtKabukiman7 years ago

Dutchj, I can't reproduce your problem. Can you give me more details (what game, what browser are you using)?

thread: The Site
NorwaySgtKabukiman7 years ago

I'm on it.

thread: The Site
NorwaySgtKabukiman8 years ago

Yeah the algorithm for relative times is not 100% identical, but I personally like it ;-)

thread: The Site
NorwaySgtKabukiman8 years ago

It should work now.

Deln and ROMaster2 like this
thread: The Site
NorwaySgtKabukiman8 years ago

I see what you mean. I'll get on that.

thread: The Site
NorwaySgtKabukiman8 years ago

Hi,

yes we have just updated stuff. If you see non-personalized times, then the JavaScript hasn't been executed. Do you have it disabled?

Cheers.

thread: The Site
NorwaySgtKabukiman8 years ago

Hi everyone,

we have just changed how timezones are handled throughout the site. Until now, we used a cookie to render times in your timezones on the server. This, over the time, caused all sorts of issues, starting with endless reload loops on the PS4 and some other browsers and ending in a lot of hacks to handle DST changes. Also, this prevents us from caching HTML output, as it depends on each user's preferences.

On top of that, when the site was moved to a new server, we also switched from running in CET/CEST (UTC+1/2) to using UTC. The many hacks throught the site to compensate for the original timezone lead to some problems (like new run times being off by an hour), which motivated us to finally do something about it.

The old cookie-based approach was replaced by outputting HTML5 <time> tags and using JavaScript to convert those into the browser's timezone. There are two minor downsides to this approach:

¤ On large pages with lots of times (like the forums index), this takes a few milliseconds and the page is delayed by a moment. In most cases you won't notice it. ¤ Users without JavaScript enabled will not see localized times anymore, but rather UTC values. As many things on speedrun.com depend on JavaScript, we don't think this will affect many users.

A note to marathon managers: We noticed that times seem off by 1-2 hours for newer marathons (with our schedule being the only source for the start time, it's hard to verify if it's correct ;-)). Please check your schedules and make sure the start time is correct. Please note that you need to configure it in UTC, not -- as earlier -- in your local timezone.

Havi, zoton2 and 5 others like this
thread: The Site
NorwaySgtKabukiman8 years ago

Zephiles, thanks for reporting, we indeed forgot to adjust the upload limits for the new webserver.

Uploads up to 64MB should now work as expected.

thread: The Site
NorwaySgtKabukiman8 years ago

This is not a mistake, the 1-Loop category looks like non-misc to me. =)

Concerning your other question: No, this is not possible and not planned, you would have to crawl all categories and runs (you would probably fetch a plain list of runs instead of querying each category for its runs).

Cheers.

thread: The Site
NorwaySgtKabukiman8 years ago

Fixed, thanks. There as a typo in the code, killing the XSS prevention.

thread: The Site
NorwaySgtKabukiman8 years ago

At the moment, we don't have such a list (except the one on the homepage), but there seem to be other sites using our API to fetch the lastest verified runs and list/tweet them.

thread: The Site
NorwaySgtKabukiman8 years ago

We encourage external sites to integrate with us and use our data. That's why I've built the REST API over the last few months. All we ask for in return is that those users give credit. Really, a small link to us will probably be sufficient. That is not much to ask.

Without a license on our site, external sites must assume that they are not allowed to use the data at all. Just because we don't have a license does not mean our data is automatically public domain. We therefore need the license to enable others to safely use our stuff.

MASH, cutenice and 3 others like this
thread: The Site
NorwaySgtKabukiman8 years ago

Just for reference: Images are supposed to be cached for one hour in everyone's browser cache. You can, after uploading a new one, check your changes by doing a hard reload (Ctrl+F5 in most browsers on Windows), but most users will only get the change after that one hour.

n1nj4ofshr3d likes this
thread: The Site
NorwaySgtKabukiman8 years ago

The site required cookies to work. Browser that disabled those ended up in a reload loop.

I "fixed" that, accepting that for users with no cookies, dates would be in the wrong timezone (a proper fix would take more time to implement and atm the first priority was that those users can at least see ¤something¤ of the site).

Right now, I can't reproduce your problems [any more]. http://www.speedrun.com/gtasa works fine for me in Vivaldi.

thread: The Site
NorwaySgtKabukiman8 years ago

No problem, I will kill your account. As for why this is not an option for everyone, I don't really know. Maybe a good chunk of our users is constantly drunk and would delete their account during binges. ;-)

thread: The Site
NorwaySgtKabukiman8 years ago

I banned him. Case hopefully closed.

About SgtKabukiman
Joined
10 years ago
Online
3 years ago
Runs
0