Page 1 of 1
Outage and Server upgrade
Posted: Mon Dec 06, 2021 1:25 am
by chase
The site was knocked offline due to a database issue that seems to have been caused by an OOM problem, I think likely due to the excessive (bot) traffic lately on the site. The VPS has been upgraded to have 4x the RAM.
Re: Outage and Server upgrade
Posted: Tue Dec 07, 2021 11:21 pm
by chase
Seems like the bots are still having fun, the httpd processes were maxed out and stuck. Upgraded apache, changed some config settings.
Re: Outage and Server upgrade
Posted: Tue Dec 07, 2021 11:42 pm
by klange
chase,
Can you please extend some administrative access to additional parties who can more actively respond to these issues? There have been many qualified and trustworthy volunteers.
Re: Outage and Server upgrade
Posted: Tue Dec 07, 2021 11:59 pm
by Kazinsal
klange wrote:chase,
Can you please extend some administrative access to additional parties who can more actively respond to these issues? There have been many qualified and trustworthy volunteers.
Seconded. If it were just the forum collapsing every few days I would be slightly less concerned, but we're losing the wiki for days at a time, which is an important resource that often is a top-level google search result for OS-related technical terms. We need to be able to fix this issue when it arises, and preferably also fix it pre-emptively so the VPS doesn't need to be kicked every 72 hours.
Re: Outage and Server upgrade
Posted: Wed Dec 08, 2021 10:10 am
by waltster
I would suggest enabling something like Cloudflare to serve static backups of the wiki/forum in the event of an outage on the back-end. It's free for much of the service and wouldn't be a hassle to setup.
Re: Outage and Server upgrade
Posted: Wed Dec 08, 2021 12:26 pm
by chase
I'm open to all suggestions.
Something I've done some work towards that'll happen next year, move out of a basic Linode VPS. Not final on where but I likely want to have something with a managed load balancer forwarding to 2+ dockerized (maybe k8) apache instances and possibly a managed MySQL service.
Cloudflare is a strong possibility, I really want some better DDOS protection.
Having additional admins would be nice, I'm open to doing that again (we've tried before) but I want auto-scaling and health checks with automatic restarts so that we have a more automated solution.
I think we are doing pretty good, our uptime is comparable with AWS
Re: Outage and Server upgrade
Posted: Wed Dec 08, 2021 12:38 pm
by nexos
+1 for migrating to something like Cloudfare. Even AWS would be better than the current state
Re: Outage and Server upgrade
Posted: Wed Dec 08, 2021 3:04 pm
by BigBuda
chase wrote:I think we are doing pretty good, our uptime is comparable with AWS
And higher than Office 359.
Re: Outage and Server upgrade
Posted: Wed Dec 08, 2021 5:16 pm
by waltster
I think that with anything like this, communication with the community is key. Thank you for working hard to keep the site online; of course, I am happy to help with any migration/setup tasks.
Re: Outage and Server upgrade
Posted: Thu Dec 09, 2021 1:06 am
by chase
Continuing to look into the recent issues, I think the bots are causing a possibly unintentional slowloris attack. We are seeing really large amounts from traffic from a small number of ips and the site handles it okay for a while but at some point (maybe while there are transient network issues somewhere) the httpd process count starts spiking and we reach the max servers number and no more httpd process can get started. I've already bumped the max servers a couple times to take advantage of the newly increased server memory.
I've added some mod_reqtimeout configuration that I hope will help if it is a slowloris issue.
I've also added additional bot configuration to phpBB which generates output a little differently. Most importantly it leaves off the session id query parameter so the bots get less "unique" urls if they are still considering the sid query parameter to be part of a unique URL. Besides the obvious Bing/Googlebot traffic, the newly configured large traffic bots are:
DotBot
PetalBot
SemrushBot
Amazonbot
Neevabot
Was interesting that Amazon/Alexa is crawling the web, makes sense if they are going to compete with Google on voice search front.
The worst offender by far is Neevabot, over a million requests to the site in just a couple weeks. If the phpBB bot settings don't help with them I might have to block their ip.
The corporate bots aren't the only offenders, we have what appear to be several individuals that are concerned about having copies of the wiki and have implemented bots of various forms to try and archive the wiki. If the bots were all well written to only get the content it wouldn't be much of a problem but most of them tend to do things like archive the user pages, the talk pages, the special pages, and every single page diff. Some of them are causing 40k hits per day to the site so I'll probably need to block them also.
We do have a copy of the wiki available for download at
https://files.osdev.org/osdev_wiki.zip if you want the information in an offline form, its not perfect but it mostly works. It was a little stale because the generation was hanging. I've fixed that. Issue there was that people have been uploading larger animated gifs for their OS images so I had to adopt the change from
https://gerrit.wikimedia.org/r/c/mediaw ... e/+/91501/ to allocate more memory to the image conversion process.
The wiki archive is a simple wget command:
Code: Select all
wget --inet4-only --no-check-certificate --mirror -k -p --reject '*=*,User:*,Special:*,User_talk:*' --exclude-directories='User:*,User:*/*,User:*/*/*,User_talk:*,User_talk:*/*,User_talk:*/*/*,Special:*,Special:*/*,Special:*/*/*' --user-agent="osdev-mirror" https://wiki.osdev.org/Main_Page
If anyone wants to suggest better options for generating an offline copy of the content pages in the wiki or post processing that should be performed I'd welcome it.
Re: Outage and Server upgrade
Posted: Tue Dec 28, 2021 7:32 pm
by Ethin
Since we're doing server upgrades and all, would it not be worthwhile to update both MediaWiki and PHPBB to their latest versions? I feel like this is something that should've been done quite a while ago. I'm pretty sure it would resolve issues like passwords needing to not have certain characters in them and would introduce unicode support. (Also, if it hasn't been done, a full OS upgrade is most likely an extremely good idea.)
Re: Outage and Server upgrade
Posted: Tue Dec 28, 2021 9:10 pm
by BigBuda
Ethin wrote:Since we're doing server upgrades and all, would it not be worthwhile to update both MediaWiki and PHPBB to their latest versions? I feel like this is something that should've been done quite a while ago. I'm pretty sure it would resolve issues like passwords needing to not have certain characters in them and would introduce unicode support. (Also, if it hasn't been done, a full OS upgrade is most likely an extremely good idea.)
I know it would be quite the task, but I'd actually recommend going with Bookstack instead of MediaWiki. Bookstack is much more pleasurable to work with and the way it organizes things makes much more sense for a site such as this. Besides, it's much easier to integrate Bookstack with different authentication sources (like LDAP or Active Directory, for example) than MW (requires plugin, not always trivial task). It would have to be a long phased process but, in my opinion as someone who've had to deal with both for a long time (and MW for over a decade), quite worthwhile. And Bookstack looks better too.
Re: Outage and Server upgrade
Posted: Tue Dec 28, 2021 10:09 pm
by Ethin
BigBuda wrote:Ethin wrote:Since we're doing server upgrades and all, would it not be worthwhile to update both MediaWiki and PHPBB to their latest versions? I feel like this is something that should've been done quite a while ago. I'm pretty sure it would resolve issues like passwords needing to not have certain characters in them and would introduce unicode support. (Also, if it hasn't been done, a full OS upgrade is most likely an extremely good idea.)
I know it would be quite the task, but I'd actually recommend going with Bookstack instead of MediaWiki. Bookstack is much more pleasurable to work with and the way it organizes things makes much more sense for a site such as this. Besides, it's much easier to integrate Bookstack with different authentication sources (like LDAP or Active Directory, for example) than MW (requires plugin, not always trivial task). It would have to be a long phased process but, in my opinion as someone who've had to deal with both for a long time (and MW for over a decade), quite worthwhile. And Bookstack looks better too.
I haven't tried bookstack. I wonder how accessible it is with assistive technology? Hmmm... I should set up a test instance on my local machine and play with it a bit.
Re: Outage and Server upgrade
Posted: Wed Dec 29, 2021 3:52 pm
by BigBuda
Ethin wrote:I haven't tried bookstack. I wonder how accessible it is with assistive technology? Hmmm... I should set up a test instance on my local machine and play with it a bit.
Disclaimer: I admit haven't tested that part. I also don't really have any experience in that area (assistive technologies). What I know about Bookstack is from a regular user and administrator point of view. We've been migrating all our MW instances to Bookstack for a while. For the type of contents involved, it makes much more sense. Let me know how your experience turns out.