Project

General

Profile

Bug #16436

Feature #10034: Translation web platform

Make the setup production-ready, adjust resource allocation and optimize stuff if needed

Added by u 7 months ago. Updated 6 days ago.

Status:
In Progress
Priority:
Normal
Assignee:
Category:
-
Target version:
Start date:
02/11/2017
Due date:
% Done:

100%

Estimated time:
16.00 h
Spent time:
(Total: 2.90 h)
Feature Branch:
Type of work:
Sysadmin
Blueprint:
Starter:
Affected tool:
Translation Platform

Description

Sysadmin work

Budgeted time: 16h

Suggestions:

  • Give the VM one more vCPU? Sometimes the webapp (run within the apache process) and mysqld are competing for resources and there are timeouts in the web UI.
  • Optimize MariaDB config according to the size and usage of our database? Sometimes MariaDB is eating a full CPU core for a long time, while there's quite some free RAM. Presumably bigger caches etc. might help.

Subtasks

Feature #12220: Set up monitoring for weblateResolved

Feature #15359: List parts of code/packages/configs to be puppetized for translation platform & its cloneResolved

Feature #16135: Consider filtering abusive requests to Weblate upstreamResolved

Feature #16450: Use puppet logic to not copy language listResolved

Bug #16525: translation-server: logrotate logs of weblate script.Resolved

History

#1 Updated by intrigeri 5 months ago

  • Estimated time changed from 160.00 h to 16.00 h

#2 Updated by intrigeri about 2 months ago

  • Description updated (diff)

#3 Updated by hefee about 2 months ago

#4 Updated by u about 1 month ago

  • Status changed from Confirmed to Needs Validation

@groente: 10 days ago you said you'd like to add Apache mod_security before we send out the call for testing. As said over email (July 9th 2019), I am preparing to send out this call today. Is there a chance you'll still implement this, or should I ignore this for now?

#5 Updated by groente about 1 month ago

u wrote:

@groente: 10 days ago you said you'd like to add Apache mod_security before we send out the call for testing. As said over email (July 9th 2019), I am preparing to send out this call today. Is there a chance you'll still implement this, or should I ignore this for now?

working on this right now!

#6 Updated by u about 1 month ago

#7 Updated by groente 22 days ago

@u, @hefee: Sofar, I've:

- deployed mod_sec
- given the host extra diskspace for the audit log
- given the host an extra vcpu
- adjusted the mariadb configuration according to mysqltuner recommendations
- given the host considerable extra memory to deal with the new mariadb config

Please let me know if you encounter any problems due to lack of system resources and/or 403 errors due to mod_sec false positives.

#8 Updated by intrigeri 21 days ago

Hi @groente!

Please let me know if you encounter any problems due to lack of system resources and/or 403 errors due to mod_sec false positives.

I'm seeing 403 for every Weblate page I'm trying now (using Tor Browser, FWIW). It did work just fine until a dozen minutes ago or so.
Is there any additional info I shall provide so you can debug & fix this?

#9 Updated by groente 21 days ago

intrigeri wrote:

Hi @groente!

I'm seeing 403 for every Weblate page I'm trying now (using Tor Browser, FWIW). It did work just fine until a dozen minutes ago or so.
Is there any additional info I shall provide so you can debug & fix this?

timestamps and url's please! thank you!

#10 Updated by intrigeri 21 days ago

timestamps and url's please! thank you!

https://translate.tails.boum.org/translate/tails/wikisrcsandboxpo/fr/?type=suggestions at 14:47 and 14:48 UTC today.

#11 Updated by zen 21 days ago

groente wrote:

@u, @hefee: Sofar, I've:

- deployed mod_sec
- given the host extra diskspace for the audit log

Maybe you are aware, but I will note here that we currently have about 8G of mod_security logs and that is giving a disk warning in the monitoring system. Logs are being rotated, though.

#12 Updated by groente 21 days ago

zen wrote:

groente wrote:

@u, @hefee: Sofar, I've:

- deployed mod_sec
- given the host extra diskspace for the audit log

Maybe you are aware, but I will note here that we currently have about 8G of mod_security logs and that is giving a disk warning in the monitoring system. Logs are being rotated, though.

I am aware, yes, lateron I'll readjust the rather excessive logging, but currently in the testing phase, it's spitting out useful info to adjust our ruleset more effectively.

#13 Updated by groente 8 days ago

  • Assignee changed from groente to hefee

Hey Hefee,

If you don't mind, I'd like your thoughts on the following:

I've been seeing these kind of error messages pop up on a few occassions in the logfiles:

Unable to connect to WSGI daemon process 'translate.tails.boum.org' on '/var/run/apache2/wsgi.21667.5.1.sock' after multiple attempts as listener backlog limit was exceeded.

Do you think this is related to https://github.com/GrahamDumpleton/mod_wsgi/issues/181 ? To get the mod_wsgi version which supports the new settings, we would have to upgrade to buster...

#14 Updated by hefee 7 days ago

  • Assignee changed from hefee to groente

groente wrote:

Hey Hefee,

If you don't mind, I'd like your thoughts on the following:

I've been seeing these kind of error messages pop up on a few occasions in the logfiles:

Unable to connect to WSGI daemon process 'translate.tails.boum.org' on '/var/run/apache2/wsgi.21667.5.1.sock' after multiple attempts as listener backlog limit was exceeded.

Do you think this is related to https://github.com/GrahamDumpleton/mod_wsgi/issues/181 ?

I don't think it is related.

What may be worth looking into is using uwsgi to ship Weblate and not use mod_wsgi from Apache. I switched several years ago from mod_wsgi to uwsgi and I never regret it, when it comes to ship python apps. As uswgi is performing very well and it is not bundled with Apache, so it is easy to use Nginx etc as Webserver. We need https://httpd.apache.org/docs/2.4/en/mod/mod_proxy_uwsgi.html as the uwsgi protocol is faster than the http, or use an http socket from uwsgi.

an initial ini file for uwsgi:


# cat /etc/uwsgi/apps-enabled/translate.tails.boum.org.ini 
[uwsgi]
master = true
touch-reload = %p  # this is a nice feature, if you touch this file, the app will be restarted automatically with the new settings

chown-socket = www-data:www-data

uid = weblate
gid = weblate

plugin = python3
chdir = <%= @code_git_checkout %>
file = <%= @code_git_checkout %>/weblate/wsgi.py

vacuum=True
max-requests=5000

and that creates the uwsgi-socket /run/uwsgi/app/translate.tails.boum.org/socket where Apache can connect to.

#15 Updated by intrigeri 7 days ago

What may be worth looking into is using uwsgi to ship Weblate and not use mod_wsgi from Apache. I switched several years ago from mod_wsgi to uwsgi and I never regret it, when it comes to ship python apps. As uswgi is performing very well and it is not bundled with Apache, so it is easy to use Nginx etc as Webserver.

Our Weblate is already sitting behind a nginx reverse proxy so if we're going the uwsgi road, it would be tempting to drop the additional hop via Apache. But we rely on Apache mod_security. ModSecurity supports nginx but AFAICT that's not in Debian, which is a show-stopper. So my understanding is that the only workable option is replacing mod_wsgi with uwsgi, i.e. we would do nginx → Apache → uwsgi, instead of the current nginx → Apache → mod_wsgi.

#16 Updated by groente 6 days ago

  • Status changed from Needs Validation to In Progress

thanks for the feedback! i haven't actually seen the wsgi error again since the last parameter adjustments i made a few days ago. if it keeps behaving, i'll just stick with the setup as is, else i'll look into mod_proxy_uwsgi.

Also available in: Atom PDF