I have hosted LibreTranslate on Ubuntu 20.04 by following tutorial available
on https://github.com/LibreTranslate/LibreTranslate and https://github.com/argosopentech/LibreTranslate-init url.
Initially, the application is running perfectly fine. However, after running for certain amount of time, app starts to return 500 Internal Server error. So i investigated the issue and found that gunicorn threads are created with each requests but are not getting terminated after the request is processed. Now this is causing issue on long run as eventually there are no more resources available to create more threads. I am not sure weather its issue with the library or Gunicorn.
I have set GuniCorn workers count to 4.
So when i starts to receive 500 error, then each workers have around 18k thread count. Used following command to get thread count
watch ps -o thcount <pid>
Following are the gunicorn error logs
[2022-05-11 16:04:19 +0100] [553213] [ERROR] Error handling request /detect
Traceback (most recent call last):
File "/home/support/LibreTranslate/env/lib/python3.8/site-packages/gunicorn/workers/sync.py", line 136, in handle
self.handle_request(listener, req, client, addr)
File "/home/support/LibreTranslate/env/lib/python3.8/site-packages/gunicorn/workers/sync.py", line 179, in handle_request
respiter = self.wsgi(environ, resp.start_response)
File "/home/support/LibreTranslate/wsgi.py", line 14, in app
instance = main()
File "/home/support/LibreTranslate/app/main.py", line 121, in main
app = create_app(args)
File "/home/support/LibreTranslate/app/app.py", line 113, in create_app
remove_translated_files.setup(get_upload_dir())
File "/home/support/LibreTranslate/app/remove_translated_files.py", line 23, in setup
scheduler.start()
File "/home/support/LibreTranslate/env/lib/python3.8/site-packages/apscheduler/schedulers/background.py", line 38, in start
self._thread.start()
File "/usr/lib/python3.8/threading.py", line 852, in start
_start_new_thread(self._bootstrap, ())
RuntimeError: can't start new thread
ERROR:apscheduler.scheduler:Error submitting job "remove_translated_files (trigger: interval[0:30:00], next run at: 2022-05-12 05:56:34 BST)" to executor "default"
Traceback (most recent call last):
File "/home/support/LibreTranslate/env/lib/python3.8/site-packages/apscheduler/schedulers/base.py", line 978, in _process_jobs
executor.submit_job(job, run_times)
File "/home/support/LibreTranslate/env/lib/python3.8/site-packages/apscheduler/executors/base.py", line 71, in submit_job
self._do_submit_job(job, run_times)
File "/home/support/LibreTranslate/env/lib/python3.8/site-packages/apscheduler/executors/pool.py", line 28, in _do_submit_job
f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name)
File "/usr/lib/python3.8/concurrent/futures/thread.py", line 188, in submit
self._adjust_thread_count()
File "/usr/lib/python3.8/concurrent/futures/thread.py", line 213, in _adjust_thread_count
t.start()
File "/usr/lib/python3.8/threading.py", line 852, in start
_start_new_thread(self._bootstrap, ())
RuntimeError: can't start new thread
From 100% of requests, around 30% resulted in 200 OK and rest were 500 Internal server error due to no threads being created.
Bellow is my system config:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 43 bits physical, 48 bits virtual
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 8
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 63
Model name: Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz
Stepping: 2
CPU MHz: 2294.686
BogoMIPS: 4589.37
Hypervisor vendor: VMware
Virtualisation type: full
L1d cache: 256 KiB
L1i cache: 256 KiB
L2 cache: 2 MiB
L3 cache: 200 MiB
NUMA node0 CPU(s): 0-7
I have uploaded both access and error log files of gunicorn at this link. Upload files for free - gunicorn_logs.zip - ufile.io