Hi,
I try to develop a dual stanza/spacy SBD for Argos, and I have tried replacing the files from argos 1.9.6 in my LT lab with those from the master branch.
With 1.9.6:
debian@libretranslate...:~$ argos-translate --from en --to de "Hello World!"
Hallo Welt!
debian@libretranslate-venv-b3-16-gra11:~$ argos-translate --from en --to de "Hello World!"
Traceback (most recent call last):
File "/usr/local/lib/python3.9/dist-packages/argostranslate/sbd.py", line 28, in __init__
self.nlp = spacy.load("xx_sent_ud_sm", exclude=["parser"])
File "/usr/local/lib/python3.9/dist-packages/spacy/__init__.py", line 50, in load
return util.load_model(
File "/usr/local/lib/python3.9/dist-packages/spacy/util.py", line 472, in load_model
raise IOError(Errors.E050.format(name=name))
OSError: [E050] Can't find model 'xx_sent_ud_sm'. It doesn't seem to be a Python package or a valid path to a data directory.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/bin/argos-translate", line 3, in <module>
from argostranslate import cli
File "/usr/local/lib/python3.9/dist-packages/argostranslate/cli.py", line 6, in <module>
from argostranslate import translate
File "/usr/local/lib/python3.9/dist-packages/argostranslate/translate.py", line 411, in <module>
sentencizer: sbd.ISentenceBoundaryDetectionModel = SpacySentencizerSmall(),
File "/usr/local/lib/python3.9/dist-packages/argostranslate/sbd.py", line 31, in __init__
spacy.cli.download("xx_sent_ud_sm")
AttributeError: module 'spacy' has no attribute 'cli'
Course, I installed spacy before doing this, and from my experience in developing the same kind of dual dependency kit and using it on the Locomotive machine I have on the same infra, I can tell that spaci.cli.download works just fine on Locomotive.
Anyone can tell what may be wrong with the code in argos-translate/master? Do I need to open an issue on github for this?
I think I know what’s wrong with the argos code : it’s not waiting until spacy has finished downloading… and returns an error straight away.
I will code a function “get_spacy” in the networking module, cache the spacy model and recode the class to include a Path argument that’s set in the package module depending on whether spacy is explicit (i.e. contained in the package) or not.
Actually, there’s some code about downloading an sbd package that’s obsolete now, I’ll rewrite it too.
The code is In the final debugging stages.
I’ll finish next week, testing the last packages where i included spacy xx, will also try a zh package with zh_spacy instead of stanza, and will make the PR afterwards.
Argos should then support either stanza or spacy language-specific SBDs (in the package), or the spacy xx out of packages.
OK, so that looks like a type error within the LT code following the commits made since 1.9.6 in argos. Somewhat weird, but who knows.
My server (proxified with wsgi/Apache) returns an error 500 now, and activating LT_DEBUG does not yield anything.
I am going to check spacy on the CLI before debigging the LT code on my workstation w/ PyCharm. I’ve already got some decent sw-fr packages yesterday.
OK, so, I checked every case scenario and launched the PR in Argos.
Sorry, I am a poor Gitter, I sent 14 commits within this thing… tried to rebase, but to no avail.
One thing though, LT is still down.
I’ll first upgrade it to the very last version, maybe it’ll work again. If not, I’ll put myself to finding out what’s wrong with it.
OK, now that I am sure the code for Argos works, I built a LibreTranslate conda environment and plugged it to a PyCharm project.
Installed the environment, packages, including the ones with spacy (swahili and tatar, the latter I use for dev).
The flask interface on localhost is operational, debug mode returns logs on conda prompt. I’ll get to the bottom of it.
For now, with argos 1.9.6, translating german no problem
[2025-01-30 13:20:11,890] ERROR in app: Exception on /translate [POST]
...
File "C:\Users\nglec\.conda\envs\LibreTranslate\lib\site-packages\stanza\pipeline\core.py", line 70, in __init__
raise Exception(f"Resources file not found at: {resources_filepath}. Try to download the model again.")
Exception: Resources file not found at: C:\Users\nglec\.local\share\argos-translate\packages\translate-sw_fr-1_2\stanza\resources.json. Try to download the model again.
(There’s some improvement margin on the tatar model, I’ll see what I can improve in Locomotive to make better datasets, and when I get some free time, to display alternatives as tabs in LT: the third alternative is way more accurate than the first).
As for my lab server, the guys at web ops did not give the service account permission to create a subdir in its $HOME … so it had some trouble creating the new $HOME/.config/argos-translate directory…
Issue closed but if you host an instance, beware permissions.