Hope this can be useful to others! Some quirks might still be present, I haven’t tested on Linux/macOS, but that testing/adding support for those platform is next. I’m also looking at the possibility of having an easy “install.py” command to add the model to the local argos-package directory so that it can be used in argos-translate/LT.
Nice! This should help to increase the production rate for Argos Translate models. I read through the code and here are some comments:
The docs are great!
I like this syntax: "file://D:\\path\\to\\mydataset-en_es"
The “.txt” extensions in the data packages works well. I don’t currently use any file extensions on the source and target files in “.argosdata” packages because I thought I might want to do other types of data in the future (images, audio, who knows). If the data is explicitly text I think the “.txt” extension is better.
The functionality for automatically calculating BLEU scores is nice to have despite the limitations of BLEU scores we’ve found.
The option to run in toy mode is a great feature. I’ve found Argos Train difficult to test in large part because a complete training run takes so long.
I’ve found that int8 quantization, like you’re using, works well but this is something you could experiment with.
I think the OpenNMT devs have been working a lot on OpenNMT data transforms which could be useful if you want to filter or clean datasets. OpenNMT-py also has functionality for dataset weighting, for example, you could use data from one dataset at double the rate for training.
It’s actually the same code, just edited for use as a module rather than as a script, but I found that running python within python (e.g. run a python subprocess within a python process) was giving me some issues with module imports, rather than troubleshooting those I decided to just edit the average_models.py script and use it as a module.
Yep I was looking at the weight options for different corpuses, I figured we can experiment with those once I have a simple pipeline working using a single one just like argos-train uses, but should be relatively easy to expand the program to have that ability.
I was able to run Locomotive on a Vast.ai Linux server with a RTX4090 GPU and the demo mostly worked out of the box.
python3 train.py --config model-config.json
I did get this error from CTranslate2 failing to use CUDA but I fixed it by setting the CTranslate2 device to “cpu”:
[email protected]:~/Locomotive$ python3 eval.py --config model-config.json
Starting interactive mode
(en)> Hello this is a test
Traceback (most recent call last):
File "/root/Locomotive/eval.py", line 239, in <module>
translation_obj = data["model"].translate_batch(
RuntimeError: Library libcublas.so.11 is not found or cannot be loaded
I also decreased the number of train steps to make the training run shorter:
Despite the short training time and solely using the TildeMODEL dataset the model seems to generate decent quality translations.
Interactive mode
[email protected]:~/Locomotive$ python3 eval.py --config model-config.json
Starting interactive mode
(en)> Hope this can be useful to others! Some quirks might still be present, I haven’t tested on Linux/macOS, but that testing/adding support for those platform is next. I’m also looking at the possibility of having an easy “install.py” command to add the model to the local argos-package directory so that it can be used in argos-translate/LT. :tada:
(it)> E' possibile che ciò possa essere utile per altri!, che potrebbero ancora essere presenti, hontoto sulla Linux/macOS, ma che le prove/astendono il sostegno a questa piattaforma sono prossime. I'm cerca anche di avere un semplice "impianto di comando" per aggiungere il modello al repertorio locale argos-pacchetto, affinché possa essere usato in unrgos-trans/LTda:
(en)> ^C
BLEU
[email protected]:~/Locomotive$ python3 eval.py --config model-config.json --bleu
Downloading flores200 dataset...
Tokenizer 'spm' has been changed to 'flores101', and may be removed in the future.
BLEU score: 52.79772
Note to override valid_steps and train_steps, you can place those directly into config.json, no need to edit the py files. Strange about ctranslate2, will need to investigate that!