Pre-requisites
- Good GPU (I am using Nvidia GeForce 3060 Ti, others may work but additional setup outside the guide may be required for allowing CUDA in WSL)
- Windows OS
- WSL Kernel Version > 5.10.43.3
- WSL 2
- Python 3
Download Text Automatically
git clone https://github.com/Interaction-Bot/opus-nlp-downloader.git
cd opus-nlp-downloader
pip install -r requirements.txt
python main.py get en th
python main.py download en th data/
{'wikimedia': {'links': 'https://object.pouta.csc.fi/OPUS-wikimedia/v20210402/moses/en-th.txt.zip', 'sentences': 26597}, 'CCAligned': {'links': 'https://object.pouta.csc.fi/OPUS-CCAligned/v1/moses/en-th.txt.zip', 'sentences': 10746372}, 'OpenSubtitles': {'links': 'https://object.pouta.csc.fi/OPUS-OpenSubtitles/v2018/moses/en-th.txt.zip', 'sentences': 3281533}, 'XLEnt': {'links': 'https://object.pouta.csc.fi/OPUS-XLEnt/v1.2/moses/en-th.txt.zip', 'sentences': 1236145}, 'Tanzil': {'links': 'https://object.pouta.csc.fi/OPUS-Tanzil/v1/moses/en-th.txt.zip', 'sentences': 93540}, 'QED': {'links': 'https://object.pouta.csc.fi/OPUS-QED/v2.0a/moses/en-th.txt.zip', 'sentences': 264677}, 'GNOME': {'links': 'https://object.pouta.csc.fi/OPUS-GNOME/v1/moses/en-th.txt.zip', 'sentences': 78}, 'NeuLab-TedTalks': {'links': 'https://object.pouta.csc.fi/OPUS-NeuLab-TedTalks/v1/moses/en-th.txt.zip', 'sentences': 102773}, 'bible-uedin': {'links': 'https://object.pouta.csc.fi/OPUS-bible-uedin/v1/moses/en-th.txt.zip', 'sentences': 124386}, 'TED2020': {'links': 'https://object.pouta.csc.fi/OPUS-TED2020/v1/moses/en-th.txt.zip', 'sentences': 160762}}
data-<dataSource>-<codeFrom>_<codeTo>
metadata.json
README
source
target
metadata.json
{
"name": "<dataSource>",
"type": "data",
"from_code": "<codeFrom>",
"to_code": "<codeTo>",
"size": <sentences>,
"reference": ""
}
.argosdata
.argosdata
files to a folderpython3 -m http.server
http://host.docker.internal:8000/<your-file>.argosdata
Install GeForce Experience
Install GeForce Game Ready Driver
Click check for updates and install if required
Do NOT install CUDA, Game Ready Driver will already support this which will be available to the WSL environment via Magic
docker run --gpus all -it argosopentech/argostrain /bin/bash
If it already exists
docker container attach argostrain
su argosopentech
source ~/argos-train-init
Add data from Download Text for Training to data-index.json
for each .argosdata
file
{
"name": "<dataSource>",
"type": "data",
"from_code": "<codeFrom>",
"to_code": "<codeTo>",
"size": <size>,
"reference": "",
"links": [
"<linkToArgosdataFile>"
]
}
argos-train
You will then get some prompts, for English to Thai enter the following:
From code (ISO 639): en
To code (ISO 639): th
From name: English
To name: Thai
Version: 1.0.0
On finish you should see something like this
pip install argostranslate
pip install pathlib
python
import argostranslate.package
import argostranslate.translate
import pathlib
argostranslate.package.update_package_index()
package_path = pathlib.Path("<file_name>.argosmodel")
argostranslate.package.install_from_path(package_path)
translatedText = argostranslate.translate.translate(term, from_code, to_code)
If you have CUDA issues this could be because your WSL version is not up to date. To update perform the following steps:
Go to Settings > Check for Updates
Select Advanced Options and turn on receiving updates for other products (WSL)
Go back and select “Check for updates” and install any updates. Restart and you should have the latest WSL version; you can check this with wsl cat /proc/version
and you want a version greater than 5.10.43
Prerequisites: Windows 10 version 2004 and higher (Build 19041 and higher) or Windows 11.
If you have an older PC with WSL 1 you will need to upgrade to WSL 2 and follow instructions from Step 4.
Then install Ubuntu (or you favourite Linux distro)
wsl --set-version 2
wsl --install -d Ubuntu
You should see the two highlighted items in your side bar for Remote Explorer and Docker management respectively