Thank you for your reply, argosopentech. Here is is the output of the nvidia-smi:
$ nvidia-smi
Wed Mar 8 08:19:04 2023
±----------------------------------------------------------------------------+
| NVIDIA-SMI 515.86.01 Driver Version: 515.86.01 CUDA Version: 11.7 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce … Off | 00000000:01:00.0 On | N/A |
| 30% 28C P8 10W / 220W | 10MiB / 8192MiB | 0% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+
±----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
±----------------------------------------------------------------------------+
As you can see, it finds no running process, even though conda is installed (but not anaconda) on my machine. Could this be part of the problem?
The stack trace of the out of range error is:
"
Traceback (most recent call last):
File “/home/argosopentech/env/bin/onmt_train”, line 33, in
sys.exit(load_entry_point(‘OpenNMT-py’, ‘console_scripts’, ‘onmt_train’)())
File “/home/argosopentech/OpenNMT-py/onmt/bin/train.py”, line 172, in main
train(opt)
File “/home/argosopentech/OpenNMT-py/onmt/bin/train.py”, line 157, in train
train_process(opt, device_id=0)
File “/home/argosopentech/OpenNMT-py/onmt/train_single.py”, line 64, in main
configure_process(opt, device_id)
File “/home/argosopentech/OpenNMT-py/onmt/train_single.py”, line 19, in configure_process
torch.cuda.set_device(device_id)
File “/home/argosopentech/env/lib/python3.10/site-packages/torch/cuda/init.py”, line 326, in set_device
torch._C._cuda_setDevice(device)
File “/home/argosopentech/env/lib/python3.10/site-packages/torch/cuda/init.py”, line 229, in _lazy_init
torch._C._cuda_init()
RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx
Traceback (most recent call last):
File “/home/argosopentech/env/bin/argos-train”, line 7, in
exec(compile(f.read(), file, ‘exec’))
File “/home/argosopentech/argos-train/bin/argos-train”, line 20, in
train.train(from_code, to_code, from_name, to_name, version, package_version, argos_version, data_exists, epochs_count)
File “/home/argosopentech/argos-train/argostrain/train.py”, line 173, in train
str(opennmt_checkpoints[-2].f),
IndexError: list index out of range
"
Does this help?