Improving package download

Failing to download packages seems like a common issue we’re having. I’m looking at adding retry logic to Argos Translate but we can also consider adding this type of functionality to LibreTranslate.

I use the standard library to download packages in Argos Translate to reduce dependencies but LibreTranslate already includes requests and we may have better results using it. The Argos Translate packaging code should be modular enough that you can get package paths from the index, download them manually, and then install them with Argos Translate.

I’ve noticed this issue with digital ocean spaces in the past (the download ends up being corrupted sometimes). Retries are always needed in my experience with DO spaces.

Another way to deal with this, more simply, could be to load the models to a droplet with static nginx or apache2. You won’t have the the CDN out of the box (it can be outsourced to cloudflare), but downloads should be way more reliable.

1 Like

Thanks for the input

I added download retries to Argos Translate, if the download fails it retries up to three attempts.

I’m also experimenting with setting up a CDN using Cloudfare pointing to an Apache server at cdn.argosopentech.io, it mostly works but I’m getting 403 Forbidden errors when trying to download from Python.

Cloudfare seems like a really good deal for CDN caching, on the free plan you get effectively unlimited bandwidth.

2 Likes

I added load balancing to Argos Translate, we will now try to download from the list of available urls in a random order. This should make the downloads more robust in case the primary CDN goes down and spread the load over multiple sources.

I’m planning to disable Cloudfare on cdn.argosopentech.com for now and point directly to my Apache server as another option for downloads. I also added links to @jefs42’s libretranslate.fortytwo-it.com to mirror the Danish models.

2 Likes

Great! I tried it and it seems to have solved the model download problems that were happening a few times (especially thanks to the addition of the retry i think).
I’m going to see if I can make a progress bar for the download of the models in CLI, it would be useful I think because when you download all the models (58) there is no indication of the estimated remaining time

Could you release a new version of argos on pypi so that we can use it in libretranslate? there have been quite a few updates to ctranslate2 as well which add support for architectures etc.

2 Likes

Sounds good, a progress bar for downloads in the CLI would be nice.

I’m planning to do a new PyPI release soon.

I saw several libraries including https://github.com/tqdm/tqdm

is it ok to use a library for this or we try to make a more lightweight progress bar but without dependency

I don’t think I’d want to add dependencies just to do a progress bar.

Also looking at it more I think even getting the progress of the downloads isn’t straightforward.

1 Like

I released Argos Translate 1.7.0 on PyPI with CTranslate2 updates and download retries.

People can now mirror or submit their own packages by making a pull request to argospm-index and Argos Translate will randomly distribute the downloads over the available links.

3 Likes