I'm wondering why the docker image is 3.7 GB

I managed to get a test instance up and running locally on my NAS with a very simple docker-compose.yml using libretranslate/libretranslate:latest to avoid building my own image.

Here it is:

version: "2.4"                                                                                                                                                                                                                                                                                                       
                                                                                                                                                                                                                                                                                                                    
services:                                                                                                                                                                                                                                                                                                            
  libretranslate:                                                                                                                                                                                                                                                                                                    
    image: libretranslate/libretranslate:latest                                                                                                                                                                                                                                                                      
    container_name: libretranslate                                                                                                                                                                                                                                                                                   
    hostname: libretranslate                                                                                                                                                                                                                                                                                         
    restart: "no"                                                                                                                                                                                                                                                                                                    
    ports:                                                                                                                                                                                                                                                                                                           
      - 7500:5000                                                                                                                                                                                                                                                                                                    
    environment:                                                                                                                                                                                                                                                                                                     
      LT_DEBUG: true                                                                                                                                                                                                                                                                                                 
      LT_FRONTEND_LANGUAGE_SOURCE: de                                                                                                                                                                                                                                                                                
      LT_FRONTEND_LANGUAGE_TARGET: en                                                                                                                                                                                                                                                                                
      LT_LOAD_ONLY: 'de,en'                                                                                                                                                                                                                                                                                          
      LT_THREADS: 4                                                                                                                                                                                                                                                                                                  
      LT_SUGGESTIONS: true                                                                                                                                                                                                                                                                                           
    networks:                                                                                                                                                                                                                                                                                                        
        - libretranslate                                                                                                                                                                                                                                                                                             
    cpus: 2                                                                                                                                                                                                                                                                                                          
    mem_limit: 4G                                                                                                                                                                                                                                                                                                    
                                                                                                                                                                                                                                                                                                                     
networks:                                                                                                                                                                                                                                                                                                            
  libretranslate:                                                                                                                                                                                                                                                                                                    
    external: true                                                                                                                                                                                                                                                                                                   
    name: libretranslate

here are a few questions related to this:

  • can I map a volume to a certain path to avoid downloading language files with every restart?
  • why ist the image at around 3.7 GB, this seems highly unusual and at a first glance with:
docker exec -ti libretranslate bash
libretranslate@libretranslate:/app$ ls -al

I am wondering whether all of that is really necessary as it looks like the complete git repo is inside there (only at a glance of course).

docker exec -ti libretranslate bash
du -cksh /app/venv/lib/*

reveals that /app/venv/lib/python3.8/ is 2.5 GB

It might be possible to optimize the image size, I don’t think a whole lot of work has gone into it. That said, most of the size will be from the language models and the pip dependencies.

You should be able to map $HOME/.local/share/argos-translate to a local folder for the models.

2 Likes

I mount all of .local to a bind mount volume (you could use a docker volume as well, I just prefer bind mounts). This captures all of .local/share as well as ./local/cache

(Docker-compose style)

    volumes:
      - ./api_keys:/app/db/api_keys
      - ./local:/home/libretranslate/.local

I too think the image (not the running container) is “fat”. The entire build code as well as compilers etc. used to build it are in the Dockerfile. Also I am running non-Cuda but the Cuda libraries are added to it. It’s really fat!

— edited later —
Oh, the compiler is just in an intermediate image, not sure if it affects the total size.

1 Like

Looks like /app/venv/, specifically the site_packages, is 3.6GB.

of that, nvidia and torch are 1.4GB and 1.9GB. The nvidia libraries for certain could be removed, and I’m not sure if Torch is used without nvidia.

1 Like

Pytorch is used by Stanza for sentence boundary detection in Argos Translate (but it won’t be necessary with Argos Translate 2). We could try removing extra Nvidia files when we’re not using CUDA.

2 Likes