Is PDF translation possible? My general experience with PDFs has been that they works well as a document format but trying to edit them is difficult.
Yeah I think you could by using PyPDF2.
TL;DR: any update? Roadblocks?
I’m just trying to understand, do the selfhost open source devs who make & use this to translate .txt, .odt, .odp, .docx?!, .pptx?!, .epub, .html, DO THEY USE GOOGLE TRANSLATE FOR THEIR PDFS? For all these years? Or what?
Sorry to vent but I’m rigging my own pdf to odt/epub to pdf converter just to be able to translate my private pdf docs offline, libretranslate being the only offline doc translator which supports more than a handful of languages…
How are you supporting ppt but not pdf? It’s the most irrational contrast, we all use pdf’s, and you out here doing docx and ppts…
LibreTranslate is open source software; I understand you want PDF support, but none has had time to work on this feature yet.
If PDF support is important to you, you could either contribute a pull request, or offer to sponsor its development.
Like @pierotofy said this is open source software. And you’re not paying us anything to use it. Most of the people working on it are volunteers. You’re welcome to submit a pull request to add support for PDF translation yourself.
PDF is a more difficult format to parse than ppts and docx. That’s why we currently support ppts and not PDF.
I can convert a pdf to epub preserving style, translate it with your software, then convert it back to pdf. But ‘no pdfs because ppts are easier’.
Zero talk about pdfs for 3 years. I would contribute if I saw any kind of acknowledgement and encouragement of this critical need.
So the devs working on this, for years, are translating their private/company pdfs, with google translate.
Sorry for being a ‘nuisance’ & Bye.
PS: This thread sorely called for -dev talk- not -mod talk-. Read the product vibes.
Argos Translate’s main function is translating input, the provided formats are given to you because they are straightforward to implement and is easily beneficial to users without much maintaining demand. PDFs are messy, all over the place, and just not straight-forward to develop around.
Again, you are welcome to contribute a pull request as we’d love to add support for PDFs. As others have mentioned, you’ll find it’s not as straightforward as you might think it is.