Can't figure out how to download an embedded PDF

nooneescapesthelaw@lemmy.ml · edit-2 1 year ago

Can't figure out how to download an embedded PDF

Norah - She/They@lemmy.blahaj.zone · 1 year ago

Okay so, PDF documents are actually already “a collection of images” basically. This website is clearly trying to make it an extra step harder by loading the images individually as you browse the document. You could manually save/download all the images and use a tool to turn it back into a pdf. I haven’t heard of a tool that does this automatically, but it should be possible for a web scraper to make the GET requests sequentially then stitch the pdf back together.

Pantoffel@feddit.de · 1 year ago

I would go this route as well. As a developer this sounds easy enough. It you don’t get vertical sequences of images, but instead a grid of images, then I would apply traditional image stitching techniques. There are tons of libraries for that on github.