Digital textbook extraction

TokyoMonsterTrucker@lemmy.dbzer0.com · 1 year ago

Digital textbook extraction

TokyoMonsterTrucker@lemmy.dbzer0.com · 1 year ago

I can print pages to PDF without a VM. The problem with printing is that these books are over 1000 pages, so I need to automate a good chunk of the process. Ideally, I’d like to capture the XML text for the pdf as well as it will look much better and I will not have to manually crop 1000 PDFs with annoying borders.

KevonLooney@lemm.ee · 1 year ago

Yeah, I believe you can do that by printing to a non-existent printer and then finding the file image waiting in the print queue. I don’t know if it works on Windows 11 but it used to work pretty well.