Pdfshuffler .sfs - Edit pdfs :) fantastic!
attach files to a pdf
If you are interested in "pdf portfolios" i.e. pdfs with attached files, pdfdetach can extract them. But mutool can do that and also attach them in the first place.
mutool does several other useful things - I haven't used it extensively, but I assume it is good as it is a brother of mupdf.
mutool does several other useful things - I haven't used it extensively, but I assume it is good as it is a brother of mupdf.
Last edited by disciple on Wed 03 Apr 2019, 05:18, edited 2 times in total.
Do you know a good gtkdialog program? Please post a link here
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER
Origami-pdf is another tool that may be worth mentioning, although it is in ruby, and I'm not sure if it is currently being developed anywhere:
And a Java one that is alive https://github.com/itext/rups/
There is a similar python project that seems to be dead.https://github.com/jesparza/peepdfFeatures
•Create PDF documents from scratch.
•Parse existing documents, modify them and recompile them.
•Explore documents at the object level, going deep into the document structure, uncompressing PDF object streams and desobfuscating names and strings.
•High-level operations, such as encryption/decryption, signature, file attachments...
•A GTK interface to quickly browse into the document contents.
And a Java one that is alive https://github.com/itext/rups/
Do you know a good gtkdialog program? Please post a link here
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER
Since I'm recording my most important knowledge about pdfs in this thread:
Qpdf is the best option if you need to remove restrictions (e.g. can't print, can't edit, or can't copy text) from pdfs, which in most cases doesn't require a password. N.B. this doesn't help if you have a pdf where the text has not been encoding according to the standard characterset (which was a problem using the old "cups-pdf" virtual printer in Puppy i.e. copied text was gibberish because the characters in the pdf had all been randomly remapped).
Qpdf is the best option if you need to remove restrictions (e.g. can't print, can't edit, or can't copy text) from pdfs, which in most cases doesn't require a password. N.B. this doesn't help if you have a pdf where the text has not been encoding according to the standard characterset (which was a problem using the old "cups-pdf" virtual printer in Puppy i.e. copied text was gibberish because the characters in the pdf had all been randomly remapped).
Do you know a good gtkdialog program? Please post a link here
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER
Pdfshuffler users may want to note that the version some have treated as an unofficial upstream has now forked as pdfarranger, I guess because there has been a little activity on the original upstream lately.
Do you know a good gtkdialog program? Please post a link here
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER
Re: attach files to a pdf
Poppler now also has a pdfattach.disciple wrote:If you are interested in "pdf portfolios" i.e. pdfs with attached files, pdfdetach can extract them. But mutool can do that and also attach them in the first place.
Sejda has an option to unpack attachments, and an option to create a "portfolio/collection of attachments". I'm not sure whether or not that is actually different from attaching a file with mutool or pdfattach.
Do you know a good gtkdialog program? Please post a link here
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER
Some PDFs have named pages i.e. if you open them in some viewers (e.g. Adobe's) instead of displaying the logical page number they display a number in a different format (perhaps i, ii, iii... in the preface and 1, 2, 3 in the main body of the document), and/or some other text.
I spent quite some time looking for tools which can handle this. I have now discovered that these are called "page labels", and sejda has a tool to apply them to a pdf. I'm not sure if there are any free tools which preserve page labels (e.g. when splitting or merging pdfs), or can list the labels of pages in a pdf. Apparently poppler has supported page labels for a very long time, but tools like pdfunite don't seem to preserve them...
A document can contain more than one page with the same label, which I guess complicates things, and I think rather than being attached to individual pages, they are defined as a kind of metadata that says "starting from this logical page, number using this format".
What I would really like is a way to create bookmarks matching the page labels, and vice versa, and to split a document based on page labels, or the page label prefix.
I spent quite some time looking for tools which can handle this. I have now discovered that these are called "page labels", and sejda has a tool to apply them to a pdf. I'm not sure if there are any free tools which preserve page labels (e.g. when splitting or merging pdfs), or can list the labels of pages in a pdf. Apparently poppler has supported page labels for a very long time, but tools like pdfunite don't seem to preserve them...
A document can contain more than one page with the same label, which I guess complicates things, and I think rather than being attached to individual pages, they are defined as a kind of metadata that says "starting from this logical page, number using this format".
What I would really like is a way to create bookmarks matching the page labels, and vice versa, and to split a document based on page labels, or the page label prefix.
Do you know a good gtkdialog program? Please post a link here
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER
Ah, knowing the right terminology helps.disciple wrote:I'm not sure if there are any free tools which preserve page labels (e.g. when splitting or merging pdfs), or can list the labels of pages in a pdf
...
What I would really like is a way to create bookmarks matching the page labels, and vice versa, and to split a document based on page labels, or the page label prefix.
You can get information about the page labels with pdftk:
Code: Select all
# pdftk Drawing1.pdf dump_data
InfoBegin
InfoKey: ModDate
InfoValue: D:20190718221353
InfoBegin
InfoKey: CreationDate
InfoValue: D:20190718221353
InfoBegin
InfoKey: Title
InfoValue: sill name (2)
InfoBegin
InfoKey: Creator
InfoValue: AutoCAD 2019 - English 2019 (23.0s (LMS Tech))
InfoBegin
InfoKey: Producer
InfoValue: pdfplot15.hdi 15.00.152.000
NumberOfPages: 2
BookmarkBegin
BookmarkTitle: Sheets and Views
BookmarkLevel: 1
BookmarkPageNumber: 0
BookmarkBegin
BookmarkTitle: Random name
BookmarkLevel: 2
BookmarkPageNumber: 1
BookmarkBegin
BookmarkTitle: sill name (2)
BookmarkLevel: 2
BookmarkPageNumber: 2
PageMediaBegin
PageMediaNumber: 1
PageMediaRotation: 0
PageMediaRect: 0 0 1191 842
PageMediaDimensions: 1191 842
PageMediaBegin
PageMediaNumber: 2
PageMediaRotation: 0
PageMediaRect: 0 0 1191 842
PageMediaDimensions: 1191 842
PageLabelBegin
PageLabelNewIndex: 1
PageLabelStart: 1
PageLabelPrefix: [1] Random name
PageLabelNumStyle: NoNumber
PageLabelBegin
PageLabelNewIndex: 2
PageLabelStart: 1
PageLabelPrefix: [2] sill name (2)
PageLabelNumStyle: NoNumber
Do you know a good gtkdialog program? Please post a link here
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER
Ah, sejda looks like the best command line option to change settings like this.disciple wrote:I haven't tried it, but I think there's a good chance Softmaker's new "Flexipdf Basic" would run in Wine. This functionality is provided, albeit in a rather strange place: File>Preferences>Loading, in the bottom section.disciple wrote:Does anybody by any chance know of a Linux program to change the default view settings in a pdf e.g. change from |continuous view" to "single page" view, or "fit width" to "100%" or "fit page"?
Do you know a good gtkdialog program? Please post a link here
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER
It looks like the latest alternative to pdfmod, pdfshuffler etc is pdfslicer.
It uses gtkmm3
The backend is qpdf, so I imagine it will do the best job at handling the most pdfs.
It uses gtkmm3
The backend is qpdf, so I imagine it will do the best job at handling the most pdfs.
Do you know a good gtkdialog program? Please post a link here
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER
Also note that the latest version of pdfarranger uses pikepdf (python interface to libqpdf) as backend if it is installed, rather than Pypdf2. I believe this will be better (check out the matrix on the pikepdf web page comparing it to Pypdf), although I haven't done any testing.disciple wrote:Pdfshuffler users may want to note that the version some have treated as an unofficial upstream has now forked as pdfarranger, I guess because there has been a little activity on the original upstream lately.
The latest version also introduces undo/redo.
Do you know a good gtkdialog program? Please post a link here
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER
Xpdf does that. It can be set default in it's prefs, and be modified on the spot in a document dialog window.disciple wrote:Does anybody by any chance know of a Linux program to change the default view settings in a pdf e.g. change from |continuous view" to "single page" view, or "fit width" to "100%" or "fit page"?
I have only used Xpdf and pdftk (don't use the pdftk-1.41-static pet) for the last 10-15 years. In my view, Xpdf produces the the cleanest and best looking fonts in a .pdf.
I also use to drag a .pdf file that I want to edit, to Abiword, which sometimes opens it like any other text document for editing. It depends on the origin of the document, for example a .pdf printout of a browser page, can very often be modified in Abi later. I have always given it a try.
BTW: I just tested my own claim, and opened a 90-page Huawei .pdf user manual (downloaded from Huawei as a .pdf) in Abi, along with some .pdf email attachments and bills. All very editable. I'm afraid I am an Abi-lover.
Hmm, on second thoughts (the other cell awakened), I have only been using Xpdf and pdftk, as long as I have been using Linux, some 20 years now...
True freedom is a live Puppy on a multisession CD/DVD.
Correct me if I'm wrong, but I think you misunderstand what I was asking.
Pdf viewers commonly allow you to configure defaults for the viewer, and it sounds like that is what you're describing i.e. it affects every pdf you open in that viewer. I am talking about the settings in the actual pdf i.e. if I change them using sejda or flexipdf and send the file to someone else, it affects how the file opens in their viewer, assuming their viewer is set up to respect the settings.
Pdf viewers commonly allow you to configure defaults for the viewer, and it sounds like that is what you're describing i.e. it affects every pdf you open in that viewer. I am talking about the settings in the actual pdf i.e. if I change them using sejda or flexipdf and send the file to someone else, it affects how the file opens in their viewer, assuming their viewer is set up to respect the settings.
Do you know a good gtkdialog program? Please post a link here
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER
Another couple of tools along similar lines as something like pdfsam (although less mature) i.e. non-wysiwyg gui utilities:
https://github.com/muriloventuroso/pdftricks (vala/gtk3/ghostscript)
https://gitlab.com/scarpetta/pdfmixtool (c++/qt5/podofo, although looking at qpdf now)
https://github.com/muriloventuroso/pdftricks (vala/gtk3/ghostscript)
https://gitlab.com/scarpetta/pdfmixtool (c++/qt5/podofo, although looking at qpdf now)
Do you know a good gtkdialog program? Please post a link here
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER
Re: attach files to a pdf
Which only attaches one file at a time, and you can't attach it in place i.e. you need to write out the pdf to a new file.disciple wrote:Poppler now also has a pdfattach.disciple wrote:If you are interested in "pdf portfolios" i.e. pdfs with attached files, pdfdetach can extract them. But mutool can do that and also attach them in the first place.
I tested it for sending Windows executables via email (both outlook
and gmail block executables and at least gmail blocks e.g. zip files these days. Success. Dealing with it seems nice and simple.
If I create a pdf portfolio with sejda, when opening it in adobe reader it complains that it needs to install flash, although it seems to work without it. I'm not sure if pdf portfolios always use Flash, or if it is just the way they've chosen to implement it in sejda. I guess for pdfs to support Flash it must be written into the standard, which seems rather stupid as one day soon (if not already) most people won't have Flash...Sejda has an option to unpack attachments, and an option to create a "portfolio/collection of attachments". I'm not sure whether or not that is actually different from attaching a file with mutool or pdfattach.
The versions of mutool I have to hand don't actually seem to have the portfolio feature... perhaps it is a compile time option?
Do you know a good gtkdialog program? Please post a link here
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER
I haven't tried the linux version lately, but the Windows Foxit Reader has a good interface for attaching files. Even Adobe Reader can attach files on Windows, although the interface isn't good.
Do you know a good gtkdialog program? Please post a link here
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER
https://github.com/arrufat/pdftag
Gui to edit pdf metadata, written in vala and uses poppler
Gui to edit pdf metadata, written in vala and uses poppler
Do you know a good gtkdialog program? Please post a link here
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER
People might find these snippets from http://www.imagemagick.org/Usage/formats/#ps interesting:
Multi-paged PDF Documents...
You can use perl to combine multiple PDF files, without resorting to a IM, and its rasterization problem...
You can also use a JAVA toolkit to merge IM generated images into a PDF producing a better PDF than a simpler one that IM will generate...Code: Select all
#!/usr/bin/perl # Script pdf-combiner.pl use strict; use warnings; use PDF::Reuse; prFile('combo.pdf'); # Output. for (qw/a b c d/) # Inputs. { prImage("result_$_.pdf"); prPage(); } prEnd();
Code: Select all
#!/bin/bash for x in ./*.jpeg do echo $x to ${x}.pdf convert $x -quality 75 ${x}.pdf done echo Merging... java tool.pdf.Merge *.pdf
Do you know a good gtkdialog program? Please post a link here
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER
I know there are some other topics here about linux OCR engines and guis, but I thought I'd mention ocrmypdf, which is probably the easiest solution for adding a layer of ocred text to a raster pdf. It is from the same author as pikepdf, which is basically a python wrapper library for qpdf.disciple wrote:Another program I don't think I've mentioned, particularly for doing ocr on scanned pdfs, is the Windows freeware "pdf-xchange viewer", which apparently runs well in Wine.
EDIT
FWIW I did some testing with ocrmypdf.
IIRC the ocr backend it uses is tesseract. Recognition was perfect except for white space; so more accurate than pdf-xchange, which I had handy for a comparison.
It shrinks test files from the scanner at my work a bit. If I install jbig2enc (which requires leptonica) it shrinks monochrome test files even more.
I wanted to know how to remove scanned text so I converted to a new pdf using pdftocairo, which removed the text and made the file a lot bigger, so presumably it reencoded without jbig2. Interestingly, if I rerun that output through ocrmypdf the result is even smaller. I was dealing with a very small single page file though, so metadata and stuff might show as a big difference in size which wouldn't be noticeable with a large file.
Last edited by disciple on Thu 24 Oct 2019, 20:04, edited 1 time in total.
Do you know a good gtkdialog program? Please post a link here
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER
Classic Puppy quotes
ROOT FOREVER
GTK2 FOREVER