pdf2book
Contents
Convert pdf to a book
Make any pdf in format A4 a book in format A5 and bind it e.g. with rings.
Rearrange the pages
How it works (schematically): PDF -> PS -> rotate specific pages -> Scramble Pages (like in a book) -> pack two pages on one page -> PDF
Hints:
- It's not parameterised, well.
- Page 45 is turned
- Scaled down by factor 0.7
But in the end this little script does the job:
1 #!/bin/bash
2
3 FILE_PATH="$1"
4 FILE="$(basename "$FILE_PATH")"
5 TITLE="${FILE%%\.pdf}"
6
7 pdftk A="$FILE_PATH" \
8 cat A1-44 A45south A46-end \
9 output "$TITLE"_rot.pdf
10 pdftops "$TITLE"_rot.pdf "$TITLE".ps
11 psbook -q "$TITLE".ps "$TITLE"_BOOK.ps
12 psnup -q -l -pa4 -Pa4 -2 \
13 -b1mm -m1mm \
14 -s0.7 \
15 "$TITLE"_BOOK.ps \
16 "$TITLE"_SIG.ps
17 ps2pdf "$TITLE"_SIG.ps
18 rm "$TITLE"{,_BOOK,_SIG}.ps
Printing
duplex short edge on A4
- auto center
- no scaling
- no border control
- cut A4 in half
- bind
Reduce Size of PDF - v1
High-quality scans can be huge, since the images are simply embedded. This a little script that reduces pdfs in size effectively using ghostscript.
1 aptitude install ghostscript
Create a script which is in a path within your environment varibale PATH e.g. /usr/local/bin/pdf_minify.sh
1 #!/bin/bash
2
3 ### INIT
4 SRC="$1"
5 DST="$2"
6
7 ### SANITIZE
8 if [ -z "$SRC" ]; then
9 echo "Please specify a source file."
10 exit 2
11 fi
12
13 if [ -z "$DST" ]; then
14 echo "Please specify a destination file."
15 exit 2
16 fi
17
18 if [ "$SRC" = "$DST" ]; then
19 echo "Both files must not match."
20 exit 2
21 fi
22
23 set -eu
24 ps2pdf -dDetectDuplicateImages=true \
25 -dCompressFonts=true \
26 -dEmbedAllFonts=false \
27 -dSubsetFonts=true \
28 -dPDFSETTINGS=/ebook \
29 "$SRC" "$DST"
Make it executable
1 chmod a+x /usr/local/bin/pdf_minify.sh
Hint: Change to destination directory and wrap it in a for-loop:
Comparison:
1 % ll -Rh
2 .:
3 insgesamt 7.3M
4 -rw-r--r-- 1 user user 842K Jul 4 14:25 G1.pdf
5 -rw-r--r-- 1 user user 4.3M Jun 17 11:13 G2.pdf
6 -rw-r--r-- 1 user user 976K Jul 4 14:24 K1.pdf
7 drwxr-xr-x 1 user user 326 Jul 4 14:47 minimal
8 -rw-r--r-- 1 user user 1.3M Jul 4 14:32 R1.pdf
9
10 ./minimal:
11 insgesamt 1.8M
12 -rw-r--r-- 1 user user 69K Jul 4 14:47 G1.pdf
13 -rw-r--r-- 1 user user 1.2M Jul 4 14:47 G2.pdf
14 -rw-r--r-- 1 user user 270K Jul 4 14:47 K1.pdf
15 -rw-r--r-- 1 user user 343K Jul 4 14:47 R1.pdf
Reduce Size of PDF - v2
Here is a more comfortable version of the script. Which can:
- replace the files and create backups,
- choose the name itself based on a suffix,
- output to another directory
- process multiple files.
When copying the script, remember that HereDocs with <<- only strip tabs ^I. So in front of EOH only tabs may be found.
/usr/local/bin/pdf_minify.sh
1 #!/bin/bash
2
3 ### DEFAULTS
4 RUN_ONCE=false
5 INPLACE=false
6 BACKUP=false
7 SUFFIX='minified'
8 SUFFIX_APPEND=true
9 SUFFIX_APPEND_FORCE=false
10
11 SELF="$(basename "$0")"
12
13 usage () {
14 cat <<-EOH
15
16 $SELF [options] [--] file1 [file2, …]
17
18 Options:
19 -b|--backup [yn] Force override creation of backup file
20 in the same directory as the input file.
21 -d|--dir-out Create directory and
22 write output in this directory.
23 No backup file is written.
24 -h|--help Display this help.
25 -i|--in-place Replace input file.
26 Creates a backup.
27 -o|--out Path to the output file and
28 process only one pdf.
29 -s|--suffix [opt] Suffix is appended after a underscore "_".
30 Maybe specified up to 2 times because
31 it has an optional argument
32 (with a bit special parsing).
33 Without opt Force append suffix.
34 With opt Change suffix (default: minified).
35 Shortform Don't separate with a space.
36 (examples:
37 correct: '-sCUSTOMSUFFIX'
38 false: '-s CUSTOMSUFFIX')
39 Longform Separate with a "="
40 (examples:
41 correct: '--suffix=CUSTOMSUFFIX'
42 false: '--suffix CUSTOMSUFFIX')
43 EOH
44 }
45
46 # Note that we use "$@" to let each command-line parameter expand to a
47 # separate word. The quotes around "$@" are essential!
48 # We need TEMP as the 'eval set --' would nuke the return value of getopt.
49 TEMP=$(getopt \
50 -o 'b:d:hio:s::' \
51 --long 'backup:,dir-out:,help,in-place,out:,suffix::' \
52 -n "$SELF" -- "$@")
53
54 if [ $? -ne 0 ]; then
55 echo 'Terminating...' >&2
56 exit 1
57 fi
58
59 # Note the quotes around "$TEMP": they are essential!
60 eval set -- "$TEMP"
61 unset TEMP
62
63 while true; do
64 case "$1" in
65 '-d'|'--dir-out')
66 DIR_OUT="$2"
67 SUFFIX_APPEND=false
68 BACKUP=false
69 shift 2
70 continue
71 ;;
72 '-h'|'--help')
73 usage
74 exit 0
75 ;;
76 '-i'|'--in-place')
77 INPLACE=true
78 SUFFIX_APPEND=false
79 BACKUP=true
80 shift
81 continue
82 ;;
83 '-b'|'--backup')
84 BACKUP_FORCE="$2"
85 shift 2
86 continue
87 ;;
88 '-o'|'--out')
89 OUT="$2"
90 shift 2
91 continue
92 ;;
93 '-s'|'--suffix')
94 # optional argument. As we are in quoted mode,
95 # an empty parameter will be generated if its optional
96 # argument is not found.
97 case "$2" in
98 '')
99 SUFFIX_APPEND_FORCE=true
100 ;;
101 *)
102 SUFFIX="$2"
103 ;;
104 esac
105 shift 2
106 continue
107 ;;
108 '--')
109 shift
110 break
111 ;;
112 *)
113 echo 'Internal error!' >&2
114 exit 1
115 ;;
116 esac
117 done
118
119 ### SANITIZE
120 if [ "${#@}" -lt 1 ]; then
121 echo "Please specify at least one source file."
122 exit 2
123 fi
124
125 if [ -n "$OUT" ]; then
126 cat <<-EOH
127 Output file has been given.
128 Processing only first argument.
129 EOH
130 fi
131
132 ### MAIN
133
134 ### FORCE OVERRIDE BACKUP DEFAULTS
135 if [ -n "$BACKUP_FORCE" ]; then
136 if grep -Ei '^(y(es)?|true)$' <<< "$BACKUP_FORCE";then
137 BACKUP=true
138 elif grep -Ei '^(no?|false)$' <<< "$BACKUP_FORCE";then
139 BACKUP=false
140 else
141 echo "Unknown switch '$BACKUP_FORCE'"
142 fi
143 fi
144
145 ### CREATE DESTINATION DIRECTORY
146 [ -n "$DIR_OUT" ] && [ ! -d "$DIR_OUT" ] \
147 && mkdir "$DIR_OUT"
148
149 ### CREATE A TEMPORARY DIRECTORY
150 DIR_TMP="$(mktemp -d)"
151
152 for ARG; do
153 echo "Processing '$ARG'"
154
155 ### CREATE A BACKUP AND LEAVE IT UNTOUCHED
156 if $BACKUP \
157 && ! cp -p "$ARG" "${DIR_OUT}${ARG%.pdf}_backup.pdf"; then
158 cat <<-EOH
159 Backup of file "$SRC" to "${DIR_OUT}${SRC%.pdf}_backup.pdf" failed.
160 Exiting…
161 EOH
162 exit 1
163 fi
164
165 ### CREATE A TEMPORARY WORKING COPY
166 cp "$ARG" "$DIR_TMP"
167 BASENAME="$(basename "$ARG")"
168
169 IN="$DIR_TMP/$BASENAME"
170
171 if [ -n "$OUT" ];then
172 RUN_ONCE=true
173 else
174 OUT="$ARG"
175 fi
176
177 if ! $INPLACE && [ -n "$DIR_OUT" ]; then
178 OUT="$DIR_OUT/$BASENAME"
179 fi
180
181 $SUFFIX_APPEND || $SUFFIX_APPEND_FORCE \
182 && OUT="${OUT%.pdf}_${SUFFIX}.pdf"
183
184 ps2pdf -dDetectDuplicateImages=true \
185 -dCompressFonts=true \
186 -dEmbedAllFonts=false \
187 -dSubsetFonts=true \
188 -dPDFSETTINGS=/ebook \
189 "$IN" "$OUT"
190
191 $RUN_ONCE && break
192 unset OUT
193 done
194
195 ### CLEANUP
196 rm -r "$DIR_TMP"
Just invoke it like this:
1 pdf_minify.sh -i -- file1.pdf file2.pdf file3.pdf
PDF Web-Optimization - Linearization
It's not all about size. With larger PDFs it is possible to start displaying content while still downloading the file. The PDF just has to be linearized, e.g. with qpdf
1 qpdf --linearize --replace-input input.pdf
2 ### RECURSIVELY FIND AND OPTIMIZE DOCUMENTS
3 find . -name '*.pdf' \
4 | xargs -P 16 -L1 -I{} \
5 qpdf --linearize --replace-input "{}"
6 ###YOU MAY HAVE TO CHECK AND DELETE ".~qpdf-orig" FILES
7 ###WHICH ARE CREATED BY QPDF ON WARNING OR ERROR
8 find . -name '*.~qpdf-orig' #-delete
9
Join PDFs into one
Make sure you got the correct sequence of PDFs.
With ghostscript