Python tutorials of Full Circle Magazine in a single PDF
Please read first the update information at the end of the post.
Full Circle Magazine (FCM) started a Python tutorial series in issue #27. At the time of writing, the current issue is #45, and the tutorial is still there :)
Problem: it would be nice to extract these tutorials from the issues and put them together in a single PDF. Thus, we would have all the tutorials together in one document.
For the lazy pigs, here is the PDF (6 MB). Get it while it’s hot :)
How to produce the single PDF
For those who are interested, here I explain how to produce the single PDF above.
First, download the issues of FCM. I suppose that the required files are named
issue45_en.pdf. Put them in a directory called
Here is a CSV file that contains data about which pages to extract from the issues:
# issue; start page; end page 27;7;10 28;7;11 29;7;11 30;7;9 31;8;11 32;8;12 33;8;12 34;8;15 35;10;13 36;7;11 37;7;11 38;7;11 39;7;11 40;8;14 41;8;12 42;8;11 43;7;9 44;7;9 45;7;8
Put this file (download link) to the same directory where the PDF files are. Here, create a subdirectory called
pieces. The extracted PDFs will be stored there.
We will use the following Python script to produce the commands that will do the extraction:
#!/usr/bin/env python # extract.py f1 = open('python.csv', 'r') for line in f1: if line.startswith('#'): continue # else line = line.rstrip('\n') (issue, start_page, end_page) = line.split(';') command = "pdftk issue%s_en.pdf cat %s-%s output pieces/%s-python.pdf" % (issue, start_page, end_page, issue) print command f1.close()
By executing the script (download link), you will get the following output:
pdftk issue27_en.pdf cat 7-10 output pieces/27-python.pdf pdftk issue28_en.pdf cat 7-11 output pieces/28-python.pdf pdftk issue29_en.pdf cat 7-11 output pieces/29-python.pdf pdftk issue30_en.pdf cat 7-9 output pieces/30-python.pdf pdftk issue31_en.pdf cat 8-11 output pieces/31-python.pdf pdftk issue32_en.pdf cat 8-12 output pieces/32-python.pdf pdftk issue33_en.pdf cat 8-12 output pieces/33-python.pdf pdftk issue34_en.pdf cat 8-15 output pieces/34-python.pdf pdftk issue35_en.pdf cat 10-13 output pieces/35-python.pdf pdftk issue36_en.pdf cat 7-11 output pieces/36-python.pdf pdftk issue37_en.pdf cat 7-11 output pieces/37-python.pdf pdftk issue38_en.pdf cat 7-11 output pieces/38-python.pdf pdftk issue39_en.pdf cat 7-11 output pieces/39-python.pdf pdftk issue40_en.pdf cat 8-14 output pieces/40-python.pdf pdftk issue41_en.pdf cat 8-12 output pieces/41-python.pdf pdftk issue42_en.pdf cat 8-11 output pieces/42-python.pdf pdftk issue43_en.pdf cat 7-9 output pieces/43-python.pdf pdftk issue44_en.pdf cat 7-9 output pieces/44-python.pdf pdftk issue45_en.pdf cat 7-8 output pieces/45-python.pdf
As can be seen, the extraction will be done with pdftk (more info here). Now, these commands are simply printed to the standard output. Here is how to execute them too:
./extract.py | sh
That is, pass the commands to the shell “sh”, which will execute them line by line.
Okay, now we have the pieces in the directory “pieces”. Enter the directory “pieces” and join the PDFs:
pdftk *.pdf cat output all.pdf
Well, to tell the truth, this method will produce a huge single PDF. The extracted pieces are also very big (5 to 10 MB), and the final PDF is about 130 MB! So actually I used Adobe Acrobat 8 Professional to merge the pieces with the conversion setting “Smaller File Size”. Acrobat Pro optimized the files and produced a file of size 6 MB. If you know how to have a similar result with open source tools, let me know.
It seems FCM comes out with a similar idea: http://fullcirclemagazine.org/python-special-edition-1/. They collected the first 8 parts of the already published Python tutorials in a special edition.
I pushed this project to GitHub, see https://github.com/jabbalaci/Full-Circle-Magazine-Series. I added some changes but I won’t rewrite this post each time. For the latest version, please refer to GitHub.
[ @reddit ]