Home > python > Python tutorials of Full Circle Magazine in a single PDF

Python tutorials of Full Circle Magazine in a single PDF

Please read first the update information at the end of the post.

Description

Full Circle Magazine (FCM) started a Python tutorial series in issue #27. At the time of writing, the current issue is #45, and the tutorial is still there :)

Problem: it would be nice to extract these tutorials from the issues and put them together in a single PDF. Thus, we would have all the tutorials together in one document.

Download

For the lazy pigs, here is the PDF (6 MB). Get it while it’s hot :)

How to produce the single PDF

For those who are interested, here I explain how to produce the single PDF above.

First, download the issues of FCM. I suppose that the required files are named issue27_en.pdf, issue28_en.pdf, …, issue45_en.pdf. Put them in a directory called full-circle.

Here is a CSV file that contains data about which pages to extract from the issues:

# issue; start page; end page
27;7;10
28;7;11
29;7;11
30;7;9
31;8;11
32;8;12
33;8;12
34;8;15
35;10;13
36;7;11
37;7;11
38;7;11
39;7;11
40;8;14
41;8;12
42;8;11
43;7;9
44;7;9
45;7;8

Put this file (download link) to the same directory where the PDF files are. Here, create a subdirectory called pieces. The extracted PDFs will be stored there.

We will use the following Python script to produce the commands that will do the extraction:

#!/usr/bin/env python

# extract.py

f1 = open('python.csv', 'r')

for line in f1:
    if line.startswith('#'):
        continue
    # else
    line = line.rstrip('\n')
    (issue, start_page, end_page) = line.split(';')
    command = "pdftk issue%s_en.pdf cat %s-%s output pieces/%s-python.pdf" % (issue, start_page, end_page, issue)
    print command

f1.close()

By executing the script (download link), you will get the following output:

pdftk issue27_en.pdf cat 7-10 output pieces/27-python.pdf
pdftk issue28_en.pdf cat 7-11 output pieces/28-python.pdf
pdftk issue29_en.pdf cat 7-11 output pieces/29-python.pdf
pdftk issue30_en.pdf cat 7-9 output pieces/30-python.pdf
pdftk issue31_en.pdf cat 8-11 output pieces/31-python.pdf
pdftk issue32_en.pdf cat 8-12 output pieces/32-python.pdf
pdftk issue33_en.pdf cat 8-12 output pieces/33-python.pdf
pdftk issue34_en.pdf cat 8-15 output pieces/34-python.pdf
pdftk issue35_en.pdf cat 10-13 output pieces/35-python.pdf
pdftk issue36_en.pdf cat 7-11 output pieces/36-python.pdf
pdftk issue37_en.pdf cat 7-11 output pieces/37-python.pdf
pdftk issue38_en.pdf cat 7-11 output pieces/38-python.pdf
pdftk issue39_en.pdf cat 7-11 output pieces/39-python.pdf
pdftk issue40_en.pdf cat 8-14 output pieces/40-python.pdf
pdftk issue41_en.pdf cat 8-12 output pieces/41-python.pdf
pdftk issue42_en.pdf cat 8-11 output pieces/42-python.pdf
pdftk issue43_en.pdf cat 7-9 output pieces/43-python.pdf
pdftk issue44_en.pdf cat 7-9 output pieces/44-python.pdf
pdftk issue45_en.pdf cat 7-8 output pieces/45-python.pdf

As can be seen, the extraction will be done with pdftk (more info here). Now, these commands are simply printed to the standard output. Here is how to execute them too:

./extract.py | sh

That is, pass the commands to the shell “sh”, which will execute them line by line.

Okay, now we have the pieces in the directory “pieces”. Enter the directory “pieces” and join the PDFs:

pdftk *.pdf cat output all.pdf

Known issue

Well, to tell the truth, this method will produce a huge single PDF. The extracted pieces are also very big (5 to 10 MB), and the final PDF is about 130 MB! So actually I used Adobe Acrobat 8 Professional to merge the pieces with the conversion setting “Smaller File Size”. Acrobat Pro optimized the files and produced a file of size 6 MB. If you know how to have a similar result with open source tools, let me know.

Update (20110305):

It seems FCM comes out with a similar idea: http://fullcirclemagazine.org/python-special-edition-1/. They collected the first 8 parts of the already published Python tutorials in a special edition.

Update (20110329):

I pushed this project to GitHub, see https://github.com/jabbalaci/Full-Circle-Magazine-Series. I added some changes but I won’t rewrite this post each time. For the latest version, please refer to GitHub.

[ @reddit ]

About these ads
Categories: python Tags: , ,
  1. January 6, 2012 at 09:59 | #1

    Nice Article…

    thanks for share..

  1. No trackbacks yet.
You must be logged in to post a comment.
Follow

Get every new post delivered to your Inbox.

Join 62 other followers

%d bloggers like this: