Archive

Archive for the ‘python’ Category

Scraping AJAX web pages (Part 5.5)

July 13, 2016 Leave a comment

Don’t forget to check out the rest of the series too!

This post is very similar to the previous one (Part 5), which scraped a webpage using PhantomJS from the command line and sent the output to the stdout.

This time we use PhantomJS again, but we do it from a Python script and wrap Selenium around PhantomJS. The generated HTML source will be available in a variable. Here is the source:

#!/usr/bin/env python3
# encoding: utf-8

"""
required packages:
* selenium
optional packages:
* bs4
* lxml
"""

from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
# from bs4 import BeautifulSoup

url = "http://simile.mit.edu/crowbar/test.html"

dcap = dict(DesiredCapabilities.PHANTOMJS)
dcap["phantomjs.page.settings.userAgent"] = (
    "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/53 "
    "(KHTML, like Gecko) Chrome/15.0.87"
)
driver = webdriver.PhantomJS(desired_capabilities=dcap)
driver.get(url)
html = driver.page_source
print(html)
# soup = BeautifulSoup(driver.page_source, "lxml") #page_source fetches page after rendering is complete
# driver.save_screenshot('screen.png') # save a screenshot to disk

driver.quit()

The script sets the user agent (optional but recommended). The source is captured in a variable. The last two lines are in comments but they would work. You could feed the source to BeautifulSoup and then you could extract part of the HTML source. If you uncomment the last line, then you can create a screenshot of the webpage.

email notification from a script

June 15, 2016 Leave a comment

Problem
I have a Digital Ocean VPS box where several scripts are running. Some of them run for a day. I would like to get an email notification when a particular script starts / ends, or when something happens.

In short: how to send an email from the command line?

Solution
First, do the necessary configuration to be able to send emails from the command line (more details here).

Sending email without a body:

mailx -s "subject" < /dev/null "to@email.com" 2>/dev/null

Sending email with a body:

echo 'this is the body of the email' | mailx -s "subject" "to@email.com" 2>/dev/null

I also made a Python wrapper for it that you can find here.

Categories: bash, python Tags: ,

[vim] run current file with Python

Problem
You use (neo)vim for editing your Python code and you want to execute the source code in your editor. The output of the script should appear in the editor.

Solution
I came up with a dynamic solution, i.e. the interpreter is taken from the first line of the code. If you specified “#!/usr/bin/env python2“, then python2 is used; if you have “#!/usr/bin/env python3“, then python3 is used.

But what if you use Anaconda and you have for instance “#!/opt/anaconda3/bin/python3” in the first line? Then simply this interpreter is used.

Here is the snippet from my config file:

" run python script {{{
    function! RunWithPython()
        let first = getline(1)
        let first = substitute(first, "^#!", "", "")
        let first = substitute(first, "\n", "", "")
        let exe = ""    " the Python binary to call

        if first =~ "/usr/bin/env "
            let exe = split(first)[-1]
        elseif first == "/opt/anaconda3/bin/python3"
            let exe = first
        endif
        if exe == ""
            echo "Error: unknown Python interpreter in the first line."
            return
        endif
        " echo exe
        echo system(exe . " " . expand('%'))
    endfunction

    au FileType python nnoremap <buffer> <F9> :call RunWithPython()<cr>
" }}}

If you want to use Anaconda, then simply customize line 10.

Categories: python, vim Tags:

setting the volume from the command line

Problem
I have a laptop where the default volume is weak. On the system tray the volume is on 100% but it’s still weak. So far I started the program “pavucontrol”, which is a GUI application, and there I set the volume to 150% (that’s the maximum). However, if I watch a youtube video and pause it, in pavucontrol the volume falls back to 100%, so I need to adjust it after each pause.

Solution
I found a command line program that allows one to set the volume. It’s independent of pavucontrol. So I made a script that runs automatically when the graphical interface comes up:

#!/usr/bin/env bash

#
# from http://askubuntu.com/questions/44680
# listing current volume:
#
#     pacmd list-sinks | grep volume
#

cmd="pacmd set-sink-volume 0 100000"
echo "#" $cmd
$cmd

First list your sinks with “pacmd list-sinks”. I only had one, hence the id 0. The value 100000 is the volume (on my system it’s equivalent to 153%). Value 65535 is volume 100%.

Tip from here.

Update (20160604)
I made a wrapper script around pacmd; you can find it here on GitHub. Its usage is very simple. Do you want to increase the volume? Just call “volume.py 140%” and you are done.

drawing trees / graphs easily

Problem
You want to draw a binary tree / graph quickly. You’ve already used graphviz but you forgot its DOT language and you don’t want to read its documentation again. What do you do?

Solution
I found a nice Python project that does exactly this: pygraph.

Command to execute:

$ pygraph -u -e neato circle ab bc cd de ea

Output (circle.png):
circle

See the project’s page for more examples and figures.

I sent a request to the author and he was kind to implement the --dot option that prints the .dot source in a file. This way you can easily tweak your graph. Don’t write the .dot file from scratch: do it quickly with pygraph, get a basic source, and refine it manually if necessary.

Categories: python Tags: , , ,

vuze crashes after some time

April 12, 2016 Leave a comment

Problem
I like the bittorrent client Vuze but it crashes on some of my machines after a while.

Solution
I wrote a monitoring script that is checking if vuze is running. If vuze dies, the script restarts vuze automatically.

The source code is here: https://github.com/jabbalaci/Vuze-Restarter .

Categories: linux, manjaro, python

dropbox: command-line interface

September 12, 2015 Leave a comment

Problem
I wanted to test the status of my Dropbox client from the terminal. Actually, I wanted to write a script that executes an action when my Dropbox folder is fully synced. So I wanted to test the status if it’s “working” or “synced”.

Solution
I found the solution here. It turned out that Dropbox has an official command-line script that can do this and much more. First, get it:

wget -O ~/dropbox.py https://www.dropbox.com/download?dl=packages/dropbox.py
chmod u+x ~/dropbox.py
~/dropbox.py status

This is a Python script, written in Python 2, thus I modified the first line to be “#!/usr/bin/env python2“.

This script can do several things for you:

$ dropbox.py 
Dropbox command-line interface

commands:

Note: use dropbox help  to view usage for a specific command.

 status       get current status of the dropboxd
 help         provide help
 puburl       get public url of a file in your dropbox
 stop         stop dropboxd
 running      return whether dropbox is running
 start        start dropboxd
 filestatus   get current sync status of one or more files
 ls           list directory contents with current sync status
 autostart    automatically start dropbox at login
 exclude      ignores/excludes a directory from syncing
 lansync      enables or disables LAN sync

Some years ago I wrote a simple script to get the public URL of a file in my Dropbox folder. This script can do that too with the “puburl” command.

Categories: bash, python Tags: , ,
Follow

Get every new post delivered to your Inbox.

Join 93 other followers