Archive for the ‘web’ Category

[wget] downloading images

December 11, 2017 Leave a comment

Working with a Python script, I wanted to download images from various websites. I gave this job to wget that I called as an external program. However, downloading some images failed. I verified them, and they opened nicely in my browser. What da hell?

Some web servers verify the client and if it’s not a browser, they simply block it. Our job is to make wget pretend it’s a normal browser. Put the following content in your “~/.wgetrc“:

header = Accept-Language: en-us,en;q=0.5
header = Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
header = Connection: keep-alive
user_agent = Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:40.0) Gecko/20100101 Firefox/40.0
referer = /
robots = off

Problem solved. I found this tip here.

Categories: bash, web Tags: ,

Write your first Firefox extension

November 20, 2017 Leave a comment
Categories: firefox, web Tags: , ,

Exctract the significant parts of a web page

October 1, 2017 Leave a comment

From a web page you want to extract the significant parts: title, author, date of publication, body, etc.

Mercury Web Parser does exactly this. It’s free. After registration you get an API key. Their web service returns a structured JSON response. I tried it with my previous post:

curl -H "x-api-key: <my_api_key>" "" | python3 -m json.tool


    "title": "Re-run a command in the terminal every X\u00a0seconds",
    "author": "Jabba Laci",
    "date_published": "2017-09-30T22:02:19.000Z",
    "dek": null,
    "lead_image_url": "",
    "content": "<div class=\"content\"> <p><strong>Problem</strong><br>\nYou want re-execute a command in the terminal every X seconds. For instance, you copy a lot of big files to a partition and you want to monitor the size of the free space on that partition.</p>\n<p><strong>Solution</strong><br>\nA naive and manual approach to the problem mentioned above is to execute the commands “<code>clear; df -h</code>” regularly, say every 2 seconds.</p>\n<p>A better way is to use the command “<code>watch</code>“. Usage:</p>\n<pre> watch -n 2 df -h </pre>\n<p>That is: execute “<code>df -h</code>” every two seconds. <code>watch</code> will also clear the screen and print the result to the top. You can quit with <code>Ctrl + c</code>.</p>\n<p>Tip from <a href=\"\">here</a>.</p> </div>",
    "next_page_url": null,
    "url": "",
    "domain": "",
    "excerpt": "Problem You want re-execute a command in the terminal every X seconds. For instance, you copy a lot of big files to a partition and you want to monitor the size of the free space on that partition.\u2026",
    "word_count": 108,
    "direction": "ltr",
    "total_pages": 1,
    "rendered_pages": 1

Pretty impressive.

Categories: web Tags: , ,

[webdev] sortable table

January 29, 2015 Leave a comment

If you create a table with HTML, it’s static. It would be great if you could sort it by various columns. How to do that?

Create the static table (like before) and integrate it with a Javascript library that will make it sortable. I found a great solution for this called tablesorter. You can also find it on github, though the first link conatains more documentation and examples.

Categories: javascript, web Tags: ,

[webdev] add a stylish Javascript image gallery

January 24, 2015 Leave a comment

I wanted to add a modern Javascript image gallery to a website. I had some thumbnails and I wanted the following: (1) clicking on a thumbnail the image should appear, (2) the gallery must be browsable in both directions.

The Lightbox script worked for me like a charm. I’m not a Javascript expert but I could integrate it in 5 minutes. Awesome script!

Check out the project’s home page for a demo.

Categories: javascript, web Tags: ,

10 Minute Mail: a great alternative

Usually I use the service but 10 Minute Mail is even simpler.

Thanks Jeszy for the tip.


Analyze a User-Agent string

You can analyze a user-agent string with

Categories: web Tags: ,