Archive for the ‘bash’ Category

mirror a website with wget

June 15, 2014 Leave a comment

I want to crawl a webpage recursively and download it for local usage.


wget -c --mirror -p --html-extension --convert-links --no-parent $url

The options:

  •  -c: continue (if you stop the process with CTRL+C and relaunch it, it will continue)
  • --mirror: turns on recursion and time-stamping, sets infinite recursion depth and keeps FTP directory listings
  • -p: get all images, etc. needed to display HTML page
  • --html-extension: save HTML docs with .html extensions
  • --convert-links: make links in downloaded HTML point to local files
  • --no-parent: Do not ever ascend to the parent directory when retrieving recursively. This is a useful option, since it guarantees that only the files below a certain hierarchy will be downloaded.

Tips from here.

Get URLs only
If you want to spider a website and get the URLs only, check this post out. In short:

wget --spider --force-html -r -l2 $url 2>&1 | grep '^--' | awk '{ print $3 }'

Where -l2 specifies recursion maximum depth level 2. You may have to change this value.

Categories: bash Tags: , , , ,

using gtk-recordmydesktop

Here I sum up how I use gtk-recordmydesktop. gtk-recordmydesktop produces huge .ogv files that I like to convert to much smaller .mp4 files (without quality loss).

My gtk-recordmydesktop settings:

  • video quality 100%
  • audio quality 100%
  • in Advanced -> Performance:
    • frames per second: 20
    • full shots at every frame: yes

Convert .ogv to .mp4:

/opt/ffmpeg/ffmpeg -i "input.ogv" -codec:v libx264 -quality good -cpu-used 0 -profile:v baseline -level 30 -y -maxrate 2000k -bufsize 2000k -threads 4 -codec:a copy -b:a 128k "output.mp4"

I record audio with a microphone but it always has some white noise. To get rid of it, extract the audio:

/opt/ffmpeg/ffmpeg -i file.mp4 -f wav output.wav

Open the .wav file with Audacity, remove the noise and save the result in .mp3 format. Finally, replace the audio in the .mp4 file:

/opt/ffmpeg/ffmpeg -i audio.mp3 -i video.mp4 -c copy final_video.mp4
Categories: bash Tags: , , , ,


I have an .ogv file that I want to convert to .mp4.

My ffmpeg is compiled from source. With the following command I got output of good quality:

/opt/ffmpeg/ffmpeg -i "input.ogv" -codec:v libx264 -quality good -cpu-used 0 -profile:v baseline -level 30 -y -maxrate 2000k -bufsize 2000k -threads 4 -codec:a copy -b:a 128k "output.mp4"
Categories: bash Tags:

measure execution time in bash

While working in bash, quite often I need to launch scripts/programs that take some time to finish. I would like to have an idea how much time it took. How to do it?

Solution #1 and #2 (easy to forget)
In Unix, there is a command called time, which can measure the execution time of a process. Example:

$ time sleep 3

real    0m3.001s
user    0m0.000s
sys     0m0.001s

Nice, but you don’t want to start everything with time, do you?

Another way is to print the date before and after the process:

$ date; echo "serious calculation in progress..."; sleep 3; date
Tue May 13 11:40:42 CEST 2014
serious calculation in progress...
Tue May 13 11:40:45 CEST 2014

Now you can see when the process started and when it finished.

Fine, but… What if you start a process normally and then you realize after a few minutes that it won’t finish soon. Stop it and restart it with one of the aforementioned two methods? No. You wait until it stops and if you are not at the computer, you won’t have any idea how long it was running. Damn!

Is there an easy solution for this problem? A painless, straightforward way? Well, yes, there is. See below.

Solution #3 (the easy way)
In bash, you can customize the prompt via the PS1 variable. By default, its value is set to something similar:

$ export PS1="\u@\h:\w\$ "
jabba@nancy:~$ cd /trash/

All we need to do is add the time in the prompt. Then, when a process terminates, you will see right away when it stopped. For adding the time, simply insert “\t” in PS1:

$ export PS1="\u@\h [\t] \w\$ "
jabba@nancy [11:52:20] ~$ cd /trash
jabba@nancy [11:52:29] /trash$

This is the basic version. In addition, my prompt is colored. To have a colored prompt, I include a file in my .bashrc. Thus, the end of my .bashrc looks like this:

# this is the last line in my .bashrc:
source ~/.bash_prompt

You can find my .bash_prompt file here.


Categories: bash Tags: ,

ménage de printemps (spring cleaning)

March 22, 2014 Leave a comment

My Dropbox folder was at 98.5%, so it was time to do some cleanup. Which directories are the largest? Which files are the largest?


alias top10dirs='du -hsx * | sort -rh | head -10'
alias top10files='find . -type f -print0 | du -h --files0-from=- | sort -hr | head -n 10'

The first one shows the top 10 largest directories, while the second one prints the top 10 largest files. Directory and file sizes are shown in a human-readable format.


$ top10dirs 
60M     20090629-deploy
60M     20090327-deploy
56M     kgm
55M     exist-deploy-v3-20100710
55M     exist-deploy-v3-20100521
$ top10files 
60M     ./20090629-deploy/
60M     ./20090327-deploy/
55M     ./exist-deploy-v3-20100710/
55M     ./exist-deploy-v3-20100521/
49M     ./exist-deploy-v3-20100409/


  • top10dirs is from here
  • for top10files I wrote a Python script, but reddit user farsass pointed out that it can be solved easier in the shell

Find the largest subdirectories

March 21, 2014 Leave a comment

The free space on your HDD is low. Which directories are the largest? What consumes so much space?

Install “ncdu“, which stands for NCurses Disk Usage.

ncdu (NCurses Disk Usage) is a curses-based version of the well-known ‘du’, and provides a fast way to see what directories are using your disk space.” (source: man)

For a command line solution, check out this post: How Do I Find The Largest Top 10 Files and Directories On a Linux / UNIX / BSD?

Categories: bash Tags: , , , ,

extract .tar.gz

February 15, 2014 Leave a comment
$ tar xvJf file.tar.xz
# or
$ tar xvf file.tar.xz
# or
$ tar --xz -xvf file.tar.xz

Why is *.tar.gz still much more common than *.tar.xz?

Categories: bash Tags:

bittorrent client from the command line

January 14, 2014 Leave a comment


sudo apt-get install transmission-cli


$ transmission-cli  -w <download_dir>  <file|url|magnet>
Categories: bash Tags: ,

print the content of a file with line numbers

November 4, 2013 Leave a comment
cat -n file.txt
Categories: bash Tags: ,

Get every new post delivered to your Inbox.

Join 71 other followers