Archive

Archive for the ‘bash’ Category

MongoDB: connect remotely

August 8, 2016 Leave a comment

Problem
I have a Digital Ocean VPS running MongoDB. There is a web application on this machine that is on port 80. MongoDB is hidden from the outside world and can only be accessed internally. There is also an SSH port where I can log in.

How to connect to my MongoDB server from home? Say I want to use a graphical client, e.g. MongoChef. The client runs on my home machine and I want to connect to MongoDB with it on my DO VPS. How to do that?

Solution
I found the solution here. In short: we connect securely to our database through an SSH tunnel.

Make sure that:

  • you can SSH into your Mongo droplet
  • your MongoDB is bound to localhost

For connecting, I use this script:

REMOTE_SSH_PORT=1234
LOCAL_PORT=2345
REMOTE_MONGO_PORT=27017
cmd="ssh -p ${REMOTE_SSH_PORT} -L ${LOCAL_PORT}:localhost:${REMOTE_MONGO_PORT} user@your.remote.ip"
echo "#" $cmd
echo "# connect on your home machine to port ${LOCAL_PORT}"
echo "# example:    mongo --port ${LOCAL_PORT}"
$cmd

The default SSH port is 22, but it’s a good idea to change it. With the command “ssh -p ${REMOTE_SSH_PORT} user@your.remote.ip” I could log in to my VPS. However, MongoDB was not accessible from outside, thus executing “mongo --host your.remote.ip --port ${REMOTE_MONGO_PORT}” failed.

The SSH tunneling above works as follows. On your home machine you open the port ${LOCAL_PORT} that is connected to your remote machine via the SSH port ${REMOTE_SSH_PORT}, and the connection is tunneled to localhost:${REMOTE_MONGO_PORT}, where localhost means the remote machine where we logged in with SSH.

So, when you execute the script above, you’ll have to log in to your remote machine via SSH. Then open a new terminal and type “mongo --port 2345” and voilá, you are connected to MongoDB on your remote machine!

If you use a Mongo client (e.g. MongoChef), then simply create a new connection and specify localhost with port 2345. Connect, and you are in.

It works as long as you are logged in in a terminal via SSH. When you log out, the local port closes that is tunneled to your remote machine.

Scraping AJAX web pages (Part 5)

July 13, 2016 Leave a comment

Don’t forget to check out the rest of the series too!

I’ve already written about PhatomJS, for instance here. Recall: PhantomJS is a headless WebKit scriptable with a JavaScript API.

The problem is still the same: we have a webpage that contains lots of JavaScript code and we want to get the final HTML that is produced after the JavaScript codes have been executed.

Example: http://simile.mit.edu/crowbar/test.html. If you download it with “wget” for instance, you get the text “Hi lame crawler” in the source. However, a JavaScript code changes this text to “Hi Crowbar!” in the browser and we want to get this generated source. How?

This time we’ll use PhantomJS. We also need a JavaScript script that will instruct PhantomJS what to do. Let’s call it printSource.js:

var system = require('system');
var page   = require('webpage').create();
// system.args[0] is the filename, so system.args[1] is the first real argument
var url    = system.args[1];
// render the page, and run the callback function
page.open(url, function () {
  // page.content is the source
  console.log(page.content);
  // need to call phantom.exit() to prevent from hanging
  phantom.exit();
});

Note that this code comes from here.

If you want to set the user agent, use this modified script:

var system = require('system');
var page   = require('webpage').create();
page.settings.userAgent = 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30)';
// system.args[0] is the filename, so system.args[1] is the first real argument
var url    = system.args[1];
// render the page, and run the callback function
page.open(url, function () {
  // page.content is the source
  console.log(page.content);
  // need to call phantom.exit() to prevent from hanging
  phantom.exit();
});

Then launch the following command:

$ phantomjs printSource.js http://simile.mit.edu/crowbar/test.html

The output is printed to the standard output.

If you want to do the same thing from a Python script, check out Part 5.5 of the series.

Categories: bash Tags: , , , ,

limit the CPU usage of Firefox / Dropbox / etc.

June 24, 2016 Leave a comment

Problem
I have an older dual core laptop where Firefox sometimes uses 120%-130% CPU and slows down the machine completely. Restarting Firefox solves the problem for a few minutes but then again, it eats up the CPU. What to do?

Solution
I don’t have many tabs open but I still have this problem. I also uninstalled the Flash plugin but it didn’t solve the problem.

However, I found a nice tool called cpulimit:

Cpulimit is a tool which limits the CPU usage of a process (expressed in percentage, not in CPU time). It is useful to control batch jobs, when you don’t want them to eat too many CPU cycles. The goal is prevent a process from running for more than a specified time ratio. It does not change the nice value or other scheduling priority settings, but the real CPU usage. Also, it is able to adapt itself to the overall system load, dynamically and quickly. The control of the used CPU amount is done sending SIGSTOP and SIGCONT POSIX signals to processes. All the children processes and threads of the specified process will share the same percentage of CPU.” (from the README of the project)

The following setting worked for me:

$ cpulimit -l 80 firefox

Firefox uses several threads but as mentioned in the documentation, they will will share the same percentage of CPU.

The CPU usage may jump higher than the specified value, but cpulimit will push it back in a few seconds.

My old laptop has become useable again :)

Update (with Dropbox)
I noticed that Dropbox also loves my CPU. Here is how I could limit this greedy beast. Originally, I started “$HOME/.dropbox-dist/dropboxd” automatically at each startup. Create the file “$HOME/bin/cpulimit_dropboxd.sh” with the following content:

#!/usr/bin/env bash

cpulimit -l 50 $HOME/.dropbox-dist/dropboxd

Make it runnable (chmod u+x cpulimit_dropboxd.sh) and call this script (cpulimit_dropboxd.sh) when your system comes up. Here I give 50% CPU for Dropbox but you can play with that value.

Categories: bash, firefox Tags: ,

email notification from a script

June 15, 2016 Leave a comment

Problem
I have a Digital Ocean VPS box where several scripts are running. Some of them run for a day. I would like to get an email notification when a particular script starts / ends, or when something happens.

In short: how to send an email from the command line?

Solution
First, do the necessary configuration to be able to send emails from the command line (more details here).

Sending email without a body:

mailx -s "subject" < /dev/null "to@email.com" 2>/dev/null

Sending email with a body:

echo 'this is the body of the email' | mailx -s "subject" "to@email.com" 2>/dev/null

I also made a Python wrapper for it that you can find here.

Categories: bash, python Tags: ,

untruncated output of “ps”

Problem
You must have noticed that when you use the Unix command “ps“, the output’s length in truncated to fit the screen’s width. How to get the full output?

Note: Instead of “ps” I usually use “ps aux“.

Solution
Use the “ww” options too:

ps auxww

Are you interested in one particular PID?

ps ww PROCESS_PID

(This is another thing that bugged me for years but a quick Google search enlightened me.)

Categories: bash Tags: , ,

compile and try a Go project on GitHub

Problem
I found an interesting Go project on GitHub (https://github.com/ichinaski/pxl) that I wanted to try. How to compile it?

(This project “pxl” can display images in the terminal).

Solution

GOBIN=$(pwd) GOPATH=/tmp/gobuild go get github.com/ichinaski/pxl

Under Manjaro I had to install the package “gc”, which contains the official Go compiler.

Categories: bash Tags: , ,
Follow

Get every new post delivered to your Inbox.

Join 92 other followers