Home > firefox > Firebug: sitescraper’s best friend

Firebug: sitescraper’s best friend

When you do sitescraping, usually you know exactly what part of a webpage you want to extract. The naive way is to download and analyze the source code of the page trying to identify the interesting part(s). But there is a better way: use Firebug.

Firebug is a Firefox add-on for web developers. You can edit, debug, and monitor CSS, HTML, and JavaScript live in any web page. The interesting part for us is the feature that you can point on any element of a webpage and Firebug shows you its exact location in the source. You can also get the CSS Path and/or the XPath of the given element.

First, install Firebug and restart the browser. In the top right corner of the browser you’ll see a little bug (part A on the figure below). Clicking on this will call the Firebug console. On the console, click on the 2nd icon from the left in the console’s top (part B). Then click on an element in the browser that you want to inspect (part C). The relevant HTML source code will be highlighted in the console (part D). Right click on it and choose the CSS Path / XPath from the popup menu. Now you only have to write a script that extracts this part of the page.

Categories: firefox Tags: ,
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: