Firebug: sitescraper’s best friend
When you do sitescraping, usually you know exactly what part of a webpage you want to extract. The naive way is to download and analyze the source code of the page trying to identify the interesting part(s). But there is a better way: use Firebug.
First, install Firebug and restart the browser. In the top right corner of the browser you’ll see a little bug (part A on the figure below). Clicking on this will call the Firebug console. On the console, click on the 2nd icon from the left in the console’s top (part B). Then click on an element in the browser that you want to inspect (part C). The relevant HTML source code will be highlighted in the console (part D). Right click on it and choose the CSS Path / XPath from the popup menu. Now you only have to write a script that extracts this part of the page.