Scraping AJAX web pages (Part 1.5)
Before attacking Part 2, I think it would be useful to investigate what the generated source of a page looks like.
Consider the following source:
<html> <body> <script>document.write("Hello World!");</script> </body> </html>
If you open it, you’ll see the text “Hello World!”. It’s not a big surprise :) But what is the generated source? How is the original html above interpreted by the browser?
<html> <body> Hello World! </body> </html>
<html> <head></head> <body> <script>document.write("Hello World!");</script>Hello World! </body> </html>
Well, the correct answer is B. If you install the Web Developer add-on to Firefox, you’ll be able to see both sources: the original one (that is downloaded from the web server), and the generated one (which is produced by the browser after interpreting the original source).