Scraping tabular data from the web

"Needle in haystack" CC-BY-NC-ND by t_buchtele

I’ve been looking for a quick and easy solution to scrape an HTML table into a usable format. Of course, there are numerous solutions to do that in some small Perl/PHP/Python programme, but I found another path especially elegant. It turns out, Google Docs has an importHTML() function in Spreadsheets:

=importHTML(“http://www.parlamentswahlen-2011.ch/resultate-a-z.html”,”table”,1)

scrapes the first (1) HTML table element (“table”) from http://www.parlamentswahlen-2011.ch/resultate-a-z.html into your Google spreadsheet. Very nice!

Hat tips to OUseful.Info for this trick :)

About these ads

3 thoughts on “Scraping tabular data from the web

  1. Pingback: How to work with Excel XLS files outside of Excel | visurus

  2. Pingback: How to work with Excel XLS files outside of Excel - Spatialists

Leave a reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s