Screenscraping the Senate for the semantic web
XML.com: Screenscraping the Senate
This is interesting. Paul Ford, as a proof of concept of the semantic web, has scraped the US Senate website for HTML, combined it with a CVS list of Senators, and generated the data in RDF. “After years of reading and writing about the Semantic Web, I still can’t tell you how to build a complete Semantic Web application from scratch. At first that was because the Semantic Web was only a vague set of half-finished specifications. But now, with publicly available triple stores like Redland and Kowari, and well-established specifications for ontology development and the like, it seems like a good time to start thinking in triples.”
He makes mention of another worthy project called the Open Government Information Awareness Project. Boinboing, from where I got the link to the xml.com article, also mentioned a project that I hadn’t heard of before called the They Work For You Project. This project evidently does the same thing for the UK Parliament as Mr Ford is attempting with the US Senate site.