Sun, 30 Nov 2008
As for single people,
"I don't know, try eating chocolate cake," he said.
-- Pastor Young (source).
[/scribble/rhetoric] permanent link and comments
Menu
This section
Recycling the past
Comments
Comments are welcome. Email me."I don't know, try eating chocolate cake," he said.
-- Pastor Young (source).
[/scribble/rhetoric] permanent link and comments
Scrape the Web: Strategies for programming websites that don't expect it
I just received this email:
It means:
I get to stand in front of people I don't know in Chicago and talk about web scraping!
(Unless the talk is canceled because no one signs up.)
So I guess it means I'll be going to PyCon 2009! It's in Chicago, from March 25 to April 2. If you'll be there, too, drop me a line!
[/note/debian] permanent link and comments
RDFa for the Debian Package Tracking System?
I was happy to read Zack's post about adding machine-readable metadata for the Package Tracking System. Not only can one query it via SOAP, he also provides XPath recipes for how to screen-scrape data out of the web pages.
He writes:
This is awesome. In fact, as you can see when he links to the SOAP backend, the SOAP interface is implemented using that XPath screen-scraping!
What I think would be even more awesome would be to present the data to a machine user of the web page as RDFa, "RDF in attributes." RDF (short for Resource Description Framework) is a standard for metadata statements. Although it is involved in early versions of RSS used for web site syndication, in general it has nothing to do with that.
A sample few RDF statements might be:
RDF generally uses URIs to represent information (though you can still literal values like numbers where appropriate). This allows different users to create namespaced terminology. That way, when Debian defines what "maintainer" means, Fedora choose if they want to use the Debian term meaning "maintainer."
If they do, then Fedora people and Debian people could use the same query (on a different set of data) to answer the same question. And if they choose to use a different term, the two data sets can co-exist; the namespacing prevents any conflict.
As for the term URI: URIs ("Uniform Resource Indicators") are just like URLs, except that instead of names of locations, they are just identifiers. So it's true that every URI is a URL, but you aren't necessarily intended to be able to wget every URI; they're just names.
Ben Adida and Mark Birbeck wrote a fantastic RDFa primer that explains the concepts and implementation, peppering it with diagrams where they might help. The key is that using RDFa gives you the ability to automatically interoperate with the world of RDF-aware tools, including query and reasoning systems, and it is architected in a way that anyone can add RDFa data to any page without possibly stepping on the toes of other extra-metadata technologies. (Microformats don't have most of these benefits.) Ben and Michael Hausenblas at W3C also wrote a document listing some further use cases for machine-readable web pages.
When I have some spare time, I'd be happy to help. But first I hope to make Zack and others aware that there is a standard for machine-readable metadata, designed with use cases like ours in mind!
[/note/debian] permanent link and comments
Tue, 18 Nov 2008
Obama's digital writing
The New York Times says this about Obama:
There's hope for me still!
[/note/me] permanent link and comments
Sat, 01 Nov 2008
Mouse cursors
John Goerzen was surprised by a mouse pointer change. His mouse changed from X.org's class black mouse pointer to the new GNOME translucent set. Upset, he wrote:
Windows users seem to place similar importance on that clicky thing. A recent PC Magazine article writes, "Few things are more important in Windows than the mouse pointers." Dave Taylor discussed mouse pointers once, showing this picture of Windows XP's mouse pointers:
Windows XP's mouse pointer, then, doesn't look like the one John Goerzen got. They look like a bent version of the normal X11 pointers with inverted colors. Windows Vista's mouse cursors do look like GNOME's (via a BlogIsEverything post):
For this reason, Windows Vista feels like a cheap knock-off of GNOME to me whenever I use it.
[/note/debian] permanent link and comments
Two Girls, One Cupid
[/scribble/hash-joiito] permanent link and comments