Yahoo Pipes is pretty incredible .. and it’s just started. About a dozen laptop screens were distracted at Lift07 with Pipe dreams .. no coincidence the favicon for Lift and Pipes resemble each other.
I’m really excited about this class of web native scripting services (I’ll try to define what that class is a few paragraphs in). For now I’m going to talk about my first go at building a Pipe, which takes a tag and builds a GeoRSS feed of people posting about that tag, Tag 2 GeoRSS.
Tag 2 GeoRSS
The basic idea is to request a list of tag citations from Technorati, and loop through with the For Each: Replace module, requesting bloginfo on each in a another pipe. You may not know that Technorati stores GeoURL tags, and they’re available through the bloginfo api.
Pipes initially only groks RSS, and the Technorati API can return RSS, but that RSS contains less information than their XML format, and the missing pieces are crucial to construct this Pipe. The RSS feed from the tag api contains a direct link to the post, but not to the weblog, and bloginfo only works on the root blog url. Second, bloginfo RSS does not contain the GeoURL location.
So I decided to go with XSL to transform the Technorati API XML into the RSS that I want. The W3C hosts an XSLT Service, so I hacked up tapitag2rss.xml and tapiblog2rss.xml. If I had those hosted on S3, then this whole Pipe could be entirely web native. The Technorati request UrlBuilder is piped to another UrlBuilder which builds the W3C XSLT request.
Iterating through the development was frustrating because of some bugs. The xmlfile argument for the XSLT UrlBuilder kept reverting to Hostname, rather than [url] any time the Pipe was reloaded. After editing the “sub-pipe”, the main pipe had trouble refreshing the sub-pipe. It seems like many requests are cached behind the scenes, so that changes to the XSL weren’t picked up, so I’d have to do something like change the name of the XSL in order to get a fresh request. Also, the Filter module couldn’t be set to screen out items that don’t have Geo tags.
The weirdest thing was trying to push through GeoRSS. I was certain GeoRSS was supported from Brady’s Deconstruction, yet when I had the XSL output GeoRSS Simple or W3C Geo, it wasn’t present in the Pipes output. It seems that within a Pipe, geo stuff is acted apon only in the y:location element, which produces GeoRSS when output .. and having my XSL output y:location did work. I think this is wrong on two counts. Pipes should grok GeoRSS on the input as well. And any namespace present in source pipes should be passed through, even if not known to Pipes.
Still it’s an amazing start, and the Pipes team has done a good turn releasing early and working with the web to find out what’s needed. GeoRSS is such a key part of the “mashup” environment, it’s available now in some form, and featured in some of the top pipes.
Web Native Scripting
There’s been a need for a web native scripting language, an abstraction to cover the bits and pieces necessary for mashups. Major parts of the mashup toolbox have become codified, but it’s still awkward to code these things in server based scripting languages, and if you’re not a programmer it’s still out of reach.
I’ve grown to admire Excel for how much power it puts in non-programmer hands, and new services like Dabble DB and EditGrid have taken that model for its explanatory leverage, but are also embracing the web native scripting approach. Transformation services like Dappit and FeedBurner have their own partial approaches. Swivel and Many Eyes focus on the visualization. Ning has of course been cultivating a new style of development, and they have abstracted out many of the common mashups and apis, but it hasn’t hit any sweet spot for non-programmers (php is too hard) or programmers (rails). Even Greasemonkey was in some sense an iteration in this space. And I could count Mapufacture among these tools too (btw, here’s Pipes Nearby Something in Mapufacture.)
All these tools are pushing forward the overall idea of making the web a programming environment, and making that environment as widely friendly as possible. It’s hard to know what the right balance will be .. surely it shouldn’t grow to resemble any other scripting language. But should it try and embrace some of the heavyweight ideas of Web Orchestration, in some lightweight fashion? For instance, my Technorati API key was maxed out during development of this Pipe, so I signed up for a new one .. figuring out this was the problem required leaving Pipes and making requests directly. There could be more ways to access the underlying data flow, and set up some kinds of triggers for unexpected behaviors. Could Pipes support OpenID, XSLT, Microformat operations and RESTful services?
The scaling problems seem real enough, and who knows, maybe this eventually pushes some processing back towards the browser (just in time for Firefox 4). Can the concept of Pipes be portable .. is there an abstract way to encode this .. there are open source projects like Plagger in which to pursue these ideas too.