Amazon YP: Running Up and Down the Street

Amazon YP: Running Up and Down the Street


Amazon has assembled a huge georeferenced photo database, that has value in its own right beyond shopping. What can be done with this? Is it possible to associate an Amazon photograph with an arbitrary latituge/longitude?

Not clear, but I’ve been playing around a bit. a9: Run up and down the street starts with any business listed in Amazon, and automatically grabs all the images to the left or the right, until Amazon runs out.

Gathered some observations on the file format of this images below. Is there any pattern to make use of?


Each image has a unique id. It resembles md5, but just a guess. Could
this id possible be a hash of the lat/lon (plus some other stuff)?

View Source on one of these pages, and search for “snapImages”. It’s a
javascript array, containing the id and urls for each image in the
filmstrip. Immediately after this, the method “addFilmstripData” is
called, with the ids referenced for walking left and right.

The javascript class “YpSnapshotControl” builds the filmstrip from the
snapImages array. Walking left or right triggers a request in a hidden
iframe; the source of that request is javascript containing a new
snapImages array. For example, the source of ..

http://www.amazon.com/gp/yp/snapshot/pan.html/102-0306354-8911335?snapshot=650f16a3e247db76

The stuff after snapshot= is an id.

So given any id, this request gives up images and surrounding ids. Then
it’s possible to do things like “Run up and down the block”

http://brainoff.com/a9/pano.pl?id=0f5861031b445205 (lots of images, big
download)
http://brainoff.com/a9/pano.pl?id=0f5861031b445205&size=sm (thumbnails)

Looking at a just few sample ids and associated image urls hasn’t
resulted in any broken codes, but I noticed a few patterns.

* the ids are either 16 or 32 characters. the last 16 characters of any
32 length id’s don’t seem to vary at all.

* the image file names consist of a 64 character preamble that doesn’t
seem to vary between any images.

* following this is an 11 or 22 character section. length corresponds
to id length. this section doesn’t vary across different resolutions
of the same image. so, it seems there’s some relationship in this
section to the id.

* the remainder of the image file name varies depending on resolution.
and perhaps city. in other words, it’s sometimes constant across
different images in the same city, at the same resolution.

* sometimes, adjacent images, with 32 character ids, will have the
first or last 11 characters of the “middle section” in common.

Comments
Comments