It looks up photos with the given text, examines the assigned location (10 million+ geotagged flickr photos), clusters the results, and returns the largest location.
It works remarkably well .. for cities, neighborhoods (“haight ashbury”, “hells kitchen”, “calle ocho”), places and landmarks (“empire state building”, “red square”). It doesn’t always work precisely (“statue of liberty”), but suspect this could be down to some tweak in the algorithm.
And this seems like just scratching the surface of what’s possible with this data.
One extension that could be possible is detection of linear objects; for example, take a look at “Bay to Breakers” flickr map