Kenyan Election Open Data

Election boundary and polling station geo data for Kenya can now be downloaded.

Kenyans are gearing up for the presidential election in March, including the tech community projects like Uchaguzi and Map Kibera. One limiting factor is availability of data. The Independent Electoral and Boundaries Commission (IEBC) finalized new constituency boundaries earlier this year, but have only released non-machine readable pdf maps. That’s led Map Kibera to resurvey on the ground the new boundaries and polling places, in Kibera, Mathare and Mukuru. But outside that excellent work, open data on the most primary geographies of democracy are not available for Kenya. Election boundaries should be the number one data available on OpenDataKE.

The IEBC released a nice site to look up Kenyan polling places. It doesn’t directly offer download, but quickly looking at network requests for the app revealed simple endpoints to request json versions of constituency and county boundaries, and the locations of polling places. I wrote a script which iterates through every ward in Kenya, caches the data, and produces outputs. There are shapefiles for download here. If you need something tweaked, let me know, and I’ll see what I can do.

This map displays approximately 16000 polling places across Kenya. In total, there are 26447, but a significant number are not mapped. Also, there are no ward boundaries available. This is a big opportunity for Kenyans to contribute back to their government’s data. For instance, Lindi Mosque has no location in the government data set, but is present in OpenStreetMap. Would be possible to repurpose some OpenStreetMap microtasking tools to quickly map these places (HOT Tasking Server, MapMill or MapRoulette).

Why make this data open? So we can build things like
* Send an SMS to get a list of polling places in your ward
* Geocoder and custom base maps to make it easier to locate reports to Uchaguzi
* Print map services, so you can easily distribute polling place locations in your area
* Analysis, such as distribution of polling places according to population density
and plenty of creative things I haven’t imagined yet.

I hope this step of action, then begging forgiveness rather than asking permission (in I hope the best tradition of mySociety) will spark curiosity and consideration by the IEBC to more fully embrace openness, not only by releasing data, but inviting conversation and feedback from the citizens it serves, and becoming more transparent in how decisions are made and how the complicated and incredibly important business of running an election is being carried out.

Moabi and Big Important Challenges for the GeoWeb

Recently finished a technical review of WWF’s Moabi platform, their geodata sharing platform to track deforestation, and came out with some important technical challenges for the GeoWeb I want to share with the mapping nerds.

Just following TAI’s bridging session on natural resource governance, I met Charles Huang of the World Wildlife Fund at a similarly styled event at the World Bank. After self-identifying as a developer, Charles asked if I’d like to talk more about Moabi … they needed a developer’s eye on the platform as they considered how to move to their 2.0. This isn’t GroundTruth’s usual sort of work, but considering the TAI Bridge effort to connect technologists with sector experts, our other work with Drupal and mapping, and my own deep interest in conservation, I dove in.

Moabi is ambitious. In its current expression with DRC and deforestation, the platform is meant to collect geographic data and detailed profiles from all sorts of courses, including the “crowd”, on everything from mining concessions, informal logging, road construction, to REDD projects, and enable conversation and coordination among everyone conservation workers on the ground, government officials, and the interested public on the other side of the world. And in the long run, this platform is intended to apply to other places and focus areas of WWF. This is such an ambitious, wide mandate, and I think touches on 3 key technical developments for the GeoWeb.

Features as Full Social Objects

In Moabi, every single feature, such as an individual mine or road, has a profile page, which should be shareable, pivotable, a focus point for social interaction among people and groups organizing to prevent deforestation. Most often in GIS and web mapping, we talk about layers of information, and while individual features often have expression in various forms on a site, there is no open source software package that deals with feature-level data in this way out of the box. And there’s not really that many examples to draw on. Every feature in OpenStreetMap has an individual url, with details on its tags, last editor, etc. But it’s not quite used as a social object, excepting maybe the awesome machine tag integration with flickr. Foursquare has some elements of this, every venue primarily being a place, and users able to connect to that place. But besides these sorta examples, a geographic feature is generally treated as a property of social objects, rather than social objects in themselves.

Multi-master sync and flexible schemas

Data is Moabi is sourced from both government, civil society, and interactions on the site. Those source databases are very likely to update over time, and very possibly they will receive updates on Moabi as well. Given a foreign key, or even a clever lookup, at least identifying which features have updated is possible. But this inevitably leads to thorny issues of handling conflicts during synchronization. This is commonly experienced in the simple situation of editing in OSM during a mapping party, where two editors independently modify the same feature. The OSM API can detect the potential conflict, but once detected the interface and process to resolve the conflict in JOSM is not intuitive or easy at all. When writing code, this situation happens all the time, and simply being text, it’s simple enough to throw the problem to the programmer to resolve. But we’re talking about geographic features here, and that’s going to take very well designed interfaces to solve properly.

A related issue is that a single feature may contain pieces of information from several different masters. Take roads. Yes, common enough is the geometry, maybe the name (or names) of the road, the surface. But with Moabi, they may also want to know if the road has a regular anti-poaching patrol, or if it’s commonly used to transport lumber. OSM deals with this unpredictability of attributes through tags, arbitrary key-value pairs, with no predetermined mandated structure, with conventions of use worked out over time. In contrast, traditional GIS requires pre-decided database schema, leading to long running discussions which often never resolve to the point of actually collecting and sharing data. So why not just use OSM? Well I’m not sure it’s best to keep every single thing interesting to Moabi about roads in OSM, and some features, like say mining concessions or REDD projects, may simply not be appropriate for OSM (this is up for discussion of course). So assuming that something else is needed, what are the options for a tagging based system that’s not OSM? The closest I’ve seen is Matt Berg’s work at Earth Institute on georegistery, an open source db with RESTful API for sharing and versioning geographic objects in GeoJSON, with arbitrary attributes.

Complex interactive experiences without a lot of fuss

Moabi has a lot of data, two dozen layers, thousands of features. Currently the only way to work through the data is via query, which can be a little slow both in usability and response time. Interactive exploration is best accomplished through Tiles and Grids, as enabled so well in TileMill and MapBox. And there’s been a lot of activity in OpenLayers with Grids, including support for interaction in multiple layers.

The issue comes with how to make this architecture efficiently queryable. Say you wanted to show only to show mines operated by Chinese companies in the DRC, which are not located in mining concessions. The most obvious way is to send this query to the server, and render the result on the client side. But perhaps there are other ways, utilising the same approach MapBox Streets takes to customize styles on the fly, by manipulating the color table of tile images. If a tile set handled just a single layer, then the color table could someohow be used to encode other properties. This would essentially pre-can a set of interesting queries during tile generation, with the query results rendered by the server. If that’s too limiting, perhaps we need to start thinking further in the future, where something like SVG based tiles could be efficiently rendered and queried in the browser.

Thoughts?

Would love to hear feedback on whether these are actually interesting challenges in your opinion, and what are some other ways of moving forward with them.

All I Want for OpenStreetMap is … Social and Attention

Kate posted her wishlist for OpenStreetMap and I second everything she wrote … OSM can be fine tuned to be even easier. In my wishlist post, I want to cover improving connections between mappers, and focusing mappers attention.

In short, OSM should be more like Facebook. Seriously. A news feed should show you the most relevant activity and news for you personally, and the profile page should show what you’re about in OSM in a more approachable way.

State of the Map Group Photo

The sociality of OSM is its biggest strength. The intricacies of maps, tags, places are discussed in minute detail, and are ultimately the result of conversation, not top down dictate. The result are maps that are more expressive for more situations than any other platform. And we even meet in person! At the pub, the hack weekend, the LogCluster meeting. Anyone who’s ever been to a State of the Map conference can attest, it’s a vibrant, fun, exciting movement.

OSM has hundreds of thousands of users, and hundreds of millions of map features. Yet the tools to connect to mappers and monitor an area are still the same as 5 years ago. Changing this is my wishlist … and yes, I know that most all of OSM features are coded voluntarily, and its something I could do myself rather than list out … I appreciate all the time and energy that goes into OSM coding, this is my *wish* list.

Finding the Local Community

If I’m travelling some place new for an event, I want to connect to the local OSM community, and the process is complicated and not obvious. First, browse to the area in OSM and click the “History” tab, which should show recent edits in the area. Problem is, this often includes large, global edits which overlap this local area and aren’t relevant. And if you page through in a somewhat mind-numbing fashion, the OSM rails app starts to have performance problems past page 5. OWL is potentially a solution to this, but it has already been a priority development item for years.

And what the history tab won’t do in either form is tell you who the most active, prominent mappers in the community are. For that, I usually jump over to ITO’s excellent OSM Mapper, which present and visualizes summary statistics for an arbitrary area. Excellent, but you have to know it’s there, and it takes a separate authentication, and is a little counter intuitive to use. But from there, I have a list of 5-10 top mappers, with links to the OSM profile pages, from which I can send them a message. And wait, and hope they respond.

Next, I’ll jump into the wiki, and look through the relevant Wiki Projects. Sometimes this will have listing of Users in an area, or links to their website, or mailing lists. The User profile pages sometimes have a way to contact that user, but just as often not. If there’s a mailing list, then it’s about subscribing to yet another mailing list, perhaps one in a foreign language, and posting. If there’s a project website, that’s usually golden and a sign of an active local community that will be responsive.

Focusing Attention

And that’s just for a one time connection. Most mappers have been involved in mapping many areas. How do you keep informed? You basically need to have a ton of RSS feeds, poke around the monitoring tools, read a lot of lists.

A lot of activity, most of the time, you only need to be dimly aware of. If a user with a good reputation is editing in a familiar area, there’s little need to check into it further. If a new user has just signed up and is editing, it would be good to take a close look at their edits, and reach out and welcome them to OSM and offer to support their work. Reputation needs to be calculated, and used to filter.

And not just edits. There’s so much going on in the community. The Weekly OSM Summary is super useful, but is only the cream of the global activity. There are OSM events all the time, but it seems mostly German mappers are adding their events to the wiki. Expand User Diaries so these posts are accessible in all sorts of ways, for all kinds of notifications.

Being Social Should Be Simpler

Maybe the general concept is around “Interest”. There are places you might be interested in. There are particular mappers and communities you might be interested in … either because they’re active in an area, or perhaps because they make suspicious edits. Directly in the OSM website, you should be able to manually indicate this interest (by friending a user, set a list of locations); or it can be automatically calculated through analysis and stats of where you’ve been active before.

I started on this idea by adding friend and home location filtering to Diary Entries and Changesets. An ok start, but not everyone uses Diary entries to communicate, and the changesets still include everything … it’s both too much information and not enough. A lot is happening in OSM at any time, and you have limited attention.

There’s definitely a need for some system to do processing and analysis, from the planet diffs. Make these stats visually intriguing, and directly a part of a user or groups activity streams. There’s great experiments visualizing group stats with rankings on altogetherlost, and overall stats. And the personal statistics on neis-one, and heat map are excellent. Bring this directly into osm.org.

Social Design

Currently the user profile page on OSM has two functions … one to view other people, and another as a dashboard for yourself. I think these should start to diverge into two different pages. The User Profile should be grown to give a better idea of what a mapper is up to. The Dashboard would be a focus point for a mapper to stay on top of everything they care about in OSM. So in one page, give a summary of what friends and nearby users are up to, but also what’s happening in places you care about. For many users, myself included, there are many areas you’ve mapped, and would like to keep track of. Right now, you explicitly express interest by declaring your home location. This could be expanded to multiple, manually specified locations (my top 10 areas); or by analysis, a list of your top areas for editing could be automatically compiled.

It’s not only a design challenge, but it’s going to be an architected challenge as well. More social takes a lot more servers.

And don’t require mappers to visit OSM.org to get this. Yes, send emails out with weekly summaries of activity.

Some summary ideas for an activity stream:

* Recent mapping activity in multiple areas you’re interested in. Should be higher level than individual changesets (SteveCoast has made 12 edits over the last week in Seattle), but allow you to dig in further if you want.
* Some visual distinction between trusted vs non-trusted
* Also deal better with “big” edits, bots, etc. Filter these, or make them visually distinct.
* New mappers in these areas, and their activity. Highlight these, it’s a good opportunity to reach out.
* Highly trusted or experienced mappers in these areas, and their activity. Again, sometimes highlight, sometimes filter, depending on the need.
* Non-mapping activity in the area … upcoming mapping parties, other local news.

When it comes to politics, crowds can be manipulated, but not communities

The women and children in the photo are suffering, and the story tells of hidden revolutionaries challenging brutal rule in all arenas, and victorious on the maps of the most powerful company on the planet. They’re renaming the infrastructure after revolutionary heroes. You can’t help but cheer on such clever efforts for freedom.

Google supports liberation! Another front in American online diplomacy?! How far from the truth. Another lame attempt to boost American companies sales with puff pieces about their support for the Arab Spring. Let’s watch and see how long before Google scrambles to show its commitments to national governments (i.e. customers) and their maps switch back to their Assad era names.

The real story here is the nature of “crowdsourcing” (a term I’m increasingly despising), and power over and control of our geographic reality. Stefen Geens says that such false information, that politically motivated editing is a risk of crowdsourcing; it’s not, it’s rather the result of a false community and opaque processes. This write-up (“regime change, hardly”) is an excellent blow by blow, but there is absolutely no way to full penetrate this proprietary system.

Could this have happened on OpenStreetMap? Sort of. Anyone can edit anytime, I could change these names right now. The difference is that the change would be spotted soon by the community which cares for this data, all past changes by the user easily identifiable, discussion and questions posed in public, and reverts applied if necessary. In the event that the inaccurate edits ontinue, the case can be escalated to mediation, and the DWG can finally take actions like warning and blocks. It’s happened before, in Northern Cyprus, and OSM dealt well.

OpenStreetMap does not support the Assad regime, nor does it support the rebellion. It supports everyone’s access to the facts, and the equal ability for those common facts to respond to reality. OSM was the first map to display the world’s newest country of South Sudan. And in the event the brutality in Syria ends, and the streets are renamed on the ground, you know where to edit.

Only Possible With Open Data

Arguments about the importance of Open Data often come down to a principled stance, or a licensing discussion … that kind of argument doesn’t make much impression on folks who aren’t way in the weeds. And it’s more than just licensing … there are equal parts issues of legality, technical freedom, and community. Clear examples of what you can do only with OpenStreetMap, and not with say, Google Map Maker, makes this stuff real. Here are just a few, among many.

Mapping of Jalabad and surrounding countryside is unique to OSM. They collect data with GPS and Smart Phones and Walking Papers. With GMM, you can only trace imagery on your laptop computer. With OSM you can go into the field to actually talk with people about what to put on the map. Afghanistan is not an option at all on MapMaker, for political reasons. And the Jalalagood guys are organized as a company, but do mapping largely in their free time, voluntarily … so they get nixed for being “commercial”.

Great video on Jalalagood and a detailed mapping trip report.

In Haiti, TapTapMap maps local bus routes, free to add any sort of data in OSM. In GMM, users are not allowed to add bus routes, only Google does that, and only by getting data from transit agencies. Haiti’s system, like many places in the developing world, does not have a single authority which maps and controls the routes. You can only get these routes by riding with your GPS.

Another project that certainly couldn’t happen without Open Data. This is a Tourist Map of the Gaza Strip. Produced by a spin-off company from a university that took part in OpenStreetMap mapping in Gaza. This is not an area that can even be mapped in Google because of a political decision, it’s mostly blank on the map. While the map is free, it’s produced by a commercial entity and contains ads for local restaurants, hotels and sites.

In the Phillippines, recent humanitarian response to flooding relied on open source tools to process and make available satellite imagery to create OSM data. The GMM toolset would not permit integration of any other data sources except those controlled by Google. So it would stifle local ability to respond to disasters that don’t make huge media splashes (G Crisis Response has not been active at all in the Philippines).

A Week for the Record Books

That was a week where I really was truly and completely welcomed to DC. Some great things, some other things, and some things I can’t talk about yet.

Was invited to fill for Kate Chapman at the Mapping the World of Humanitarianism workshop (not the geo kind of mapping). Talked about the challenges with community centered social technologies, within the humanitarian system. Most difficult question “Do you consider yourself a humanitarian?”

Seda Muradyan invited me to present within a multi-day workshop with Armenia journalists. Was via Skype, with translation (the translator did an amazing job. Spoke about OpenStreetMap and how we’ve applied it within community-centered journalism projects. Just missed saying hi to Noha Atef who had been in Armenia for this. Still amazed that the Internet makes this kind of thing possible.

TC103, Tech Tools for Emergency Management is an online course led by TechChange. I’ve been moderating, and the quality of the discussion and participants is phenomenal. Rob Baker and I sat down for a chat in front of the camera, good fun … may be released at the end of the course.

World Bank and Google

* http://www.nytimes.com/2012/01/14/opinion/empowering-citizen-cartographers.html?_r=2
* http://www.globalintegrity.org/blog/bank-responds-to-google-maps-deal
* http://irevolution.net/2012/01/20/google-inc-world-bank-empowering-citizen-cartographers/#comment-11284

How many beers, coffees and phone calls discussing this boggling move last week? I’m still thinking about how to respond and where to go from here.

OpenStreetMap and Google

Google contractors were caught vandalizing OpenStreetMap. Oy vey.

Thanks to the 2012 OpenStreetMap Foundation Board. This is going to be the year.

Last weekend in Seattle, the OSM Foundation Board met “face-to-face”. We get together because no matter how much you try otherwise, there’s way more done in person in a couple intensive days. It cost about 4 or 5k USD this time, and it’s worth the cost. But, I think we’ve always done a terrible job explaining what happens at the Board meetings, and a middling job following up, and those two things are totally related.

I want this meeting to be different. It must be different. This is my fifth year on the Board and final year on the Board (I was elected again this year, but will stand down at the next AGM), and to me, and the entire Board, this is a crucial year for OSM. The face-to-face was the most productive yet, and the most difficult yet. I’m very satisfied. In year’s past, the minutes get published, and various announcements go out through working groups, and that will happen. But it’s insufficient, maybe distilling too far the atmosphere and the messiness of these get togethers.

The Stage

Steve is based in Redmond, and expecting a child any day, so he offered to host and avoid travel. I wasn’t far, relatively, in Chicago. The rest of the Board (excepting RichardF who couldn’t make it) flew in from Europe. I found a cabin near a lake on airbnb, quiet, cosy, and cheap. Henk hired a car, and drove everyone around. We had a meetup Friday night, made some burritos and played Kinect at Steve and Hurricane’s place (and tried to forget we watched Crank 2), and enjoyed the Seattle sunshine (no joke). Sunday Hurricane gave us horse riding lessons!

A regular vacation! Except for the part where we spent 18 hours of our weekend discussing/arguing about OSM in windowless meeting rooms at Microsoft (which we very much appreciate btw!). And the rest of the weekend continuing to talk about it, or even dream about it. Being on the Board is a sacrifice of time, because we all feel deeply responsible to the project and our position.

Presentations

The Board meeting proper started with presentations by Steve and Oliver. Steve hit many of the same themes from his SOTM and SOTM-EU talks, except he left out all the stuff about how awesome OSM is doing. We looked and discussed several graphs of recent statistics. OSM’s growth to date has been beyond imagination, but there’s no shortage of projects that changed the world and then met reality, hard. Looking at some of these, the factors in decline included insular community, lack of direction, and no innovation. That’s what we have to avoid.

Oliver made the point that “We are the Board! Shape the project!”. The Board, and the Foundation, needs to be a functional team, with clear goals and activities, all within the limited volunteer time we have to contribute. Fact at this point is, the Foundation doesn’t have clear objectives, beyond the mission to support but not control the OSM project. To meet goals, we can take action, we can guide and steer, we can spend money. At the end of workshop, there should be a target that guides all our activities towards achievement. Some of the slides were beyond funny management clip art (a guy looking forlorn into the mirror, facing reality) but the point was important. “We are the Board! Shape the project!”

At this point, I thought it would be useful to look at some of the management lessons and differences from HOT. While we are by no means perfect, I do feel there’s good alignment between the organizational side of HOT and the community, largely the same community as OSM. Contrast to OSMF, HOT is very focused in what it does, with clear guidance and priorities and steering. We aren’t afraid of spending money when it’s necessary. We value marketing by the organization (though could be better). There are clear technical needs, and we pay for it. There’s a key attention to the consumption side of map data collection, seeking strategic partnerships with other organizations. We’ve been selective and directive with responsibility, and when necessary, have taken it away. We try to be as transparent as possible, publishing very detailed board minutes.

Goals

We took Oliver’s point and started strategic planning.

OSMF Board meeting traditionally use a simple technique to come to consensus on a topic, whether it’s the agenda of the meeting, or in the case of Seattle, the objectives and activities of the OSMF this year. We brainstorm all our choices on the subject, write them on the whiteboard. Each person gets some number of votes, say 5, and distributes them among the topics. If topics can be grouped together, their count is added together. There’s discussion about the meaning of terms, sometimes a lot of discussion. Iteration to insure that we all have a common understanding. At the end, there’s a list of priorities. I always squirm in this process, because somehow I don’t believe it can work, but inevitably does a pretty good job, and if we need to override, we’re not strict about the methodology.

In less than an hour, we had these goals for 2012.

The World’s Most Used Map OSM is clearly the world’s most used open map, and most open map, and the best map. We want as many people of possible contributing and using OSM, and to do that, the experience of using OSM needs to improve, and where you use OSM can improve.

More Than Just Streets Do you know everything OSM is capable of mapping? Does your neighbor? Does your mayor? OSM is relatively well known in some circles, but it’s full potentially is still opaque to many. We want everyone to know what OSM is about.

Cultivating Leadership of Mappers. Shared Goals Between the Community & OSMF Mapping is driven by mappers, with a clear goal (make the map!), and there’s every reason that with clear goals and empowered members, the OSMF can act strongly. We now have clear goals, and clear expectations of what the management team and working groups can do and achieve, without much prescription on how things happen. This all frees the Board to provide the direction, and the management team and working groups to make the operational decisions.

Easier Contribution for Non-Geeks We debated how this differed from the Most Used Map, and decided it was important enough focus to stand on its own. Usability is certainly related, but more broadly, there’s much to do to improve all kinds of involvement in OSM.

And Again

The bulk of our time was spent translating these goals into actions, and this really was the most difficult part. Some things were quick to decide, like the final switch over to ODbL, but others became very drawn out and very detailed, like the process for site redesign. We touched on every standing issue, and aligned clearly to the goals. PR, list moderation, license change, the management team, working group budgets, SOTM, PR, site redesign, the articles of association.

We all agreed that short term action was needed on almost everything, with mind to how things should play out in the longer term. This meant drawing the above diagram, a lot, to remind ourselves of the urgency. We set big, audacious goals for all parts of the Foundation, with clear deadlines.

With so much on the table, we decided to stay in the room until we had decided on everything, which ended up meaning staying hours late, til there was little sunshine outside (or metaphorically sunshine inside the room) and tension rising. At one point, I was so fed up, I almost walked out, really seeing that if we didn’t resolve the issue at the Board, it wouldn’t resolve in the Foundation and the project, the goals wouldn’t be met, and decline was inevitable. And for me personally, that would mean a slow turning away from a project ingrained in almost everything I do in the world. We had to push through.

And we did. Despite looking over the brink, we had resolve. I felt tense, but knew I’d be happy with what we accomplished.

And after it was all done, we had some beers. The next day we rode horses. Group hug.

Thanks to the 2012 Board. This is going to be the year.

And thanks to Oliver Kuhn for the photos!

Opening Data in Kenya. My Method is to Hack.

There’s good reason to join the excitement about Open Data in Kenya. As Tariq says on the World Bank blog

Open data in Kenya is special: it comes at a time of national change; it’s got a head start on tools and expertise from the global open data community and it’s happening in a country where the information ecosystem is still maturing.

I’m proud that our work with Map Kibera has any relation to this at all. And it’s certainly due to the hard work of passionate people, in a tough environment, especially Dr. Bitange Ndemo (if you have the time, Dr. Ndemo’s talk at the World Bank is recommended).

Now that the launch has subsided, and I have a spare moment in the air from Tanzania, I want to look in depth at what data and how data has been released on OpenDataKE, the means of working with the data and collaborating on the data, and how this resource can relate to other open data sets in Kenyan society. Now that the government has made a bold move, I think it’s the responsibility of the software development community and civil society to really step up and test out the data, and suggest how this can become a really vibrant and social resource. Again, Tariq says this succinctly

the call for open data should go hand in hand with a call for better quality data: data that might be collected by official government agencies or in this age, by citizens themselves.

Transect across data

My “method” is to hack. I want to make an interesting simple visualization with some data from OpenDataKE, focusing on Nairobi, using openly available tools. Browsing data sets, the Population Density per Constiuency, derived from the 2009 census, seemed promising. The difference in density across the urban landscape Nairobi is extreme. For a sense of it, just look at the density of features in OpenStreetMap in the map of the slum of Mathare compared to nearby leafy Mathaiga. And to help the hack, the population density data set even has a handy location column.

Or maybe not. The usual practice in tabular data is to split the latitude and longitude into two columns, but here both values are formatted along with the unnecessary name of the province in which the constituency is located. Anyone who has had to work with data is used to little problems like this, and it’s easy enough (for a programmer) to write a quick script to clean this up. So I selected Export to CSV (side note, the other options presented by the platform seem hardly useful), filtered just the constituencies in Nairobi, and cleaned it up just by hand (I was too lazy to script this for just a handful of values).

Gaps and Errors

I uploaded the CSV to GeoCommons, which has facility to deal with many formats of data and easily layer together interactive maps, and was surprised to see that several points weren’t placed in Nairobi at all. Turns out there’s several errors in the location column, at least in Nairobi, and possibly in the rest of the country (I didn’t check). I’d have to correct these by hand. My knowledge of the location and extent of the constituencies is limited, so I needed another source, and that is not something you can find on OpenDataKE. It took some searching until I found scanned maps of contituencies on the Mars Group site. An overview map of all the constituencies was missing, so I used the adjacent constituency names in order to place the mistaken ones.

This worked well, but I’m left with questions. Why isn’t constituency boundary data available on OpenDataKE? How did Mars Group get these maps? And now that I’ve gone to the bother of correcting this data set, how can I contribute the changes back, or at least alert the holders of the data to the errors? There is a nomination section on OpenDataKE, which was wonderfully active until July 9, and then went quiet (did Socrata’s support contract expire then?). Anyway, I’m hopeful these will start getting attention again, so I’ve submitted two requests (pending approval to post), one for constiuency boundaries, and another for a way to correct the location column in the population density data set.

My second surprise was that when I made the annotation size relative to the population density, I didn’t see a big difference among the constituencies. The area where Kibera is located, Langata, is about the same density as Westlands, and both are less than CBD and Eastlands. What’s happening here is that constituencies aren’t aligned to uniform urban settlement patterns. Langata, the home constiuency of the Prime Minister, includes both the slum of Kibera and the wealthy and sparse suburb of Karen. A more useful and telling metric would be population density per Ward, the sub-unit of constituency which does have fairly good alignment to settlement patterns. The census can and has been aggregated to this level, because there was a large promotion of the census count of population in Kibera.

So again I’ve nominated a data set, for the population density aggregated at ward level. And I’ve also made a request for meta-information on the methodology of the census in Kibera and other informal settlements. While the 170,000 figure is surely more close to reality than the wild 1 million figures of the past, by comparing that number with estimates derived from other methods there is a discripency; the others agree on an average closer to 250,000. Additionally, and admittedly anecdotedly, many people in Kibera say they and their neighbors were never counted. Now this happens in any census, and it does not deligitimize the census, but in order to interpret data, openness on the methodology of data collection and analysis is also necessary.

The Civil Society of Data

Open government data exists in a wider ecosystem. Just a few months ago, Columbia University released amazing data sets of Nairobi, including high detail land use under open knowledge licenses. A truly beautiful and informative data set. Another place to find many a Kenyan civil society data set is Virtual Kenya. I thought the population density dataset would be interesting to layer with land use.

This data is distributed as Shapefiles, and I need tiles to use a base map. This is the purpose of MapBox, a rapidly developing tool set to make it easy to build beautiful map tiles. I loaded the Shapefiles in my locally running TileMill, styled the landuse categories based on Columbia’s pdf using carto, assigned interaction, and exported as mbtiles. These were dropbox’d, and posted to TileStream, as this map.

Mouseover or click on the map to get more detail about each parcel. This interaction technique is really interesting (as a geek), it’s entirely javascript and lightweight in the browser; it still has a few rough edges, but overall, a nice experience. There are limits, like TileMill doesn’t work with CSV, or permit multiple interactive layers, but it’s a great work in progress. Thanks to DevSeed for the TileStream account, and Dane Springmeyer, who spent some time with me hacking and bug hunting the interaction features of mapnik.

Like the OpenDataKE data set, and actually all data sets, there are errors … there is no such thing as a perfect map. The Ethiopian Church, across from YaYa, is not indicated nor is its land zoned as “public use” as other church lands in Nairobi are. And the Sarakasi Dome, home of our yoga practice in Nairobi, is not shown a unified plot at all. Now Columbia makes their contact information known on the site, and I’ve met them personally, so feedback here is direct over email, but I wonder from here … what is the method and intention to continually correct, update and discuss these data sets? Does it need to?

Of course, that is the primary approach of OpenStreetMap … geographic data in a wiki, that gets constantly examined, updated, and discussed, completely openly. OpenStreetMap can provide another overlay, so we can have some roads and points of reference for the final map. So on GeoCommons, I configured the tiles from the land use data on TileMill (this required some hidden configuration of the tile scheme), composited over semi-transparent OSM data (provided by GeoCommons through Acetate), and then finally, the population density points. This is the result for now of the data transect.

I hope I can improve this. You’ll see that the OSM streets don’t overlay precisely with land use. This I believe, but haven’t confirmed, to be the result of a project error in the Land Use data set. And an even better representation of the population density would have been a geo-join with area boundaries, had they been available. This would clearly show a thematic variation of population density. And of course, finer grained detail will be required to fulfill the original intention to show Nairobi’s vast differences in population density.

Where have we gone

Government data sets, authoratative civil society data sets, and completely crowd sourced data sets, layerd together in a single map, revealing a little more about Nairobi, and about the data itself. Each is collected, distributed, and updated in different methods. In some ways, I feel OSM leads the wild edge here of what’s possible, and what we want: a truly social environment for data. Data without community is data dry and unimportant. Of course, I’m not saying OSM is the final repository for all data: OSM doesn’t deal with demographic and private data of a census, and the methods to authoritatively certify versions of OSM data are just starting. But this hasn’t stopped several kinds of OSM and government interaction already beyond the “traditional” import, with the likes of Portland and the USGS interacting with the OSM community.

The ultimate promise of all this OpenDataKE is not necessarily in the data itself, but in the deep and wide serving conversations openness triggers. My own personal metric for this will be when government officials from OpenDataKE and slum dwellers from Kibera and Mathare (and Mukuru) openly collaborate and work together. Can’t wait to see this happen. To get there, I challenge you too … get geeky with some data and write about it!

Jerusalem, Moving the Ladder

(x-posted from GroundTruth Initiative)

After 4 weeks, we’re leaving Jerusalem. The finest puzzle of human passion, and passion beyond, resting solidly and unsteadily on 5000 years or so of accumulated white stone and dirt. The most complicated and absurd and somehow, sometimes wonderful city. Our host Micha Kurz of Grassroots Jerusalem warned us that 4 weeks would be just enough to just begin understanding Jerusalem. In fact, it’s only enough time for the city to get a healthy grip on you so that you really don’t want to leave. And it’s definitely not enough time to come up for air for any writing and reflection … hopefully now I have a little space, on my trip back down to Dar es Salaam, the other end from here of the Great Rift Valley, where in complete contrast, the biggest conflict is that there might, maybe, be another political party in a couple years.

There is a ladder resting on the front balcony of the Chuch of the Holy Sepluchure. Centuries of delicate negotiation guide how priests and monks of various sects of Christianity move throughout the twisting bizarre space that might possibly be the site of the crucifixion of Jesus. No set of rules for behavior are comprehensive enough to cover every situation, and you hope you have good faith enough for dialogue when the loopholes come up. Not so here, where no one knows which sect originally placed the ladder, so they all refuse to move it. In the middle of all this, hoards of Russian pilgrims take pouty glamour shots in front of what might possibly be the site of a great suffering of Jesus. Thank you for the absurdity and the warning, that ladder should be the symbol of the city.

Remember this is also the country where there is meticulous debate on automated milking of cows on Shabbat.

Jerusalem is a place where they play excellent music in the streets on Friday, and no one notices a soldier dancing with a machine gun. In Silwan, down the slopes of the Old City, young men ride horses through streets defiantly kept 2-way as it has been for thousands of years, and gallop against slowly trudging tourist buses making their way to the City of David, where archeological diggings expose Biblical history, and expose too much of the present day, with houses and mosques collapsing above the excavations. Kites flew above all of this, capturing the view from cameras. In one day, you can rave off Jaffa Street, visit the birthplace of Jesus (grottos!), have a lovely fish dinner in the shadow of the security wall (thankfully just mapped, else we would’ve missed it), be told “have a nice day!” by a teenage soldier at an eerily deserted checkpoint, share a taxi with two Ethiopian priests in town from Dublin, and night cap it with a bottle of Brooklyn Lager. And check which beer your drink, cause drinking Taybeh vs Goldstar might be partisan (or wearing your hair a certain way, or the length of your skirt). Weekends are so confusing in Jerusalem! With 3 religions and 3 different holy days, it depends what side of town you’re on. Not recommended to go back and forth more than once between East and West Jerusalem in the course of one day, your mind will not be able to take such different worlds living in one city. You can hear Rock the Casbah performed in Arabic, and have multiple two hour discussions on the name of Jerusalem in OpenStreetMap.

Through it all, there has been such a strong reaffirmation of the mission of GroundTruth. With layers apon layers of history, of too much subtlety of meaning, of confusion, of deadly conflict, seems like the only possible response is coming to some reckoning and witness to it all, to see the change over time, to let everyone speak up about the reality of their lives. Let’s map Jerusalem. Let’s let people expose their regular humaness, carefully pace ourselves through the bullshit and maybe just find some small piece of reason. On Micha’s tour around Jerusalem, we saw an ancient city transformed yet again at the founding of Israel in 1948, and after 1967, the Green Line is now the smoothest highway through town. The municipal boundary of Jerusalem slices through old villages, envelopes completely new ones, and the security wall takes yet another course, and pressing close is the Areas ABC of the Oslo Accords. It hardly makes sense even if you understand it, and by showing what it’s like right on a piece of ground you may never visit otherwise, perhaps finally some understanding will happen.

We have at least 3 more posts to talk about the experience with the amazing Grassroots Jerusalem, who cut their chops mapping the Salah ad Din shopping district, and passed it on to al Walaja a small and inspiring village, which has a piece of just about everything complicated in Israel and Palestine including a completely encircling section of the wall, and it’s all being wrapped up in a potent brew of technology and training inspired by a slum in East Africa.

Doing

The past few months have most entirely been about discussing, strategizing, arguing, planning. Really missing actually doing stuff.

So I decided to ignore the emails, concepts notes, budgets, design documents. Build something. Just small things, but cool things. Makes me happy.

osm-changesets

On the OpenStreetMap site, the history tab is super important for keeping tabs on changes, but visually dull and hard to immediately parse. I added a map, with highlighting between the list and map views. TomH cleaned up the code and improved the layout. Basically, just some javascript and view changes, and I had to get my OSM installation up to date. Needs some tweaking, but a good start.

I was pushed to do this by the Strategic Working Group. We’re discussing the OSM front page and usability, going in circles, and realzing the only way forward was to just do some things, however minor, to iterate our discussion.

Looking at the history tab, I realize there’s many ways to go in the realm of monitoring change. Make monitoring more interactive, customized to the places your care about, and the mappers you trust. There are a whole set of external monitoring tools and perhaps some of the tools, or the ideas, could be integrated.


Kibera-1961

That is Kibera in 1961. Choose “Nairobi 1961” from this map. Really, just a preview, it’s going to blow people’s minds, especially with Brian Ekdale’s concise history of Kibera.

Last year in Nairobi, I became aware of an amazing archive of historic aerial imagery, flown by the RAF, acquired over all the British Colonial possessions, several times over decades. The images were used for creating ordnance survey maps … but since then the original images have been sitting in various dusty basements, most recently Oxford. Paolo Paron has a plan to get it all openly online. But in the mean time, you have to go to Oxford, sort through boxes of square images, and request photographic copies. I had the chance to do this back in December, and it was wonderful. 1961 is 50 years ago, meaning these images are out of copyright!

I receieved from Oxford eight largish tiff images. Seeking first to rectify them, I tried various incarnations of map warper (including a local version) but with 30 MB images, it was choking. Next I tried to find a desktop option, and found good results with QGIS georectification plugin. I stitched together 6 of the images, roughly (if I spent more time, would’ve adjusted the different brightness levels and try to align better the two North-South runs that covered Nairobi, one at a slight angle to another). Then loaded a shapefile of Kenya OSM into QGIS, set control points, and warped. Took the resulting GeoTiff onto the generous resource of hypercube, warped the image to spherical mercator using gdalwarp, and after some hints from winkey and crschmidt on #telascience, “gdal_translate -expand rgb”, to take this single band image into RGB, which TileLayers seems to prefer.

There’s a lot more to say about this image, and that’ll be built up in the upcoming new mapkibera.org site. And of course, there’s those 1.5 million time capsule images sitting in an Oxford basement to open up.

Flow-Liberia

Friends of my mine from Nairobi, working for the World Bank, undertook a substantial data collection of all water distribution points in Liberia. Pretty interesting compare and contrast with community mapping. I was pinged about this on twitter by Ned of Water for People, who helped develop the Android/Appspot application for this data collection, called FLOW. My immediate response was that this should be open source and open data, so other people can do cool stuff with it. Max took up the conversation and obliged (as much as he could, the data license is still under discussion). So I had to respond and do something cool with the data.

The flow web app is ok, but a little clunky. It uses the Google Maps API, and with nearly 1000 markers, it loads slowly and interaction is tricky. So I wanted to try a tile-based approach, to improve the speed and visualization, and this seemed like the best opportunity to get into TileMill, part of the MapBox toolset by Development Seed. After going through the tools, I am extremely impressed with what DevSeed is building, and hope I can help out by finding good uses, digging up bugs, and contributing here and there.

So I installed TileMill, updating mapnik in the process. Really nicely designed app, with laser focus on building nice tiles. One of the amazing features is map interactivity, which easily and efficiently turns input data sets into mouseover and infowindow bubbles. The trick it uses is grid renderer, Dane’s extension to mapnik which builds tiled json files, used by the wax library for OpenLayers (and ModestMaps, and GMaps) to define hit areas. Basically, much more efficient than creating markers.

Finally, Dane showed me gridsforkids, a simple sandbox for building image and json tile sets for standalone, offline map apps. I had to grab the “.mml” file created by TileMill, have carto convert that to mapnik xml, and then use the custom generate_tiles.py to build everything. Found a small bug in how TMS is handled there, and had to hack in a few things for my data set, but it basically works (a few problems with the hit areas, still investigating). One way this could go is streamlining the process, where one of the outputs from TileMill was a tar ball bundling all the tools to run the map offline.