Data Source Handbook by Pete Warden

There are no rate limits, but you do have to get an API key and use OAuth to authenticate your calls:,-122.389388.json { "query":{ "latitude":37.778381, "longitude":-122.389388 }, "timestamp":1291766899.794, "weather": { "temperature": "65F", "conditions": "light haze" }, { "demographics": { "metro_score": 9 }, "features":[ { "handle":"SG_4H2GqJDZrc0ZAjKGR8qM4D_37.778406_-122.389506", "license":"", "attribution":"(c) OpenStreetMap ( and contributors CC-BY-SA (", "classifiers":[ { "type":"Entertainment", "category":"Arena", "subcategory":"Stadium" } ], "bounds":[ -122.39115, 37.777233, -122.387775, 37.779731 ], "abbr":null, "name":"AT&T Park", "href":" 1.0/features/SG_4H2GqJDZrc0ZAjKGR8qM4D_37.778406_-122.389506.json" }, ... Locations | 17 Yahoo!

<parameters applicable-location="point1"> <temperature type="maximum" units="Fahrenheit" time-layout="k-p24h-n8-1"> <name>Daily Maximum Temperature</name> <value>38</value> <value>33</value> <value>41</value> <value>41</value> <value>35</value> <value>32</value> <value>30</value> <value>35</value> </temperature> <temperature type="minimum" units="Fahrenheit" time-layout="k-p24h-n7-2"> <name>Daily Minimum Temperature</name> <value>22</value> <value>28</value> <value>34</value> <value>22</value> <value>24</value> <value>17</value> Locations | 23 <value>20</value> </temperature> </parameters> </data> </dwml> OpenStreetMap The volunteers at OpenStreetMap have created a somewhat-chaotic but comprehensive set of geographic information, and you can download everything they’ve gathered as a single massive file. One unique strength is the coverage of areas in the developing world that are absent from commercial databases, and since it’s so easy to change, even US locations are often more up-to-date with recent changes than more traditional maps.

The example Carr highlights is Wikipedia, which although popular and extensive, has grown in a haphazard way that matches the selective interests of participants, and has incomplete, sometimes poorly written, trivial and highly contested articles which undermine its authority and usability. Carr contends that ‘if Wikipedia weren’t free, it is unlikely its readers would be so forgiving of its failings’ (2007: 4). OpenStreetMap can suffer from a lack of coverage in some areas where there are few volunteers. There are also concerns as to the sustainability of volunteered crowdsourced labour, with Carr (2007) arguing that the connections that bind a virtual crowd together are often superficial, lacking depth and obligatory commitment, are liable to dispersion, and are reliant on a small core group to keep the project going and provide the bulk of the labour. In contrast, others have noted with respect to OpenStreetMap, that the quality of data produced matches that of professional companies, and that the coverage is diverse (Haklay 2010; Mooney et al. 2011). What this discussion highlights is that just because a dataset is huge in volume, it is not necessarily random, representative, clean, has fidelity, or is trustworthy.

., who the email was sent to or received from, the time/date, subject, attachments). Even if the e-mail is downloaded locally and deleted it is still retained on the server, with most institutions and companies keeping such data for a number of years. Like other forms of data, spatial data has grown enormously in recent years, from real-time remote sensing and radar imagery, to large crowdsourced projects such as OpenStreetMap, to digital spatial trails created by GPS receivers being embedded in devices. The first two seek to be spatially exhaustive, capturing the terrain of the entire planet, mapping the infrastructure of whole countries and providing a creative commons licensed mapping dataset. The third provides the ability to track and trace movement across space over time; to construct individual time–space trails that can be aggregated to provide time–space models of behaviour across whole cities and regions.

As a consequence, vast quantities of data are routinely generated concerning interactions across ICT networks Volunteered data In contrast to surveillance that is either directed at people or things by individuals and agencies, or are captured automatically as an inherent feature of a device or system, much big data are actively volunteered by people. In such cases, individuals generate and input data and labour either to avail themselves of a service (such as social media) or to take part in a collective project (such OpenStreetMap or Wikipedia). Such labour has been called prosumption as the modes of production and consumption have been partially collapsed onto one another, with individuals assuming a role in the production of the service or product they are consuming (Ritzer and Jurgenson 2010). For example, the content of a social media site is simultaneously produced and consumed by individual users inputting comments, uploading photos and videos, and engaging in discussion and the exchange of sentiment (‘liking’ or ‘disliking’ something).


Rescue workers didn’t know where to start; even the ones with GPS receivers quickly discovered that there were no good digital maps of Haiti. Google, to its credit, gave the United Nations full access to the usually proprietary data in its collaborative Map Maker tool, but the real hero of the hour was the OpenStreetMap project, an open-source alternative to Map Maker. OpenStreetMap is essentially the Wikipedia of maps: anyone can use it, anyone can change it in real time, and its data is free and uncopyrighted in perpetuity. When the earthquake struck, late Tuesday afternoon, Haiti was a white void in OpenStreetMap. Within hours, thousands of amateur mappers were collaborating all over the world, adding roads and buildings from aerial imagery to the database, until every back alley and footpath in Port-au-Prince had been charted. Relief workers updated the maps with traffic revisions, triage centers, and refugee centers, and just days later, the volunteer-drawn map was the United Nations’ go-to source of transportation information.

Relief workers updated the maps with traffic revisions, triage centers, and refugee centers, and just days later, the volunteer-drawn map was the United Nations’ go-to source of transportation information. “Many thanks to all crisis mappers for great contributions,” posted UNICEF emergency officer Jihad Abdalla. “You made my life much easier, since I’m a one-man show here . . . million thanks.” Port-au-Prince, as it looked in OpenStreetMap when the earthquake hit and the way it looked a week later After reading about the lives saved in Haiti by OpenStreetMap, I used it to look at my own neighborhood and found that the cul-de-sac we live on was also missing from the map. After hesitating a moment—is it really okay to draw on a map?—I added and labeled my street by hand, Wikipedia-style. It was a surprising rush to add something new, however trivial, to the world’s sum of geographical knowledge.* For a brief moment, I was Captain Cook charting the New Zealand coastline, a veritable Stanley of the suburbs.

Also available in Australia from McArthur Maps, 208 Queens Parade, North Fitzroy, 3068, Australia; phone: 0011 614 3155 5908; e-mail: Further credits: Images on page 66 courtesy of NASA; map on page 81 courtesy of Altea Gallery (; map on page 118 © Dragonsteel Entertainment, LLC; photograph on page 118 © Mayang Murni Adnin; photograph on page 171 by Jim Payne; images on page 230 © OpenStreetMap and contributors, CC-BY-SA For my parents. And for the kid with the map. CONTENTS Chapter 1: ECCENTRICITY Chapter 2: BEARING Chapter 3: FAULT Chapter 4: BENCHMARKS Chapter 5: ELEVATION Chapter 6: LEGEND Chapter 7: RECKONING Chapter 8: MEANDER Chapter 9: TRANSIT Chapter 10: OVEREDGE Chapter 11: FRONTIER Chapter 12: RELIEF Notes Index MAPHEAD Chapter 1 ECCENTRICITY n.: the deformation of an elliptical map projection My wound is geography.


Structured social data and geospatial mapping suggest one direction where these tools are evolving in the field. A web application from ESRI deployed during historic floods in Australia demonstrated how crowdsourced social intelligence provided by Ushahidi can enable emergency social data to be integrated into crisis response in a meaningful way. The Australian flooding web app includes the ability to toggle layers from OpenStreetMap, satellite imagery, and topography, and then filter by time or report type. By adding structured social data, the web app provides geospatial information system (GIS) operators with valuable situational awareness that goes beyond standard reporting, including the locations of property damage, roads affected, hazards, evacuations and power outages. Long before the floods or the Red Cross joined Twitter, however, Brian Humphrey of the Los Angeles Fire Department (LAFD) was already online, listening.

After the devastating 2010 earthquake in Haiti, the evolution of volunteers working collaboratively online also offered a glimpse into the potential of citizen-generated data. Crisis Commons has acted as a sort of “geeks without borders.” Around the world, developers, GIS engineers, online media professionals and volunteers collaborated on information technology projects to support disaster relief for post-earthquake Haiti, mapping streets on OpenStreetMap and collecting crisis data on Ushahidi. Healthcare What happens when patients find out how good their doctors really are? That was the question that Harvard Medical School professor Dr. Atul Gawande asked in the New Yorker, nearly a decade ago. The narrative he told in that essay makes the history of quality improvement in medicine compelling, connecting it to the creation of a data registry at the Cystic Fibrosis Foundation in the 1950s.


NATIONAL GEOSPATIAL-INTELLIGENCE AGENCY https://nga.​maps.​arcgis.​com/​home/ The National Geospatial-Intelligence Agency provides public access to large volumes of satellite and other geo-data and imagery in support of scientific research, natural disaster recovery operations, and crisis management. NORSE ATTACK MAP http://map.​norsecorp.​com/ Norse, a cyber-threat analysis firm, provides real-time visualizations of global cyber war based on data collected every second from Internet and Dark Web sources, plotting origins of attackers and target attacks. OPENSTREETMAP https://www.​openstreetmap.​org/ OpenStreetMap is a crowdsourced mapping platform maintained by a user community that constantly updates data on transportation networks, store locations, and myriad other content generated and verified through aerial imagery, GPS devices, and other tools. PLANET LABS https://www.​planet.​com/ Planet Labs uses a network of low-orbit satellites to capture the most current images of the entire earth and form composite digital renderings that can be used for commercial or humanitarian applications.

Edison; European Energy Supply Security; Gazprom; International Energy Institute; Natural Earth; Norsk Oljemuseum; OpenStreetMap; Petroleum Economist; U.S. Energy Information Administration; White Stream. pai1.31 The New Arctic Geography. Map created by Jeff Blossom. Arctic Council; Durham University; Grenatec; IBRU; IFT; Ministry of Foreign Affairs of Denmark; Natural Earth; The New York Times; Theodora. pai1.32 The World: 4 Degrees Celsius Warmer. © 2009 Reed Business Information—UK. All rights reserved. Distributed by Tribune Content Agency. pai1.33 One Mega-City, Many Systems. Created by University of Wisconsin–Madison Cartography Laboratory. Government of the Hong Kong Special Administrative Region; Global Administrative Areas; Natural Earth; Noun Project; OpenStreetMap; timeout.​com. pai1.34 Global Data Flows Expanding and Accelerating.

If we are an urban species, then producing data-driven cityscapes—mapping cities from within—is as important as capturing their scale. In the 1980s, GPS technology firms began painstakingly driving and geo-coding roads all over the world, building up databases for the suites of navigational tools that are now in almost every new car’s dashboard. Google soon joined the fray, adding more satellite imagery and street views. Today every individual can become a digital cartographer: Maps have gone from Britannica to Wiki. OpenStreetMap, for example, crowdsources street views from millions of members who can also tag and label any structure, infusing local knowledge and essential insight for everything from simple commuting to delivering supplies during humanitarian disasters.*1 We can now even insert updated imagery from Planet Labs’ two dozen shoe-box-size satellites into 3-D maps and fly through the natural or urban environment.


Inspired by Wikipedia’s model of collaborative knowledge production, in 2004 British computer scientist Steve Coast launched OpenStreetMap. Suddenly, anyone could upload a record of his or her movements along the nation’s road network. By systematically traveling the streets of every city, town, and village in the United Kingdom, an army of volunteers set out to make a freely-usable map. As of 2013, after years of collective surveying and annotation, the crowdsourced street map of England was finally nearing completion. The effort has since expanded around the world, and in poor countries often rivals the government’s own maps. After the 2010 Haiti earthquake, which obliterated the nation’s mapping agency in a building collapse, OpenStreetMap provided essential data to relief organizations. The Indian activists who pioneered slum mapping in the 1990s saw their work as a way to begin integrating poor communities into existing city-planning efforts in the hope of securing a fairer share of government resources.

The Indian activists who pioneered slum mapping in the 1990s saw their work as a way to begin integrating poor communities into existing city-planning efforts in the hope of securing a fairer share of government resources. But with the new chart living online in OpenStreetMap, Map Kibera is focused instead on powering new tools that change how the community is represented in the media, and how organizers lobby the government to address local problems. Voice of Kibera, for instance, is a citizen-reporting site built using another open-source tool called Ushahidi. The name means “testimony” in Swahili, and it was developed in 2008 to monitor election violence in Kenya. Voice of Kibera plots media stories about the community onto the open digital map, and allows residents to send in their own reports by SMS. Another Map Kibera effort recruits residents to monitor the progress of infrastructure projects.

And there will always be an urge to “do something,” if only for self-preservation. As Heeks argues, “In a globalized world, the problems of the poor today can, tomorrow—through migration, terrorism, and disease epidemics—become the problems of those at the pyramid’s top.”50 This brings us to the final dilemma: crowdsourcing and the future role of government in delivering basic services. In smart cities, there will be many new crowdsourcing tools that, like OpenStreetMap, create opportunities for people to pool efforts and resources outside of government. Will governments respond by casting off their responsibilities? In rich countries, governments facing tough spending choices may simply withdraw services as citizen-driven alternatives expand, creating huge gaps in support for the poor. In the slums of the developing world’s megacities, where those responsibilities were hardly acknowledged to begin with, crowdsourced alternatives may allow governments to free themselves from the obligation to equalize services in the future.


People from around the world, especially Haitians living elsewhere, started adding in street names. The map was, according to Dickover, “insanely detailed” within just a couple of weeks, and was routinely used by the World Bank, the United Nations, the US Southern Command, the US Marine Corps, the Coast Guard—“anyone who needed to get across town.” By the third week, the World Bank was funding people from OpenStreetMap to train local Haitians in the use of GPS equipment to add more and more local knowledge. Dickover wants to make this sort of partnership of local people with a distributed network of developers more routine, so that we don’t have to wait for disasters to spur action. So, he has been organizing State Department–sponsored “TechCamps” around the world. For example, at the TechCamp in Santiago, Chile, at the end of 2010, people from Brazil and Argentina said they’d like to have a way to aggregate the data gathered by election monitors so they could get an overall picture of the situation on polling days.


Wikipedia is so identified with web-based collaboration that its name has been incorporated into book titles (Wikinomics: How Mass Collaboration Changes Everything) and related initiatives such as the leaked document site WikiLeaks. In Benkler’s The Wealth of Networks, Wikipedia plays a prominent role as an exemplar of “commons-based peer production.” But Wikipedia turned out to be more the exception than the rule. While there are other not-for-profit large-scale collaborative platforms (OpenStreetMap, for example), no other non-­commercial site has reached anything resembling Wikipedia’s influence. As Sue Gardner, then Executive Director of the Wikimedia Foundation, wrote in 2011: Wikipedia represents the fulfilment of the original promise of the internet: that it’s a kind of poster child for online collaboration in the public interest. Because back when the internet started, we figured it would be full of stuff like Wikipedia.


Etsy, an online marketplace for makers, is not like a really big craft fair. eBay is different from both classifieds and yard sales. Airbnb is much more than a listing of 1 million bed-and-breakfasts. What distinguishes and transforms these activities is that platforms connect, organize, aggregate, and empower the participating peers. Without the platform—without Airbnb, Etsy, Lyft, TopCoder, or OpenStreetMaps, to name a few—the peer co-creators would not engage, the leveraged excess capacity would be limited, and the consumers of these products and services would not return again and again. Excess capacity turns out to be a key input into a Peers Inc product or service. The cost of experimentation is lowered as new value is extracted out of something that already exists and is already substantially (or entirely) paid for.


It becomes much deeper, more intrinsic, and more accessible to all. Instead of merely enjoying things together, collaboration PLANNING YOUR COMMUNITY Download at Boykma.Com 35 goes so far as to help people create things together. In these environments, the community also assumes the role of producer of the content. The typical example here is one of the many Free Culture communities, such as Linux, Wikipedia, OpenStreetMap, Creative Commons, etc. In these communities, community members have the opportunity to change the very content that brings them together. The Ubuntu community is one such example. Ubuntu is an entirely free Linux operating system that is designed to provide a complete, free, stable, and secure system for desktops, servers, or mobile devices. Ubuntu is built using hundreds of pieces of preexisting Free Software tools that we refer to as upstream applications.


Weight each edge (by number of replies, whether it’s symmetric, and so on) and set limits on the number of links from any node. This sharply reduces the intermediate data size, yet still does a reasonable job of estimating cohesiveness. —Philip (flip) Kromer, Infochimps * * * [144] [145] [146] [147] [148] All are steady-state network flow problems. A flowing crowd of websurfers wandering the linked-document collection will visit the most interesting pages the most often. The transfer of social capital implied by social network interactions highlights the most central actors within each community. The year-to-year progress of students to higher or lower test scores implies what each school’s effect on a generic class would be


Write-centered communities For some communities, collaboration goes much further. It becomes much deeper, more intrinsic, and more accessible to all. Instead of merely enjoying things together, collaboration goes so far as to help people create things together. In these environments, the community also assumes the role of producer of the content. The typical example here is one of the many Free Culture communities, such as Linux, Wikipedia, OpenStreetMap, Creative Commons, and so on. In these communities, community members have the opportunity to change the very content that brings them together. The Ubuntu community is one such example. Ubuntu is an entirely free Linux operating system that is designed to provide a complete, free, stable, and secure system for desktops, servers, or mobile devices. Ubuntu is built using hundreds of pieces of preexisting Free Software tools that we refer to as upstream applications.