Tag: Analysis

Welcome to an additional family member – OSMstats

Maybe some of you are already familiar with “OSMstats”, a website that provides numerous statistics about the OpenStreetMap (OSM) project. The site was created and is maintained by the two guys at altogetherlost.com. However, OSMstats has now been moved to the ResultMaps domain at osmstats.neis-one.org. I added several new features too. First of all, you can now select a specific date for your stats. Secondly, the main menu panel has been extended with a new entry for statistical information about OSM changesets.

osmstats

Additionally, the graphs for the country statistics, the active members and daily edits are also available in a “year”-overview. I hope you like the new extensions. A big thanks to both guys at altogetherlost.com who originally created OSMstats!

OSMstats is now available at: http://osmstats.neis-one.org
Feel free to check out my Resultmaps too which offer many helpful and funny OSM tools: http://resultmaps.neis-one.org

Notice: OSMstats was introduced in 2011, this means the webpage cannot provide statistics prior to that year. Also, the newly created Changeset-Tab has only data for July, 2014 and after.

Thanks to maɪˈæmɪ Dennis

The Average Age of OpenStreetMap Objects

Joseph Reeves asked me on twitter the other day if “anyone knows the average age of @openstreetmap objects?“. Here we go: Based on the complete OSM data history file from here (June 14th, 2014) and some additional lines of code, I conducted a simple analysis.

Overall 400,000 mappers of the more than 1.7 million registered members contributed to the OSM project. Almost 375,000 contributors created at least one Node, 325,000 one Way and 70,000 one Relation object. In total the contributors collected more than 2.7 billion Nodes, 263 million Ways and 3 million relations. The percentage of newly created OSM objects (Nodes, Ways & Relations) has been more or less at the same level for the past few years (2010 to 2014): with17% to 20%. The following diagram shows the percentage of each created OSM object type.

created_objects

Additionally, I evaluated the number of objects based on the date of their last modification. Utilizing the object timestamps of the last modification, we see a slightly different result for the last 4 years. 55% of the Nodes, 67% of the Ways and 74% of the Relations in the OSM database do not have a timestamp dated before 2012.

last_modifier

However, I guess it would be an interesting visualization, if we could put those numbers on a world map similar to the “OpenStreetMap availability” by Stefano De Sabbata. You can also find some up to date OSM statistics here.

Thanks to maɪˈæmɪ Dennis

A précis: Where are the US mappers at?

This blog post is a summary of Dennis’ and my State of the Map (SotM) United States presentation. Maybe some of you already know about our publication: “Comparison of Volunteered Geographic Information Data Contributions and Community Development for Selected World Regions”. From the abstract: “Our findings showed significantly different results in data collection efforts and local OSM community sizes. European cities provide quantitatively larger amounts of geodata and number of contributors in OSM …”. “Furthermore, the results showed significant data contributions by members whose main territory of interest lies more than one thousand kilometers from the tested areas.” Especially the last finding is quite interesting when considering “arm-chair-mapping” in OSM.

However, for our SotM US session we repeated some of the conducted analyses for 50 urban areas in the United States to see whether similar patterns could be determined. You can find the session abstract here; additionally the ppt slides and also a video are online. The following animation shows the number of contributor’s evolution in the US from 2007 to 2014.

us_mapper_animation_tiny

Similar to our prior research results for the selected 12 world regions, the US urban areas showed different individual patterns. Some cities such as Fargo (ND) experienced several data imports in the past which resulted in strong data density values (Nodes and Ways), whereas other areas solely rely on a small community of volunteers and contributors.

We also conducted a simple statistical analysis to evaluate whether certain socio-economic factors have an impact on the development of OSM communities in the different cities. Variables such as population density, per capita income and education showed a moderate to strong correlation with contributor numbers, highlighting that all of the aforementioned factors can have an impact on the success of OSM in the selected urban areas in the US.

us_socio_economic_factors

It was also quite useful to take a look at the local contributor numbers vs. external contributors. Certain cities such as Miami (FL) heavily rely on data contributions made by mappers whose home region is more than 1000 km away, whereas other cities such as Los Angeles(CA) show large values for both, local and external mappers. You can check out your own area here too. The corresponding blog post is online here: “The OpenStreetMap Contributors Map aka Who’s around me?”. A more detailed analysis that is currently being conducted will reveal if cities that prove to have large external mapper contributions show the same quality as areas with lots of local mappers or not.

 

It’s about time – OpenStreetMap Contributor Activity Report 2013

One and a half years ago (end of 2011), one of my open access publications (“Analyzing the Contributor Activity of a Volunteered Geographic Information Project — The Case of OpenStreetMap“) was published. It contained several interesting findings about the contributions made by the community of the OSM project. The results showed that the community follows a particular pattern that many other online community based projects tend to struggle with too. Only a small number of the members really contribute in a meaningful way to the project. Additionally, the publication illustrated how many contributors are located in Europe and other areas of the world and how and where mappers contribute data over a certain period time.

I thought it was time to update this information with some new statistics. Between the end of 2011 and July 2013 the number of registered OSM members has increased more than two-an-a-half times to almost 1.34 Mio. Based on the freely available changeset dump of the project it is quite easy to check how many members created at least one changeset and thus hopefully made an edit to the database. The following figure shows the increase of registered members and the aforementioned results of the analysis of the changeset dump of July 31st, 2013.

2013_Members

By the end of 2011 almost 43% of the 505,000 registered members created at least one changeset. This number decreased by July 2013 to only 26% (355,000) of the 1.34 Mio registered members. As some of you already know, the real-number of actual contributors is also far below this. I decided to look into this in a little bit more detail and created some diagrams that show the number of changesets and active contributors per month. We can see a few events that had an impact on the numbers in the diagrams. First, the license-change in April 2012 followed by the run of the redaction bot in July 2012 (HDYC-profile) and at the end the release of the new OSM iD editor in May 2013. The number of changesets has not changed a lot when comparing current (July, 2013) numbers with prior months of last year.

2013_Changesets

The last diagram of this blog post shows the active contributors per month. The collected information tells us that the total number of “long-time” contributors is increasing whereas the number of “new” contributors is more or less on the same level in recent months.

2013_Contributors

It is also interesting to see an impact of “new” members in the month before the license change (March, 2012). Anyway, for the last one and a half years the number of active contributors per month is consistent with a total number between 19,000 and 23,000. What do you think?

You will find additional OSM editor usage statistics (by Oli-Wan) in the OSM Wiki. Also, it is interesting to see that currently the number of newly registered members is only growing between 700 and 900 per day. In the time before August, 2013 is was between 3,000 and 4,000 per day! Did anyone change something in the registration process at OSM.org, e.g. a new security/login mechanism during account creation?

Thanks to maɪˈæmɪ Dennis

PS: Happy 9th Birthday OpenStreetMap!

The State of the Map. United States. Street Network. 2013

Last year we wrote a journal paper in which we analyzed the OpenStreetMap (OSM) dataset of the United States which was published on May 28th, 2013 in the Transactions in GIS Journal. You can download a free pre-print version here. This paper has been published just on time to add to the discussion at the upcoming State of the Map United States conference which will take place in San Francisco and includes some presentations about data imports to OSM. Unfortunately, Dennis and I cannot attend the conference this year, so we decided to write a blog post with some additional and up-to-date numbers.

In January there was an announcement on the OSM mailing list that in the past few months many connectivity errors in the United States OSM dataset had been fixed. Probably a lot of these fixes can be attributed to Martijn’s Maproulette website or to Geofabrik’s OSM Inspector (OSMI) Routing View. However, a short discussion started on the mailing list about the total number of errors that are left and how long it would take to fix all those errors. Thus, we downloaded four OSM planet files dated Jan 4th 2012, June 13th 2012, Jan 2nd 2013 and Jun 2nd 2013 to get some new results. After cutting the United States dataset from the planet files, we used the same algorithm as utilized in OSMI’s Routing View, to receive some stats about the street network of the US datasets.

First of all the, the following image shows the number of errors for each dataset that we included in the analysis. The errors that were detected are separated into unconnected and duplicate ways. You can find some additional information about both error types here.

As you can see, the number of unconnected OSM ways has been rapidly reduced in the past 17 months from around 141,000 to 19,000. The number of “duplicate way” errors has been reduced from 17,500 to 11,500. You can find the exact numbers in the following table and an updated error layer on the mentioned OSMI website. In certain cases the duplicate way error created several errors for one and the same way. For these particular cases the number of unique OSM way IDs were counted.

Date – Unconnected Ways – Duplicate Ways

  • Jan 4th, 2012 – 141,578 – unique 17,563 (overall errors: 535,923)
  • June 13th, 2012 – 145,468 – unique 17,977 (overall errors: 518,536)
  • Jan 2nd, 2013 – 15,911 – unique 12,287 (overall errors: 257,388)
  • Jun 2nd, 2013 – 19,073 – unique 11,582 (overall errors: 220,451)

Overall the length of the US street network did not really change a lot. At the beginning of 2012 it was around 11.07 million km while in 2013 it is 11.1 million km, which means an increase of around 30,000 km. The following image shows the distribution of the US street network divided by different OSM road classes.

The length of the residential roads is still decreasing (-496,000 km), similar to what we saw during the analysis for our paper, while the length of the other road types (+276,000 km) and secondary/tertiary roads (+205,000 km) is increasing. This is the result of a massive retagging process of the imported TIGER/Line dataset in OSM. Dennis mentioned this already in his SotM US 2012 presentation. Motorways also experienced an increase of around +44,000 km in 2012. You will find some additional, quite interesting statistics, charts and of course maps in the aforementioned journal publication. In particular a few more thoughts and facts about the effect and impact of data imports on OSM can be found in our research study about the United States OSM dataset.

I Like OpenStreetMap (OpenLayers Plugin)

A few months ago, Frederik Ramm posted an idea on the German OpenStreetMap mailing list about a new (stochastic) approach to OSM data quality assurance. You can find his original German post here. His idea was to create a way to allow users to “like” or “dislike” a specific region on the OSM map, a function that other popular websites such as YouTube or Facebook implemented to allow users to provide feedback to videos or status updates. For OSM this particular function could give some indicators or trends about the OSM map data.

I really liked his idea and in collaboration with Frederik I created an Open Source OpenLayers plugin. For all new readers: OpenLayers is an Open Source library which can implement a dynamic (OSM) map into more or less any webpage. One of our goals was to make the integration of the ILikeOSM plugin as easy as adding a tile server to your OpenLayers map.

The following image shows the plugin in more detail, including the “like” and “dislike” buttons to provide feedback about the area on the map.

An additional feature of the plugin shows how many users have been viewing the same area of the map that the current user is taking a look at. More precisely: How many other users have been viewing a similar area of the map within the past two minutes with a zoom level of +-3 to yours. All components of the plugin are Open Source and available on github. The database which saves the likes and dislikes is running on a German OSM Dev server. A database dump file can be downloaded on a daily basis. It is important to note at this point that no private data is saved in the database when a user leaves his or her feedback. The plugin only saves an independent, randomly generated user ID, the feedback type i.e. thumbs up/down, the zoom level, the layer name and the bounding box of the map section. A map view is generally not saved to the database until the user accepts to do so via a pop up window.

Do you like this feature?
It is quite easy to integrate it into your own webpage. Here is how it works:
1. Add the following line below your OpenLayers script-tag:
<script src=”http://ilike.openstreetmap.de/ILikeOSM.min.js” type=”text/javascript”></script>
2. Then add the following lines to your OpenLayers Controls:
new OpenLayers.ILikeOSM()
3. Styling
<style type="text/css">
div.olILikeOSM { position: absolute; top: 15px; left: 50px; padding: 7px; color:white; border-radius: 10px; background: rgba(0, 0, 0, 0.6); }
div.olILikeOSM a { color: white; font-size:12px; text-decoration: underline; }
</style>
4. That’s it!

What is the benefit of this plugin or of the saved ILikeOSM data?
Based on the saved likes, dislikes and map views we can generate some statistics to provide you with information about the number of people who like or dislike your particular area of interest. Maybe we can even see some prove of Linu’s law “given enough eyeballs, all bugs are shallow”; meaning in this case, that a larger number of users that check a certain region of the map, results in “better” OSM data quality. As a first prototype, I generated a static webpage which shows an example result map.

Further ideas?
The plugin could potentially be expanded with an additional textbox in which a user could leave a comment why the area is not well represented in OSM. This information could then be saved e.g. in OpenStreetBugs. Anyway, we think that the current version of the plugin could provide some very useful information. You will find a webpage with all information, examples and downloads here: http://ilike.openstreetmap.de As a first step we integrated the plugin into the OpenStreetMap Germany webpage.

Frederik will give a short talk about the ILikeOSM plugin at the upcoming State of the Map 2012 in Tokyo. If our proposed session abstract about another topic for the State of the Map 2012 US gets accepted, Dennis will try to present it there too.

Thank you very much for your feedback: Frederik, Jonas, Dennis, Sven & Marc

Where are the new OpenStreetMap Contributors?

Since past Friday the OpenStreetMap project has more than 600 000 registered members. As many of you may know, not every new registered member starts contributing to the project right away. Based on my “How did you contribute to OSM?” database I created a small (but neat) webpage which shows where the newest registered OpenStreetMap (OSM) members made one of their first edits. The following image shows a screenshot of the new webpage:

The visualized data will be updated on a daily basis. At the moment there are two layers available: one layer displays the latest members of the past two days, while the other layer does the same for the past seven days. At lower zoom-levels the icons are clustered and only show the number of new members. However, on higher zoom-levels you can click on the individual icons to get further information about the new project member. Thanks to Stamen for their really nice looking watercolor map. Would you like to see more statistics about the number of new contributors for each individual country?

The new webpage is online here: http://resultmaps.neis-one.org/newestosm.php

thx @ maɪˈæmɪ Dennis

Which country has the most OpenStreetMap GPS Points?

Some of you might already know that OpenStreetMap released a first bulk GPS point dataset last weekend. It contains almost 2.8 milliard (or for readers in the US 2.8 billion) points and is provided in its raw format, which means that only coordinate information is available for each point. Unfortunately it does not include any additional information or metadata. You can read more about it at the OSM Foundation Blog.

The first idea that came to my mind was a simple comparison analysis to answer the following questions: Where are all those points located and which country has the most GPS points? In a first try I conducted some results that showed that all points are distributed over 238 countries. For my analysis I used the OSM Mapnik world boundaries from the wiki. As you can see in the following pie chart, nearly 21% of the points are located in Russia (about 570 million points) and another 18% in Germany (about 500 million points). Does Russia have so many GPS points because of the country size or is the community just exceptionally active with GPS devices? However, the strange thing is that Germany is, with about 18%, “only” on the second place this time, weird isn’t it? 😉

I think overall these are some quite interesting numbers. We all hope to see some more metadata information in the OSM GPS point dataset soon.

thx @ maɪˈæmɪ Dennis & Good luck for next week!

*UPDATE* April 11th, 2012
The following map shows the OSM GPS points per 1000km²:

The second map shows the OSM GPS points per 1000 inhabitans:

OpenStreetMap in Germany (2007-2011)

Due to some requests by some German OpenStreetMap contributors, here a German blogpost about the results of the article: “The Street Network Evolution of Crowdsourced Maps: OpenStreetMap in Germany 2007–2011.” By Pascal Neis, Dennis Zielstra & Alexander Zipf. 2012. Future Internet 4, no. 1: 1-21. (doi:10.3390/fi4010001) Link: http://www.mdpi.com/1999-5903/4/1/1/

Bemerkung: Im Folgenden sind ausgewählte Ergebnisse und Diagramme aus dem englischen Artikel dargestellt/zusammengefasst. Bei weiterem Interesse bitte das Original Journal Paper lesen. Es beinhaltet bei weitem mehr Informationen und Abbildungen!

Das OpenStreetMap (OSM) Projekt ist das bekannteste Projekt im Bereich Volunteered Geographic Information (VGI). Weltweit beteiligen sich mehrere hundert tausend Mitglieder um Informationen für eine „freie“ Geodatenbank zu sammeln. Der Zuwachs der Daten ist weltweit recht heterogen, Deutschland zählt aber global zu eine der aktivsten Länder und die Anzahl der Projektbeteiligten steigt von Jahr zu Jahr. Aktuell (Juni 2011) haben insgesamt mehr als 40000 unterschiedliche Mitglieder zum Deutschland Datensatz beigetragen. Wie in der folgenden Abbildung zu sehen, haben unterschiedliche Mengen von Mitgliedern, die drei OSM Objektarten (Node, Way & Relation) in Deutschland erzeugt. Eine weitere wichtige Information ist in der Abbildung ebenfalls zu sehen: 98% der Punkte wurden von ca. 8500 Mitgliedern, 98% der Linien von ca. 7500 und 98% der Relations auf ca. 2600 Mitglieder generiert (wenn man den letzten Eigentümer als Ersteller bewertet).

(c) MDPI

(c) MDPI

In folgender Abbildung ist die Entwicklung des Gesamtstraßennetzes für Deutschland für die vergangenen vier Jahre (2007-2011) zu sehen. Die vielen unterschiedlichen Straßenkategorien wurden aus Übersichtsgründen und für bessere Untersuchungs- und Vergleichsmethoden in vier Gruppen zusammengefasst (Autobahn/Schnellstraßen, Kreisstraße/Gemeidestraße, Straßen an/in Wohngebieten und sonstige wie Service oder Feld-/Waldwege).

(c) MDPI

(c) MDPI

Verfolgt man den Wachstum der unterschiedlichen Kategorien, ist zu erkennen, dass ab einem bestimmten Zeitpunkt manche Kategorien nicht mehr weiter zunehmen. Daraus lässt sich ableiten, ab wann eine Kategorie annährend „komplett“ erfasst gewesen sein dürfte oder wo noch neue Straßen hinzukommen. Bei diesem ersten Vergleich ist aber folgendes zu beachten: Der Datensatz von TomTom eignet sich nur für einen Vergleich des Wegenetzes für die Autonavigation (also drei der vier Kategorien). Die Kategorie „Sonstige Wege“kann nur bedingt im Vergleich berücksichtigt werden. In der vierten Kategorie hat OSM ein bereits viel höheres Wegenetz als der kommerzielle Anbieter. Basierend auf den eben erwähnten Annahmen und dem Vergleich mit den TomTom Kategoriestraßenlängen kommen wir zu folgenden Ergebnissen:

  1. Autobahnen/Schnellstraßen waren bereits Mitte 2008 komplett erfasst
  2. Mitte 2009 waren Kreisstraßen/Gemeindestraßen in Deutschland erfasst
  3. Straßen in/an Wohngebieten sind noch nicht vollständig erfasst
  4. Ende 2009 hatte OSM bereits mehr „Sonstige Wege“ als der kommerzielle Datensatz von TomTom
  5. In der Gesamtsumme des Wegenetzes hat OSM seit Mitte 2010 TomTom übertroffen. Wobei hier sicherlich die vielen Feld- und Waldwege für OSM ein Vorteil sind.
  6. Aktuell (Juni 2011) wird in OSM Deutschland größtenteils nur noch vereinzelt am Wegennetz an und in Wohngebieten und vermehrt am sonstigen Wegenetz gearbeitet (Wald-, Wiesen- und Feldwegen).

Die Entwicklung der einzelnen Straßenkategorien im Vergleich zum TomTom Datensatz ist in der folgenden Abbildung zu sehen.

(c) MDPI

(c) MDPI

Damit hat sich in Deutschland aktuell (Juni 2011) das OSM Straßennetz für die Autonavigation bis auf 9% an vergleichbare Datensätze herangearbeitet und besitzt im Bereich des Gesamtwegenetzes sogar über 27% mehr Informationen. Durch den aktuellen Zuwachs in den fehlenden Straßenkategorien dürfte OSM die noch offene Differenz im Straßennetz bis Mitte/Ende 2012 ausgleichen.

Neben dem Wegenetz wurden auch die Gesamtzahlen der Abbiegevorschriften pro Straßenkategorie miteinander verglichen.

(c) MDPI

(c) MDPI

Wie im oberen Bild zu sehen ist, ist die Differenz zwischen TomTom und OSM nicht gering. Damit sind aktuell mehr als fünfmal so viele Abbiegevorschriften bei TomTom für Deutschland verfügbar im Vergleich zu OSM. Die Anzahl von Abbiegevorschriften steigt zwar stetig bei OSM, trotzdem dürfte es vermutlich nach jetzigem Stand und Zuwachs noch mehrere Jahre dauern bis OSM hier aufschließen kann.

Der komplette (englische) Artikel mit weiteren Untersuchungen und Abbildungen ist hier kostenfrei herunterladbar: http://www.mdpi.com/1999-5903/4/1/1/

thx @ maɪˈæmɪ Dennis

Updated Status for Unmapped Places

The last unmapped places analysis for OpenStreetMap that I conducted is nearly eight months ago. So I figured it was about time to create a new one. You can read in the last blog post how my algorithm exactly works.

However, at the moment (Nov. 4th. 2011) we have (according to the Geofabrik extract) about 597 000 entries in OSM for places that are located within “Europe“. This means we have an overall increase of about 90 000 places within the past eight months. We can separate them into several types with different values:

  • City: 1093 (as of March 11th, 2011 it was 1055 ; +3.6%)
  • Town: 16213 (as of March 11th, 2011 it was 16106 ; +0.7%)
  • Suburb: 29642 (as of March 11th, 2011 it was 24913 ; +19.0%)
  • Village: 301638 (as of March 11th, 2011 it was 278691 ; +8.2%)
  • Hamlet: 238717 (as of March 11th, 2011 it was 184326 ; 29.5%)
  • Isolated dwelling: 9064 (new in my stats)

The results showed that of the total 301638 village entries for Europe in the database, about 154445 (51%) (in March 2011 it was 55%) have not been detected or mapped yet. Further it is possible that the places are tagged incorrectly (e.g. villages vs. hamlet). Anyway, the following figures show the distribution of the values for each country (in different scales).

It is nice to see, that Austria (-688), Czech Republic (-633), France (-1978), Georgia (-721), Germany (-1192), Italy (-926), Poland (-2364), Spain (-1472) and the United Kingdom (-829) were able to reduce their “unmapped places” in a quite solid way. As usual you can find my results as a GPX-overlay here: http://resultmaps.neis-one.org

(Remarks for http://resultmaps.neis-one.org: Not each and every country is available as an overlay. Some countries such as France or Poland showed longer browser loading times to display the GPX-overlays!)

UPDATE: Download the complete GPX-files of this analysis here.

thx @ maɪˈæmɪ Dennis