Category: OpenStreetMap

Additional insights about OSM changeset discussions: Who requests, receives and responds?

Last year I wrote two blog posts about the OpenStreetMap (OSM) feature that allows commenting on contributor map changes within a changeset. The first blog post showed some general descriptive statistics about the number of created changeset discussions, affected countries, the origin of the commenting contributors or their mapping reputation. The second post described a newly introduced feature, where contributors can flag their changeset so that their map edits can be reviewed. This blog post will follow up on this topic and conducts some similar but updated research.

The first chart shows the number of created comments (discussed changesets) and the contributors involved over the last 15 months. The number of created comments and discussed changesets fluctuates over time, whereas the number of contributors who take part in changeset discussions stays consistent at around 1,500 per month. Around 3,200 contributors received a comment on at least one changeset’s map edits a month.

Adding Indicators to OSM Map Edits Assessment

Almost two years ago I published a web service that finds suspicious OpenStreetMap (OSM) map changes. You can use the service here and find some more information in previous blog posts. Especially Changeset discussions revealed that they are more or less de facto standard for communication between contributors during map change reviews.

However, when I am inspecting map changes, I sometimes see new contributors using uncommon OSM tags. Therefore I think it could be useful to add an additional assessment parameter to the aforementioned suspicious OSM map changes page. The newly introduced indicator states the matching ratio between the contributed and the most popular OSM tags. This means, if the changeset contributor used many uncommon tags at her/his map changes objects, the matching rate will be low. If the contributor applied many common (“popular”) tags, it results in a high matching rate towards 100%. For the calculation I used Jochen Topf’s taginfo API to get commonly used OSM tags. An API description can be found here. Furthermore I added the average age (in days) of modified and deleted objects. This indicator can be used to see if the contributor edited objects, which have been mapped today (0 days) or exist already for a longer period of time, e.g. 1566 days. The values for the average version numbers are computed in a similar fashion.

Public profiles on “How did you contribute to OSM?”

The web page How did you contribute to OpenStreetMap? (HDYC) provides individual detailed information about project members. Some time ago, the page has been revised, that member profiles can only be accessed, when users logged in with their OpenStreetMap (OSM) user account. This feature has been implemented, after a long and important discussion about “protecting user privacy in the OSM project”. The complete German discussion can be found here. However, I don’t want to continue the discussion here. I still support that any information, which are available about contributors, should not be hidden in project data dumps, APIs or on webpages. In my opinion, information such as contributor names or ids and modification timestamps are essential for doing quality analysis and assessments to protect the project against e.g. vandalism or unintended map edits.

Processing compressed OpenStreetMap Data with Java

This blog post contains a summary on how you can write your own Java classes to process OpenStreetMap (OSM) pbf files. PBF is a compression format, which is nowadays more or less the standard utilized for reading and writing OSM data quickly. In the OSM world, many tools and programs implemented this file format (you can find additional information here). However, I think the following samples for reading and writing such compressed OSM data can be very helpful. In particular, if someone has to create some sort of test data or has to read some specific mapped objects of interest for her/his own project. The well-known Java Osmosis tool (command line application for processing OSM data), provides several libraries that are the basis for this brief tutorial.

Step 1: Maven is the key – If your Java project is already managed by Maven, you can just add the following lines to your pom.xml. It downloads and adds the required jar-files that are needed to process compressed OSM data to your project.

Review requests of OpenStreetMap contributors
– How you can assist! –

The latest version of the OpenStreetMap editor iD has a new feature: “Allow user to request feedback when saving“. This idea has been mentioned in a diary post by Joost Schouppe about “Building local mapping communities” (at that time: “#pleasereview”) in 2016. The blog post also contains some other additional and good thoughts, definitely worth reading.

However, based on the newly implemented feature, any contributor can flag her/his changeset and ask for feedback. Now it’s your turn! How can you find and support those OSM’ers?

  • Step 1: Based on the “Find Suspicious OpenStreetMap Changesets” page you can search for flagged changesets, e.g. limited to your country only: Germany or UK.
  • Step 2: Leave a changeset comment where you e.g. welcome the contributor and (if necessary) give her/him some feedback about the map changes. You could also add some additional information, such as links to wiki pages of tags (map features), good mapping practices, the OSM forum, OSM help or mailing lists. Based on the changeset comment other contributors can see that the original contributor of this changeset already has been provided with some feedback.

Who is commenting?
An Overview about OSM Changeset Discussions

As mentioned in my previous blog post about detecting vandalism in OpenStreetMap (OSM) edits, it’s highly recommended that contributors use public changeset discussions when contacting other mappers regarding their edits. This feature was introduced at the end of 2014 and is used widely by contributors today. Each and every comment is listed publicly and every contributor can read the communication and, if necessary, add further comments or thoughts. In most cases where questions about a specific map edit come up, it is desirable that contributors take this route of communication instead of private messaging each other.

For my presentation at the German FOSSGIS & OpenStreetMap conference I created several statistics about the aforementioned changeset discussion feature. For this blog post I reran all analyses and created some new charts and statistics. Let’s start with the first image (above): It shows the number of commented or discussed changesets per month since its introduction. The peak in January, 2017 is based on a revert with several thousands of changesets.

Detecting vandalism in OpenStreetMap – A case study

This blog post is a summary of my talk at the FOSSGIS & OpenStreetMap conference 2017 (german slides). I guess some of the content might be feasible for a research article, however, here we go:

Vandalism is (still) an omnipresent issue for any kind of open data project. Over the past few years the OpenStreetMap (OSM) project data has been implemented in a number of applications. In my opinion, this is one of the most important reasons why we have to bring our quality assurance to the next level. Do we really have a vandalism issue after all? Yes, we do. But first we should take a closer look at the different vandalism types.

Reviewing OpenStreetMap contributions 1.0 – Managed by changeset comments and discussions?

The OSM project still records around 650 new contributors each day (out of almost 5,000 registered members per day). Some countries (such as Belgium or Spain) already provide platforms to coordinate the introduction to OSM for new mappers. Others use special scripts or intense manual work to send the newly registered contributors mails with useful information (Washington or The Netherland). However, oftentimes new contributors make, as expected, beginner-mistakes. Personally, I often detect unconnected ways, wrong tags or rare fictive data. Unfortunately, sometimes (new) members also delete, intentionally or unintentionally, existing map data.

At the end of 2014, many people were anticipating the newly introduced changeset discussions feature. A few months later, I developed a page that finds the latest discussions around the world or in your country. By now, many OSM members use changeset discussions for commenting or questioning map edits of other members.

main

A comparative study between different OpenStreetMap contributor groups – Outline 2016

Over the past few years I have written several blog posts about the (non-) activity of newly registered OpenStreetMap (OSM) members (2015, 2014, 2013). Similarly to the previous posts, the following image shows the gap between the number of registered and the number of active OSM members. Although the project still shows millions of new registrations, “only” several hundred thousand of these registrants actually edited at least one object. Simon showed similar results in his yearly changeset studies.

2016members

The following image shows, that the project still has some loyal contributors. More specifically, it shows the increase in monthly active members over the past few years and their consistent data contributions based on the first and latest changeset:

2016months

However, this time I would like to combine the current study with some additional research. I tried to identify three different OSM contributor groups, based on the hashtag in a contributor’s comment or the utilized editor, for the following analysis:

Unmapped Places of OpenStreetMap – 2016

Back in 2010 & 2011 I conducted several studies to detect underrepresented regions a.k.a. “unmapped” places in OpenStreetMap (OSM). More than five years later, some people asked if I could rerun the analysis. Based on the latest OSM planet dump file and Taginfo, almost 1 million places have been tagged as villages. Furthermore, around 59 million streets have a residential, unclassified or service highway value. My algorithm to find unmapped places, works as follows:

  1. Use every place node of the OSM dataset which has a village-tag (place=village).
  2. Search in a radius of ca. 700 m for a street with one of the following highway-values: residential, unclassified or service.
  3. If no street can be found, mark the place as “unmapped”!

My results for the entire OSM planet can be found under the following webpage.

unmapped