Wait, someone did what?
Exploring Reverted Map Edits in OpenStreetMap

The OpenStreetMap (OSM) project has over 10 million registered members, with around 2 million user profiles having made at least one map contribution. However, a closer look reveals that there has been a slight decline in the number of active contributors over the last three years. Despite the extensive global mapping community, there are instances where individuals or automated bots disregard the consensus norms of the community when editing data. These situations arise due to disagreements regarding the appropriateness of certain tagging or features within the OSM database. To address these issues, a change rollback process, commonly referred to as reverting, is used to combat vandalism and correct ‘mistakes’ by restoring a previous version of the data.

Two years ago, I added additional statistics to the “How did you contribute to OSM?” page for quality assurance purposes. The numbers for each contributor profile were derived from an analysis of the full history OSM planet dump and changeset tags, including the specific editor used. While this pragmatic approach provides valuable insights, it’s important to acknowledge that the obtained numbers are estimations rather than exact figures. Furthermore, I received several inquiries regarding the implementation of the processing involved in identifying the displayed “reverted changes”.

OpenData vom Bundesamt für Kartographie und Geodäsie vs. Crowdsourced OpenStreetMap in Deutschland – Ein Vergleich Offener Daten

Nach knapp 1.000 Tagen Abstinenz (endlich?) mal wieder ein Blog Post von mir. Aufgrund des inhaltlichen und räumlichen Bezugs diesmal auf deutsch. English version via Google translate?

Präambel – Im Herbst 2020 entstand beim FOSSGIS e.V. eine Open Data Arbeitsgruppe. Durch verschiedene gemeinsame Aktivitäten von der Arbeitsgruppe und dem Bundesamt für Kartographie und Geodäsie (BKG), wie z.B. einem Workshop, wurden Anfang Dezember 2020 zwei Datensätze von Standorten der Landespolizei und Gesundheitsämtern für die „Pflege und Erweiterung der OpenStreetMap-Datenbank“ freigegeben. Daneben existieren beim BKG noch weitere interessante „Open Data“ Geodaten und Webdienste, die aber aufgrund ihrer Lizenzbedingungen nicht vom OpenStreetMap (OSM) Projekt verwendet werden dürfen.

Ein „offizieller offener“ Datensatz von einer Bundesbehörde? Gut, wie sieht’s im Vergleich zu gemeinsam zusammengetragen Daten aus, z.B. OpenStreetMap? Lassen sich Unterschiede in der Qualität feststellen? Sind die Datensätze womöglich auf Augenhöhe oder existieren gravierende Unterschiede oder wovon könnten alle profitieren?

#100 – Thank you!

While I was working on my latest blog post, I realized that I had already written 100 posts over the past nine years. All posts have one thing in common: They are about the well-known and maybe never ending OpenStreetMap project. From time to time there are still emerging questions or issues which must be tackled by someone. This always fascinated me about OSM. However, this particular number 100 is not about a specific subject, it’s just a tiny post to say thank you! Thank you for your continuous interest in reading, commenting and of course sometimes criticizing my work. To me it’s still awesome to see that you, a few thousand people in total, use tools or services daily, that I implemented.

New metric for measuring the “qualitative nature” of OpenStreetMap activities @ How did you contribute ?

Back in June we had a twitter chat about potential new features for the “How did you contribute to OpenStreetMap” (HDYC) website. One suggestion was to “show more relevant information about skills, tagging system or the quality of contributions” of a project member (by J-Louis). Overall I really like the following summary by Claudius: “HDYC started off with a strong focus on quantitative metrics and you expanded it lately a lot to reflect the qualitative nature of contributions. I think there’s value to show more about which area of data someone contributed: Auto/bike/railway/water infrastructure, amenities…”.

Additional insights about OSM changeset discussions: Who requests, receives and responds?

Last year I wrote two blog posts about the OpenStreetMap (OSM) feature that allows commenting on contributor map changes within a changeset. The first blog post showed some general descriptive statistics about the number of created changeset discussions, affected countries, the origin of the commenting contributors or their mapping reputation. The second post described a newly introduced feature, where contributors can flag their changeset so that their map edits can be reviewed. This blog post will follow up on this topic and conducts some similar but updated research.

The first chart shows the number of created comments (discussed changesets) and the contributors involved over the last 15 months. The number of created comments and discussed changesets fluctuates over time, whereas the number of contributors who take part in changeset discussions stays consistent at around 1,500 per month. Around 3,200 contributors received a comment on at least one changeset’s map edits a month.

Adding Indicators to OSM Map Edits Assessment

Almost two years ago I published a web service that finds suspicious OpenStreetMap (OSM) map changes. You can use the service here and find some more information in previous blog posts. Especially Changeset discussions revealed that they are more or less de facto standard for communication between contributors during map change reviews.

However, when I am inspecting map changes, I sometimes see new contributors using uncommon OSM tags. Therefore I think it could be useful to add an additional assessment parameter to the aforementioned suspicious OSM map changes page. The newly introduced indicator states the matching ratio between the contributed and the most popular OSM tags. This means, if the changeset contributor used many uncommon tags at her/his map changes objects, the matching rate will be low. If the contributor applied many common (“popular”) tags, it results in a high matching rate towards 100%. For the calculation I used Jochen Topf’s taginfo API to get commonly used OSM tags. An API description can be found here. Furthermore I added the average age (in days) of modified and deleted objects. This indicator can be used to see if the contributor edited objects, which have been mapped today (0 days) or exist already for a longer period of time, e.g. 1566 days. The values for the average version numbers are computed in a similar fashion.

Public profiles on “How did you contribute to OSM?”

The web page How did you contribute to OpenStreetMap? (HDYC) provides individual detailed information about project members. Some time ago, the page has been revised, that member profiles can only be accessed, when users logged in with their OpenStreetMap (OSM) user account. This feature has been implemented, after a long and important discussion about “protecting user privacy in the OSM project”. The complete German discussion can be found here. However, I don’t want to continue the discussion here. I still support that any information, which are available about contributors, should not be hidden in project data dumps, APIs or on webpages. In my opinion, information such as contributor names or ids and modification timestamps are essential for doing quality analysis and assessments to protect the project against e.g. vandalism or unintended map edits.

Processing compressed OpenStreetMap Data with Java

This blog post contains a summary on how you can write your own Java classes to process OpenStreetMap (OSM) pbf files. PBF is a compression format, which is nowadays more or less the standard utilized for reading and writing OSM data quickly. In the OSM world, many tools and programs implemented this file format (you can find additional information here). However, I think the following samples for reading and writing such compressed OSM data can be very helpful. In particular, if someone has to create some sort of test data or has to read some specific mapped objects of interest for her/his own project. The well-known Java Osmosis tool (command line application for processing OSM data), provides several libraries that are the basis for this brief tutorial.

Step 1: Maven is the key – If your Java project is already managed by Maven, you can just add the following lines to your pom.xml. It downloads and adds the required jar-files that are needed to process compressed OSM data to your project.

Review requests of OpenStreetMap contributors
– How you can assist! –

The latest version of the OpenStreetMap editor iD has a new feature: “Allow user to request feedback when saving“. This idea has been mentioned in a diary post by Joost Schouppe about “Building local mapping communities” (at that time: “#pleasereview”) in 2016. The blog post also contains some other additional and good thoughts, definitely worth reading.

However, based on the newly implemented feature, any contributor can flag her/his changeset and ask for feedback. Now it’s your turn! How can you find and support those OSM’ers?

  • Step 1: Based on the “Find Suspicious OpenStreetMap Changesets” page you can search for flagged changesets, e.g. limited to your country only: Germany or UK.
  • Step 2: Leave a changeset comment where you e.g. welcome the contributor and (if necessary) give her/him some feedback about the map changes. You could also add some additional information, such as links to wiki pages of tags (map features), good mapping practices, the OSM forum, OSM help or mailing lists. Based on the changeset comment other contributors can see that the original contributor of this changeset already has been provided with some feedback.

Who is commenting?
An Overview about OSM Changeset Discussions

As mentioned in my previous blog post about detecting vandalism in OpenStreetMap (OSM) edits, it’s highly recommended that contributors use public changeset discussions when contacting other mappers regarding their edits. This feature was introduced at the end of 2014 and is used widely by contributors today. Each and every comment is listed publicly and every contributor can read the communication and, if necessary, add further comments or thoughts. In most cases where questions about a specific map edit come up, it is desirable that contributors take this route of communication instead of private messaging each other.

For my presentation at the German FOSSGIS & OpenStreetMap conference I created several statistics about the aforementioned changeset discussion feature. For this blog post I reran all analyses and created some new charts and statistics. Let’s start with the first image (above): It shows the number of commented or discussed changesets per month since its introduction. The peak in January, 2017 is based on a revert with several thousands of changesets.