Wait, someone did what?
Exploring Reverted Map Edits in OpenStreetMap

The OpenStreetMap (OSM) project has over 10 million registered members, with around 2 million user profiles having made at least one map contribution. However, a closer look reveals that there has been a slight decline in the number of active contributors over the last three years. Despite the extensive global mapping community, there are instances where individuals or automated bots disregard the consensus norms of the community when editing data. These situations arise due to disagreements regarding the appropriateness of certain tagging or features within the OSM database. To address these issues, a change rollback process, commonly referred to as reverting, is used to combat vandalism and correct ‘mistakes’ by restoring a previous version of the data.

Two years ago, I added additional statistics to the “How did you contribute to OSM?” page for quality assurance purposes. The numbers for each contributor profile were derived from an analysis of the full history OSM planet dump and changeset tags, including the specific editor used. While this pragmatic approach provides valuable insights, it’s important to acknowledge that the obtained numbers are estimations rather than exact figures. Furthermore, I received several inquiries regarding the implementation of the processing involved in identifying the displayed “reverted changes”.

Over the past few weeks, I have developed an advanced processing pipeline. This involved revisiting the comprehensive OSM planet dump and examining the evolution of each entity (node, way, relation) in relation to its previous states. Specifically, an entity with a higher version number was identified as a revert if it had the same latitude/longitude coordinates (for nodes), tags (key-value pairs), and/or members (for relations) as a previous version. In simpler terms, if a mapper changed “X” to “Y” and another mapper subsequently altered it back to “X”, it would be counted as a revert.

The following graph illustrates the amount of reverted map edits, changesets, and the contributors affected per month. This visualisation offers some initial insights into the scope and impact of reverted changes. It’s important to note, that the these numbers don’t include any actions related to reverted data imports.

 

It is also necessary to look more closely at the specific entities that are counted as “reverted”. Are they primarily nodes with a few tags, or are they ways and relations with extensive mapping histories in active areas? These specific aspects, among others, will be explored in an upcoming blog post or possibly published as part of a scientific research study.

What are your thoughts? I think many of you might be curious to discover whether your own map entities have been reverted. Please feel free to leave a comment on my OSM diary page, where I have cross-posted this article.

OpenData vom Bundesamt für Kartographie und Geodäsie vs. Crowdsourced OpenStreetMap in Deutschland – Ein Vergleich Offener Daten

Nach knapp 1.000 Tagen Abstinenz (endlich?) mal wieder ein Blog Post von mir. Aufgrund des inhaltlichen und räumlichen Bezugs diesmal auf deutsch. English version via Google translate?

Präambel – Im Herbst 2020 entstand beim FOSSGIS e.V. eine Open Data Arbeitsgruppe. Durch verschiedene gemeinsame Aktivitäten von der Arbeitsgruppe und dem Bundesamt für Kartographie und Geodäsie (BKG), wie z.B. einem Workshop, wurden Anfang Dezember 2020 zwei Datensätze von Standorten der Landespolizei und Gesundheitsämtern für die „Pflege und Erweiterung der OpenStreetMap-Datenbank“ freigegeben. Daneben existieren beim BKG noch weitere interessante „Open Data“ Geodaten und Webdienste, die aber aufgrund ihrer Lizenzbedingungen nicht vom OpenStreetMap (OSM) Projekt verwendet werden dürfen.

Ein „offizieller offener“ Datensatz von einer Bundesbehörde? Gut, wie sieht’s im Vergleich zu gemeinsam zusammengetragen Daten aus, z.B. OpenStreetMap? Lassen sich Unterschiede in der Qualität feststellen? Sind die Datensätze womöglich auf Augenhöhe oder existieren gravierende Unterschiede oder wovon könnten alle profitieren?

Um zumindest einen Teil der zuvor genannten Fragen beantworten zu können, liegt es auf der Hand eine klassische Qualitätsanalyse zwischen den zwei Datensätzen durchzuführen. Eine interessante Frage dabei: Welcher Datensatz ist die Referenzquelle? Ist der BKG Datensatz die Referenz oder inzwischen vielleicht der OSM Datensatz? In annähernd allen mir bekannten Qualitätsuntersuchungen wird der „offizielle“ Datensatz als Referenz angenommen, daher wird die folgende Analyse ebenfalls so durchgeführt.

Wie wurde methodisch vorgegangen? Die beiden hier untersuchten Datensätze vom BKG wurden über Github bezogen. Die OSM Elemente für den Vergleich wurden aus einem aktuellen Planetfile mit osmium für Deutschland extrahiert (vielen Dank an dieser Stelle an Jochen als Maintainer für dieses super schnelle Tool und die Unterstützer). Bei der eigentlichen Analyse der Qualität wurden folgende Merkmale untersucht: Vollständigkeit, Logische Konsistenz, Positionsgenauigkeit, Zeitliche Genauigkeit und Thematische Genauigkeit. Dabei kamen verschiedene JAVA Klassen zum Einsatz, die zum größten Teil bei mir auf GitHub gefunden werden können.

Wie sehen die einzelnen Ergebnisse des Vergleichs der Datensätze im Detail aus? Starten wir als erstes mit der Vollständigkeit von den beiden Datensätzen im Vergleich:

  • Anzahl Objekte Landespolizei vom BKG: 4,257
  • Anzahl Objekte amenity=police OSM: 3,871

Auf den ersten Blick existieren damit rund 10% mehr Standorte im Datensatz vom BKG als wie am 03.02.2022 in OSM eingetragen waren. Die Besonderheit liegt aber im verwendeten OSM-Element und -Tagging, was in der ersten Version dieses Blog Posts zu Abweichungen in den Ergebnissen bei der Vollständigkeit geführt hat.

Hier verfügt der „offizielle“ Datensatz vom BKG um rund 35% mehr Objekte als was in OSM auf die schnelle zu finden ist.

Die logische Konsistenz kann über verschiedene Wege geprüft werden. In meinem Beispiel hier wurde jeweils des BKG und der OSM Datensatz bzgl. des Vorhandensein der Attribute mit sich selbst untersucht. Bedeutet: Der Datensatz der Landespolizei vom BKG besitzt 11 Sachattribute und die Gesundheitsämter verfügen über 12 Sachattribute. Bei der Landespolizei sind bei den Objekten, bis auf Telefax (73%) und E_Mail (52%), die Attribute/Eigenschaften mindestens zu 97% angegeben. Bei den Gesundheitsämtern vom BKG sind, bis auf Telefax (80%) und E_Mail (90%), die Attribute mindestens zu 99% angegeben. Bei den OSM sieht dies anders aus. Vergleichbare Eigenschaften, also Tags (key-value Paare), sind bei den in OSM vorhandenen Standorten der Polizei mit name (86%), addr:street/housenumber/postcode/city (ca. 63%), phone (27%) und fax (8%) mit einem Wert vorhandenen. Bei den Gesundheitsämtern von OSM sieht es ähnlich aus: Hier sind name (100%), addr:street/housenumber/postcode/city (ca. 78%), phone (14%) und fax (7%) mit einem Wert befüllt.

Um die Genauigkeit der Lage (Positionsgenauigkeit) zu vergleichen, wurde jeweils mit einem Puffer im Umkreis von 500m um den Standort einer Landespolizei oder eines Gesundheitsamtes vom BKG nach vergleichbaren Objekten in OSM gesucht. Im genannten Umkreis der Landespolizei-Stellen vom BKG befindet sich bei 87% ein erstelltes Polizei-Element im OSM Datensatz. Bei den Gesundheitsämtern finden sich bei 44% ein Eintrag bei OSM.

Die Prüfung der thematischen Genauigkeit erfolgte nur über einen minimalistischen Ansatz, in dem die Namen der über die Positionsgenauigkeit verknüpften Objekte miteinander verglichen wurden. Hierbei zeigte sich, dass nur 25% (Gesundheitsämter) und 32% (Landespolizei) der Namen zwischen den BKG und OSM Datensätzen exakt übereinstimmen. Die Untersuchung dieses Qualitätsmerkmals könnte oder müsste umfangreicher angegangen werden.

Die Datensätze des BKG wurden im Jahr 2021 veröffentlicht. Bei OpenStreetMap wird für gewöhnlich der Zeitpunkt der letzten Änderung des Elementes für die Aktualität bzw. zeitliche Genauigkeit verwendet.

Zusatzinfo: Die Mitwirkenden beim OSM Projekt – In OpenStreetMap haben bei den Standorten der Polizei insgesamt mind. 1.428 verschiedene Mitglieder an den Daten mitgearbeitet. Bei den Gesundheitsämtern waren es mind. 120 Personen, die die Elemente in irgendeiner Form (Lage oder Sachinformationen) bearbeitet oder ergänzt haben.

Kurzzusammenfassung oder was bringt jetzt dieser „Vergleich“? Dieser Blog Post hat keinen Anspruch auf Richtig- und Vollständigkeit. Es wird dennoch gezeigt, dass neben der Quantität (siehe Vollständigkeit) insbesondere das Augenmerk anscheinend auf die Attribute bzw. enthaltenen Details zu den jeweiligen Einträgen bei OSM gelegt werden sollte. Welche Vorgehensweise hat sich bei OSM in der Vergangenheit etabliert? Zumindest in Deutschland sollten nicht nur meiner Meinung nach keine Datenimporte mehr stattfinden. Vielmehr würde es sich anbieten, und wie in manchen Städten oder Ländern bereits erfolgreich umgesetzt und gelebt, eine Art Datenabgleich angeboten werden, wonach Interessierte und Engagierte die einzelnen Einträge vergleichen können.

Solch freigebende Datensätze, wie die vom BKG, eignen sich hervorragend zur Kontrolle und/oder Erweiterung der gesammelten Daten des OpenStreetMap-Projektes. Um es hier auch erwähnt zu haben: Nicht nur gemeinsam zusammengetragene Daten, sondern auch offizielle Daten, können Fehler oder Abweichungen enthalten. Dadurch sollten nach Möglichkeit diese Daten oder Informationen nicht unreflektiert nach OSM übernommen werden.

PS: Dieser Blog Post hat keinen Anspruch einer Wissenschaftlichen Untersuchung, sondern ist einfach aus einer Laune heraus an einem Sonntagmorgen bei einem Espresso entstanden. Hoffe es waren dennoch ein paar interessante Einblicke für Euch mit dabei?

#100 – Thank you!

While I was working on my latest blog post, I realized that I had already written 100 posts over the past nine years. All posts have one thing in common: They are about the well-known and maybe never ending OpenStreetMap project. From time to time there are still emerging questions or issues which must be tackled by someone. This always fascinated me about OSM. However, this particular number 100 is not about a specific subject, it’s just a tiny post to say thank you! Thank you for your continuous interest in reading, commenting and of course sometimes criticizing my work. To me it’s still awesome to see that you, a few thousand people in total, use tools or services daily, that I implemented.

It’s still incredible that many people (not all) spent their spare time contributing to the project, not only as spatial data contributors but also as software engineers, system admins or coordinators of workshops, conferences or mapping events or by just validating or reviewing the latest map changes. Some of my webpages wouldn’t be as successful without your feedback. So, thanks again! Finally, I would like to thank all the people who I have met during the different meet ups, such as FOSSGIS, OSM hack weekends etc. over the past couple of years. There have always been friendly, respectful and useful chats: It’s always a pleasure.

Thanks to maɪˈæmɪ Dennis.

New metric for measuring the “qualitative nature” of OpenStreetMap activities @ How did you contribute ?

Back in June we had a twitter chat about potential new features for the “How did you contribute to OpenStreetMap” (HDYC) website. One suggestion was to “show more relevant information about skills, tagging system or the quality of contributions” of a project member (by J-Louis). Overall I really like the following summary by Claudius: “HDYC started off with a strong focus on quantitative metrics and you expanded it lately a lot to reflect the qualitative nature of contributions. I think there’s value to show more about which area of data someone contributed: Auto/bike/railway/water infrastructure, amenities…”.

So I finally started searching in the OpenStreetMap (OSM) wiki for any feasible information about “groups of tags” or “tag categories”. Altogether, I couldn’t discover any solution that fits perfectly to determine the areas of data a mapper contributed in. However, later I got a hint from the JOSM developers to use the presets of the well-known and popular editor. You may ask, ‘What are presets?’ “Presets in JOSM are menu-driven shortcuts to tag common object types in OpenStreetMap. They provide you with a user friendly interface to edit one or more objects at a time, suggest additional keys and values you may wish to add to those objects, and most importantly, prevent you from having to enter keys and values by hand.” You can find many different presets at the aforementioned JOSM page. However, during my data processing I utilized the “default presets”. The XML file contains many combinations of popular or established tag combinations, which contributors use when they are mapping.

So far so good, as a first step I released a new version of “Find Suspicious OpenStreetMap Changesets“. It shows the utilized presets for each changeset. This can already indicate some quality aspects such as attribute (tag) accuracy or completeness. Now, after some weeks and some minor adjustments, I started to use this collected information about applied presets to expand the metrics of a mapper’s profile. The HDYC-page now also lists which presets the mapper recently utilized during her/his contributions such as adding, modifying or removing map elements. I think this is a really useful next step towards an even more required aspect of quality assurance that we highly need with the OSM project.

Some technical details: The database behind the “Find Suspicious OpenStreetMap Changesets” webpage uses the augmented diff files of the Overpass-API. The utilized “default” preset list of the JOSM editor can be found here (Internal Preset list). The entire processing tool was developed with JAVA and uses a Postgres database to store the results. By now, only recently utilized presets of the past 60 days of the contributor’s activity are utilized and presented.

However, thank you very much for all your feedback. Hope that it helps.

Thanks to maɪˈæmɪ Dennis.

Additional insights about OSM changeset discussions: Who requests, receives and responds?

Last year I wrote two blog posts about the OpenStreetMap (OSM) feature that allows commenting on contributor map changes within a changeset. The first blog post showed some general descriptive statistics about the number of created changeset discussions, affected countries, the origin of the commenting contributors or their mapping reputation. The second post described a newly introduced feature, where contributors can flag their changeset so that their map edits can be reviewed. This blog post will follow up on this topic and conducts some similar but updated research.

The first chart shows the number of created comments (discussed changesets) and the contributors involved over the last 15 months. The number of created comments and discussed changesets fluctuates over time, whereas the number of contributors who take part in changeset discussions stays consistent at around 1,500 per month. Around 3,200 contributors received a comment on at least one changeset’s map edits a month.

After publishing the aforementioned blog post, people were asking for some numbers that show the commented changeset grouped by the editing application that was utilized. The results show that these numbers stayed more or less the same with 2/3 of all commented changesets (almost 160,000) being edited by the iD editor. This is not very surprising since this particular editor is used by many OSM beginners during first edits. It’s also interesting to see whether the changeset author responded (also grouped by the OSM editor that was used). Overall only around 32,000 contributors responded to their changeset comment. You can find some additional charts about the comments per discussed changeset in the previous blog post. Again, the majority (around 71%) of the changeset discussions contain one comment only.

Since last August, contributors can mark their changeset with a flag for “review_requested”. After a few months now I think it’s time for a first look at the numbers. The following charts display the number of requested reviews by contributors and their marked changesets. First of all, almost each month around 7,000 contributors asked for one review minimum. Overall almost 36,000 changesets have been marked for review each month. If we take a close look and filter changesets by hashtags, we can see that sometimes large numbers of the changesets are contributed by #HOTOSM or #MissingMaps members.

The following diagram shows probably the most disappointing results: The number of requested reviews that actually have been reviewed in the end. No matter if the changeset has the #HOTOSM or #MissingMaps tags or not, the relative value of reviewed changesets lies only between 6 and 18%. To be honest, I’m also a bit surprised that only a few of #HOTOSM or #MissingMaps changesets have been reviewed so far.

So, what do you think? Do you review contributions without commenting on the changesets? Do we need more attention here or is it just boring to look after changesets which are marked for review? I think it’s obvious, that we need more contributors who review map changes or least “documenting” their work. But can we handle this? Or do we need better tools?

Thanks to maɪˈæmɪ Dennis.

Adding Indicators to OSM Map Edits Assessment

Almost two years ago I published a web service that finds suspicious OpenStreetMap (OSM) map changes. You can use the service here and find some more information in previous blog posts. Especially Changeset discussions revealed that they are more or less de facto standard for communication between contributors during map change reviews.

However, when I am inspecting map changes, I sometimes see new contributors using uncommon OSM tags. Therefore I think it could be useful to add an additional assessment parameter to the aforementioned suspicious OSM map changes page. The newly introduced indicator states the matching ratio between the contributed and the most popular OSM tags. This means, if the changeset contributor used many uncommon tags at her/his map changes objects, the matching rate will be low. If the contributor applied many common (“popular”) tags, it results in a high matching rate towards 100%. For the calculation I used Jochen Topf’s taginfo API to get commonly used OSM tags. An API description can be found here. Furthermore I added the average age (in days) of modified and deleted objects. This indicator can be used to see if the contributor edited objects, which have been mapped today (0 days) or exist already for a longer period of time, e.g. 1566 days. The values for the average version numbers are computed in a similar fashion.

Last but not least, the number of the affected contributors of the changeset is calculated. If a contributor only changes objects on which she or he is the latest modifier, this number will be ‘0’. Otherwise the value represents the number of unique mappers whose contributions have been changed. I hope that overall the newly added indicators can be useful for identifying changesets which need a closer look. The suspicious OSM map changes website has also received some style updates. They should help to highlight the most important parameters. I also added the aggregation of the latest changesets for a specific contributor. Guess this could be really useful to see a “big picture” of the individual mapping activities.

The aforementioned service is online here –> “Find Suspicious OpenStreetMap Changesets

Thanks to maɪˈæmɪ Dennis.

Public profiles on “How did you contribute to OSM?”

The web page How did you contribute to OpenStreetMap? (HDYC) provides individual detailed information about project members. Some time ago, the page has been revised, that member profiles can only be accessed, when users logged in with their OpenStreetMap (OSM) user account. This feature has been implemented, after a long and important discussion about “protecting user privacy in the OSM project”. The complete German discussion can be found here. However, I don’t want to continue the discussion here. I still support that any information, which are available about contributors, should not be hidden in project data dumps, APIs or on webpages. In my opinion, information such as contributor names or ids and modification timestamps are essential for doing quality analysis and assessments to protect the project against e.g. vandalism or unintended map edits.

Anyway, after the last modification, which required the mentioned user login on HDYC, I got positive and also negative feedback. Most negative feedback concerned that profiles are now hidden and not public anymore. But because contributors want to show their mapping efforts, I implemented a new feature, that profiles can be accessed without a user login on HDYC. So, if you want that anyone can access and see your OSM profile, just add a link to your HDYC profile on your OSM profile page. Similar as you did this maybe already for your OSM-related accounts (see blog post). The tool-chain checks the profiles of every contributor, who has been active within the last 24 hours.

Additionally, the HDYC web page got several small updates. The overall ranking has been switched to more meaningful recent country rankings. The “last modifier of” amounts have been temporary removed/replaced by detailed numbers of created and modified way elements. The changeset table now also contains some really useful hints about used words in the changesets comments and hashtags and their amounts. This feature has been requested by a German contributor, thanks “!i!”. Most of the displayed numbers should be updated on an hourly basis. Only the activity areas and information about changesets are “only” updated every 24 hours. Some numbers also contain links to further statistics such as detailed information about recent changesets, ranking lists of a country and commented or discussed changesets. Overall I tried to highlight further efforts and activities, such changeset discussions, related accounts or roles, and not “only” raw mapping element amounts.

Thanks to maɪˈæmɪ Dennis.

Processing compressed OpenStreetMap Data with Java

This blog post contains a summary on how you can write your own Java classes to process OpenStreetMap (OSM) pbf files. PBF is a compression format, which is nowadays more or less the standard utilized for reading and writing OSM data quickly. In the OSM world, many tools and programs implemented this file format (you can find additional information here). However, I think the following samples for reading and writing such compressed OSM data can be very helpful. In particular, if someone has to create some sort of test data or has to read some specific mapped objects of interest for her/his own project. The well-known Java Osmosis tool (command line application for processing OSM data), provides several libraries that are the basis for this brief tutorial.

Step 1: Maven is the key – If your Java project is already managed by Maven, you can just add the following lines to your pom.xml. It downloads and adds the required jar-files that are needed to process compressed OSM data to your project.

<dependency>
	<groupId>org.openstreetmap.osmosis</groupId>
	<artifactId>osmosis-pbf</artifactId>
	<version>0.46</version>
</dependency>

If you don’t use Maven, you can download the aforementioned dependencies here and add the jars as described here to your eclipse build path.

Step 2: Implementing the Sink interface for reading OSM data – Now, after you added the required libraries to your project, you can create your own class to read compressed OSM data. The following MyOSMReader class is an easy example for that scenario: It reads an OSM pbf files and prints all Ids of ‘ways’ with a ‘highway’ key.

package org.neis_one.osm.examples;

import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.InputStream;
import java.util.Map;

import org.openstreetmap.osmosis.core.container.v0_6.EntityContainer;
import org.openstreetmap.osmosis.core.container.v0_6.NodeContainer;
import org.openstreetmap.osmosis.core.container.v0_6.RelationContainer;
import org.openstreetmap.osmosis.core.container.v0_6.WayContainer;
import org.openstreetmap.osmosis.core.domain.v0_6.Tag;
import org.openstreetmap.osmosis.core.domain.v0_6.Way;
import org.openstreetmap.osmosis.core.task.v0_6.Sink;

import crosby.binary.osmosis.OsmosisReader;

/**
 * Receives data from the Osmosis pipeline and prints ways which have the
 * 'highway key.
 * 
 * @author pa5cal
 */
public class MyOsmReader implements Sink {

	@Override
	public void initialize(Map<String, Object> arg0) {
	}

	@Override
	public void process(EntityContainer entityContainer) {
		if (entityContainer instanceof NodeContainer) {
			// Nothing to do here
		} else if (entityContainer instanceof WayContainer) {
			Way myWay = ((WayContainer) entityContainer).getEntity();
			for (Tag myTag : myWay.getTags()) {
				if ("highway".equalsIgnoreCase(myTag.getKey())) {
					System.out.println(" Woha, it's a highway: " + myWay.getId());
					break;
				}
			}
		} else if (entityContainer instanceof RelationContainer) {
			// Nothing to do here
		} else {
			System.out.println("Unknown Entity!");
		}
	}

	@Override
	public void complete() {
	}

	@Override
	public void close() {
	}

	public static void main(String[] args) throws FileNotFoundException {
		InputStream inputStream = new FileInputStream("/Path/To/Your/read.osm.pbf");
		OsmosisReader reader = new OsmosisReader(inputStream);
		reader.setSink(new MyOsmReader());
		reader.run();
	}
}

Step 3: Implementing the Source Interface for writing OSM data – Similarly to Step 1 to read an OSM file, you will have to implement again an interface to write data as well. This time the Osmosis core task Source interface is being utilized. The following example class writes 10 Nodes to a new OSM pbf file.

package org.neis_one.osm.examples;

import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.OutputStream;
import java.util.Date;

import org.openstreetmap.osmosis.core.container.v0_6.NodeContainer;
import org.openstreetmap.osmosis.core.domain.v0_6.CommonEntityData;
import org.openstreetmap.osmosis.core.domain.v0_6.Node;
import org.openstreetmap.osmosis.core.domain.v0_6.OsmUser;
import org.openstreetmap.osmosis.core.task.v0_6.Sink;
import org.openstreetmap.osmosis.core.task.v0_6.Source;
import org.openstreetmap.osmosis.osmbinary.file.BlockOutputStream;

import crosby.binary.osmosis.OsmosisSerializer;

/**
 * Writes OSM data to the output task.
 * 
 * @author pa5cal
 */
public class MyOsmWriter implements Source {

	private Sink sink;

	@Override
	public void setSink(Sink sink) {
		this.sink = sink;
	}

	public void write() {
		for (int idx = 1; idx <= 10; idx++) {
			sink.process(new NodeContainer(new Node(createEntity(idx), 0, 0)));
		}
	}

	public void complete() {
		sink.complete();
	}

	private CommonEntityData createEntity(int idx) {
		return new CommonEntityData(idx, 1, new Date(), new OsmUser(idx, "User"), idx);
	}

	public static void main(String[] args) throws FileNotFoundException {
		OutputStream outputStream = new FileOutputStream("/Path/To/Your/write.osm.pbf");
		MyOsmWriter writer = new MyOsmWriter();
		writer.setSink(new OsmosisSerializer(new BlockOutputStream(outputStream)));
		writer.write();
		writer.complete();
	}
}

Step 4: That’s it – You should now be able to write your own classes that allow you to process compressed OSM data.

One last thing: When writing your own Java classes to process OSM data, you should always keep in mind, that the PBF file has it’s own object ordering format of the OSM objects: Each file first contains Nodes, then Ways and at the end Relations. That means, a Way element only contains references to Node ids. It doesn’t contain the coordinates or any tags of the actual nodes that are used for the Way. Thus, if you want a complete Way element with its geometry, you have two options: Store all Nodes in some kind of map or read the entire OSM file twice. Furthermore, if you’re interested in additional compressing options during writing a compressed OSM file, you should have a look at this class.

Thanks to maɪˈæmɪ Dennis.

Review requests of OpenStreetMap contributors
– How you can assist! –

The latest version of the OpenStreetMap editor iD has a new feature: “Allow user to request feedback when saving“. This idea has been mentioned in a diary post by Joost Schouppe about “Building local mapping communities” (at that time: “#pleasereview”) in 2016. The blog post also contains some other additional and good thoughts, definitely worth reading.

However, based on the newly implemented feature, any contributor can flag her/his changeset and ask for feedback. Now it’s your turn! How can you find and support those OSM’ers?

  • Step 1: Based on the “Find Suspicious OpenStreetMap Changesets” page you can search for flagged changesets, e.g. limited to your country only: Germany or UK.
  • Step 2: Leave a changeset comment where you e.g. welcome the contributor and (if necessary) give her/him some feedback about the map changes. You could also add some additional information, such as links to wiki pages of tags (map features), good mapping practices, the OSM forum, OSM help or mailing lists. Based on the changeset comment other contributors can see that the original contributor of this changeset already has been provided with some feedback.
  • Step 3: Finally you could create & save a feed URL of your changeset’s search. That’s it.

Personally, I really like this new feature. It provides an easy way to search for contributors who are asking for feedback about their map edits. Thanks to all iD developer’s for implementing this idea. What do you think? Should I add an extra score to “How did you contribute to OpenStreetMap” where every answer to a requested feedback changeset will be counted?

Some statistics? There you go: “OSM Changesets of the last 30 Days

Thanks to maɪˈæmɪ Dennis.

Who is commenting?
An Overview about OSM Changeset Discussions

As mentioned in my previous blog post about detecting vandalism in OpenStreetMap (OSM) edits, it’s highly recommended that contributors use public changeset discussions when contacting other mappers regarding their edits. This feature was introduced at the end of 2014 and is used widely by contributors today. Each and every comment is listed publicly and every contributor can read the communication and, if necessary, add further comments or thoughts. In most cases where questions about a specific map edit come up, it is desirable that contributors take this route of communication instead of private messaging each other.

For my presentation at the German FOSSGIS & OpenStreetMap conference I created several statistics about the aforementioned changeset discussion feature. For this blog post I reran all analyses and created some new charts and statistics. Let’s start with the first image (above): It shows the number of commented or discussed changesets per month since its introduction. The peak in January, 2017 is based on a revert with several thousands of changesets.

In total, more than 92,000 changesets have been discussed in the past few years with around 151,000 comments. All comments were created by almost 14,000 different contributors. So far most changesets were commented in Germany, the United States, Russia and the UK, as you can see in the following images. This correlates to some extent, with the exception of Kazakstan, with the number of active contributors for each country (see e.g. OSMstats for active contributors). As shown on the right side, many changesets (71%) only received one comment or discussion. This means, in most cases the commented changeset did not receive a response by the owner/contributor of the changeset.

Which changesets are discussed and who creates comments? I think it’s not surprising to see that most changesets by new contributors receive a comment. However, as the following charts show, there are also changesets by long-time contributors that have some discussions. It’s also quite interesting to see that all kinds of contributors (new and long-time) created discussions. I would have expected a trend towards contributors with a higher number of mapping days.

What is the origin of the contributor who created the comment? Again, not surprisingly, this correlates with the number of active OSM contributors per country as mentioned above. The contributors’ origin is determined by his/her main activity areas which you can find/see on “How did you contribute to OpenStreetMap?“.

Some additional numbers about the text content of the changeset discussions: Roughly 22% of the changeset comments contain the word “revert”. On the other side, more than 17% include some sort of “Welcome”, “Willkommen”, “Hello”, “Ciao”, “Hola”, “Bonjour”, “nǐ hǎo!” or “привет!” text. The following image shows a word cloud of the most used words in the changeset discussions:

The last chart shows the accumulative changeset discussion contributors and comments. Almost 63% of all discussion comments have been created by around 2% of the contributors. However, I assume this looks very similar to other long tails of OpenStreetMap contribution charts. What do you think?

Want to see the latest OSM discussions in your area or country? Check this webpage.

Thanks to maɪˈæmɪ Dennis.