Over the past few months, I have been experimenting with various open-source language models. I often use Ollama for this, as it is easy to install and can be integrated well into custom workflows. As part of several quality analysis and quality assurance ideas around the OpenStreetMap (OSM) project, I wanted to explore whether large language models could also be helpful for analysing OSM Notes. OSM Notes allow contributors and users to report errors, missing information, or other issues in OSM. The vast majority of these notes are factual, constructive, and helpful. In a very small number of cases, however, notes may contain insulting, harmful, or otherwise problematic language. In other cases, notes may include personal information such as names or phone numbers. I was therefore interested in whether such cases could be detected automatically, how reliable such an approach might be, and where its limitations are.

Approach
For quite some time, I have been running various statistics and analyses around OSM Notes. Among other things, I try to assign notes to different categories. For this experiment, I added another internal category: “Potentially problematic language”. My internal pipeline checks new notes and marks them if the text may contain insulting, harmful, or otherwise problematic content. This marking is explicitly intended as an internal signal, not as a final assessment or moderation decision. Based on initial tests, I also set up a bot that reacted to selected notes in certain cases. All affected notes, meaning notes that were commented on or closed, were reviewed manually by me in parallel. The aim of the experiment was to better understand the possible use of LLMs in this context and to evaluate the results systematically.

Experiences from the Experiment
The tests showed that automated detection can produce interesting results. In total, well over 500 notes with rather critical text content were detected. Many of the notes that were commented on or closed have since been hidden by the Data Working Group (DWG). At the same time, it became clear that intervening in a community process in this way is sensitive and should be communicated transparently. During the experiment, there was critical feedback and some misunderstandings about the purpose, functionality, and scope of my bot. Among other things, there was an impression that the bot could hide or censor notes. This is not the case: I am not an OSM administrator and I am not a member of the DWG. The bot therefore could not hide notes. It could only comment on or close notes, just like any regular OSM user.

Looking back, it would probably have been better to document the experiment more extensively in advance, for example in a blog post, on GitHub, or in the OSM Community Forum. Although the functionality was described on the bot account’s profile page, this was apparently not visible or transparent enough. Following the feedback, I stopped the public bot process some time ago.

Next Step
The internal analysis is still running. I do not want the results to be understood as moderation decisions, but rather as a data analysis and a possible support tool for quality assurance — similar to other analyses on my websites. For this reason, I am providing a webpage that lists notes which may contain insulting, harmful, or otherwise problematic content:

👉 https://resultmaps.neis-one.org/osm-notes-language-review

The list is explicitly intended as a signal and analysis tool. It does not claim to assess every individual case correctly or conclusively. Automated classifications can be wrong: there may be false positives, meaning notes that are incorrectly marked as problematic. There may also be false negatives, meaning problematic notes that are not detected. It is also important to mention that OSM Notes with critical content may be hidden by the DWG. Therefore, some notes listed in my analysis may no longer be publicly visible on the OSM website.

Feedback on false positives, false negatives, and the general framing of this analysis is welcome.


Posted

in

, , , , , ,

by

Tags:

Comments

2 responses to “Checking OpenStreetMap Notes for Potentially Problematic Language”

  1. Dennis Chen Avatar

    This note(https://osm.org/note/5327225) content is mis-translated as Mandarin, actually it is Taiwanest Taigi
    The meaning: the restaurant boss malicious bankruptcy and run away to China, leaving his creditors unpaid

    FYR:
    https://en.wikipedia.org/wiki/Written_Hokkien#Chinese_characters

    1. Pascal Neis Avatar
      Pascal Neis

      Thanks for the info and the additional reference, Dennis

      This is most likely caused by the automatic language detection on my side. Ollama seems to detect some languages very well on its own, probably depending on what the model was trained on, but for other languages or variants it can unfortunately fail.

      I will need to add a fallback or some kind of separate check for Chinese languages/scripts and related variants. I had already noticed similar issues before, but it is difficult for me to verify the translation properly since I do not speak the language myself.

Leave a Reply

Your email address will not be published. Required fields are marked *