OpenStreetMap Foundation

Licensing Working Group

  OSM Database Rebuild

This is a shopping and wish list for re-building the OSM geo-database so that ODbL coverage is 100%.  In the current version, it is just a collection point for a number of threads that have been going on to bring some of the participants together.  Once it is in a more coherent format, we can move it to wiki.openstreetmap.org for wider participation. This document can be read publicly at

https://docs.google.com/document/pub?id=11nadDjFQwPXHSiJ_yRDjErDOLO-h6U18qak8mj1I9R4

Michael Collinson 2011-10-18

It is now time to kick off rebuilding the database to achieve 100% new CTs/ODbl coverage so that the red squares in http://matt.dev.openstreetmap.org/treemap.png are either turned green or reduced in size and eventually eliminated. This is LWG’s current consensus strategy on how we will proceed:

Predicate: Remapping is better than just removing non-ODbL data from the current version, that should come last (Simon challenges this). So:

  1. Recruit a core volunteer technical team to help out … anyone can join and/or drop out (many of the tasks may be small but very important). Mike
  2. Focus initially on systems and minor changes that can:
  1. Second emailing to all folks who not yet accepted or declined the new terms. Focus on providing information or links in key languages. Richard is working on this.
  2. OSM database rebuilding per se. This does not have to be done in one go, but in a series of steps starting with wants has the most beneficial impact for the least amount of work.

Unanswered Legal/Ethical/Procedural Questions

These need to be resolved before technical tools are created.

Technical Wish List

Uncategorised:

Engineering WG - www.openstreetmap.org / API:

http://www.openstreetmap.org/browse/node/*/history

http://www.openstreetmap.org/browse/way/*/history

http://www.openstreetmap.org/browse/relation/*/history

Potlatch:

JOSM:

“licensechange” plugin http://wiki.openstreetmap.org/wiki/JOSM/Plugins/LicenseChange exists.

Meerkator and other OSM Contributor tools

??

Explicit re-build tools

Specific Proposals

17/08/2011

Hi LWG folks,

seeing there is still no decision on the process of rebuilding the database to Odbl-only and that people still discuss about just setting all objects back to the edit before the first non-odbl-compliant edit, I want to propose following process, which I hope should keep almost everything that can be kept without legal problems (not dealing with splits and merges at the moment, that's quite complex I fear). It is more or less based on the bachelor thesis by Jakob Altenstein (http://checkout.yourweb.de/thesis/Jakob_Altenstein_Thesis.pdf, german):

|Mark all changesets by agreeing users (and possibly users stating their data is in PD (I know you refused that, but hey, maybe think about it again :p), known bot accounts, etc.) as "good"

Mark all changesets with bot=yes as "good"

Mark all other changesets as "bad"

Nodes:

        ->  first changeset bad

            ->  node not member of a way or relation

                ->  delete node

        ->  set current state to empty (no tags, no coordinates)

        ->  for each changeset, starting with the first and in ascending order

            ->  changeset bad

                ->  keep previous tag state

                (->  do tag changes considered not protectable)*1

                ->  changed coordinates

                    ->  set new coordinates to be bad coordinates, keep potential good coordinates

            ->  changeset good

                ->  do all tag changes on the current tag state

                ->  changed coordinates

                    ->  no existing bad coordinates or new coordinates more than 1m*2 away

                        ->  delete bad coordinates if existing

                        ->  set new coordinates to be the good coordinates

                    ->  else

                        ->  set new coordinates to be the bad coordinates

        ->  no good coordinates

            ->  delete node

        ->  set future coordinates to be the good coordinates

        ->  node odbl-compliant and save

Ways:

        ->  first changeset bad

            ->  delete way

        ->  set current state to be the state of the first changeset

||         ->  for each changeset, starting with the second and in ascending order

            ->  changeset bad

                ->  keep previous tag state

                (->  do tag changes considered not protectable)*1||

                ->  keep previous node state

            ||->  changeset good

                ->  do all tag changes on the current tag state||

                ->  do all node changes on the current node state

        ->  remove all deleted nodes

        ->  no nodes left

            ->  delete way

        ->  only one node left

            ->  keep way but set some kind of fixme attribute

        ->  way odbl-compliant and save

Nodes:

        ->  no tags and not member of a way or relation

            ->  delete node

Relations:

        ||->  first changeset bad

            ->  delete relation

        ->  set current state to be the state of the first changeset

||         ->  for each changeset, starting with the second and in ascending order

            ->  changeset bad

                ->  keep previous tag state

                (->  do tag changes considered not protectable)*1||

                ->  keep previous member state||

||             ||->  changeset good

                ->  do all tag changes on the current tag state||

                ->  do all member changes on the current member state

        ->  remove all deleted members||

        ->  no members left

            ->  delete relation

        ->  relation odbl-compliant and save

*1 Some tag changes could be considered not protectable, e.g. hghway->highway, deleting created_by tags, etc.

*2 Threshold to prevent moving nodes minimally to rescue them, might be chosen diffently|

Items with subitems are conditions for the subitems to be done. Last step with relations might have to be done more than once as relations may have relations as members.

Greetings,

errt