Finding a pub with MongoDB and OpenStreetMap

ConFoo

Montréal, Canada

February 27th, 2014



Derick Rethans
derickr

Finding a pub with MongoDB and OpenStreetMap

Derick Rethans

derick@mongodb.com—@derickr

http://derickrethans.nl/talks/mongo-osm-confoo14

  • The examples in this talk use data from OpenStreetMap
  • OpenStreetMap is a free worldwide map, created through crowdsourcing
  • The data is free to download and use under its open license.
  • It's sort of a WikiPedia for map data

2d index vs. 2dsphere index

Example:


{
  loc : {
    type : "LineString" ,
    coordinates : [ [ -0.09, 51.49 ], [ 2.35, 48.86 ] ]
  }
}
  • GeoJSON is a format for encoding a variety of geographic data structures.
  • Used by various other HTML APIs as well: OpenLayers, Leaflet and through GDAL also PostGIS.
  • Point, MultiPoint, LineString, MultiLineString, Polygon, MultiPolygon and GeometryCollection.

{
    "_id" : "n356277759",
    "ty" : NumberLong(1),
    "l" : {
        "type" : "Point",
        "coordinates" : [
            -0.1607376,
            51.5138662
        ]
    },
    "ts" : [
        "amenity=pub",
        "food=yes",
        "name=The Tyburn",
        "operator=wetherspoons",
        "real_ale=yes"
    ]
}

{
    "_id" : "w4373934",
    "ty" : NumberLong(2),
    "l" : {
        "type" : "LineString",
        "coordinates" : [
            [ -0.1286202, 51.50752 ],
            [ -0.1283857, 51.5075115 ],
            [ -0.1282389, 51.5075112 ],
            [ -0.1278265, 51.5075453 ]
        ]
    },
    "ts" : [
        "highway=primary",
        "lit=yes",
        "name=Trafalgar Square",
        "oneway=yes",
        "postal_code=WC2",
        "sidewalk=left"
    ]
}


{
   "_id" : "w161490892",
   "ty" : NumberLong(2),
   "l" : {
      "type" : "Polygon",
      "coordinates" : [
         [
            [ -0.1356425, 51.497671 ],
            [ -0.1355737, 51.4975662 ],
            [ -0.1354639, 51.4975815 ],
            [ -0.1354536, 51.497694 ],
            [ -0.1356425, 51.497671 ]
         ]
      ]
   },
   "ts" : [
      "amenity=pub",
      "building=yes",
      "name=The Albert",
      "real_ale=yes"
   ]
}
  • We divide the surface of the earth into cells
  • Cells have a level which define the size of the cell
  • S2 provides 31 levels
  • The higher the level, the smaller the cell, and therefore the more needed to cover the earth
  • By default, we use all levels between 500m on a side and 100km on a side

amenity=toilets

highway=primaryname=Trafalgar Square

highway=pedestrianname=Trafalgar Squarearea=yes

$geoNear

  • find stuff near a point
  • index required

$geoWithin

  • find stuff within a polygon/circle
  • index not required

$geoIntersects

  • find stuff that intersects with other stuff
  • index not required

db.poiConcat.find( {
  ts: "amenity=pub",
  l: {
    *$near: {*
      *$geometry: {*
        *type: 'Point',*
        *coordinates: [ -0.1204, 51.5168 ]*
      *},*
      *$maxDistance: 500*
    *}*
  }
} ).limit(5).pretty();

{
  "_id" : "n26848690",
  "l" : {
    "type" : "Point",
    "coordinates" : [ -0.119473, 51.516787 ]
  },
  "ts" : [
    "addr:housenumber=64-68",
    "addr:street=Kingsway",
    "amenity=pub",
    "name=The Shakespeare's Head",
    "wifi=free"
  ]
}
  • Starts from a point
  • Looks at concentric donuts from that point
  • Want to look at documents within the donut
  • If distance from query point to document is within the donut, output
  • Adapts size of the donut as needed

hydepark = db.poiConcat.findOne( {
  ts: { $all: [
    "name=Hyde Park", "leisure=park"
  ] }
} );

db.poiConcat.find( {
  *l: { $geoWithin: {*
    *$geometry: hydepark.l*
  *} },*
  ts: "amenity=cafe"
} );


{
  "_id" : "w19851241",
  "ty" : NumberLong(2),
  "l" : *{*
    *"type" : "Polygon",*
    *"coordinates" : [*
      *[ [ -0.1549378, 51.508331 ], … [ -0.1549378, 51.508331 ] ]*
    *]*
  *},*
  "ts" : [
    "access=yes", "leisure=park", "name=Hyde Park",
    "wikipedia=http://en.wikipedia.org/wiki/Hyde_Park,_London"
  ]
}

building = db.poiConcat.findOne( {
  _id: "w30734457"
} );

db.poiConcat.find( {
  l: {
    *$geoIntersects: {*
      *$geometry: building.l*
    *}*
  },
  ts: {
    $exists: true
  }
} );


{ "_id" : "w5059478", "ts" : [ "branch=Charing Cross", "electrified=rail", "frequency=0", "gauge=1435", "history=Retrieved from v43", "layer=-3", "line=Northern", "loading_gauge=deep-tube", "name=Northern Line (Charing Cross Branch) Southbound", "network=London Underground", "oneway=yes", "railway=subway", "tunnel=yes", "voltage=630" ] }
{ "_id" : "w139389296", "ts" : [ "branch=Charing Cross", "electrified=rail", "frequency=10", "gauge=1435", "layer=-2", "line=Northern", "loading_gauge=deep-tube", "name=Northern Line (Charing Cross Branch)", "network=London Underground", "oneway=yes", "railway=subway", "tunnel=yes", "voltage=630" ] }
{ "_id" : "n595696911", "ts" : [ "disused=yes", "disused:amenity=bar", "name=Kudos", "toilets=yes", "toilets:access=customers" ] }
{ "_id" : "n595696974", "ts" : [ "amenity=cafe", "name=Costa", "wheelchair=yes" ] }
{ "_id" : "n653124873", "ts" : [ "addr:city=London", "addr:housenumber=441", "addr:street=Strand", "operator=Paperchase", "phone=020 7497 2797", "postal_code=WC2R 0QU", "shop=stationery", "website=http://www.paperchase.co.uk/london/strand/stry/wc2r0qr/" ] }
{ "_id" : "n1163880380", "ts" : [ "addr:housenumber=430", "addr:street=Strand", "name=Ryman", "phone=+44 20 7240 4408", "postal_code=WC2R 0QN", "shop=stationery", "website=http://www.ryman.co.uk/store-finder/branches/branch/?storeid=1037", "wheelchair=limited" ] }
{ "_id" : "n1571982051", "ts" : [ "name=Charing Cross", "railway=subway_entrance", "source:name=survey", "source:railway=survey" ] }
{ "_id" : "n1571982070", "ts" : [ "name=Charing Cross", "railway=subway_entrance", "source:name=survey", "source:railway=survey", "wheelchair=no" ] }
{ "_id" : "n2066862842", "ts" : [ "addr:housenumber=440", "addr:street=Strand", "amenity=bank", "name=Coutts & Co", "phone=+44 2077 531000", "postal_code=WC2R 0QS", "website=http://www.coutts.com/locations/london-strand/" ] }
{ "_id" : "w166702178", "ts" : [ "layer=-3", "line=Jubilee", "name=Jubilee Line (disused)", "network=London Underground", "oneway=yes", "railway=disused", "tunnel=yes" ] }
{ "_id" : "w166707824", "ts" : [ "layer=-2", "name=Northern Line Southbound", "railway=platform" ] }
{ "_id" : "w166707825", "ts" : [ "layer=-2", "name=Northern Line Northbound", "railway=platform" ] }
  • MongoDB's aggregation framework has as $geoNear pipeline operator
  • Also adds the distance as a new field to the document
  • Has to be the first operator in the pipeline
  • Only works if there is one 2dsphere index
  • Loop over all nodes, and:
  • Loop over all the ways, and:
node_cache: 45.1MB — poiConcat:  84.2MB

https://github.com/derickr/3angle

http://derickrethans.nl/talks/mongo-osm-confoo14

derick@mongodb.com—@derickr