Reference#

pyrosm.data.get_data(dataset, update=False, directory=None)#

Get the path to a PBF data file, and download the data if needed.

Parameters:
  • dataset (str) – The name of the dataset. Run pyrosm.data.available for all available options.

  • update (bool) – Whether the PBF file should be downloaded/updated if the dataset with the same name exists in the temp.

  • directory (str (optional)) – Path to a directory where the PBF data will be downloaded. (does not apply for test data sets bundled with the package).

class pyrosm.pyrosm.OSM(filepath, bounding_box=None)#

OpenStreetMap PBF reader object.

Parameters:
  • filepath (str) – Filepath to input OSM dataset ( *.osm.pbf )

  • bounding_box (list | shapely geometry) – Filtering OSM data spatially is allowed by passing a bounding box either as a list [minx, miny, maxx, maxy] or as a Shapely Polygon/MultiPolygon or closed LineString/LinearRing.

get_boundaries(boundary_type='administrative', name=None, custom_filter=None, extra_attributes=None, timestamp=None)#

Parses boundaries from OSM.

Parameters:
  • boundary_type (str) –

    The type of boundaries to parse. Possible values:
    • ”administrative” (default)

    • ”national_park”

    • ”political”

    • ”postal_code”

    • ”protected_area”

    • ”aboriginal_lands”

    • ”maritime”

    • ”lot”

    • ”parcel”

    • ”tract”

    • ”marker”

    • ”all”

  • name (str (optional)) – Name of the administrative area that will be searched for.

  • custom_filter (dict (optional)) – Additional filter for what kind of boundary to parse.

  • extra_attributes (list (optional)) – Additional OSM tag keys that will be converted into columns in the resulting GeoDataFrame.

  • timestamp (str | datetime | int) –

    If provided, the data from given moment of time will be returned. The time should be provided in UTC. Note: This functionality only works with OSH.PBF files that can be downloaded manually e.g. from Geofabrik (requires login with OSM account).

    The logic: the closest version of each element up to given timestamp will be selected to the result. This means that elements can be older than the given timestamp (the most up-to-date version is selected), but not newer (records having exactly the selected timestamp will be kept). In case only a date is given, the time will represent midnight of the given day, such as “2021-01-01 00:00:00”.

See also

Take a look at OSM documentation for further details about the data:

https://wiki.openstreetmap.org/wiki/Key:boundary

get_buildings(custom_filter=None, extra_attributes=None, timestamp=None)#

Parses buildings from OSM.

Parameters:
  • custom_filter (dict) –

    What kind of buildings to parse, see details below.

    You can opt-in specific elements by using ‘custom_filter’. To keep only specific buildings such as ‘residential’ and ‘retail’, you can apply a custom filter which is a Python dictionary with following format:

    • custom_filter={‘building’: [‘residential’, ‘retail’]}

  • extra_attributes (list (optional)) – Additional OSM tag keys that will be converted into columns in the resulting GeoDataFrame.

  • timestamp (str | datetime | int) –

    If provided, the data from given moment of time will be returned. The time should be provided in UTC. Note: This functionality only works with OSH.PBF files that can be downloaded manually e.g. from Geofabrik (requires login with OSM account).

    The logic: the closest version of each element up to given timestamp will be selected to the result. This means that elements can be older than the given timestamp (the most up-to-date version is selected), but not newer (records having exactly the selected timestamp will be kept). In case only a date is given, the time will represent midnight of the given day, such as “2021-01-01 00:00:00”.

See also

Take a look at the OSM documentation for further details about the data: https://wiki.openstreetmap.org/wiki/Key:building

get_data_by_custom_criteria(custom_filter, osm_keys_to_keep=None, filter_type='keep', tags_as_columns=None, keep_nodes=True, keep_ways=True, keep_relations=True, extra_attributes=None, timestamp=None)#

` Parse OSM data based on custom criteria.

Parameters:
  • custom_filter (dict (required)) – A custom filter to filter only specific POIs from OpenStreetMap.

  • osm_keys_to_keep (str | list) – A filter to specify which OSM keys should be kept.

  • filter_type (str) – “keep” | “exclude” Whether the filters should be used to keep or exclude the data from OSM.

  • tags_as_columns (list) – Which tags should be kept as columns in the resulting GeoDataFrame.

  • keep_nodes (bool) – Whether or not the nodes should be kept in the resulting GeoDataFrame if they are found.

  • keep_ways (bool) – Whether or not the ways should be kept in the resulting GeoDataFrame if they are found.

  • keep_relations (bool) – Whether or not the relations should be kept in the resulting GeoDataFrame if they are found.

  • extra_attributes (list (optional)) – Additional OSM tag keys that will be converted into columns in the resulting GeoDataFrame.

  • timestamp (str | datetime | int) –

    If provided, the data from given moment of time will be returned. The time should be provided in UTC. Note: This functionality only works with OSH.PBF files that can be downloaded manually e.g. from Geofabrik (requires login with OSM account).

    The logic: the closest version of each element up to given timestamp will be selected to the result. This means that elements can be older than the given timestamp (the most up-to-date version is selected), but not newer (records having exactly the selected timestamp will be kept). In case only a date is given, the time will represent midnight of the given day, such as “2021-01-01 00:00:00”.

get_landuse(custom_filter=None, extra_attributes=None, timestamp=None)#

Parses landuse from OSM.

Parameters:
  • custom_filter (dict) –

    What kind of landuse to parse, see details below.

    You can opt-in specific elements by using ‘custom_filter’. To keep only specific landuse such as ‘construction’ and ‘industrial’, you can apply a custom filter which is a Python dictionary with following format:

    custom_filter={‘landuse’: [‘construction’, ‘industrial’]}

  • extra_attributes (list (optional)) – Additional OSM tag keys that will be converted into columns in the resulting GeoDataFrame.

  • timestamp (str | datetime | int) –

    If provided, the data from given moment of time will be returned. The time should be provided in UTC. Note: This functionality only works with OSH.PBF files that can be downloaded manually e.g. from Geofabrik (requires login with OSM account).

    The logic: the closest version of each element up to given timestamp will be selected to the result. This means that elements can be older than the given timestamp (the most up-to-date version is selected), but not newer (records having exactly the selected timestamp will be kept). In case only a date is given, the time will represent midnight of the given day, such as “2021-01-01 00:00:00”.

See also

Take a look at OSM documentation for further details about the data:

https://wiki.openstreetmap.org/wiki/Key:landuse

get_natural(custom_filter=None, extra_attributes=None, timestamp=None)#

Parses natural from OSM.

Parameters:
  • custom_filter (dict) –

    What kind of natural to parse, see details below.

    You can opt-in specific elements by using ‘custom_filter’. To keep only specific natural such as ‘wood’ and ‘tree’, you can apply a custom filter which is a Python dictionary with following format:

    custom_filter={‘natural’: [‘wood’, ‘tree’]}

  • extra_attributes (list (optional)) – Additional OSM tag keys that will be converted into columns in the resulting GeoDataFrame.

  • timestamp (str | datetime | int) –

    If provided, the data from given moment of time will be returned. The time should be provided in UTC. Note: This functionality only works with OSH.PBF files that can be downloaded manually e.g. from Geofabrik (requires login with OSM account).

    The logic: the closest version of each element up to given timestamp will be selected to the result. This means that elements can be older than the given timestamp (the most up-to-date version is selected), but not newer (records having exactly the selected timestamp will be kept). In case only a date is given, the time will represent midnight of the given day, such as “2021-01-01 00:00:00”.

See also

Take a look at OSM documentation for further details about the data:

https://wiki.openstreetmap.org/wiki/Key:natural

get_network(network_type='walking', extra_attributes=None, nodes=False, timestamp=None)#

Parses street networks from OSM for walking, driving, and cycling.

Parameters:
  • network_type (str) –

    What kind of network to parse. Possible values are:

    • ’walking’

    • ’cycling’

    • ’driving’

    • ’driving+service’

    • ’all’.

  • extra_attributes (list (optional)) – Additional OSM tag keys that will be converted into columns in the resulting GeoDataFrame.

  • nodes (bool (default: False)) – If True, 1) the nodes associated with the network will be returned in addition to edges, and 2) every segment of a road constituting a way is parsed as a separate row (to enable full connectivity in the graph).

  • timestamp (str | datetime | int) –

    If provided, the data from given moment of time will be returned. The time should be provided in UTC. Note: This functionality only works with OSH.PBF files that can be downloaded manually e.g. from Geofabrik (requires login with OSM account).

    The logic: the closest version of each element up to given timestamp will be selected to the result. This means that elements can be older than the given timestamp (the most up-to-date version is selected), but not newer (records having exactly the selected timestamp will be kept). In case only a date is given, the time will represent midnight of the given day, such as “2021-01-01 00:00:00”.

Returns:

  • gdf_edges or (gdf_nodes, gdf_edges)

  • Return type

  • ———–

  • geopandas.GeoDataFrame or tuple

See also

Take a look at the OSM documentation for further details about the data: https://wiki.openstreetmap.org/wiki/Key:highway

get_pois(custom_filter=None, extra_attributes=None, timestamp=None)#

Parse Point of Interest (POI) from OSM.

Parameters:
  • custom_filter (dict) – An optional custom filter to filter only specific POIs from OpenStreetMap, see details below.

  • extra_attributes (list (optional)) – Additional OSM tag keys that will be converted into columns in the resulting GeoDataFrame.

  • timestamp (str | datetime | int) –

    If provided, the data from given moment of time will be returned. The time should be provided in UTC. Note: This functionality only works with OSH.PBF files that can be downloaded manually e.g. from Geofabrik (requires login with OSM account).

    The logic: the closest version of each element up to given timestamp will be selected to the result. This means that elements can be older than the given timestamp (the most up-to-date version is selected), but not newer (records having exactly the selected timestamp will be kept). In case only a date is given, the time will represent midnight of the given day, such as “2021-01-01 00:00:00”.

Notes

By default, Pyrosm will parse all OSM elements (points, lines and polygons) that are associated with following keys:

  • amenity

  • shop

  • tourism

You can opt-out / opt-in specific elements by using ‘custom_filter’. To parse elements associated with only specific tags, such as amenities, you can specify:

custom_filter={“amenity”: True}

You can also combine multiple filters at the same time. For instance, you can parse all ‘amenity’ elements AND specific ‘shop’ elements, such as supermarkets and book stores by specifying:

custom_filter={“amenity”: True, “shop”: [“supermarket”, “books”]}

See also

You can check the most typical OSM tags for different map features from OSM Wiki https://wiki.openstreetmap.org/wiki/Map_Features. It is also possible to get a quick look at the most typical OSM tags from Pyrosm configuration:

>>> from pyrosm.config import Conf
>>> print("All available OSM keys", Conf.tags.available)
All available OSM keys ['aerialway', 'aeroway', 'amenity', 'building', 'craft',
'emergency', 'geological', 'highway', 'historic', 'landuse', 'leisure',
'natural', 'office', 'power', 'public_transport', 'railway', 'route',
'place', 'shop', 'tourism', 'waterway']
>>> print("Typical tags associated with tourism:", Conf.tags.tourism)
['alpine_hut', 'apartment', 'aquarium', 'artwork', 'attraction', 'camp_pitch',
'camp_site', 'caravan_site', 'chalet', 'gallery', 'guest_house', 'hostel',
'hotel', 'information', 'motel', 'museum', 'picnic_site', 'theme_park',
'tourism', 'viewpoint', 'wilderness_hut', 'zoo']
static to_graph(nodes, edges, graph_type='igraph', direction='oneway', from_id_col='u', to_id_col='v', edge_id_col='id', node_id_col='id', force_bidirectional=False, network_type=None, retain_all=False, osmnx_compatible=True, pandana_weights=['length'])#

` Export OSM network to routable graph. Supported output graph types are:

  • “igraph” (default),

  • “networkx”,

  • “pandana”

For walking and cycling, the output graph will be bidirectional by default (i.e. travel along the street is allowed to both directions). For driving, one-way streets are taken into account by default and the travel is restricted based on the rules in OSM data (based on “oneway” attribute).

Parameters:
  • nodes (GeoDataFrame) – GeoDataFrame containing nodes of the road network. Note: Use osm.get_network(nodes=True) to retrieve both the nodes and edges.

  • edges (GeoDataFrame) – GeoDataFrame containing the edges of the road network.

  • graph_type (str) –

    Type of the output graph. Available graphs are:
    • ”igraph” –> returns an igraph.Graph -object.

    • ”networkx” –> returns a networkx.MultiDiGraph -object.

    • ”pandana” –> returns an pandana.Network -object.

  • direction (str) – Name for the column containing information about the allowed driving directions

  • from_id_col (str) – Name for the column having the from-node-ids of edges.

  • to_id_col (str) – Name for the column having the to-node-ids of edges.

  • edge_id_col (str) – Name for the column having the unique id for edges.

  • node_id_col (str) – Name for the column having the unique id for nodes.

  • force_bidirectional (bool) – If True, all edges will be created as bidirectional (allow travel to both directions).

  • network_type (str (optional)) – Network type for the given data. Determines how the graph will be constructed. The network type is typically extracted automatically from the metadata of the edges/nodes GeoDataFrames. This parameter can be used if this metadata is not available for a reason or another. By default, bidirectional graph is created for walking, cycling and all, and directed graph for driving (i.e. oneway streets are taken into account). Possible values are: ‘walking’, ‘cycling’, ‘driving’, ‘driving+service’, ‘all’.

  • retain_all (bool) – if True, return the entire graph even if it is not connected. otherwise, retain only the connected edges.

  • osmnx_compatible (bool (default True)) – if True, modifies the edge and node-attribute naming to be compatible with OSMnx (allows utilizing all OSMnx functionalities). NOTE: Only applicable with “networkx” graph type.

  • pandana_weights (list) – Columns that are used as weights when exporting to Pandana graph. By default uses “length” column.