Reference#
- pyrosm.data.get_data(dataset, update=False, directory=None)#
Get the path to a PBF data file, and download the data if needed.
- Parameters:
dataset (str) – The name of the dataset. Run
pyrosm.data.available
for all available options.update (bool) – Whether the PBF file should be downloaded/updated if the dataset with the same name exists in the temp.
directory (str (optional)) – Path to a directory where the PBF data will be downloaded. (does not apply for test data sets bundled with the package).
- class pyrosm.pyrosm.OSM(filepath, bounding_box=None)#
OpenStreetMap PBF reader object.
- Parameters:
filepath (str) – Filepath to input OSM dataset ( *.osm.pbf )
bounding_box (list | shapely geometry) – Filtering OSM data spatially is allowed by passing a bounding box either as a list [minx, miny, maxx, maxy] or as a Shapely Polygon/MultiPolygon or closed LineString/LinearRing.
- get_boundaries(boundary_type='administrative', name=None, custom_filter=None, extra_attributes=None, timestamp=None)#
Parses boundaries from OSM.
- Parameters:
boundary_type (str) –
- The type of boundaries to parse. Possible values:
”administrative” (default)
”national_park”
”political”
”postal_code”
”protected_area”
”aboriginal_lands”
”maritime”
”lot”
”parcel”
”tract”
”marker”
”all”
name (str (optional)) – Name of the administrative area that will be searched for.
custom_filter (dict (optional)) – Additional filter for what kind of boundary to parse.
extra_attributes (list (optional)) – Additional OSM tag keys that will be converted into columns in the resulting GeoDataFrame.
timestamp (str | datetime | int) –
If provided, the data from given moment of time will be returned. The time should be provided in UTC. Note: This functionality only works with OSH.PBF files that can be downloaded manually e.g. from Geofabrik (requires login with OSM account).
The logic: the closest version of each element up to given timestamp will be selected to the result. This means that elements can be older than the given timestamp (the most up-to-date version is selected), but not newer (records having exactly the selected timestamp will be kept). In case only a date is given, the time will represent midnight of the given day, such as “2021-01-01 00:00:00”.
See also
Take a look at OSM documentation for further details about the data:
- get_buildings(custom_filter=None, extra_attributes=None, timestamp=None)#
Parses buildings from OSM.
- Parameters:
custom_filter (dict) –
What kind of buildings to parse, see details below.
You can opt-in specific elements by using ‘custom_filter’. To keep only specific buildings such as ‘residential’ and ‘retail’, you can apply a custom filter which is a Python dictionary with following format:
custom_filter={‘building’: [‘residential’, ‘retail’]}
extra_attributes (list (optional)) – Additional OSM tag keys that will be converted into columns in the resulting GeoDataFrame.
timestamp (str | datetime | int) –
If provided, the data from given moment of time will be returned. The time should be provided in UTC. Note: This functionality only works with OSH.PBF files that can be downloaded manually e.g. from Geofabrik (requires login with OSM account).
The logic: the closest version of each element up to given timestamp will be selected to the result. This means that elements can be older than the given timestamp (the most up-to-date version is selected), but not newer (records having exactly the selected timestamp will be kept). In case only a date is given, the time will represent midnight of the given day, such as “2021-01-01 00:00:00”.
See also
Take a look at the OSM documentation for further details about the data: https://wiki.openstreetmap.org/wiki/Key:building
- get_data_by_custom_criteria(custom_filter, osm_keys_to_keep=None, filter_type='keep', tags_as_columns=None, keep_nodes=True, keep_ways=True, keep_relations=True, extra_attributes=None, timestamp=None)#
` Parse OSM data based on custom criteria.
- Parameters:
custom_filter (dict (required)) – A custom filter to filter only specific POIs from OpenStreetMap.
osm_keys_to_keep (str | list) – A filter to specify which OSM keys should be kept.
filter_type (str) – “keep” | “exclude” Whether the filters should be used to keep or exclude the data from OSM.
tags_as_columns (list) – Which tags should be kept as columns in the resulting GeoDataFrame.
keep_nodes (bool) – Whether or not the nodes should be kept in the resulting GeoDataFrame if they are found.
keep_ways (bool) – Whether or not the ways should be kept in the resulting GeoDataFrame if they are found.
keep_relations (bool) – Whether or not the relations should be kept in the resulting GeoDataFrame if they are found.
extra_attributes (list (optional)) – Additional OSM tag keys that will be converted into columns in the resulting GeoDataFrame.
timestamp (str | datetime | int) –
If provided, the data from given moment of time will be returned. The time should be provided in UTC. Note: This functionality only works with OSH.PBF files that can be downloaded manually e.g. from Geofabrik (requires login with OSM account).
The logic: the closest version of each element up to given timestamp will be selected to the result. This means that elements can be older than the given timestamp (the most up-to-date version is selected), but not newer (records having exactly the selected timestamp will be kept). In case only a date is given, the time will represent midnight of the given day, such as “2021-01-01 00:00:00”.
- get_landuse(custom_filter=None, extra_attributes=None, timestamp=None)#
Parses landuse from OSM.
- Parameters:
custom_filter (dict) –
What kind of landuse to parse, see details below.
You can opt-in specific elements by using ‘custom_filter’. To keep only specific landuse such as ‘construction’ and ‘industrial’, you can apply a custom filter which is a Python dictionary with following format:
custom_filter={‘landuse’: [‘construction’, ‘industrial’]}
extra_attributes (list (optional)) – Additional OSM tag keys that will be converted into columns in the resulting GeoDataFrame.
timestamp (str | datetime | int) –
If provided, the data from given moment of time will be returned. The time should be provided in UTC. Note: This functionality only works with OSH.PBF files that can be downloaded manually e.g. from Geofabrik (requires login with OSM account).
The logic: the closest version of each element up to given timestamp will be selected to the result. This means that elements can be older than the given timestamp (the most up-to-date version is selected), but not newer (records having exactly the selected timestamp will be kept). In case only a date is given, the time will represent midnight of the given day, such as “2021-01-01 00:00:00”.
See also
Take a look at OSM documentation for further details about the data:
- get_natural(custom_filter=None, extra_attributes=None, timestamp=None)#
Parses natural from OSM.
- Parameters:
custom_filter (dict) –
What kind of natural to parse, see details below.
You can opt-in specific elements by using ‘custom_filter’. To keep only specific natural such as ‘wood’ and ‘tree’, you can apply a custom filter which is a Python dictionary with following format:
custom_filter={‘natural’: [‘wood’, ‘tree’]}
extra_attributes (list (optional)) – Additional OSM tag keys that will be converted into columns in the resulting GeoDataFrame.
timestamp (str | datetime | int) –
If provided, the data from given moment of time will be returned. The time should be provided in UTC. Note: This functionality only works with OSH.PBF files that can be downloaded manually e.g. from Geofabrik (requires login with OSM account).
The logic: the closest version of each element up to given timestamp will be selected to the result. This means that elements can be older than the given timestamp (the most up-to-date version is selected), but not newer (records having exactly the selected timestamp will be kept). In case only a date is given, the time will represent midnight of the given day, such as “2021-01-01 00:00:00”.
See also
Take a look at OSM documentation for further details about the data:
- get_network(network_type='walking', extra_attributes=None, nodes=False, timestamp=None)#
Parses street networks from OSM for walking, driving, and cycling.
- Parameters:
network_type (str) –
What kind of network to parse. Possible values are:
’walking’
’cycling’
’driving’
’driving+service’
’all’.
extra_attributes (list (optional)) – Additional OSM tag keys that will be converted into columns in the resulting GeoDataFrame.
nodes (bool (default: False)) – If True, 1) the nodes associated with the network will be returned in addition to edges, and 2) every segment of a road constituting a way is parsed as a separate row (to enable full connectivity in the graph).
timestamp (str | datetime | int) –
If provided, the data from given moment of time will be returned. The time should be provided in UTC. Note: This functionality only works with OSH.PBF files that can be downloaded manually e.g. from Geofabrik (requires login with OSM account).
The logic: the closest version of each element up to given timestamp will be selected to the result. This means that elements can be older than the given timestamp (the most up-to-date version is selected), but not newer (records having exactly the selected timestamp will be kept). In case only a date is given, the time will represent midnight of the given day, such as “2021-01-01 00:00:00”.
- Returns:
gdf_edges or (gdf_nodes, gdf_edges)
Return type
———–
geopandas.GeoDataFrame or tuple
See also
Take a look at the OSM documentation for further details about the data: https://wiki.openstreetmap.org/wiki/Key:highway
- get_pois(custom_filter=None, extra_attributes=None, timestamp=None)#
Parse Point of Interest (POI) from OSM.
- Parameters:
custom_filter (dict) – An optional custom filter to filter only specific POIs from OpenStreetMap, see details below.
extra_attributes (list (optional)) – Additional OSM tag keys that will be converted into columns in the resulting GeoDataFrame.
timestamp (str | datetime | int) –
If provided, the data from given moment of time will be returned. The time should be provided in UTC. Note: This functionality only works with OSH.PBF files that can be downloaded manually e.g. from Geofabrik (requires login with OSM account).
The logic: the closest version of each element up to given timestamp will be selected to the result. This means that elements can be older than the given timestamp (the most up-to-date version is selected), but not newer (records having exactly the selected timestamp will be kept). In case only a date is given, the time will represent midnight of the given day, such as “2021-01-01 00:00:00”.
Notes
By default, Pyrosm will parse all OSM elements (points, lines and polygons) that are associated with following keys:
amenity
shop
tourism
You can opt-out / opt-in specific elements by using ‘custom_filter’. To parse elements associated with only specific tags, such as amenities, you can specify:
custom_filter={“amenity”: True}
You can also combine multiple filters at the same time. For instance, you can parse all ‘amenity’ elements AND specific ‘shop’ elements, such as supermarkets and book stores by specifying:
custom_filter={“amenity”: True, “shop”: [“supermarket”, “books”]}
See also
You can check the most typical OSM tags for different map features from OSM Wiki https://wiki.openstreetmap.org/wiki/Map_Features. It is also possible to get a quick look at the most typical OSM tags from Pyrosm configuration:
>>> from pyrosm.config import Conf >>> print("All available OSM keys", Conf.tags.available) All available OSM keys ['aerialway', 'aeroway', 'amenity', 'building', 'craft', 'emergency', 'geological', 'highway', 'historic', 'landuse', 'leisure', 'natural', 'office', 'power', 'public_transport', 'railway', 'route', 'place', 'shop', 'tourism', 'waterway']
>>> print("Typical tags associated with tourism:", Conf.tags.tourism) ['alpine_hut', 'apartment', 'aquarium', 'artwork', 'attraction', 'camp_pitch', 'camp_site', 'caravan_site', 'chalet', 'gallery', 'guest_house', 'hostel', 'hotel', 'information', 'motel', 'museum', 'picnic_site', 'theme_park', 'tourism', 'viewpoint', 'wilderness_hut', 'zoo']
- static to_graph(nodes, edges, graph_type='igraph', direction='oneway', from_id_col='u', to_id_col='v', edge_id_col='id', node_id_col='id', force_bidirectional=False, network_type=None, retain_all=False, osmnx_compatible=True, pandana_weights=['length'])#
` Export OSM network to routable graph. Supported output graph types are:
“igraph” (default),
“networkx”,
“pandana”
For walking and cycling, the output graph will be bidirectional by default (i.e. travel along the street is allowed to both directions). For driving, one-way streets are taken into account by default and the travel is restricted based on the rules in OSM data (based on “oneway” attribute).
- Parameters:
nodes (GeoDataFrame) – GeoDataFrame containing nodes of the road network. Note: Use osm.get_network(nodes=True) to retrieve both the nodes and edges.
edges (GeoDataFrame) – GeoDataFrame containing the edges of the road network.
graph_type (str) –
- Type of the output graph. Available graphs are:
”igraph” –> returns an igraph.Graph -object.
”networkx” –> returns a networkx.MultiDiGraph -object.
”pandana” –> returns an pandana.Network -object.
direction (str) – Name for the column containing information about the allowed driving directions
from_id_col (str) – Name for the column having the from-node-ids of edges.
to_id_col (str) – Name for the column having the to-node-ids of edges.
edge_id_col (str) – Name for the column having the unique id for edges.
node_id_col (str) – Name for the column having the unique id for nodes.
force_bidirectional (bool) – If True, all edges will be created as bidirectional (allow travel to both directions).
network_type (str (optional)) – Network type for the given data. Determines how the graph will be constructed. The network type is typically extracted automatically from the metadata of the edges/nodes GeoDataFrames. This parameter can be used if this metadata is not available for a reason or another. By default, bidirectional graph is created for walking, cycling and all, and directed graph for driving (i.e. oneway streets are taken into account). Possible values are: ‘walking’, ‘cycling’, ‘driving’, ‘driving+service’, ‘all’.
retain_all (bool) – if True, return the entire graph even if it is not connected. otherwise, retain only the connected edges.
osmnx_compatible (bool (default True)) – if True, modifies the edge and node-attribute naming to be compatible with OSMnx (allows utilizing all OSMnx functionalities). NOTE: Only applicable with “networkx” graph type.
pandana_weights (list) – Columns that are used as weights when exporting to Pandana graph. By default uses “length” column.