osmextract benchmark (companion to benchmarks.ipynb)

osmextract benchmark (companion to benchmarks.ipynb)#

osmextract is an R package, so its parsing tasks run here (with the R kernel) rather than inside the Python benchmarks.ipynb. It fetches the Helsinki extract from BBBike — the same data the Python benchmark reads via get_data("helsinki") — using osmextract’s own oe_get(provider = "bbbike"), runs the same two parsing tasks, and writes osmextract_results.csv next to it; benchmarks.ipynb loads that file and merges the osmextract rows into its results table and charts.

Run this with an R kernel (see the installation snippet in benchmarks.ipynb). It is kept fair the same way as the Python tools: only the oe_read() call is timed (median over REPEATS runs), and force_vectortranslate = TRUE re-does the PBF → features work each run (the osmextract analogue of QuackOSM’s ignore_cache) rather than reading a cached GeoPackage.

# Same input as the Python benchmark (get_data("helsinki")): the latest Helsinki extract from
# BBBike. osmextract fetches it itself via oe_get(provider = "bbbike") -- one of its core
# capabilities. download_only + skip_vectortranslate just download the .osm.pbf and return its
# path, and the timed reads below re-parse that file.
suppressPackageStartupMessages(library(osmextract))
REPEATS <- 3L
options(timeout = max(600, getOption("timeout")))  # the ~62 MB download can exceed R's 60 s default

pbf <- oe_get("Helsinki", provider = "bbbike",
              download_only = TRUE, skip_vectortranslate = TRUE, quiet = FALSE)

cat("osmextract", as.character(packageVersion("osmextract")),
    "| sf", as.character(packageVersion("sf")), "\nPBF:", pbf, "\n")
The input place was matched with: Helsinki

Downloading the OSM extract:
  |======================================================================| 100%
File downloaded!
osmextract 0.6.0 | sf 1.1.0 
PBF: /Users/tenkanh2/Library/Application Support/org.R-project.R/R/osmextract/bbbike_Helsinki.osm.pbf 
# Time oe_read() of one PBF layer, filtered to a tag, and report the median + count.
bench_oe <- function(task, layer, tagkey) {
    q <- sprintf('SELECT * FROM "%s" WHERE %s IS NOT NULL', layer, tagkey)
    tryCatch({
        times <- numeric(REPEATS); n <- 0L
        for (i in seq_len(REPEATS)) {
            el <- system.time({
                x <- oe_read(pbf, layer = layer, query = q,
                             force_vectortranslate = TRUE, quiet = TRUE)
            })[["elapsed"]]
            times[i] <- el; n <- nrow(x)
        }
        data.frame(task = task, tool = "osmextract",
                   seconds = round(median(times), 2), features = n,
                   status = "ok", stringsAsFactors = FALSE)
    }, error = function(e) {
        data.frame(task = task, tool = "osmextract", seconds = NA_real_,
                   features = NA_integer_,
                   status = paste("error:", conditionMessage(e)),
                   stringsAsFactors = FALSE)
    })
}
# Buildings = the multipolygons layer; roads = the lines layer (GDAL's OSM driver).
results <- rbind(
    bench_oe("buildings", "multipolygons", "building"),
    bench_oe("roads",     "lines",         "highway")
)
print(results)
write.csv(results, "osmextract_results.csv", row.names = FALSE)
cat("\nWrote osmextract_results.csv -- now re-run the \"Results\" section of",
    "benchmarks.ipynb to include osmextract.\n")
Warning message in get_default_osmconf_ini():
“The package couldn't retrieve the osmconf.ini from GDAL installation. Defaulting to the one bundled in this package. Please raise a new issue at https://github.com/ropensci/osmextract/issues”
Warning message in CPL_gdalvectortranslate(source, destination, options, oo, doo, :
“GDAL Message 1: Non closed ring detected. To avoid accepting it, set the OGR_GEOMETRY_ACCEPT_UNCLOSED_RING configuration option to NO”
Warning message in get_default_osmconf_ini():
“The package couldn't retrieve the osmconf.ini from GDAL installation. Defaulting to the one bundled in this package. Please raise a new issue at https://github.com/ropensci/osmextract/issues”
Warning message in CPL_gdalvectortranslate(source, destination, options, oo, doo, :
“GDAL Message 1: Non closed ring detected. To avoid accepting it, set the OGR_GEOMETRY_ACCEPT_UNCLOSED_RING configuration option to NO”
Warning message in get_default_osmconf_ini():
“The package couldn't retrieve the osmconf.ini from GDAL installation. Defaulting to the one bundled in this package. Please raise a new issue at https://github.com/ropensci/osmextract/issues”
Warning message in CPL_gdalvectortranslate(source, destination, options, oo, doo, :
“GDAL Message 1: Non closed ring detected. To avoid accepting it, set the OGR_GEOMETRY_ACCEPT_UNCLOSED_RING configuration option to NO”
Warning message in get_default_osmconf_ini():
“The package couldn't retrieve the osmconf.ini from GDAL installation. Defaulting to the one bundled in this package. Please raise a new issue at https://github.com/ropensci/osmextract/issues”
Warning message in get_default_osmconf_ini():
“The package couldn't retrieve the osmconf.ini from GDAL installation. Defaulting to the one bundled in this package. Please raise a new issue at https://github.com/ropensci/osmextract/issues”
Warning message in get_default_osmconf_ini():
“The package couldn't retrieve the osmconf.ini from GDAL installation. Defaulting to the one bundled in this package. Please raise a new issue at https://github.com/ropensci/osmextract/issues”
       task       tool seconds features status
1 buildings osmextract    3.94   176843     ok
2     roads osmextract    3.11   296667     ok

Wrote osmextract_results.csv -- now re-run the "Results" section of benchmarks.ipynb to include osmextract.