Subscribe to this thread
Home - General / All posts - New video: 5 Minute Tutorial - Connect to an OSM Vector Server and Harvest Data
Dimitri


6,276 post(s)
online
#13-Nov-20 12:03

There's a new, short video in the Videos page's 5 minute tutorial series.

When you want to use OpenStreetMap data it's easy to fall into the rut of downloading an entire country's worth of PBF file data from geofabrik.de and to import that, but another way to get OSM data is using the Webserver: osm vector data soure. If you just want coverage for a smaller area, like a town, this is a great way to get it.

This is the video version of one of the new topics (there are several) showing examples for data sources, for example, like working with the new ArcGIS REST vector dataport to grab vector data from ArcGIS feature servers.

RonHendrickson
270 post(s)
#13-Nov-20 16:01

Nice video, short and sweet and very handy to get smaller versions of datasets.

antoniocarlos

551 post(s)
#13-Nov-20 20:24

Good example! Is there a reason why copying a drawing, pastes a table that you have to create a drawing from as a second step?


How soon?

Dimitri


6,276 post(s)
online
#14-Nov-20 01:27

The video doesn't copy a drawing. If it copied a drawing, then pasting into the main part of the project would paste a drawing.

The video copies selected objects, which are records in a table. Pasting those into a drawing will paste new objects into that drawing. Pasting them into a table pastes new records into the table. Pasting into the project pane gets you a table.

It might be a nice extension so that if the copy was done from within a table, then a table gets pasted, but if the copy is done from a drawing then both a table and a drawing were pasted with existing style and coordinate system.

antoniocarlos

551 post(s)
#14-Nov-20 01:49

That is what i meant.


How soon?

dchall8
775 post(s)
#14-Nov-20 05:29

That is too much fun! Thanks for the example. Now I have to go see what the REST dataport has to offer.

I have to wonder, though, where does OSM get their data? Most of the buildings in my small town are misplaced to the southeast, but right in the middle of just about any location, there are buildings which are spot on. Some buildings are missing. Some are extra buildings. Some of the newer buildings are there and others are not. Looking at the timestamps versus accuracy, I found a building under construction adjacent to two buildings that are roughly 75 years old. The new building looks perfect while the two adjacent are displaced by 10+ feet, yet the timestamps for the three buildings are 1 month apart. The same user and usernumber are listed against all three.

This is just a curiosity. I'm not looking here for answers.

brwalker9 post(s)
#16-Nov-20 04:29

I would like to use a similar workflow to that described in the video for scraping subsets of OSM vector data. However using a REST data port I cannot find a workflow that easily allows a small subset of data to be scraped to a local drawing

I am using arcgisrest to link to https://services.thelist.tas.gov.au/arcgis/rest/services/Public/CadastreParcels/MapServer

My workflow is to

  1. Connect to above mapserver
  2. Add linked drawing to map
  3. Zoom in to my area of intereest (about 100 cadastre parcels)
  4. Select these using mouse and Edit/Copy

At this point things get very slow. It would seem that M9 wants to download all 420K+ records from the mapserver to complete this action. Is there anyway to speed this up?

Dimitri


6,276 post(s)
online
#16-Nov-20 11:04

See the Example: Vector Layers from an ArcGIS REST Feature Server topic for background, since that has similar workflow.

Feature servers tend to be very slow. The way to bypass that slowness is to bite the bullet and to scrape the entire table and drawing from the server into a local table and drawing. You can then do all your selections and such locally without waiting around to see what the server does.

Working with data left on the server to do any sort of analytics, even something as simple as selection, can put slowness on top of slowness, because that asks the server to do more.

When you want to select just a subset of items, Manifold sends a query to the server to find objects matching the selection criterion, like where you clicked or the bounding box of a selection marquee. It's not just a matter of looking at what is in cache, because there might have been a change, or there might be some overlap, or the server might have choked and not sent everything (internally timed out) and so on.

Take a look at the log window, and you might see some surprising results. For example, I connected to the server in the Example topic above, zoomed into a smaller area, and ctrl-clicked one area to select it. That took a long time to select. I then chose Edit - Copy to copy it, and it took forever to get that copy of just one record... why? Because Manifold sent a query to the server for what might have been under that clicked location (could have been more than one area there...), and the server timed out or failed at the query.

Looking at the log window, I saw:

2020-11-16 11:41:28 *** (root)::[kansas]::[Current Kansas Field Production] (Query) The remote server returned an error: (504) Gateway Timeout.

2020-11-16 11:43:33 *** (root)::[kansas]::[Current Kansas Field Production] (Query) The remote server returned an error: (504) Gateway Timeout.

2020-11-16 11:43:54 *** (root)::[kansas] Error (400): Unable to complete operation. ["Unable to perform query. "]

2020-11-16 11:43:54 Render: [Map] (20.607 sec)

2020-11-16 11:45:35 *** (root)::[kansas]::[Current Kansas Field Production] (Query) The remote server returned an error: (504) Gateway Timeout.

2020-11-16 11:45:35 -- Copy: [kansas]::[Current Kansas Field Production Drawing] (1 record(s), 94.533 sec)

Errors like "(504) Gateway Timeout" are entirely server-side. Something within the ESRI server infrastructure timed out.

The log window talks about a query because to copy selected items, there's a query for that since things might have changed since the selection. Manifold is making sure to get the latest version of the object from the server.

I don't know if it makes sense to have a mode where copies and such come only out of whatever is in local cache: one important reason to use servers is that they can provide real time data on what is in the layer (if that is not important, it's easier to just download the data without messing with a server). Looking just at cache is not getting real time data. But then again, doing a copy/paste and thereafter working locally is also not doing that.

brwalker9 post(s)
#16-Nov-20 22:11

Thanks for the explanation Dimitri. Re your "I don't know if it makes sense to have a mode where copies and such come only out of whatever is in local cache"

I think it makes a lot of sense to have a mode that allows copies to come from cache. It seems I have two options to obtain the small subset of data I require. First, I can scrape the whole layer. - This involves a very long time frame downloading 420K records to get the 100 or so I want. Second, the data supplier (Government based) also supplies regionalised static data for download. This is much quicker to download but the problem is they cannot guarantee that the download includes the most up to date information as at the download date.

The data I need is being used to prepare a static map product. It is accepted that the data might change after it's date of production. The important thing is that the most up to date data is used for it's production in the first place. Hence my desire to pull the data from the Mapserver.

I find it very interesting that the video shows that connecting via a OSM dataport does allow small subsets of data to be scraped from a huge dataset whereas the REST service does not. I would be interested to know if this is a feature of the mapserver or alternatively the way Manifold is requesting the data

Manifold User Community Use Agreement Copyright (C) 2007-2019 Manifold Software Limited. All rights reserved.