Subscribe to this thread
Home - General / All posts - Can Manifold substitute for SQL utilities
Forest
578 post(s)
#26-Apr-18 00:38

There are a whole bunch of SQL utilities that I have always wanted but being of limited funds have held off on. These include things like:

  • SQL Razor;
  • SQL Maestro; and
  • Navicat.

I am ms-access based and will be for some time due the time it will take to replace existing deep functionality. Therefore I have a need to shuffle info mainly from ms-access to other formats.

The problem that prompts this question is that I have spatial point data from iTracker and will soon have data from Survey123. The first thing I have noticed is that much of the data is in flat tables, when it would better be represented as a relational hierarchy and I am currently figuring out how to generate a set of related tables from flat tables. As much of the data is csv or in excel format, creating custom importers in ms-access is straight forward. Much of the logic required will be for ensuring that the same data is not imported twice. I think ms-access is still the best tool for this work.

At some point, I would like to shift the data to something more industrial. Postgres and SQL-Server come to mind. SQL-Server has the advantage of having upsizing wizard in ms-access that do the data conversion. However, I would like to work with open source systems if possible. Manifold's own file system may also be a good recipient of such data. There is however one consideration that leads me away from a Manifold based solution. That is that I would like to create a system that allows data to be value added after collection. Most of my data has geotagged photos and it is common to revisit them and add notes as photos record things other the subject they were taken for. A photo of a bird on a riverbank also records the condition of the riverbank. With ms-access, it is not hard at all to drag and drop a photo into a form to create a new link and to add new attribute data. This is the functionality that I would like to recreate. To date, visual studio has defeated me so I am looking at GeoDjango. Any thoughts.

Forest

Dimitri

5,036 post(s)
#26-Apr-18 06:36

What helps most is to say what, specifically, you would do with SQL Razor, SQL Maesto and Navicat that you cannot do now in SQL in 9. Even just a list of the top five things you would want is very helpful.

Most of my data has geotagged photos and it is common to revisit them and add notes as photos record things other the subject they were taken for.

I've long been an advocate of enabling a new data type in tables, where each record could have a raster field (I'll call it a "photo" field for clarity in this post). You could then have a photo in each record instead of having an image per table with each record being a tile in that image. After all, if a single record can efficiently hold 2GB of coordinates for a geom or a single tile could efficiently, easily, be 1024 x 1024 pixels, the storage architecture could easily handle a photo, even a "big" photo megabytes in size, per record.

We've not done that to date because there is much work to be done to fully articulate what is already there. Much more important are filling out all the vector editing capabilities, filling out all the basic raster editing and manipulation things, merging in the LiDAR super-fast structures and such. Only then could we, I think, peel off some time to do "photo per record."

The issue with that is not so much infrastructure, although that certainly requires work, it is all the accessory facilities that people will instantly require to work with a new photo per record type. For example, you want the ability to browse and to display, to edit, to batch load and unload from many different formats and storage (traverse folder hierarchies, harvest all the usual tags, maintain orientation, etc.). None of that is rocket science but there is a lot of it.

You can also easily spend a lot of time on things not central to the Manifold community. For example, just yesterday I complained about the truly awful GUI for most audio players. You have Windows stuff insisting it wants to show and order my mp3 files by their track name and by tags that Windows thinks I should prioritize, not by the file names by which they are known across dozens of devices. I found myself complaining to a friend that I wished Manifold had an audio type so we could store an MP3 per record and then write a simple app to display and to manage those as desired.

That's nonsense, of course, but not complete nonsense as with web-based interfaces one can think of casting all sorts of data into geographic context. As you walk about Paris and you get near a point of interest your smart phone Tourist app starts telling you about the scene at which you point it, or as you are driving down highway 5 in the central valley of California, just wondering if you can hold out until the famous In-N-Out about halfway between the Bay area and LA, your tablet starts telling you the history of oil discoveries, what all those pumps are in the fields, a discussion of the San Andreas fault, or whatever.

Or, a utility crew is working at a site where there is a break in the water main and the supervisor calls up a video shot on cell phone by the crew that last dug up that spot.

If you have rich media associated with spatial coordinates, all that is possible, and even easy.

steveFitz

213 post(s)
#26-Apr-18 08:07

Forest,

The first thing I have noticed is that much of the data is in flat tables, when it would better be represented as a relational hierarchy and I am currently figuring out how to generate a set of related tables from flat tables. As much of the data is csv or in excel format, creating custom importers in ms-access is straight forward. Much of the logic required will be for ensuring that the same data is not imported twice. I think ms-access is still the best tool for this work.

As previously discovered we have similar work flows. My main field data table is flat (the dbf part of a shapefile) with data being imported into Access and split into 2 tables via a collection of Update, Append and Insert queries. This approach of splitting the table data into 2 tables should work for any database I would imagine. The queries check keys (usually a location site name and ID number compound key) and insert new records in the master table if they don't exist and then checks for another compound key that includes date and inserts new inspection records and skips records were the key already exists in the Access table. This gives an object (one tree in my case) to inspections (many annual inspections) relationship. If you don't already have a handle on entity relationships then Google the term, read up and start drawing diagrams of your ideal data collection set up. If you haven't already then also read up on database normal form and try to apply some of these rules in your design. I think some of the software you mention would have ER diagram utilities to help create your db relationships.

A second table (Photos.dbf in my case) allows many related photo names to be uploaded to Access also. Not sure if a related table can be set up with Survey 123 (?).

Most of my data has geotagged photos and it is common to revisit them and add notes as photos record things other the subject they were taken for. A photo of a bird on a riverbank also records the condition of the riverbank.

I prefer to keep my photos in folders with just a record of the path in the db. This avoids having to manage massive GB dbs.

It sounds like you are describing the classic many to many relationship I mentioned in the data collection tool post previously. One photo could have one or many related records (e.g [PHOTOS] (PhotoID, Date, Location) > [Wildlife] (Species, Notes), [Environment] (Notes) etc.) in tables and each location record could have one or many photos. You would make as many tables as you think there are things (or entities) that you are interested in in your photos. Related keys and foreign keys should make it all relate and work as long as you have the key in each new record to relate each new photo or note.

Using my data collection tool (I won't mention its name again since I'm not an ESRI rep!) I can collect many-to-many relation data easily out in the field (one record - one or a billion related photos or until I reach the 2GB shapfile limit!). I do admit though that my Access set up is not a true many-to-many relationship set up as it doesn't use intermediate tables, it just repeats data (photo number) and as such is not really efficient (breaks normal form).

Forest
578 post(s)
#27-Apr-18 01:08

@Dimitri,

I have build my own ms-access based information system. The backend is about 15 MB uncompressed. It stores attribute data on about 300 K photos and videos and has fully attributed records for about 10 000 photos. Broadly is quite similar to iNaturalist in data model and capabilities. My photo collection was also indexed by ACDSee which seems to have a FoxPro engine. They put thumbnails as blobs into the database. The result is that their database is over 30 GB, which is a major disincentive to back up the data. I have also had long periods (hours) when ACDSee has been reconciling their internal database with my hard drive. The last event was when I was cloning a hard drive when a windows update started running in the background and the two processes clashed and corrupted the hard drive. The filing system eventually rebuilt itself but then ACDSee took hours to recover. My ms-access based solution survived all of these events. I do like the idea of what you said but it has implications for backing up and for recovery from disasters.

@SteveFitz

I have also used many to many data model. I was working on transmission line projects and developed an application that had a stack of five many to many relationships.

Land Parcel -> many owners: one owner -> many land parcels

One land parcel -> many transmission lines (near substations): one transmission line -> many land parcels

(intermediate table) one land parcel/transmission line -> many towers and spans: one tower and span ahead -> many land parcels

one tower and span ahead -> many issues (river crossings, houses with kids, weak bridges with load limits etc, historical structures): one issue -> many towers and spans (crossing organic farms or endangered vegetation)

You can start to see how there is a tangle of many to many relationships. I now have a standard form design that makes all of this stuff easy. It is ms-access based but could just as easily be done in visual studio. I backended the data to SQL Server but never rewrote the front end.

It should also be clear to other readers why flat GIS data tables would be very inefficient in such a situation.

steveFitz

213 post(s)
#27-Apr-18 08:44

Sounds like you got it wrapped up Forest. Good luck!

Forest
578 post(s)
#30-Apr-18 01:22

Hi Steve,

If I am working with point data that flows into a table in a database, I do have it wrapped up. And I am happy to share solutions. I certainly do not have it wrapped up with field data collection through GIS-based online data collection tools.

Thinking about the tools in front of me - the Arc-based tools Survey123 and Collector, they seem to offer only one key service that I would otherwise find hard to duplicate. That service is security, where different people have different roles with respect to the editing of the same data. E.g. a surveyor can locate an item and an engineer can assess its serviceability. Unfortunately security is an issue where I work as I handle culturally sensitive data. Deployment of data may also be advantaged by use of Arc-online but I imagine I can do this in other ways. I will start a new thread for that.

Thanks for all of the ideas you have put in.

Manifold User Community Use Agreement Copyright (C) 2007-2017 Manifold Software Limited. All rights reserved.