Subscribe to this thread
Home - Cutting Edge / All posts - Manifold System

8,579 post(s)
#31-Oct-17 16:55

SHA256: 5e331b285731da19acbd166811ef45a9c0ff597b9e440c66dd67a28ad83c2e93

SHA256: 1ae4ebcdbcac0d66085a0d22d752d66aad123b5cd8fd6cd8fb93c7d81728b72c


(Fix) Resizing the map window correctly updates overlay for object shown in the Record pane.

(Fix) Deleting object shown in the Record pane clears the Record pane.

Exporting data with EPSG coordinate system exports coordinate system data as EPSG code without expanding it to the coordinate system definition.

State Plane coordinate systems are merged into a single roster supporting different representations for different dataports with many holes found and fixed.

GDAL dataport allows working with GDAL 2.0.x, 2.1.x or 2.2.x and automatically selects the latest version available with automatic adjustments for the call interface.

The command window can be saved as a query using Edit - Save as Query. Invoking the command creates a new query component, copies the current text from the command window, then selects the created query in the Project pane and puts the focus into the Project pane list so that clicking Enter opens the new component.

A virtual map created by opening a drawing / image / labels and dropping other components into the window from the Project pane can be saved as a persistent map using Edit - Save as Map.

The Component pane displays the number of records in a table if it is available. Tables that make the number of records available are: all tables in MAP files, tables in file data sources with some fields cached in .MAPCACHE files, tables in remote databases. (We are working to extend the list to cover file data sources without .MAPCACHE and some web data sources. The most significant type of tables where the number of records is not available is queries.)

Counter values like the number of records are printed with thousands separators.

The View - Reapply Filter / Order command is split into View - Filter - Reapply Filter (applies both filter and order) and View - Order - Reapply Order (applies just order).

The table window supports the new View - Filter - Filter Fetched Records Only command. The command switches between applying filters to the fetched portion of the table (checked, the default) and applying filters to the entire table (unchecked). In the latter mode, altering the filter or reapplying it refills the table. Switching between modes reapplies the filter.

Notes on the implementation:

  • If the table is smaller than the current record limit for the table window, the filter always operates on the already fetched records and the option does not play a role. (Whether the table is below the limit is detected by requesting the number of its records, so the table has to be able to determine it = if the table does not report the number of records in the Component pane the optimization is turned off.)
  • Switching from applying the filter to the entire table to applying it to only the fetched records requires reapplying the filter a second or so after. We will likely do this automatically in the future.

The selection box in the map window uses Shift to switch between adding to the selection (no Shift, the default) and subtracting from the selection (Shift). The current choice is indicated using the cursor.

The selection box in the map window uses Ctrl to switch between selecting objects touching the box (Ctrl, the default because the selection box starts with Ctrl-click) and selecting objects contained within the box (no Ctrl, after you started the selection box using Ctrl-click depress Ctrl). The current choice is indicated by the thickness of the box border (touching uses the standard border, contained uses the border that is twice as thick).

End of list.


8,760 post(s)
#01-Nov-17 03:28

This is great progress, a big improvement (in my opinion) on the issues that have been proving contentious in the 163.6 and 163.7.

I think there is still room to make things more intuitive and useful in normal workflows--specific suggestions below.

Besides that...

The command window can be saved as a query using Edit - Save as Query. Invoking the command creates a new query component, copies the current text from the command window, then selects the created query in the Project pane and puts the focus into the Project pane list so that clicking Enter opens the new component.

What could be more intutive than that? Great design.

Same with Edit - Save as Map, also brilliant.

As for having the number of records back, that makes more difference than I would have thought, mainly psychological I suppose, even massive tables sit on firm ground.

But the biggest deal is View - Filter - Filter Fetched Records Only. Many, many thanks for this option. It works really well. The implemenatation is perfect.

Some ideas for possible improvements in the same direction. Hopefully some of this is already planned (or something like it).

(1) An option for ordering, just like 'View > Filter > Filter fetched records only' for filtering. So 'View > Order > Order fetched records only'. Again defaulting to on.

With the option switched off, ordering would scan and filter the whole table, regardless of size, and then show the N highest- or lowest-ranked records from the whole set. (N the batch size, currently 50000.)

With a massive table, this sometimes allow us to ask a silly question, and get a silly answer in return. But yes, let us exercise poor judgement from time to time--only, to mitigate its effects, please allow the "full fetch" (so to speak) to be cancelled at any time. On cancellation it would be nice to retain all records fetched so far, and for these to then be sorted.

Why do I think this is crucial? I would use it in almost every workflow, certainly for everything important.

Typically what I want to do, when interrogating a dataset--whether looking for errors on some criterion calculated in SQL and written to a field, or just when getting to know new data--is to choose a field, sort on it to see the largest values for that field in the whole dataset, then reverse the sort order to see the smallest values. In either case, I may see values outside expected magnitudes, or more large or small values than I would have expected; eye I might notice rough natural breaks that will be useful in the next steps. Those two clicks, and only two screenfuls of records, can tell me an enormous amount about the data, often everything I immediately need to know. Often sorting on a second field will also be informative, but not always. In any case the ordering must be on the whole table. (With the infrastructure now built in 9 I would always do one or more sorts on just 50000 records first, of course. That is very powerful too, as an initial option.)

(By the way this is why comments in the thread for 163.6 which focussed on the impossibility of inspecting all records in a very large table, while true, went wide of the mark. Inspecting every record is seldom the point. If I have chosen my criterion or criteria wisely, then what I usually want to see is the range--provided it is the absolute range. A range within an arbitrary subset is not often so useful--to put the same thing another way.)

That is the first thing, for me essential. By comparison the others would be 'nice to have'.

(2) An option to adjust the batch size used when less than all records are returned. 50000 will not suit all users, all networks, all datasets, all workflows, etc.

(3) Possibly, for clicking on the 'fill record' (at the bottom, measning 'there is more data') to do something. Because it looks as if it should do something? Well, that is no doubt a bad reason. And given something like (1) above, perhaps this would be unnecessary.

But FWIW, for native projects only, clicking on the fill record might do either of these things:

(a) Fetch all records (no batch limit)--at least until I press cancel. When all records have been scanned (or I cancel), start to apply applicable filters and ordering, if any, to the records than have been fetched; if no filtering or ordering is in force, just display the first screen of records (exactly as for builds before 163.6) and leave use to PgDn, Ctrl-End or whatever to navigate as best we can.

(b) Fetch another batch of records. There are problems with this, which Adam has covered. All the same, it might be possible to do something sensible and efficient here just for native data sources (which would be enough). To my mind it would not matter, if no ordering is in place, if the next batch(es) overlapped with previous batch(es) to some degree, or returned updated data for records previously fetched. That is SQL, that is live data.

But again, only (1) is a feature which I think is truly needed.


8,760 post(s)
#01-Nov-17 03:45

(Sorry for many typos. 'By eye' etc.)


8,579 post(s)
#01-Nov-17 07:00

Regarding ordering on the whole table, we are hesitant to implement this for two reasons:

1. We have something much better in mind, formed partially by input in the previous threads. We won't be able to do it in the next few weeks, however. We need a little more time to flesh out the concept and implement it.

2. Unlike filtering on the whole table, ordering on the whole table as part of the UI is a dead feature if you, say, have a better way to see the min / max. At least with filtering, whatever records you passed you processed and made some useful incremental progress. With ordering, whatever records you passed might or might not contain the min / max, and since we are only doing it to find the min / max... plus there is a huge post-processing step and the requirement to keep all of the intermediate data... better use SELECT Min(), Max() directly.

Maybe that (finding min / max for a field) is something that can be put into an add-in while we are doing 1.

We can do the other items. In the case of tables in MAP files we can fetch 'more' records, too.


8,760 post(s)
#01-Nov-17 07:44

1 sounds great, intriguing. Can't say more than that! (because don't know).

Re 2, in case we are at odds, the purpose (anyway as I meant it) is not just to find the single minimum or maximum record for the whole table (ordered by field x), but crucially, to find the set of M minimum or maximum records for the whole table (ordered by field x).

We don't need M = 50000 in this case, just as many records as we can realistically scroll through, several screenfuls, say 250 records? (More is OK but usually redundant, though if we can extend M when we need to then great.)

We do absolutely need this though, every day. That is: to be able to see the set of 250 or so minimum or maximum records, ordered by field x, for the whole table.

Well, this can be done in SQL using COLLECT, or SELECT using one thread, with ORDER and FETCH. That's not hard to use, and it's deeply powerful stuff.

But there is also ordering of tables in the GUI--a really worthwhile thing too (even people who love SQL to an unhealthy degree can also love an interactive UI)--and getting a complete range from a full table scan is an essential part of making that intuitive and useful.

(Ordering an arbitrary subset of records is useful pro forma, but not really as analysis. It may be at least as much a trap as it is a useful thing.)

I'm glad about the nice-to-have items, thanks Adam.


8,760 post(s)
#01-Nov-17 08:01

I also meant to say this, the most important point: the actual means doesn't matter at all, only the purpose. Anything that can show us the M min./max. records for the entire table ordered by field x is great--especially if it hits other objectives as well.


6,319 post(s)
#01-Nov-17 09:58

Is Contents Pane hidden by default in I was puzzled for a moment.


8,579 post(s)
#01-Nov-17 10:30

No, it's behind the Project pane like usual, but not hidden. No changes in that area.


6,319 post(s)
#01-Nov-17 11:33

I must have hidden Contents Pane by accident when closing a project and next time MF started with this state. But That's OK.

Manifold User Community Use Agreement Copyright (C) 2007-2017 Manifold Software Limited. All rights reserved.