Subscribe to this thread
Home - Cutting Edge / All posts - Radian Studio 9.0.162.x
adamw


10,447 post(s)
#13-Jul-17 17:48

9.0.162.x

Converted to a public update.

Changes for 9.0.162.4

Changes for 9.0.162.3

Changes for 9.0.162.2

Changes for 9.0.162.1

adamw


10,447 post(s)
#13-Jul-17 17:48

9.0.162.1

Changes

New query function: GeomSplitToConvex - takes an area geom and decomposes it into convex parts. The function uses a new algorithm, which is significantly faster than in Manifold 8. The function is also much more robust and can handle geometry that the previous algorithm couldn't. New transform: Decompose to Convex Parts.

New query function: GeomSplitToConvexPar - a parallel version of GeomSplitToConvex. The Transform dialog invokes the parallel function if parallel execution is allowed.

Schemas for views on SQL Server, PostgreSQL and other databases are extended to include a spatial index on each geometry field. This allows drawings created on these views to use existing spatial indexes if the view preserves them. (The drawback is that if the view is designed in such a way that searching it using a spatial criteria is slow, then the drawing is slow. Since there is no reliable way to tell for a view whether or not searching it using a spatial criteria will be slow, we think this is a reasonable compromise.)

There is a new conversion layer for converting coordinate system data to PRJ. Exporting a SHP file writes coordinate system data as PRJ. (We are going to use the conversion in multiple other places.)

There is a new conversion layer for reading ERDAS coordinate systems. Importing or linking a ERS file or an ECW file recognizes many coordinate systems that were previously unrecognized. (The new code supports all ERDAS coordinate systems that we know of. We cross-verified our roster of coordinate systems with those used by various other products as well.)

(Fix) Exporting boolean values to a CSV file puts them in quotes.

(Fix) Exporting a table with binary / geometry / tile fields to a CSV file no longer sometimes mislabels (other) fields.

Exporting a table to a CSV file exports xN and UUID values.

Exporting text values to a CSV file replaces line breaks with spaces to make sure exported data can be imported by as many products as possible.

The SQLITE dataport automatically chooses between Spatialite and ESRI's STGEOMETRY extension based on spatial data in the database. (These extensions cannot co-exist and cannot handle each other's data.)

The SQLITE dataport supports adding geometry fields for ESRI's STGEOMETRY extension.

Change to script functions: the parameters in Application.CreatePointObj, Application.CreatePoint3Obj, Application.CreatePoint4Obj have been made optional, to allow COM clients creating point objects with default coordinates, similarly to .NET clients.

Labels for line objects are automatically rotated. (Unlike in Manifold 8, the rotation works for all styles, the style of a label does not get reset to the default if the rotation code does not support it. The new labeling code performs significantly faster than the old code. We are working to add antialiasing. We are also going to allow bending labels at individual letters, right now the label text has to fit into a single line segment else it won't be displayed.)

Labels overlapping other labels are automatically skipped. (The new overlap resolution code performs significantly faster than similar code in Manifold 8, scaling to millions of labels. The current limitation is that overlaps are only resolved within the same labels component, not between components. We are going to remove it.)

(Fix) The MySQL dataport no longer sometimes fails due to wrong cursor type.

The NC dataport (NetCDF) recognizes 'latitude' and 'longitude' variable names and interprets them similarly to 'lat' and 'lon'.

(Fix) Exporting a BIL file or a FLT file forces pixel scale values to be non-negative.

End of list.

tjhb
10,094 post(s)
#16-Jul-17 01:31

New query function: GeomSplitToConvex - takes an area geom and decomposes it into convex parts. The function uses a new algorithm, which is significantly faster than in Manifold 8. The function is also much more robust and can handle geometry that the previous algorithm couldn't. New transform: Decompose to Convex Parts.

It's astonishingly fast, and does seem robust. I've attached an example of geometry for which Decompose to Convex Parts and Decompose to Triangles make 8.0.30 crash. Radian 9.0.162.1 handles both transforms perfectly.

Questions:

(1)

Radian SQL functions GeomSplitToConvex[Par] and GeomTriangulate both take a tolerance parameter, but the paremeter is not exposed in the corresponding transform dialogs. Is that intentional?

(2)

If we set up Decompose to Convex Parts under Edit > Transform, with Allow parallel execution checked, then press Edit Query, we get a splitgeom() function using GeomSplitToConvex, not GeomSplitToConvexPar. The calling INSERT INTO query does specify THREADS SystemCpuCount(), but I'm not sure that will be effective for INSERT INTO.

The same is true for Decompose to Triangles, using GeomTriangulate, not GeomTriangulatePar.

(On the other hand, Triangulate All and Triangulate All, Lines write queries using GeomTriangulatePar GeomTriangulateLinesPar, respectively.)

(3)

Somewhat related: the Query Builder syntax for Thread configuration reads

Thread configuration

  • "batch": <batch>
  • "threads": <threads>

It isn't clear that the required syntax is a single-quoted JSON string, with surrounding {} and both arguments in quotes, like

'{"threads": "6", "batch": "32"}'

Double-clicking either of the specifiers copies it verbatim to the code pane (e.g. "batch": <batch>). Would it be possible for the result to be well-formed JSON?

We can also use ThreadConfig, not mentioned here but listed as a function. ThreadConfig() only allows specifying threads, not also batch--maybe that is coming.

Attachments:
Decompose test.map

tjhb
10,094 post(s)
#16-Jul-17 05:39

Re (2), in a structure like

INSERT INTO (...)

SELECT ...

FROM ...

THREADS N

the THREADS directive formally governs both the SELECT and INSERT parts of the query. In practice, only one thread is used for the INSERT phase (INSERT operations are serialised). But this restriction does not limit the SELECT phase, which will use the N threads specified.

The two phases might overlap (I don't know), with INSERT writing records in one thread as they are produced by SELECT using N threads. The important thing is that INSERT does not create a bottleneck for SELECT.

So in the auto-generated code for Decompose to Convex Parts

-- SQL9

-- ...

FUNCTION splitgeom(arg GEOMTABLE AS

  (SELECT [Value] AS [Geom] FROM CALL GeomToBranches(CASE WHEN GeomIsArea(arg) THEN GeomSplitToConvex(arg, 0) ELSE NULL END))

END;

PRAGMA ('progress.percentnext' = '100');

INSERT INTO [Contour areas south 1e-10 Table Decompose to Convex Parts] (

  [ID][definition][designated][nat_form][elevation],

  [Geom (I)]

SELECT

  [ID][definition][designated][nat_form][elevation],

  SPLIT CALL splitgeom([Geom (I)])

FROM [Contour areas south 1e-10]

THREADS SystemCpuCount();

the GeomSplitToConvex function is effectively executed using SystemCpuCount() threads, since the splitgeom() function is launched from that many threads simultaneously.

On the other hand, it does seem to be faster again to increase the thread count within the splitgeom() function as well, as for example

FUNCTION splitgeom(arg GEOM) TABLE AS

  (SELECT [Value] AS [Geom] FROM CALL GeomToBranches(CASE WHEN GeomIsArea(arg) THEN GeomSplitToConvexPar(arg, 0, '{ "threads": "6" }') ELSE NULL END))

as well as the THREADS directive on the calling query. The difference is not great though.

adamw


10,447 post(s)
#16-Jul-17 08:15

On the other hand, it does seem to be faster again to increase the thread count within the splitgeom() function as well, as for example

It's only faster if there are a couple of objects that are disproportionally large.

The only performance gained by using multiple threads in the splitter in addition to using multiple threads in the main SELECT is that lost when most of the threads in the main SELECT are done but one or two aren't because they got stuck processing a huge object. When there are many objects, the threads that didn't get huge objects have other work to do, there is no performance lost. When there are enough big objects, all threads get them and there is little performance lost either. Etc.

tjhb
10,094 post(s)
#16-Jul-17 08:31

That is hugely helpful thank you.

So when we have many objects, of which most are small or simple, but a significant number are disproportionately large and complex, it may be worth adding multiple threads per object as well as multiple threads per batch.

It looks like that is the case in my contour stack--but it's not a very real example.

adamw


10,447 post(s)
#16-Jul-17 08:57

So when we have many objects, of which most are small or simple, but a significant number are disproportionately large and complex, it may be worth adding multiple threads per object as well as multiple threads per batch.

If the number of large objects is significant, it is better to just set the batch size to 1 (THREADS ... BATCH 1). This will avoid cases of a couple of threads getting particularly unlucky with multiple big objects and other threads running out of things to do.

In extreme cases it makes sense to do the query in two parts - first process the small objects with threads allocated per statement, then process the large objects with threads allocated per function call. The small / large test could perhaps be just the number of coordinates.

adamw


10,447 post(s)
#16-Jul-17 08:06

In order:

1 - That the Transform dialog does not include a control for specifying tolerance for functions like GeomSplitToConvex, GeomTriangulate and some others is intentional. We are changing our approach to tolerance a bit, both in the geometry code and in the UI. We want 99% of all uses of tolerance to be in the Normalize transforms. In all other cases we want tolerance to be 0 (automatic) as often as possible. This allows the data to be cleaner and the functions to be faster.

2 - The parallelism in Decompose to Convex Parts / Decompose to Triangles / other decompose transforms comes from processing multiple objects simultaneously. Having processing of a single object spawn additional threads generally harms the performance instead of helping it. There might be corner cases (ie, a drawing that contains a single huge object), they are best handled with adjusting the text of the query manually.

3 - The threads / batches parameters don't have to be strings, they can be numbers (the JSON is well-formed either way, but perhaps you meant that using strings for these parameters looks strange - you don't have to):

'{ "threads": 6, "batch": 32 }'

We won't be extending ThreadConfig to specify batch size, there are too many parameters coming to the thread configuration to include them all, so we'll stop at the number of threads and everything else will have to be specified by editing the string (and we'll provide good defaults).

adamw


10,447 post(s)
#16-Jul-17 08:25

I think I got what you meant in item 3 - you are thinking if it would be better to make clicking items for threads / batch in the query builder generate a full JSON string, correct? If so, I am not sure this particular thing is a good idea but we will think about what we can do to make it clearer that thread config is a JSON string.

tjhb
10,094 post(s)
#16-Jul-17 08:28

Thanks Adam. I responded to point 2 here. (Might not be right.)

Point 1 looks like a substantial change, and we are seeing a snapshot partway through.

I got point 3 wrong (I must have had skew JSON syntax when I tried numbers). The main point though is that the QB entry is currently not hugely helpful. You got it.

Extensions to ThreadConfig are exciting of course. I imagine at least some of the new parameters will be for GPGPU.

tjhb
10,094 post(s)
#16-Jul-17 08:09

As to speed and robustness, I've been testing Decompose to Convex Parts on a tough dataset in Radian and in Manifold 8.

The dataset is all of the closed contour lines in the South Island of New Zealand, rendered as areas. (They are ordered from ground up, overlapping systematically like a Tower of Hanoi, but this is not important for the test.)

Radian 9.0.162.1 times vary from 7 to 8 minutes, depending on threading options used.

Manifold 8.0.30 has just crashed after completing the normalization phase, on trying to decompose object 1 (the base contour, with largest area), at 27mn.

adamw


10,447 post(s)
#16-Jul-17 08:18

Happy to hear this!

Thanks for the note.

tjhb
10,094 post(s)
#16-Jul-17 08:56

Input: 220803 areas (coord count 4 to 841654, often highly convoluted).

Output: 29306396 convex areas.

Incredibly fast.

adamw


10,447 post(s)
#26-Jul-17 18:20

9.0.162.2

Changes

Rendering

There is a new rendering engine that supports advanced rendering features like antialiasing and can offload a lot of rendering tasks to GPU. The options dialog allows switching the rendering engine choosing between:

  • Basic, software - rendering engine of 9.0.162.1 and below, uses software rendering, does not support antialiasing and other advanced features;
  • Advanced, software - new rendering engine, GPU is only used for the most basic tasks like those in 'Basic, software', supports antialiasing;
  • Advanced, hardware acceleration allowed - new rendering engine, GPU is used whenever it makes sense to use it, supports antialiasing.

The default choice is 'Advanced, hardware acceleration allowed'. The 'Advanded, software' choice is intended to be used to work around a temporary issue with the video driver, or for specialized scenarios like programmatic rendering or use under Terminal Services (Remote Desktop). Switching the rendering engine does not require restarting the application, but current rendering tasks will complete using former settings.

Rendering antialiased text performs noticeably faster. (Applies to advanced rendering engine.)

Joins of antialiased lines are rendered with better quality. (Applies to advanced rendering engine.)

(Fix) Panning or zooming a map with one or more layers referring to an invalid component no longer sometimes leaves rendering artifacts in background.

Merging rendered data for map layers performs faster.

Resolving overlaps between many labels has been optimized to perform significantly faster.

Other

Exporting data to any format displays progress and allows canceling.

MySQL dataport requires the version of the client library to be at least 5.1. Older client libraries cannot be used due to incompatible changes to the MySQL ABI.

Data sources for databases like SQL Server or Oracle can be refreshed by right-clicking the data source component in the Project menu and invoking Refresh. Refreshing a data source automatically rebuilds virtual components for drawings and images, virtual tables exposing coordinate system info, etc.

New script function: Database.CanMigrate - checks if the database supports migrating data from Enterprise storages.

New script function: Database.CanRefresh - checks if the database supports refreshing components.

New script function: Database.IsMigrateNeeded - checks if the database contains an Enterprise storage with changes that have not yet been migrated.

New script function: Database.IsReadOnly - checks if the database is read-only.

New script function: Database.Migrate - migrates an Enterprise storage contained in the database or changes to that storage, if it has already been migrated.

New script function: Database.Refresh - refreshes components in the database.

SQLite dataport can read compressed metric created by SpatiaLite without SpatiaLite.

SQLite dataport can read geometry (without indexes) without any spatial extensions.

(Fix) MIF dataport no longer sometimes misreads numeric data from MID files if system locale is non-English.

(Fix) MIF dataport no longer sometimes (rarely) misreads data due to the incorrect default value for the delimiter (comma instead of tab).

Connecting to an MDB / ACCDB / XLS or similar file automatically uses Access Database Engine for Office 2016, if it is available.

GPKG dataport detects an attempt to insert a second geometry field into a table and fails it early (the operation cannot succeed).

(Fix) Inserting or deleting drawings in GPKG dataport no longer sometimes fails to refresh the Project pane.

Dataports creating virtual components no longer override names of existing components. (This was generally only happening when one was specifically naming components to run into the override.)

End of list.

API documentation is updated to include all changes.

tjhb
10,094 post(s)
#31-Jul-17 01:15

The new rendering engine makes a big difference. In particular, zooming in is much better. It is noticably better under 'Advanced, hardware...' than under 'Advanced, software'.

I think there are two obvious areas for improvement: panning and zooming out.

For panning, it would be very good if the engine could try to predict which tiles may be needed for the next panned view, and render them during idle time. The current lag is a slap in the face.

For zooming out, the same applies (please predict). In addition, it seems there is no attempt to anti-alias initial zoomed-out views (or oversample, not sure of my terms here). Fine lines and area borders become intermittent, which is disconcerting.

(These comments are mostly for Manifold 9.)

Mike Pelletier

2,122 post(s)
#31-Jul-17 15:57

I'll add a bit to Tim's comments. I prefer the way Mfd 8 temporarily displays during the rendering time. It is more like Google (smooth) while Radian Studio is like ESRI (chunky). Great to see Mfd working on better labeling!

Dimitri


7,413 post(s)
#31-Jul-17 17:33

Great comments and 100% agree with Tim and Mike. The only question is in what order this gets done.

From the beginning we've always intended predictive pre-rendering. It's just one of those obvious things. If nothing else is going on and the user is just looking at the screen for eons of time (whole milliseconds...) may as well render up a few screens and put them in memory where they can be rapidly displayed.

But, what's obvious and easy in static settings like Google or PhotoShop where the map basically is the same from one moment to the next is a very different deal in a fully parallel environment like Radian.

Consider pre-rendering for panning only, where you'd like to to pre-render and cache the adjacent 8 screens (up/down/left/right/diagonals). Threads from a process the user has launched, a different program has launched (moving points about based on GPS feeds of moving trucks...) , a remote server has launched, etc., can change all that. So you need a way to know if something has changed since the cached pre-render was made. Counting zoom it's not going to be just 8 screens but maybe more like 48, and that can cover a *lot* of objects that could be different from when the cached image was rendered and when the display must happen.

Radian has smooth zoom, not quantized to discrete zoom steps, and not a computed/interpolated zoom based on static graphics objects like SVG uses (send static data to a client and then let the client render). Even if you just quantized it a bit and interpolated between those to give the appearance of smooth zoom, you're still no longer talking about 8 screens for an immediate pan, it's dozens of screens and all have to be checked if something has changed as a result of any thread happening anywhere in the system.

There are plenty of ways to deal with all that, but it ends up being very, very many details that are all subject to change given different other optimizations in the rendering system. It can also end up involving a huge amount of memory to cache various options even if you get good at predicting which way the user will want to zoom or pan [Consider a map window with dozens of layers that is undocked and spread across three to six monitors... lots of pixels to compute dynamically].

The other approach is to invest effort into making rendering so fast that you don't need to think about predictive pre-rendering. It turns out there are many optimizations there, which is one reason Radian is as fast as it is, and there are very many more that can be done. Faster rendering from fundamentals, not playing tricks on switching in pre-rendered views from cache, is always good, whether it is used in real time or whether it is used in background to produce even faster pre-rendered views for cache tricks.

To date there has been a bigger payday from focus on rendering fast from fundamentals. When returns start diminishing on fundamentals we can shift gears a bit to pre-render and other non-fundamental optimizations, for example, to allow smooth pans in 9, better appearance for zooms, etc. That will also allow such optimizations to take advantage of all the inside opportunities that steadily improving rendering from fundamentals offers as well.

So, bottom line is 100% agreement and moving forward with a mix of work for steady improvement, both easy things like cutting out blinks when selecting, harder things like smooth pan/zoom, etc.

adamw


10,447 post(s)
#04-Aug-17 18:20

9.0.162.3

Changes

Small versions of the query builder in dialogs like Select and Transform hide items for aggregate functions.

Applying a filter in a query builder list keeps items below those that match the filter. (Setting filter to 'SELECT' will keep the items below 'SELECT ...'.)

Applying a component type filter in the Project pane changes the filter button to indicate that the list of components is filtered.

The component type filter in the Project pane allows includes an item for data sources.

The controls in the Project pane are slightly repositioned to avoid putting context menus into a monitor to the right of the Project pane, if there is one.

Scrolling a window using a scrollbar moves keyboard focus to the window.

(Fix) Deleting records from a btree index in a MAP file no longer sometimes breaks the index. (The failure was found during stress testing. It could only happen under very rare circumstances and should nearly always manifest itself with the 'Can't delete record' error.)

Tables in MAP files detect inconsistencies in index data and report broken indexes in the log. In addition, affected indexes are made read-only. Table data can be safely copied to another table. Deleting affected indexes is also safe, although not all space originally used by index data might be reclaimed.

Joins exposing btree indexes detect more cases where duplicates are made impossible by conditions applied on top of joins, and keep indexes unique instead of converting them into indexes with duplicates.

(Fix) Outer joins no longer sometimes misapply conditions in WHERE applying to individual tables.

(Fix) Outer joins no longer sometimes misapply constant conditions evaluating to FALSE in the scope of a join.

Joins and WHERE filters optimize the IN construct for fields used in btree indexes: <field> IN (<value>, <value>, ...). Any btree index type is allowed. A btree index with multiple fields will optimize IN with the first field.

(Fix) The IN construct used to match NULL against an empty table correctly returns FALSE instead of NULL.

(Fix) The IN construct used to match a non-NULL value against a list of values or a table that contain no matches, but contain NULLs, correctly returns NULL instead of FALSE.

The IN construct used with tables can use any type of btree index and can use a btree index for part of matched values. (Previously, only a btree index on a single field with no duplicates or nulls could be used.)

The IN construct used to match a value against a list of values fails if the values in the list are of different types. (This is handled differently in different databases. Some allow conversions, including from string to number or vice versa.)

The recent file list is limited to 32 items.

(Fix) The table window no longer sometimes paints field names in the header in gray.

The GCDB dataport reports supported geocoding functions via the 'API' comments component.

(Fix) Database dataports correctly handle mfd_meta tables with fixed-width fields created by Manifold 8.

Database dataports automatically adjust object names in mfd_meta to include schema names, to help Manifold 8.

(Fix) Database dataports correctly handle schema names with backslashes (frequently found when schema names coincide with names of Windows users, as is common practice on SQL Server).

The dataport for personal ESRI geodatabases allows renaming or deleting drawings.

The dataport for Enterprise storages merges storages from different database schemas together.

End of list.

tjhb
10,094 post(s)
#05-Aug-17 00:31

The controls in the Project pane are slightly repositioned...

Comparing old and new, the new positioning actually looks better--there is slightly more space. The tiniest thing: could the spacing for the Layers pane be adjusted to match?

Love this:

Applying a component type filter in the Project pane changes the filter button to indicate that the list of components is filtered.

And this is really helpful, obvious now that you've done it:

Applying a filter in a query builder list keeps items below those that match the filter [in other words, keeps the subtree, if any, for each matched item].

adamw


10,447 post(s)
#05-Aug-17 07:18

The tiniest thing: could the spacing for the Layers pane be adjusted to match?

Yep, good idea.

tjhb
10,094 post(s)
#05-Aug-17 02:34

Something that could really help us to code efficiently and well: for every SQL operator and function that *can* potentially use an index (under some circumstances), write a log message, named for the operator or function, noting whether an index has in fact been used, and if so what kind of index (and its name).

That output would be too verbose for normal use, only worthwhile for development and testing, so perhaps there could be a $verbose$ directive to control it. And perhaps control other things later--number of CPU cores, GPU execution units, effective parallelism at each node. Index usage would a great start.

I expect there are already better ideas along these lines in-house (cf. execution plans in SQL Server).

It's stuff that the compiler and engine already know (among heaps of other things that are literally too complex for users to consider). Perhaps a question of distilling and displaying some of what is most useful.

adamw


10,447 post(s)
#05-Aug-17 07:24

Yes, we are planning to show query plans / add pragmas to influence them where it makes sense.

A related question is to what extent we should rewrite "unaccelerated" operations that are semantically the same as "accelerated" ones. We do quite a bit of this already, but with a seemingly infinite range of ways to express the same thing there is always something that we don't rewrite automatically, so we should perhaps put up a page of rules of thumb.

Eg, for the current engine:

1. Replace <field> IN (<value>) with <field>=<value>. (We don't rewrite this automatically and the second form allows the optimizer to reason about indexes better.)

2. Replace <field>=<value1> OR <field>=<value2> OR ... with <field> IN (<value1>, <value2>, ...).

Etc.

tjhb
10,094 post(s)
#05-Aug-17 08:29

That's great. No need to add it all at once in a beautiful GUI, could be bit by bit.

On the related question, I think we need to know less about automatic optimizations--rules of thumb sounds about right. I doubt we would usually need detailed reporting on these--though maybe if a join was automatically changed from inner to outer, for example, then we would learn something useful from a note.

Another example (your 'etc') might I think be replacing

<field1> = <value1> AND <field2> = <value2> AND ...

with

(<field1>, <field2>, ...) = (<value1>, <value2>, ...)

though this would give only a modest speedup (as it does in Manifold 8) unless there were an index on at least one of <field1>, <field2>..., preferably on all of them in some order (then a dramatic speedup).

adamw


10,447 post(s)
#05-Aug-17 09:38

<field1>=<value1> AND <field2>=<value2> AND ... is currently best kept as is. The two forms that you mention are, of course, semantically equivalent, but since a lot more people use the first one than the second, we tend to look for the first form more.

The most important rule that has not been mentioned yet is perhaps:

Replace GeomDistance(<p>, <q>, <t>) <= <value> with GeomWithin(<p>, <q>, <value>, <t>).

...although this particular case occurs so often, we will likely rewrite it automatically.

tjhb
10,094 post(s)
#05-Aug-17 08:47

Would fields listed in GROUP BY ever be reordered to make best use of an index?

adamw


10,447 post(s)
#05-Aug-17 09:34

That's on the list of improvements for the future. They aren't reordered now, but they will be.

adamw


10,447 post(s)
#14-Aug-17 17:40

9.0.162.4

Changes

(Fix) The query engine no longer sometimes fails to compile queries with nested joins and complex conditions due to unresolved field names.

The query engine optimizes joins with '=' join conditions and indexed fields (btree) on both sides using merge scans. This provides significant performance benefits. Indexes on fields may allow duplicates or nulls, the join criteria may use only a subset of index fields (starting with the first). Comparison options on text fields have to coincide.

(Fix) The query engine no longer fails to use the spatial index for SELECT INTO and some other operations.

The query engine is better at optimizing query plans when there are multiple alternatives.

The query engine is significantly better at optimizing filter criteria in nested joins.

(Fix) WHERE filters and joins on indexed text fields no longer sometimes fail to account for comparison options such as NOCASE, and return less data than they should.

The type for a new index in CREATE / ALTER statements is no longer optional. (It was optional with a silent default of BTREE. We found that too confusing.)

The ArcGIS REST dataport supports the EXPORT service, which is the default service supported by all ArcGIS REST servers. If the server implements the TILEINFO service, the dataport uses TILEINFO (because it is faster), otherwise it uses EXPORT.

The ArcGIS REST dataport exposes an image for each layer in a multi-layer server, in addition to the image for all layers (with visibility and styles being set to the defaults provided by the server).

The WMS, WMTS and similar dataports work around issues with old-style servers that do not properly decode space characters in URL parameters.

The WMTS dataport tries to re-create the bounding box of a layer for which it is not provided from an approximate bounding box in the default coordinate system, if it is provided.

(Fix) Attempting to export a TIFF file no longer fails to switch to BigTIFF if the image is bigger than 4 GB.

(Fix) Exporting a TIFF file no longer sometimes fails to recover from a failure and shuts down the application.

(Fix) Exporting a PNG file no longer sometimes fails to recover from a failure and shuts down the application.

(Fix) Exporting a BIL file with INT16U pixel values no longer fails.

The .PSV file extension is mapped to the CSV dataport. (Importing a .PSV file using the 'All Files' filter will automatically use CSV.)

Connecting to databases with lots of components performs faster.

Dragging and dropping components in the Project pane always moves data within the same data source and always copies data across different data sources. The '+' sign in the copy cursor has been made larger.

End of list.

adamw


10,447 post(s)
#14-Aug-17 17:56

The combined effect from changes in the query engine is big.

Illustrating just one change (merge scans), a few simple queries with synthetic data:

1 - setup, not measured:

--SQL9

CREATE TABLE t (a INT32INDEX a_x BTREE (a));

INSERT INTO t (a) SELECT * FROM CALL ValueSequence(1, 100000, 1);

CREATE TABLE u (b INT32INDEX b_x BTREE (b));

INSERT INTO u (b) SELECT * FROM CALL ValueSequence(50001, 150000, 1);

1 - query, the performance on a test system on 9.0.162.3 was 4.243 sec, in .162.4 it went to 0.532 sec:

--SQL9

SELECT * INTO p FROM t INNER JOIN u ON a=b;

2 - setup, not measured:

--SQL9

CREATE TABLE t (a INT32INDEX a_x BTREEDUP (a));

INSERT INTO t (a) SELECT value DIV 2 FROM CALL ValueSequence(1, 100000, 1);

CREATE TABLE u (b INT32INDEX b_x BTREEDUP (b));

INSERT INTO u (b) SELECT value DIV 2 FROM CALL ValueSequence(50001, 150000, 1);

2 - query, the performance went from 4.595 sec to 1.050 sec:

--SQL9

SELECT * INTO p FROM t INNER JOIN u ON a=b;

3 - setup, not measured:

--SQL9

CREATE TABLE t (a INT32INDEX a_x BTREE (a));

INSERT INTO t (a) SELECT * FROM CALL ValueSequence(1, 100000, 1);

CREATE TABLE u (b INT32INDEX b_x BTREE (b));

INSERT INTO u (b) SELECT * FROM CALL ValueSequence(50001, 150000, 1);

3 - query, the performance went from 5.118 sec to 0.856 sec:

--SQL9

SELECT * INTO p FROM t FULL OUTER JOIN u ON a=b;

The above queries were also performing faster in Manifold 8 than in Radian / Viewer (mostly because Manifold 8 could load everything into memory, etc). They now perform better in Radian / Viewer, single thread vs single thread.

We tested a lot of real-world queries with similar joins as well and they all perform faster across the board. (We do have a couple of queries that perform faster in Manifold 8 than in Radian, all of them are very involved with several levels of joins, all of them perform faster when split into parts, and, most importantly, when they are split into parts Radian easily outperforms Manifold 8.)

tjhb
10,094 post(s)
#14-Aug-17 18:12

This is huge. Many thanks. I'll post some comparisons too.

danb

2,064 post(s)
#14-Aug-17 21:02

Great work with the tiff export and other improvements. The tiff export now works flawlessly here.


Landsystems Ltd ... Know your land | www.landsystems.co.nz

apo
171 post(s)
#15-Aug-17 14:32

Great,

working on a hour long query it came down to less than a minute with the improvement on IN, as on indexes

Now I spend more time on finding my errors than waiting ... great job

a.

Manifold User Community Use Agreement Copyright (C) 2007-2021 Manifold Software Limited. All rights reserved.