Subscribe to this thread
Home - Cutting Edge / All posts - Radian Studio 9.0.162.x
adamw


6,911 post(s)
#13-Jul-17 17:48

9.0.162.1

manifold-9.0.162.1-x64.zip

SHA256: d4fa25938ccd163b624699c89eaf1f9650f4f83fe365682e3d788ae7402ecfb3

manifold-viewer-9.0.162.1-x64.zip

SHA256: c0f25a4ff2ecd0a53e05841e47bfdbe14b8e7e2d2d5c3e099d21e968a2cffb76

Changes for 9.0.162.1

adamw


6,911 post(s)
#13-Jul-17 17:48

9.0.162.1

Changes

New query function: GeomSplitToConvex - takes an area geom and decomposes it into convex parts. The function uses a new algorithm, which is significantly faster than in Manifold 8. The function is also much more robust and can handle geometry that the previous algorithm couldn't. New transform: Decompose to Convex Parts.

New query function: GeomSplitToConvexPar - a parallel version of GeomSplitToConvex. The Transform dialog invokes the parallel function if parallel execution is allowed.

Schemas for views on SQL Server, PostgreSQL and other databases are extended to include a spatial index on each geometry field. This allows drawings created on these views to use existing spatial indexes if the view preserves them. (The drawback is that if the view is designed in such a way that searching it using a spatial criteria is slow, then the drawing is slow. Since there is no reliable way to tell for a view whether or not searching it using a spatial criteria will be slow, we think this is a reasonable compromise.)

There is a new conversion layer for converting coordinate system data to PRJ. Exporting a SHP file writes coordinate system data as PRJ. (We are going to use the conversion in multiple other places.)

There is a new conversion layer for reading ERDAS coordinate systems. Importing or linking a ERS file or an ECW file recognizes many coordinate systems that were previously unrecognized. (The new code supports all ERDAS coordinate systems that we know of. We cross-verified our roster of coordinate systems with those used by various other products as well.)

(Fix) Exporting boolean values to a CSV file puts them in quotes.

(Fix) Exporting a table with binary / geometry / tile fields to a CSV file no longer sometimes mislabels (other) fields.

Exporting a table to a CSV file exports xN and UUID values.

Exporting text values to a CSV file replaces line breaks with spaces to make sure exported data can be imported by as many products as possible.

The SQLITE dataport automatically chooses between Spatialite and ESRI's STGEOMETRY extension based on spatial data in the database. (These extensions cannot co-exist and cannot handle each other's data.)

The SQLITE dataport supports adding geometry fields for ESRI's STGEOMETRY extension.

Change to script functions: the parameters in Application.CreatePointObj, Application.CreatePoint3Obj, Application.CreatePoint4Obj have been made optional, to allow COM clients creating point objects with default coordinates, similarly to .NET clients.

Labels for line objects are automatically rotated. (Unlike in Manifold 8, the rotation works for all styles, the style of a label does not get reset to the default if the rotation code does not support it. The new labeling code performs significantly faster than the old code. We are working to add antialiasing. We are also going to allow bending labels at individual letters, right now the label text has to fit into a single line segment else it won't be displayed.)

Labels overlapping other labels are automatically skipped. (The new overlap resolution code performs significantly faster than similar code in Manifold 8, scaling to millions of labels. The current limitation is that overlaps are only resolved within the same labels component, not between components. We are going to remove it.)

(Fix) The MySQL dataport no longer sometimes fails due to wrong cursor type.

The NC dataport (NetCDF) recognizes 'latitude' and 'longitude' variable names and interprets them similarly to 'lat' and 'lon'.

(Fix) Exporting a BIL file or a FLT file forces pixel scale values to be non-negative.

End of list.

tjhb

7,248 post(s)
#16-Jul-17 01:31

New query function: GeomSplitToConvex - takes an area geom and decomposes it into convex parts. The function uses a new algorithm, which is significantly faster than in Manifold 8. The function is also much more robust and can handle geometry that the previous algorithm couldn't. New transform: Decompose to Convex Parts.

It's astonishingly fast, and does seem robust. I've attached an example of geometry for which Decompose to Convex Parts and Decompose to Triangles make 8.0.30 crash. Radian 9.0.162.1 handles both transforms perfectly.

Questions:

(1)

Radian SQL functions GeomSplitToConvex[Par] and GeomTriangulate both take a tolerance parameter, but the paremeter is not exposed in the corresponding transform dialogs. Is that intentional?

(2)

If we set up Decompose to Convex Parts under Edit > Transform, with Allow parallel execution checked, then press Edit Query, we get a splitgeom() function using GeomSplitToConvex, not GeomSplitToConvexPar. The calling INSERT INTO query does specify THREADS SystemCpuCount(), but I'm not sure that will be effective for INSERT INTO.

The same is true for Decompose to Triangles, using GeomTriangulate, not GeomTriangulatePar.

(On the other hand, Triangulate All and Triangulate All, Lines write queries using GeomTriangulatePar GeomTriangulateLinesPar, respectively.)

(3)

Somewhat related: the Query Builder syntax for Thread configuration reads

Thread configuration

  • "batch": <batch>
  • "threads": <threads>

It isn't clear that the required syntax is a single-quoted JSON string, with surrounding {} and both arguments in quotes, like

'{"threads": "6", "batch": "32"}'

Double-clicking either of the specifiers copies it verbatim to the code pane (e.g. "batch": <batch>). Would it be possible for the result to be well-formed JSON?

We can also use ThreadConfig, not mentioned here but listed as a function. ThreadConfig() only allows specifying threads, not also batch--maybe that is coming.

Attachments:
Decompose test.map

tjhb

7,248 post(s)
#16-Jul-17 05:39

Re (2), in a structure like

INSERT INTO (...)

SELECT ...

FROM ...

THREADS N

the THREADS directive formally governs both the SELECT and INSERT parts of the query. In practice, only one thread is used for the INSERT phase (INSERT operations are serialised). But this restriction does not limit the SELECT phase, which will use the N threads specified.

The two phases might overlap (I don't know), with INSERT writing records in one thread as they are produced by SELECT using N threads. The important thing is that INSERT does not create a bottleneck for SELECT.

So in the auto-generated code for Decompose to Convex Parts

-- SQL9

-- ...

FUNCTION splitgeom(arg GEOMTABLE AS

  (SELECT [Value] AS [Geom] FROM CALL GeomToBranches(CASE WHEN GeomIsArea(arg) THEN GeomSplitToConvex(arg, 0) ELSE NULL END))

END;

PRAGMA ('progress.percentnext' = '100');

INSERT INTO [Contour areas south 1e-10 Table Decompose to Convex Parts] (

  [ID][definition][designated][nat_form][elevation],

  [Geom (I)]

SELECT

  [ID][definition][designated][nat_form][elevation],

  SPLIT CALL splitgeom([Geom (I)])

FROM [Contour areas south 1e-10]

THREADS SystemCpuCount();

the GeomSplitToConvex function is effectively executed using SystemCpuCount() threads, since the splitgeom() function is launched from that many threads simultaneously.

On the other hand, it does seem to be faster again to increase the thread count within the splitgeom() function as well, as for example

FUNCTION splitgeom(arg GEOM) TABLE AS

  (SELECT [Value] AS [Geom] FROM CALL GeomToBranches(CASE WHEN GeomIsArea(arg) THEN GeomSplitToConvexPar(arg, 0, '{ "threads": "6" }') ELSE NULL END))

as well as the THREADS directive on the calling query. The difference is not great though.

adamw


6,911 post(s)
#16-Jul-17 08:15

On the other hand, it does seem to be faster again to increase the thread count within the splitgeom() function as well, as for example

It's only faster if there are a couple of objects that are disproportionally large.

The only performance gained by using multiple threads in the splitter in addition to using multiple threads in the main SELECT is that lost when most of the threads in the main SELECT are done but one or two aren't because they got stuck processing a huge object. When there are many objects, the threads that didn't get huge objects have other work to do, there is no performance lost. When there are enough big objects, all threads get them and there is little performance lost either. Etc.

tjhb

7,248 post(s)
#16-Jul-17 08:31

That is hugely helpful thank you.

So when we have many objects, of which most are small or simple, but a significant number are disproportionately large and complex, it may be worth adding multiple threads per object as well as multiple threads per batch.

It looks like that is the case in my contour stack--but it's not a very real example.

adamw


6,911 post(s)
#16-Jul-17 08:57

So when we have many objects, of which most are small or simple, but a significant number are disproportionately large and complex, it may be worth adding multiple threads per object as well as multiple threads per batch.

If the number of large objects is significant, it is better to just set the batch size to 1 (THREADS ... BATCH 1). This will avoid cases of a couple of threads getting particularly unlucky with multiple big objects and other threads running out of things to do.

In extreme cases it makes sense to do the query in two parts - first process the small objects with threads allocated per statement, then process the large objects with threads allocated per function call. The small / large test could perhaps be just the number of coordinates.

adamw


6,911 post(s)
#16-Jul-17 08:06

In order:

1 - That the Transform dialog does not include a control for specifying tolerance for functions like GeomSplitToConvex, GeomTriangulate and some others is intentional. We are changing our approach to tolerance a bit, both in the geometry code and in the UI. We want 99% of all uses of tolerance to be in the Normalize transforms. In all other cases we want tolerance to be 0 (automatic) as often as possible. This allows the data to be cleaner and the functions to be faster.

2 - The parallelism in Decompose to Convex Parts / Decompose to Triangles / other decompose transforms comes from processing multiple objects simultaneously. Having processing of a single object spawn additional threads generally harms the performance instead of helping it. There might be corner cases (ie, a drawing that contains a single huge object), they are best handled with adjusting the text of the query manually.

3 - The threads / batches parameters don't have to be strings, they can be numbers (the JSON is well-formed either way, but perhaps you meant that using strings for these parameters looks strange - you don't have to):

'{ "threads": 6, "batch": 32 }'

We won't be extending ThreadConfig to specify batch size, there are too many parameters coming to the thread configuration to include them all, so we'll stop at the number of threads and everything else will have to be specified by editing the string (and we'll provide good defaults).

adamw


6,911 post(s)
#16-Jul-17 08:25

I think I got what you meant in item 3 - you are thinking if it would be better to make clicking items for threads / batch in the query builder generate a full JSON string, correct? If so, I am not sure this particular thing is a good idea but we will think about what we can do to make it clearer that thread config is a JSON string.

tjhb

7,248 post(s)
#16-Jul-17 08:28

Thanks Adam. I responded to point 2 here. (Might not be right.)

Point 1 looks like a substantial change, and we are seeing a snapshot partway through.

I got point 3 wrong (I must have had skew JSON syntax when I tried numbers). The main point though is that the QB entry is currently not hugely helpful. You got it.

Extensions to ThreadConfig are exciting of course. I imagine at least some of the new parameters will be for GPGPU.

tjhb

7,248 post(s)
#16-Jul-17 08:09

As to speed and robustness, I've been testing Decompose to Convex Parts on a tough dataset in Radian and in Manifold 8.

The dataset is all of the closed contour lines in the South Island of New Zealand, rendered as areas. (They are ordered from ground up, overlapping systematically like a Tower of Hanoi, but this is not important for the test.)

Radian 9.0.162.1 times vary from 7 to 8 minutes, depending on threading options used.

Manifold 8.0.30 has just crashed after completing the normalization phase, on trying to decompose object 1 (the base contour, with largest area), at 27mn.

adamw


6,911 post(s)
#16-Jul-17 08:18

Happy to hear this!

Thanks for the note.

tjhb

7,248 post(s)
#16-Jul-17 08:56

Input: 220803 areas (coord count 4 to 841654, often highly convoluted).

Output: 29306396 convex areas.

Incredibly fast.

Manifold User Community Use Agreement Copyright (C) 2007-2017 Manifold Software Limited. All rights reserved.