Subscribe to this thread
Home - Cutting Edge / All posts - Radian Studio 9.0.160.x
adamw


10,447 post(s)
#03-Apr-17 17:51

9.0.160.x

Converted to a public update.

Changes for 9.0.160.4

Changes for 9.0.160.3

Changes for 9.0.160.2

Changes for 9.0.160.1

adamw


10,447 post(s)
#03-Apr-17 17:52

9.0.160.1

Changes

There is a new SID dataport for MrSID image files. The dataport uses MrSID Raster SDK. The required DLLs are included in the installation packages. There are no additional installation pre-requisites. The dataport supports reading 8-bit raster data.

(Fix) Converting a geometry value with coordinate system to GeoJSON using the GeomJsonGeo no longer fails to escape the coordinate system string.

Pasting data into the Project pane tracks progress and can be canceled.

Adding a new geometry object in the map window reports errors if they occur. (For example, a drawing built on the GDB table will reject an object of a wrong type. Previously, the rejection was silent.)

(Fix) The DWG dataport no longer sometimes produces incorrect boundaries for hatch objects.

The GDB dataport preserves circular arcs, ellipsoidal arcs and cubic splines when writing geometry values. Curves that can not be translated to GDB are linearized.

(Fix) The GDB dataport no longer sometimes misreads ellipsoidal arcs. (We think we also have a bug in rendering these curves - to be taken care of shortly.)

New script function: Application.CreateExpressionParser - creates an expression parser object that can be used to create expressions.

New script function: Database.CreateExpressionParser - creates an expression parser object that can be used to create expressions in the context of the database.

New script function: ExpressionParser.CreateExpression - takes expression text and parameters, and creates an expression object to evaluate it.

New script function: Expression.Evaluate - evaluates the expression for the specified parameter values.

New script property: Expression.Source - returns expression parameters.

New script property: Expression.Target - returns expression results.

New script property: Expression.Text - returns expression text.

(All added script functions and properties are described on the updated Manifold APIs web site. The added objects allow .NET / COM scripts to use query functions as if they were implemented in the object model.)

End of list.

adamw


10,447 post(s)
#10-Apr-17 17:32

9.0.160.2

Changes

Dataports that use .MAPCACHE files allow writes to metadata. This allows formatting linked drawings and images, correcting their coordinate system in case it was missing or has not been read correctly, etc. Edits to metadata are saved in .MAPCACHE. Metadata synthesized by the dataport is provided as a read-only system MFD_META_SOURCE table.

(Fix) Attempting to open a data source which uses a .MAPCACHE file no longer sometimes makes the data source icon stuck in the 'discovering' state.

(Fix) Attempting to connect to a data source which uses a .MAPCACHE file via the ODBC driver no longer sometimes fails.

(Fix) Linearizing an ellipsoidal arc no longer sometimes creates an arc in the wrong direction.

(Fix) Writing a geometry value with an ellipsoidal arc to GDB correctly orients the arc.

The SID dataport supports reading pixel values of arbitrary type.

The SID dataport supports rendering intermediate image levels.

The SID dataport reads EPSG coordinate system info. (Some SID files have coordinate system info as WKT, the dataport currently does not read this due to a limitation in the SID SDK. We are considering using a workaround.)

The SID dataport supports reading multispectral images. Individual channels are split to different tile fields.

There is a new GDAL dataport for reading data through GDAL / OGR. The dataport uses GDAL DLLs already installed on the machine. The DLLs have to be either in one of the Radian folders or in system path (preferable). The dataport currently supports reading vector data. GDAL drivers are treated as thread-unsafe by default.

The GDAL dataport reads coordinate system info.

(Fix) Prototypes for GeomNormalizeTopology, GeomIntersectLinesPair and several other geometry functions displayed in the query builder use clearer names for return values (<table> instead of <drawing>, etc).

(Fix) The ODBC driver no longer sometimes rejects writing geometry values from Manifold 8.

The ODBC driver flags special key fields such as the OBJECTID field in GDB tables as autogenerated for Manifold 8. This allows Manifold 8 to insert drawing objects into such tables. Editing or deleting the objects that have just been inserted requires refreshing the table (this is a limitation of Manifold 8).

The ODBC driver ignores values for computed fields provided by Manifold 8. This allows Manifold 8 to operate on tables with computed fields (their values are computed by the database, not set by client).

The SHP dataport no longer fails to return table data using BTREE index on the ID field if used without cache.

The New Data Source dialog classifies dataports for files like CSV as 'File: xxx' instead of 'Database File: xxx'. (We felt the real distinction is between 'Database:' - the user has to specify the server to open, 'File:' / 'Folder:' - the user has to specify the path to file or folder to open, and 'Web Server:' - the user has to specify the URL to open.)

End of list.

adamw


10,447 post(s)
#17-Apr-17 12:17

A quick heads-up: we expect the next build in a few days. The main focus is scripting.

tjhb
10,094 post(s)
#17-Apr-17 15:13

That's intriguing! Cool.

mdsumner


4,260 post(s)
#20-Apr-17 00:58

Indeed, I've been really slack on Radian but hoping to get back into it. Keen to see some examples of scripting to populate tiled tables

Also R's odbc (based on nanodbc) and RODBC both still don't play but I need to check all again based on recent updates. Any idea if nanodbc-based code should or shouldn't work?


https://github.com/mdsumner

adamw


10,447 post(s)
#20-Apr-17 12:02

We found why the ODBC driver fails to work with R's DBI.

It is a pretty small thing. We fixed it, the fix is going to come in the cutting edge build we are preparing (either today or tomorrow).

(The details: There is a parameter that we report as read-only. R blindly tries to set it to the exact same value we are using. We were failing the attempt to set it because the parameter is read-only and that was ending it for R. We adjusted the code to allow 'setting' the parameter to the same value it already has. There were no further issues after.)

RODBC doesn't work and it shouldn't. It needs a patch, because what it is currently doing is against the spec in a way that we don't want to work around.

mdsumner


4,260 post(s)
#20-Apr-17 12:33

Great! I am very willing to assist on the R side to patch or lobby for changes, if needed - it's over my ken to identify these problems though. Thanks Adam


https://github.com/mdsumner

adamw


10,447 post(s)
#21-Apr-17 18:59

9.0.160.3

Changes

Web dataports that have been allowed to use cache (the default) allow writes to metadata. This allows formatting drawings and images exposed by these dataports, etc.

The GDAL dataport reads raster data.

The GDAL dataport reads palettes for indexed color data.

The GDAL dataport automatically recognizes "folder" data sets as classified by GDAL (GDB, TIGER/Line) and sets up GDAL connection accordingly. (Third-party software using GDAL usually have separate dialogs for "file" and "folder" data sets. Our dataport attempts to distinguish between these two types of data sets seamlessly.)

The GDAL dataport exposes indexes on vector data.

The GDB dataport allows changing table schemas (to the extent supported by the ESRI's GDB SDK).

The ODBC driver includes an adjustment that allows connections from R via DBI.

The object model underwent a series of architectural changes which improve its performance. (This build includes about two-thirds of the changes we have been planning to do, the next build will deliver the last third. Some of the most important changes are singled out in the notes below.)

Script services are removed. The Services pane is removed. (We used to need script services to hold critical data for running scripts. We no longer do. We were also planning on using script services to exchange data between scripts and between sessions of scripts. After changes to the object model we have better ways to do this when we need it.)

The object model eliminates locks on many of the key objects. This allows multiple threads of the same script to perform calls on the same object in parallel. (Previously, calls to the same object were usually serialized. Multiple scripts could make calls in parallel but multiple threads of the same script could not, unless they used script services like our underlying code did - which was obviously too much to expect from a script, the code that had to be written was too complex.)

The object model optimizes transferring data via sequences.

The object model optimizes transferring data to and from expressions.

End of list.

adamw


10,447 post(s)
#21-Apr-17 19:02

A couple more words on scrips and performance.

The object model in Manifold 8 concentrated on ease of use. Since the internal data structures used by Manifold 8 itself were mostly single-threaded, the object model did not have to be multi-threaded and so it used a global lock. This meant that multiple calls into the object model could not proceed in parallel. If the user somehow managed to call the object model from multiple threads, these calls were then performed in sequence. The object model took one call at a time, making other threads wait in line. Many packages and libraries (eg, the standard implementation of Python) do the exact same thing, global locks are pretty widespread, but they obviously are not a good fit for multiple threads.

In Radian, we reworked the internal data structures to be multi-threaded, and reworked the object model to be multi-threaded as well. Up until recently, however, objects in the same script were still synchronizing with each other by default, because - by default - they communicated with Radian through a shared channel. It was entirely possible to set up objects to communicate through different channels programmatically, but that had to be done explicitly. (Objects in different scripts didn't have that issue, but the most important scenario is obviously a single running script.) The batch of changes in this build allows script objects to communicate with Radian in a different way without using shared channels, removing the problem entirely. Not all objects have been adjusted to communicate without using shared channels yet, but many were, and we are going to adjust all of them in the next build.

In addition to allowing seamless multi-threading, we have been making gradual improvements to the object model over many builds now. These improvements were frequently too technical / niche to mention in the build notes, but their effect was growing.

To illustrate how far we all have come we are attaching the results of a set of tests that compare the performance of the object model between Manifold 8, Radian 9.0.160 (one test needs 9.0.160.2, the previous cutting edge build), and the current build.

Test data sets for both Manifold 8 and Radian are attached as well. Test scripts don't do anything profoundly clever, but they are stressing the system in ways characteristic for real analysis. Each file includes a comments component with instructions on how to perform the tests.

scripts-throughput.map

scripts-throughput-8.map

Individual notes on each test follow.

adamw


10,447 post(s)
#21-Apr-17 19:05

Test 1 - write numbers / strings

Create a table and write a number of records with numeric / string values.

We use the following format for the results: <product/build> - <time in milliseconds> (<number of processed records>) - <number of records processed per second>. Best number is shown in bold.

8.0.30 - 25600 ms (10k records) - 0.39k rec/s

9.0.160 - 4200 ms (500k records) - 119k rec/s

9.0.160 SQL - 3310 ms (500k records) - 151k rec/s

9.0.160.3 - 4200 ms (500k records) - 119k rec/s

First, Radian is clearly much faster at this than 8. The standard number of records we used for Radian was 500k and we had to reduce that number 50x to make the test for 8 manageable. The main reason for the performance difference is different internal data structures, which are way faster in Radian, with the difference between the object models coming second.

The performance in both builds of Radian is the same, changes to the object model in 9.0.160.3 were in other areas.

Inserting records using a script is slightly slower than doing the same using a query. This is fine for a simple script, although in general, given enough time and code, a script should outperform a query.

We don't include a SQL version of the test for 8, because the query engine in 8 lacks a function that would produce a virtual table filled with a sequence of numbers.

Test 2 - read numbers / strings

Read the table created by test 1, parse values.

8.0.30 - 2060 ms (10k records) - 4.9k rec/s

8.0.30 SQL - 41 ms (10k records) - 244k rec/s

9.0.160 - 2530 ms (500k records) - 198k rec/s

9.0.160 SQL - 2090 ms (500k records) - 239k rec/s

9.0.160.3 - 1820 ms (500k records) - 275k rec/s

This time the query engine in 8 can perform the test. Using SQL in 8 is clearly much faster than using scripts. For this particular query, the performance of the query engine in 8 is on par with that of the query engine in Radian. (In very simple scenarios like this one and / or on very small data sets, the query engine in 8 can even outperform Radian due to being less sophisticated! The scripts, however, can not.)

In 9.0.160, reading records using a script continues to be slightly slower than doing the same using a query. But in 9.0.160.3, the script outperformed the query, thanks to the implemented changes.

Test 3 - write geoms

Create a table and fill it with geometry values.

8.0.30 - 9400 ms (1k records) - 0.1k rec/s

9.0.160 - 9900 ms (500k records) - 50.5k rec/s

9.0.160.3 - 9900 ms (500k records) - 50.5k rec/s

This test behaves more or less like test 1. 8 is again the slowest. The two builds of Radian have the same performance, because changes to the object model in 9.0.160.3 were in other areas. The test prepares data for the next test.

Test 4 - read geoms

Read the table created by test 3, parse geometry values.

8.0.30 - 2930 ms (1k records) - 0.34k rec/s

9.0.160 - 3430 ms (500k records) - 146k rec/s

9.0.160 SQL - 81000 ms (500k records) - 6.2k rec/s

9.0.160.3 - 2740 ms (500k records) - 182k rec/s

8 is slowest, obviously. The standard test for Radian is 500x the size of that for 8. That's the effect of the many accumulated changes.

The query engine is significantly slower than the script at this particular task - which was the purpose of the test. Some tasks are much better done using a script - if the query analyzed geometry using a script function, the performance would have been much better.

9.0.160.3 improved the performance of the script by about 25%. The effect comes from optimizations in sequences.

Test 5 - read geoms complex

Read a separate table of geometry values, transforming each value using a sequence of spatial functions.

Tests for Radian include both a single-threaded and a multi-threaded version.

8.0.30 - 18650 ms (1k records) - 53.6 rec/s

8.0.30 SQL - 17300 ms (1k records) - 57.8 rec/s

9.0.160.2 - 17400 ms (1k records) - 57.5 rec/s

9.0.160.2 (4x) - 18100 ms (1k records) - 55.2 rec/s

9.0.160 SQL - 17500 ms (1k records) - 57.1 rec/s

9.0.160 SQL (4x) - 5475 ms (1k records) - 182.6 rec/s

9.0.160.3 - 17400 ms (1k records) - 57.5 rec/s

9.0.160.3 (4x) - 5450 ms (1k records) - 183.5 rec/s

The performance of 8 is reasonable. The number of calls into the object is small and this helps a lot. The performance of the query engine is slightly better than the performance of the script.

The performance of 9.0.160.2 is about the same as that of 8. The vast majority of the time is spent in geometry functions which have been explicitly selected to be more or less the same for 8 and Radian (for now), so the same performance from a script is expected. Reworking the script to use multiple threads does nothing. In fact, the script slows down a bit due to double locking - first in a script and then in the object model. The single-threaded query performs about the same as the single-threaded script. Reworking the query to use multiple threads improves the performance by quite a bit - as expected.

The performance of the single-threaded script in 9.0.160.3 is the same as in 9.0.160.2. But adding multiple threads now provides an immediate performance boost - about the same as for the query. The performance boost from using 4 threads for both the query and the script is about 3.2 rather than 4, because there is some overhead. The overhead could be reduced by making the portion of the code shared between threads more sophisticated (this is easier done for a script rather than for a query due to the script having more control over what gets executed and how).

Test 6 - read pixels

Read all pixels in an image.

8.0.30 - 18020 ms (88.5k pixels) - 0.5k pix/s

8.0.30 SQL - 113 ms (88.5k pixels) - 783k pix/s

9.0.160 - 580 ms (29.1m pixels) - 50,172k pix/s

9.0.160 SQL - 19770 ms (29.1m pixels) - 1,472k pix/s

9.0.160.3 - 580 ms (29.1m pixels) - 50,172k pix/s

Radian outperforms 8 by a huge margin. The query engine in 8 is much faster than a script, but its throughput is very far from what is possible in Radian.

This is one more test which is performed much better using a script rather than a query. The performance of a pure SQL solution in Radian is still 2x faster than the performance of a similar query in 8, but the query really should have used a script function in the middle, because the script is more than 30 times faster.

There is no difference between the performance of the script in 9.0.160 and in 9.0.160.3.

Enjoy!

tjhb
10,094 post(s)
#21-Apr-17 22:27

Well, we knew it would be big! This is really huge. It's great to have such detailed testing notes and examples.

(By the way I'm really glad you came up with something simpler than services for sharing data between scripts. I was afraid I would have been too lazy too learn to understand that. Now you've made another kind of parallelism essentially free.)

tjhb
10,094 post(s)
#21-Apr-17 22:51

The examples are really valuable--especially showing so clearly how to run a scripting task using multiple threads. It will make a great template.

The comparison in test 4 between SQL and scripting may not be entirely fair (or more accurately, not the end of the story). That's because in [query 5 - read geoms complex 4 threads] the function

FUNCTION f(g GEOM) FLOAT64 AS GeomLength(GeomBuffer(GeomConvertToLine(GeomBuffer(g, 0.3, 0)), 0.1, 0), 0) END;

could be rewritten into 3 nodes so that each heavy geometric operation can have its own pool of threads. An SQL version written in that way may be significantly faster than the current scripting version. (I haven't tested.)

On the other hand, (a) I think you specifically wanted to measure 4 threads in scripting against 4 threads in SQL (which is obviously a valuable test); and (b) I think the scripting version could also be rewritten so that each worker thread launched its own pool of child workers with part of the task, and each child worker could delegate a further part to a pool of grandchild threads, much as in SQL (at least in logic). All the same, in that case, creating such a thread pipeline would remain easier to code (and tune) in SQL.

adamw


10,447 post(s)
#22-Apr-17 07:12

You are correct in both (a) and (b) - we wanted the comparison to be apples-to-apples and whatever threading scheme is used by a query can be implemented by a script. Obviously, as you say, it is much simpler to experiment with that in a query, a script has to be much more verbose. But there is a payback in that when it is clear what threads we want, a script can be much more flexible and pick up a lot of performance which the query engine cannot.

For example, if we do want separate threads for GeomBuffer / GeomConvertToLine / second GeomBuffer above - which I don't think is going to do much in this particular case, but let's say our test is such that splitting threads that way helps - a script can use a single pool of threads with each thread doing either part of the work depending on whatever data is ready. A query engine can not do this right now. (And if / when the query engine learns to do this, the script will pull some other trick.)

It's a healthy balance. Queries are faster to write, so you start solving every task with one or more queries. Then as you make progress, you offload parts of the logic which you want to optimize to scripts (in the extreme case rewriting queries into a script entirely).

tjhb
10,094 post(s)
#22-Apr-17 08:29

That is a really important post, thanks so much Adam.

Let me try drawing a mistaken set of principles.


Query execution will usually be better than script execution, iff per-thread data load will be either roughly constant, or monotonically increasing.

However, multi-threaded scripts can be just as good in those cases, with more coding work.

But, where per-thread data may be randomly variable, scripts have an inherent advantage that queries can (for now) never match.

adamw


10,447 post(s)
#22-Apr-17 08:42

This isn't wrong, but I would have said it simpler:

SQL is a high-level language and scripts are a lower-level language. Anything you can do in a query, you can do in a script (in Radian, the object model is thin enough not to be an obstacle). The query engine has many, many, many optimizations that would be hard to replicate in a script, but that's still theoretically possible. What's more important is that the query engine does not have other optimizations that can be implemented in a script easily. So, the best approach is to mix and match, picking the best of both worlds.

mdsumner


4,260 post(s)
#22-Apr-17 01:29

Sweet, here's my minimal code for getting started with R. For anyone reading the install notes be sure to run the optional ODBC installation line/s from a console with elevated privileges. There's no feedback but its action can be confirmed by seeing a new ODBC driver in the system, with "(Experimental)" in the name.

You can confirm the connection in R with something like this:

#R

library(DBI)

mapfile <- "C:\\data\\Radian\\odbc\\Drawing_9.0.map"

dstring <- sprintf("Driver={Manifold 9.0 Project Driver (Experimental) (*.map)};DBQ=%s;", mapfile)

con <- dbConnect(odbc::odbc(), .connection_string  = dstring)

con

# <OdbcConnection>  Database: C:\data\Radian\odbc\Drawing_9.0.map

# Radian Studio 9.0 Universal Edition (build 9.0.160.3) Version:  9.00.0003

If you try to have two live connections to the same source (don't do that) you'll see an error like this:

#R

Error in odbc_connect(connection_string) : 

  nanodbc.cpp:950: IM006: [Microsoft][ODBC Driver Manager] Driver's SQLSetConnectAttr failed 

Make sure your R and DBI and odbc and other packages are up to date.

I'm using the R release candidate, with R version 3.4.0 RC (2017-04-20 r72569) with these package versions: DBI_0.6-1 odbc_1.0.1 compiler_3.4.0 tibble_1.3.0 Rcpp_0.12.10 blob_1.0.0

More soon, thanks!


https://github.com/mdsumner

adamw


10,447 post(s)
#22-Apr-17 07:20

You can have multiple connections to the same MAP file as long as they are read-only (the connection string above should include "...;ReadOnly=true;"). Read-write connections to MAP files are exclusive.

Connections to data sources other than MAP files behave according to the data source type. Connections to most file-based data sources are exclusive (sometimes even if the data source is opened in read-only mode), connections to web data sources are always shared, connections to databases are a mixed bag again - MDB / XLS and the likes can again be exclusive depending on connection options, SQL Server / PostgreSQL / Oracle and other databases are shared.

mdsumner


4,260 post(s)
#22-Apr-17 08:04

Oh duh, great :0 I'm so used to this being read-only .. time to unpick 15 years of habit.


https://github.com/mdsumner

mdsumner


4,260 post(s)
#29-Apr-17 13:27

Just a note that I was able to use the GDAL dataport to import successfully.

I checked in the Log Window first which reported that it couldn't load the looked for DLL, and then I added C:\OSGeo4W64\bin to my path (for several years I've used the local shortcut from the OSGeo4W install which doesn't require the system path set).

To check something that Radian cannot already read on its own, I am messing around with VRT, which works.


https://github.com/mdsumner

adamw


10,447 post(s)
#02-May-17 07:24

A heads-up: we expect the next build either today or tomorrow. This is going to be the final build in this series of builds. After the final testing, the series will go public.

adamw


10,447 post(s)
#03-May-17 17:58

9.0.160.4

Changes

(Fix) LIMIT is no longer a reserved keyword in queries.

New script function: Values.AddValueType - takes a name and a type, and adds a value with type info and no data. The function allows conveying type information to objects like expressions without using model values.

New script property: Value.Type - reads the value type.

New script function: Value.CreateCopy - creates a copy of a value.

New script function: Values.CreateCopy - creates a copy of a value set.

New script function: Values.CreateLocked - creates a copy of a value set with locked structure. Attempting to add or remove a value in a locked value set will throw an error. In return, using a locked value to pass data to functions like Table.Insert performs faster.

Renamed script function: Expression.Source is renamed to Expression.GetSource. Each call returns a new Values object. This allows using the returned object for evaluation (using the shared object returned by Expression.Source only worked if the script never altered its structure and never used it from multiple threads, this was too error-prone).

Renamed script function: Expression.Target is renamed to Expression.GetTarget. Each call returns a new Values object, similarly to Expression.GetSource.

Script functions that return Values objects like Sequence.Fetch or Expression.Evaluate perform noticeably better and include optimizations for values with many items.

Attempting to access an item with an unknown name in a Values object or in one of the schema collections returns a null object instead of throwing an error.

New script property: Database.CanDesign - returns true if database supports changes to table schema and false otherwise. The property is mainly used to determine if creating a new table in order to alter its schema later is not going to be possible.

New script property: Database.IsSaveAsNeeded - returns true if saving changes to the database will change its format (used to indicate when saving a MAP file will upgrade its format from Manifold 8 to Radian and make it unreadable in Manifold 8).

Script functions for Database and Table objects no longer reserve an opts parameter for extensibility. (Extensibility is now done differently.)

New script property: Schema.IndexField.IgnoreAccent - controls the use of accents for a text field in a btree index.

New script property: Schema.IndexField.IgnoreSymbols - controls the use of symbols for a text field in a btree index.

Reworked script function: Schema.DeleteItem is replaced with Schema.DeleteConstraint, Schema.DeleteField and Schema.DeleteField.

Launching a script reclaims memory in undisposed objects at the end. The change has little effect for well-written scripts, but makes for a big difference for less-well-written ones where it helps to conserve memory and pinpoint coding errors.

The 000 (S-57) dataport performs significantly faster.

The 000 dataport exposes labels and map.

The 000 dataport recognizes way more coordinate systems.

The SID dataport supports reading JPEG2000 files and has been made the default dataport for JPEG2000 files due to an issue in the current version of the ECW SDK. (We are having second thoughts about the switch as SID SDK also has issues with JPEG2000, just on different files.)

(Fix) The GDAL dataport no longer creates a separate image for each channel of BGR / BGRA data.

The GDAL dataport reports used driver in the log.

(Fix) Altering the schema of a GDB table increments its version to update windows that show its data.

End of list.

adamw


10,447 post(s)
#03-May-17 17:59

Updated test results

Updated test file, changes for 9.0.160.4 are marked in comments:

scripts-throughput-b.map

Here are the comparisons of 9.0.160.4 with 9.0.160.3, all test sizes are the same so I am just providing times in milliseconds.

Test 1 - write numbers / strings

9.0.160.3 - 4200 ms

9.0.160.4 - 2600 ms, a significant improvement from using locked value sets

Test 2 - read numbers / strings

9.0.160.3 - 1820 ms

9.0.160.4 - 1600 ms, a minor improvement from optimizations in sequences

Test 3 - write geoms

9.0.160.3 - 9900 ms

9.0.160.4 - 8100 ms, a fair improvement from using locked value sets

Test 4 - read geoms

9.0.160.3 - 2740 ms

9.0.160.4 - 2350 ms, a minor improvement from optimizations in sequences, like in test 2

Test 5 - read geoms complex

Test 6 - read pixels

No changes, optimizations were in other areas

There are two new tests.

Test 7 - write many fields

9.0.160.3 - 7800 ms

9.0.160.4 - 3200 ms, a significant improvement from multiple sources

Test 8 - read many fields

9.0.160.3 - 10450 ms

9.0.160.4 - 1200 ms, a dramatic improvement from optimizations in value sets

Lastly, updates to the API doc are coming (very) soon.

adamw


10,447 post(s)
#10-May-17 17:33

The build went public.

There are tons of additions in the object model, you can see them in the final build notes.

We managed to squeeze some more performance, too - and where it matters the most: building geometry / tile values as well as accessing pixel data in tile values. I will try to post some performance numbers later.

We are also going to update the API doc tomorrow.

The scripts have never been this fast!

And then - next wave of cutting edge builds. This time the focus is going to be the UI, particularly for rasters.

adamw


10,447 post(s)
#11-May-17 10:20

Here is an illustration of some of the most important performance-related changes in the final build.

This is a fragment of code from test 6 (read pixels) in 9.0.160.4:

// C#

M.Tile tile = ...

M.Tile.PixelSet pixels = tile.Pixels;

for (int pixely = 0; pixely < tile.Height; ++pixely)

{

  for (int pixelx = 0; pixelx < tile.Width; ++pixelx)

  {

    object pixel = pixels[pixelx, pixely];

    if (pixel == null)

      continue;

    double height = (short)pixel;

    if (height >= 200)

      accum++;

  }

}

...and this is the same code in 9.0.161:

// C#

M.Tile tile = ...

M.Tile.PixelSet<short> pixels = (M.Tile.PixelSet<short>)tile.Pixels;

M.Tile.PixelSet<bool> pixelMissingMasks = tile.PixelMissingMasks;

for (int pixely = 0; pixely < tile.Height; ++pixely)

{

  for (int pixelx = 0; pixelx < tile.Width; ++pixelx)

  {

    if (pixelMissingMasks[pixelx, pixely])

      continue;

    double height = pixels[pixelx, pixely];

    if (height >= 200)

      accum++;

  }

}

Let's count the differences.

First, pixel values returned by the pixel set are no longer of the abstract 'object' type that have to be cast to whatever type they really are individually. Instead, you specify the pixel type as a parameter to the PixelSet type, which is now a generic. This avoids having to convert types later. (Note how the pixel value in the first fragment has to be cast to 'short' first - which is what the real type of pixels in that particular tile is. You can't even cast the pixel value to 'double', which is what the code in the example wants. Because that cast is going to fail, the value returned by the pixel set is 'short' and not 'double'. More importantly, you have to be casting the value of each pixel in the tile to the same type over and over again. Note how the pixel value in the second fragment does not have to be cast to anything at all. The system already knows it is a 'short' from the PixelSet definition, and the conversion to 'double' is automatic.)

Second, invisible pixels are no longer represented by null values. Instead, there is a PixelMissingMasks property that returns a boolean value for each pixel (true - the pixel is missing = invisible, false - the pixel is not missing = visible). This allows pixel values to be returned as regular non-nullable value types without boxing, which also improves performance.

Also, this is not shown in the above fragments, but Point<T>, Point3<T>, Point4<T> values used to represent pixels in 2/3/4-channel tiles are now value types as well. The object model used to have a limitation in that all custom types added by Radian had to be reference types, that limitation has been removed and so primitive vectors are now value types. This improves performance again.

The end result of the changes for this particular example is a performance boost of about 30%:

Test 6 - read pixels

9.0.160.4 - 580 ms

9.0.161 - 450 ms

And one more performance difference of 9.0.161 compared to 9.0.160.4:

Test 3 - write geoms

9.0.160.4 - 8100 ms

9.0.161 - 6800 ms

The improvement comes from changes to the geom builder. Points becoming value types don't affect this test much, but they would if the script, for example, stored them in arrays.

Other tests show the same performance, some are marginally better than before, none are slower.

Manifold User Community Use Agreement Copyright (C) 2007-2021 Manifold Software Limited. All rights reserved.