Subscribe to this thread
Home - General / All posts - Clip transform not optimized
tjhb
10,094 post(s)
#23-Sep-18 02:56

(9.0.168.2.)

The Clip transform operates on two tables, clipping geoms in t1 with all area geoms in t2.

Its first step is currently to union all area geoms in t2 into one area.

That is not optimal if geoms in t1 tend to be small, and the geoms in t2 are collectively large or complex.

I think it would normally be better for the template to approach this geom-by-geom: for each geom in t1, union (only) all touching areas union in t2. Then clip.

(That seems also efficient if the situation is reversed: if geoms in t1 are large, and the areas in t2 are many and small.)

It might also save work if there were no reprojection where projections already match.

[Added.] There might be more to it. Clipping in 9.0.168.2 (at least) seems terribly slow compared to Manifold 8. I will attach a test project with timings in Manifiold 8 and 9.

adamw


10,447 post(s)
#24-Sep-18 16:21

We did the query this way on purpose - we have several optimizations in mind here which don't care about one of the areas in the clip potentially being big and we think that in the end this will outperform the constant searches for each object. We don't have these optimizations ready though, so if the operation without them ends up being terribly slow (it might on some types of data), well... maybe we should reconsider the query for the time being, or maybe we should bump the priority for the optimizations. Note though that if the clipping drawing already consists of a single complex object, it already is bottlenecked in the exact same place it bottlenecks now - a clip of a small object and a large object spending time proportional to the complexity of the large object - this is the right thing to fix (making the operation spend time proportional to the complexity of the result of the clip).

tjhb
10,094 post(s)
#25-Sep-18 04:14

Thanks very much Adam. Back soon with a specific comparison in Manifold 8 and 9, with the data I was working on above.

tjhb
10,094 post(s)
#26-Sep-18 01:17

An interim update.

Test data on Dropbox: for Manifold 8 (.7z, 7.6 MB), for Manifold 9 (.mxb, 9 MB).

Both files contain the same source geometry, in two drawings.

Both drawings are projected (NZTM/NZGD2000). In Manifold 8 they have the default location precision of 1e-6.

The 3 areas in [mainland] show the main 3 islands of New Zealand.

The 16204 areas in [island] include both (a) offshore islands (some large, some tiny) and (b) islands within waterbodies and rivers on the mainland.

(Why are (a) and (b) published as a single data class? You may well ask.)

The purpose of the current exercise is to split the islands into offshore islands and mainland islands.

I have tried to keep operations in 8 and 9 as close to identical as possible.

Here are the operations in 8:

open project "20180923 Mainland and islands R32 m8.map"

open map

    [All objects in mainland] Normalize metric

    [All objects in island] Normalize metric # unnecessary

duplicate [island] -> [Island 2]

add [Island 2] to map

    [All objects in Island 2] Normalize metric

    [All objects in Island 2] Clip with (intersect) [All objects in mainland]

duplicate [island] -> [Island 3]

add [Island 3] to map

    [All objects in Island 3] Normalize metric

    [All objects in Island 3] Clip with (intersect) [All objects in mainland]

Timings in 8:

Clip with (Intersect): 109.308 sec (1mn 49s)

Clip with (Subtract): 107.765 sec (1mn 48s)

In 9 I have used matching tolerance (1e-6). Earlier incomplete tests used 0 (automatic tolerance). I can retest fully at 0, time permitting, but the impression was much the same as with 1e-6.

Here are the operations in 9:

open project "20180923 Mainland and islands R32 m9.map"

open map

    [mainland]

        Contents > Template

            Geom

            Template

                Normalize

                    Geom: Geom

                    Tolerance: 1e-6

                +Allow parallel execution

            Update field

    [island]

        (same)

Project pane

    [island], [island Table]

        Copy

        Paste -> [island 2], [island Table 2]

        Paste -> [island 3], [island Table 3]

add [island 2] to map

add [island 3] to map

open map

    [island 2]

        Contents > Template

            Geom

            Template

                Normalize

                    Geom: Geom

                    Tolerance: 1e-6

                +Allow parallel execution

            Update field

    [island 3]

        (same)

    [island 2]

        Contents > Template

            Geom

            Template

                Clip

                    Clip with: mainland

                    Tolerance: 1e-6

                    +Keep inner part

                +Allow parallel execution

            Add component

            ...

That's as far as I've got in 9. After processing for 1h 15mn, we are up to object 2256 (of 16204), still on the first operation. CPU usage is a solid 100%.

I am not sure that I'll let it finish. (That's the reason for interim update.)

In both cases Manifold 8/9 has been the only application running except Notepad++ (for keeping notes), Firefox (just now, for getting Dropbox links and writing this post), and in the case of Manifold 9, I have kept one instance of Manifold 8 open (just for pasting occasional screenshots made via PrtScn).

[Screenshot added, after another 15mn processing.]

[9.0.168.2. Windows 10 64-bit, 32 GB RAM, 3x SSD, TEMP on SSD2.]

Attachments:
screenshot.png

tjhb
10,094 post(s)
#26-Sep-18 02:43

I posted that, then had lunch. The last bite of lunch said...

Why not try combining all of the small areas into one multi-part area, then clipping with the large areas? Maybe the inefficiency in clipping many small areas with a few large areas really comes down to the n-way join.

I'll give steps in a minute, to be consistent, but here are the timings. The first is with "Keep inner part" (intersect), the second without (subtract).

Transform (Clip): [island Table Merge Areas Drawing] (21.843 sec)

Transform (Clip): [island Table Merge Areas Drawing 2] (22.344 sec)

So I feel really dumb to be outsmarted by a flounder (and some lettuce), but I won't be the only one who runs into this trap, so I'm pleased that I did.

An example of trying to do in 9 what one would do in 8, in the way one would do it in 8. I know that is wrong, but I did exactly that.

tjhb
10,094 post(s)
#26-Sep-18 03:50

This approach is almost the exact opposite of what I initially thought the answer should be, in my first post in this thread, and looking at the data it's clear why that is the case.

Perhaps what is most efficient will mainly come down to the difference in scale between the objects to be clipped (call this set A) and the clipping area(s) (set B).

I'm going to try thinking aboiut it like this:

(1) First, if it is possible that objects from A will overlap more than one area from B, then the areas from B should normally* be merged either wholly (if potential overlaps are common) or in the groups that touch each object in A (if overlaps will be rare). *That is, assuming input objects in A should map to output objects 1:1.

(2) Then, if objects in A are generally much smaller than the area(s) in B, then set A should be merged* before clipping, and split afterwards. *If some or all attributes must be kept, then it will usually be better to group by those attributes, rather than merging the whole set.

(3) If the objects in A and area(s) in B are generally of similar size, then set A should not be merged.

(4) If the objects in A are generally much larger than the area(s) in B, then again set A should not be merged, but areas in B should definitely be merged, either wholly or in touching groups (as for (1), but with a stronger preference for touching groups in this case).

I am only thinking out loud, and any of these points may be completely wrong--just as my first post was.

Could the built-in clipping template(s) be made to adapt to this sort of factor in the data? That is probably a different question.

adamw


10,447 post(s)
#26-Sep-18 08:21

Thanks a lot for the notes.

Figuring out whether pre-joining the clip areas will make sense in terms of performance is computationally expensive, that's mostly a dead-end for the general case. We'll take a look at the numbers on our side and either revert the pre-joining until we have the optimizations I talked above, or bump the priority of these optimizations.

tjhb
10,094 post(s)
#27-Sep-18 04:44

Thanks Adam.

In the meantime, for anyone else who needs to clip many tiny areas with a few large areas, here is a workflow that works well.

[island drawing]

    Contents > Transform

        Geom

        Template: Merge areas

            Geom: Geom

        Options

            Table: [all islands merged]

            Component: [all islands merged drawing]

        +Allow parallel execution

        Add component

        -- Transform (Merge Areas): [island drawing] (0.094 sec)

[map]

    add [all islands merged drawing]

    [mainland drawing]

        Contents > Template

            Geom

            Template

                Normalize

                    Geom: Geom

                    Tolerance: 1e-6

                +Allow parallel execution

            Update field

            -- Transform (Normalize): [mainland drawing] (4.281 sec)

    [all islands merged drawing]

        Contents > Transform

            Geom

            Template: Normalize

                Geom: Geom

                Tolerance: 1e-6

            +Allow parallel execution

            Update field

            -- Transform (Normalize): [all islands merged drawing] (2.218 sec)

            Template: Clip

                Clip with: [mainland drawing]

                Tolerance: 1e-6

                +Keep inner part -- first time

            Options

                Table: [all islands mainland]

                Component: [all islands mainland drawing]

            +Allow parallel execution

            Add component

            -- Transform (Clip): [all islands merged drawing] (21.844 sec)

[all islands mainland drawing]

    Contents > Transform

        Template: Decompose to shapes

        Options

            Table: [island mainland]

            Component: [island mainland drawing]

        Add component

        -- Transform (Decompose to Shapes): [all islands mainland drawing] (0.844 sec)

[map]

    [all islands merged drawing]

        Contents > Transform

            Geom

            Template: Clip

                Clip with: [mainland drawing]

                Tolerance: 1e-6

                -Keep inner part -- second time

            Options

                Table: [all islands coastal]

                Component: [all islands coastal drawing]

            +Allow parallel execution

            Add component

            -- Transform (Clip): [all islands merged drawing] (21.875 sec)

[all islands coastal drawing]

    Contents > Transform

        Template: Decompose to shapes

        Options

            Table: [island coastal]

            Component: [island coastal drawing]

        Add component

        -- Transform (Decompose to Shapes): [all islands coastal drawing] (1.141 sec)

The total time of all transforms here is 52s, which compares well with the total Manifold 8 time of 217s or 3mn 37s. (And very much better that the 1h 30mn to get about 16% of the way through in Manifold 9 by taking a naive approach.)

All of the transforms can be combined into a single query (quite easy with the help of the Edit Query button in each transform) to make a one-click operation.

Very small points. Tolerance of 1e-6 (0.000001) is only used to match the Manifold 8 procedure exactly. "Automatic" tolerance (0), or something more meaningful for the data, would normally be better. (2) I renamed the source table and its drawing after posting test data in the Dropbox project. (3) In the steps given earlier for the Manifold 9 workflow that did not work well for this data, I mistakenly wrote "Contents > Template" when I meant "Contents > Transform".

Lastly, no apologies for splitting so many hairs here since the time saving is huge. Someone else might bang their head against the same thing in the meantime and might find this thread helpful--before it is all made completely irrelevant by new optimizations.

danb

2,064 post(s)
#27-Sep-18 21:59

Thanks for posting Tim. This will be very useful. I had exactly this task to do yesterday which I started in 9 then ended up deferring to 8. I need to do it again today so will try your suggested methodology.

Clips and the various topology overlays seem to be quite a large part of my day to day life and I am deferring both to 8 at the moment. I'm very pleased to hear from Adam however that both are owed a rework at some point along the line.


Landsystems Ltd ... Know your land | www.landsystems.co.nz

BoldItalicUnderlineStrikethroughSuperscriptSubscriptInsert Bullet ListInsert Numbered ListInsert ImageInsert LinkRemove LinkInsert Horizontal RuleInsert HeadingInsert SubheadingInsert TextInsert QuoteInsert CodeRemove Formatting
Insert Symbol SmileWinkGrinBlinkBlushOh, My!HuhSadAngryPh34rSleepCool
Link URL: Text Raw HTML




Post Reply Cancel
Manifold User Community Use Agreement Copyright (C) 2007-2021 Manifold Software Limited. All rights reserved.