Subscribe to this thread
Home - General / All posts - M9 Contour Query Comparison
atrushwo108 post(s)
#29-Nov-18 16:48

Hey Everyone,

I had a question regarding a couple of queries I wrote. The first one works fine; however, the second does not, and I do not understand why. The purpose of the queries is to relate closed depression contours to the smallest contour which it is contained by. I need this information to use the average end area method to calculate incremental volumes for depth/volume curves for depressions.


Query 1

UPDATE

(

SELECT

 [3].[MFD_ID],

 [3].[MINCON] AS [DESTINATION],

 [S_MFD_ID]

FROM

(

SELECT

 [O_MFD_ID],

 FIRST([S_MFD_ID]AS [S_MFD_ID]

FROM

(

SELECT 

 [O_MFD_ID],

 [S_MFD_ID] 

FROM 

CALL GeomOverlayContainedPar(

[Contours DRAWING] ([mfd_id][HEIGHT][Geom]),

[Contours DRAWING] ([mfd_id][HEIGHT][Geom]),

0,ThreadConfig(SystemCpuCount()))

WHERE [S_MFD_ID] <> [O_MFD_ID] AND ABS([O_HEIGHT]-[S_HEIGHT]) <= 0.1

ORDER BY GeomArea([S_GEOM], 0) ASC

)

GROUP BY [O_MFD_ID]

THREADS SystemCpuCount()

)

INNER JOIN [CONTOURS] AS [3] ON [O_MFD_ID] = [3].[MFD_ID]

THREADS SystemCpuCount()

)

SET [DESTINATION] = [S_MFD_ID]


Query 2

UPDATE

(

SELECT

 [O_MFD_ID],

 SPLIT (COLLECT [S_MFD_ID],[o_MINCON] ORDER BY GeomArea([S_GEOM], 0) ASC FETCH 1)

FROM

(

SELECT 

 [O_MFD_ID],

 [O_MINCON],

 [S_MFD_ID],

 [S_GEOM]

FROM 

CALL GeomOverlayContainedPar(

[Contours DRAWING] ([mfd_id][HEIGHT][Geom]),

[Contours DRAWING] ([mfd_id][HEIGHT][Geom][MINCON]),

0,ThreadConfig(SystemCpuCount()))

WHERE [S_MFD_ID] <> [O_MFD_ID] AND ABS([O_HEIGHT]-[S_HEIGHT]) <= 0.1

)

GROUP BY [O_MFD_ID]

THREADS SystemCpuCount()

)

SET [O_MINCON] = [S_MFD_ID]

atrushwo108 post(s)
#29-Nov-18 17:16

Example Map

Attachments:
Contours.map

tjhb
10,094 post(s)
#29-Nov-18 17:31

After the SPLIT in the second query, the BTREE index on [O_MFD_ID] becomes (is converted to) BTREEDUP.

That is necessary, because each row (and ID value) is listed as many times as the rows produced by COLLECT for its group.

The fact that this is exactly one (FETCH 1) in this special case (FETCH 1) could in principle be seen by the compiler, but is not. It just sees SPLIT by a table and acts accordingly.

So the target table no longer has any unique index for its records, and can't be addressed for UPDATE.

I think it is easily fixed but I'll check first.

atrushwo108 post(s)
#29-Nov-18 17:53

Thanks for the reply! Great explanation too. I'm glad to know that I'm not going crazy.

I run into numerous situations like this and ideally would like to circumvent the redundant join. Some examples are:

  1. Finding the name of the nearest object (the one with the shortest distance)
  2. Finding the month with the maximum precipitation

tjhb
10,094 post(s)
#29-Nov-18 22:48

I was partly wrong. Having a better look now (with testing).

tjhb
10,094 post(s)
#01-Dec-18 06:01

Sorry for delay in posting this, may not still be useful.

Reformatting is not criticism but a side-effect of reading. Your code was already good, I learned a few things from it.

--SQL9

UPDATE [CONTOURS] SET [MINCON] = NULL;

PRAGMA ('progress.percentnext' = '100');

UPDATE

    (

    SELECT

        t.[mfd_id], t.[MINCON],

        u.[s_mfd_id]

    FROM

        [Contours] AS t

        INNER JOIN

        (

        SELECT

            [o_mfd_id],

            (

                -- extra parens to cast table as scalar

                -- preserves logical 1:1 and BTREE

                (

                COLLECT [s_mfd_id]

                ORDER BY GeomArea([s_Geom], 0) ASC

                FETCH 1

                )

            )

        FROM

            (

            SELECT

                [o_mfd_id],

                [s_mfd_id][s_Geom]

            FROM 

                CALL GeomOverlayContainedPar(

                    [CONTOURS DRAWING] ([mfd_id][Height][Geom]),

                    [CONTOURS DRAWING] ([mfd_id][Height][Geom][MINCON]),

                    0,

                    ThreadConfig(SystemCpuCount())

                    )

            WHERE [s_mfd_id] <> [o_mfd_id]

            AND ABS([o_Height] - [s_Height]) <= 0.1

            )

        GROUP BY [o_mfd_id]

        ) AS u

    ON t.[mfd_id] = u.[o_mfd_id]

    )

SET [MINCON] = [s_mfd_id]

;

Sorry for lack of notes (except one), questions welcome. I've removed THREADS specifications where they can't help.

I run into numerous situations like this and ideally would like to circumvent the redundant join. Some examples...

If you post specific examples (with data and/or code), we could discuss what is redundant and what is not, how best to preserve helpful indexes, etc. That would be fun.

Attachments:
Query 4.sql

tjhb
10,094 post(s)
#01-Dec-18 06:20

P.s. the point where the original BTREE was (and is) lost (converted) was at the GeomOverlay..., naturally, since no 1:1 relation could be assumed.

For that reason we do need to join back here.

tjhb
10,094 post(s)
#01-Dec-18 20:13

One more point, important.

It's about your original Query 1.

Don't use FIRST in that way in 9, we can't, especially not with multiple threads.

In multi-threaded SQL, FIRST and LAST both effectively mean "ANY" (some unpredictable row).

You were exactly right to use COLLECT instead, in Query 2. Then its ORDER does retain rigour, reliable.

tjhb
10,094 post(s)
#02-Dec-18 03:06

(The query can be further improved.)

atrushwo108 post(s)
#03-Dec-18 15:00

Morning!

No worries on the delay and apologies for mine as well. I put this down for the weekend. I have a bit of flexibility on delivering this task so timing isn't driving me right now. I'm really just trying to understand better. As such, I had a few questions/comments.

If I understand correctly.... (Let me know if I don't get it)

  1. The GeomOverlay converts the BTREE to a BTREEDUP because of the 1:many relationship
  2. The Group by converts the BTREEDUP to a BTREE (otherwise the BTREE couldn't have been preserved in the scalar function)
  3. The Scalar function is as opposed to the vector split function. Use of the scalar function preserves the BTREE for use on the join.
  4. The join compares the original BTREE against the preserved BTREE for optimal performance.

My questions to you are....

  1. Do I understand things correctly?
  2. How might I speed this up? You suggested that this query could be improved. Right now, Query 1, 4 and 5 (included below) are virtually identical in performance. I've been testing them on a larger dataset. 4 is the better than 1 by about 0.5%, 5 is worse than 1 by about 10%.
  3. My only idea was to do the GeomOverlay on a BoundedBox version of the contours then add an extra GeomOverlay condition in the where statement to ensure the BoundedBoxes didn't pick up anything extra. However, this only slowed things down. I didn't attach this version.

Thanks for the tip on first/last too. This was from my Manifold 8 experience where I didn't have the collect statement.


Query 5

--CREATE TEMPORARY DATABASE

CREATE ROOT x;

USE ROOT x;

CREATE DATASOURCE d AS ROOT;

--LOAD CONTOURS INTO TEMPORARY DATABASE

SELECT [MFD_ID],[GEOM],[HEIGHT] INTO [TCONTOURS] FROM d::[CONTOURS];

ALTER TABLE [TCONTOURS] (ADD INDEX [MFD_ID_XX] BTREE ([MFD_ID]));

ALTER TABLE [TCONTOURS] (ADD INDEX [GEOM_XX] RTREE ([GEOM]));

ALTER TABLE [TCONTOURS] (ADD PROPERTY 'FieldCoordSystem.Geom' 'PROJCS["Calgary_3TM_WGS_1984_W114",GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",SPHEROID["WGS_1984",6378137.0,298.257223563]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]],PROJECTION["Transverse_Mercator"],PARAMETER["False_Easting",0.0],PARAMETER["False_Northing",0.0],PARAMETER["Central_Meridian",-114.0],PARAMETER["Scale_Factor",0.9999],PARAMETER["Latitude_Of_Origin",0.0],UNIT["Meter",1.0]]');

CREATE DRAWING [TCONTOURS Drawing] 

(

  PROPERTY 'Table' '[TCONTOURS]',

  PROPERTY 'FieldGeom' 'GEOM'

);

--CREATE LOOKUP TABLE

SELECT 

 [O_MFD_ID],

 ((COLLECT [S_MFD_ID] ORDER BY GeomArea([S_GEOM], 0) ASC FETCH 1))

INTO [TEMP]

FROM 

CALL GeomOverlayContainedPar(

[TContours DRAWING] ([mfd_id][HEIGHT][Geom]),

[TContours DRAWING] ([mfd_id][HEIGHT][Geom]),

0,ThreadConfig(SystemCpuCount()))

WHERE [S_MFD_ID] <> [O_MFD_ID] AND ABS([O_HEIGHT]-[S_HEIGHT]) < 0.1

GROUP BY [O_MFD_ID]

--ADD INDEX TO LOOKUP TABLE

ALTER TABLE [TEMP] (ADD INDEX [MFD_ID_XX] BTREE ([O_MFD_ID]));

--RELATE BACK

UPDATE

(

SELECT

 [1].[MFD_ID],

 [1].[MINCON] AS [DESTINATION],

 [2].[S_MFD_ID] AS [DATA]

FROM D::[CONTOURS] AS [1]

INNER JOIN [TEMP] AS [2] ON [1].[MFD_ID] = [2].[O_MFD_ID]

)

SET [DESTINATION] = [DATA]

Manifold User Community Use Agreement Copyright (C) 2007-2021 Manifold Software Limited. All rights reserved.