Subscribe to this thread
Home - General / All posts - neater coding / selecting triangles
yves61
438 post(s)
#15-Mar-15 13:52

I have this query which selects triangle areas , which coordinates - all 3 - lay on another area having more than 3 coordinates. Is there a better and neater coding to achieve the same goal - using the 3 coordinates of the triangles as a test criterium ?

OPTIONS COORDSYS("Drw" AS COMPONENT);

select * from 

    (

    select P1ID, P2ID, count(contains(P2ID,P1IDCoord)) as QTYCoord FROM

        (

        select P1.id as P1ID,  P1IDCoord,  P2.id as P2ID from Drw as P1, Drw as P2 

        where

        P1.id <>P2.id

        and

        CoordCount(P1.id)=3

        AND

        CoordCount(P2.id)>3

        Split by coords(P1.id) as P1IDCoord

        )

    where 

    contains(P2ID,P1IDCoord)

    Group By P1ID,P2ID

    )

where 

QTYCoord = 3

Attachments:
triangles.map

tjhb
10,094 post(s)
#15-Mar-15 19:40

[Made a mistake in my reading of your objective. It pays to actually look at the data! Back soon.]

tjhb
10,094 post(s)
#15-Mar-15 20:18

There's one bug, though its effect is masked:

count(contains(P2ID,P1IDCoord))

This is not a useful construct since it will always return the count of all rows. Contains returns Boolean, and COUNT() counts all values except NULLs, so every row where neither P2ID nor P1IDCoord is NULL will be counted, regardless of whether Contains() returns TRUE or FALSE for that row.

This is not interfering with the result in this case, since all rows where Contains() is FALSE are removed by the WHERE condition. (To see what COUNT() is really doing, try removing the WHERE Contains() clause.)

What you might want in another case like is something like

SUM(CAST(Contains(a, b) AS INTEGER))

though you don't need that here--you just need to count the contained coords.

Other than that, there's nothing wrong with your method. It can be simplified a bit though and I think this is what you are asking for pointers on.

In the join conditions, you don't need to test whether P1.ID <> P2.ID, since you are already testing that they have different numbers of coords (which entails that they are different objects).

On the other hand, there is a chance that it might be helpful to do so, if it allows the query engine to make better use of its index on ID. You can find that out by testing.

The WHERE Contains() filter can be more naturally expressed as a LEAVING filter directly on the result of SPLIT BY.

Similarly the final WHERE QTYCoord = 3 filter can be expressed as a HAVING filter on the result of GROUP BY.

OPTIONS CoordSys("Drw" AS COMPONENT);

SELECT

    [P1ID][P2ID],

    COUNT([P1IDCoord]AS [QTYCoord]

FROM

    (SELECT

        [P1].[ID] AS [P1ID][P1IDCoord],

        [P2].[ID] AS [P2ID]

    FROM

        [Drw] AS [P1]

        INNER JOIN

        [Drw] AS [P2]

    ON [P1].[ID] <> [P2].[ID] -- unnecessary but possibly helpful

    AND CoordCount([P1].[ID]) = 3

    AND CoordCount([P2].[ID]) > 3

    SPLIT BY Coords([P1].[ID]AS [P1IDCoord]

    LEAVING Contains([P2].[ID][P1IDCoord])

    )

GROUP BY [P1ID][P2ID]

HAVING COUNT([P1IDCoord]) = 3

;

Why maintain consistent capitalization, if the engine doesn't care? What style you choose is up to you, but in my opinion *some* consistent style helps build good code. Makes it easier to visualize a solution, to spot bugs, and later to maintain and repurpose code.

Why always use square brackets? Harder to justify that. Habit and perhaps a little bit more clarity on the page (or you might think: more clutter), plus one important practical benefit. If you want to change the name of a table, a column or an alias, everywhere in the code, you can be certain that replacing all instances of [name] does exactly that, even if you happen to use a name that contains an SQL token appearing elsewhere in the query.

Why use a fully expressed INNER JOIN rather than the comma/WHERE form? Because it makes you think: what kind of join do I need here? Do I need to apply the condition as a filter to both tables (INNER JOIN), just one table (LEFT or RIGHT), or neither table (FULL JOIN); or do I mean no condition at all (CROSS JOIN)? These are all different tools, and if you explicitly set them out on your mental desk, you'll use them. Making the right choice can mean getting the right result rather than an error (of course); but in cases where more than one kind of join is workable, choosing the best kind can make a large difference to speed. And when you come back later you can immediately see your thinking.

(Avoiding the comma/WHERE form of join also potentially leaves WHERE syntax for the specific purpose of filtering rows, which in turn helps keep joins and simple filters as separate concepts in your head.)

More important than any of that is working towards a style that works for you equally well on the page and in your head. For example, if you prefer lower case and minimal use of whitespace in your head, you'll probably prefer lower case and minimal whitespace on the page as well.

Attachments:
qrySelectTriangles.txt

tjhb
10,094 post(s)
#15-Mar-15 20:32

This is a simpler way again, to get the same result.

Not necessarily better (I like your method using SPLIT BY).

This uses multipoints. Care is required when using multipoint geoms in Manifold--always check that the result of a geometric test is what you would expect. The main gotcha is that Touches() only looks at the first coord of a multipoint. As Art has pointed out elsewhere, we can successfully use Distance() = 0 in that case instead, which does check all coords. Here Contains() checks all coords as well.

OPTIONS CoordSys("Drw" AS COMPONENT);

SELECT

    [P1].[ID] AS [P1ID],

    [P2].[ID] AS [P2ID]

FROM

    [Drw] AS [P1]

    INNER JOIN

    [Drw] AS [P2]

ON [P1].[ID] <> [P2].[ID] -- unnecessary but possibly helpful

AND CoordCount([P1].[ID]) = 3

AND CoordCount([P2].[ID]) > 3

AND Contains([P2].[ID], ConvertToPoint([P1].[ID]))

;

Attachments:
QrySelectTriangles 2.txt

adamw


10,447 post(s)
#16-Mar-15 07:11

OT: Without looking into the details of the conversion above (which I am 99% sure doesn't lose anything important), I have to say it is amazing how the original (a) SELECT calling SELECT with GROUP calling SELECT with SPLIT, was replaced by (b) SELECT with GROUP calling SELECT with SPLIT, and then by (c) SELECT with, um, a join.

yves61
438 post(s)
#16-Mar-15 07:25

A proof of how less is more ... by mastering the code and cutting the clutter.

Thank you. I feel I am learning a lot from you ... and in need of studying the code more.

Manifold User Community Use Agreement Copyright (C) 2007-2021 Manifold Software Limited. All rights reserved.