Subscribe to this thread
Home - General / All posts - How to Generate a Near Table in Manifold?
#15-Oct-20 17:33

Hi,

I am new in Manifold.

I have a set of 100 polygons that represent parcels.

For each parcel, I need to know the nearest parcel based on Euclidean Distance (centroid Origin to centroid Neighbor parcel).

How can I do this in Manifold? I need something similar to the Generate Near Table tool from ArcGIS.

Any suggestions?

Attachments:
argGIS_nearestTable.PNG

apo122 post(s)
#15-Oct-20 19:17

Hello,

I recently used a query to do that in M9 between a set of points and an object

SELECT [points].[mfd_id],

SPLIT (

COLLECT [objects].[mfd_id], GeomDistance([points].[Geom][objects].[Geom], 0) ORDER BY GeomDistance([points].[Geom][objects].[Geom], 0)  FETCH 1 )

FROM [points] , [objects]

GROUP BY [points].[mfd_id] 

This is using a global join between the two tables and might be interested in restricting a bit with a

INNER JOIN [objects]

ON GeomDistance([points].[Geom], [objects].[Geom], 0) < x

tjhb

9,452 post(s)
#15-Oct-20 19:18

Nice!

By the way (an aftertought) you can avoid the SPLIT, which is strictly pro-forma since you are using FETCH 1, by wrapping the COLLECT expression in an extra set of (round) brackets. This casts the result--a one-record, one-field table (per group)--as a scalar.

apo122 post(s)
#15-Oct-20 19:20

yes the split collect is smart as I love SQL simple and efficient

apo122 post(s)
#15-Oct-20 19:23

I just found that this approach is limited by the lack of the AS possibility to rename the calculated fields.

Sometimes M9 is returning "return" as field name and sometimes a combination of the source table name and field, what is annoying if you have to couple several queries

tjhb

9,452 post(s)
#15-Oct-20 19:26

Yes, results produced by a COLLECT expression cannot be aliased.

On the other hand, a wrap in an extra layer allows this, and is essentially free (~no overhead), since the compiler is brainy enough to recognise that it does no extra work.

It's just one of those things, soon used to it.

apo122 post(s)
#15-Oct-20 19:34

sure we can alias it on a extra layer but what is annoying for me is that it is injecting extra symbols or chars as this example

[REQUALSPE_STEP5.\u0001E_year]

from the table [REQUALSPE_STEP5] and the field [E_year]

tjhb

9,452 post(s)
#15-Oct-20 19:45

Huh? Can you post an example that does that?

apo122 post(s)
#15-Oct-20 19:51

Sure here it is encountered that yesterday

I just run the query and added the table in the query zone

what is more strange, to me, is that you can refer directly to the [E_year] field even if its name is composed

Attachments:
collect.mxb

tjhb

9,452 post(s)
#15-Oct-20 21:09

To me that looks like a bug. Can you report it?

rk
412 post(s)
#15-Oct-20 22:14

The extra symbol is actually already there in the name.

Attachments:
u0001.png

apo122 post(s)
#16-Oct-20 07:01

yes we agree that the symbol is injected by the COLLECT query.

I'll report it

tjhb

9,452 post(s)
#16-Oct-20 07:08

That’s what I thought I saw too. Why I have no idea.

On composed names though, I think these are, precisely, composed only when fields need to be shown to the world. Inside the compiler’s environment, fields are just pointers, names don’t matter, until they do.

apo122 post(s)
#16-Oct-20 07:12

Just reported it.

The second level of strangeness is that I can refer to this column using only E_year even if its name is [REQUALSPE_STEP5.\u0001E_year]. It means the engine is splitting the name by .\u0001 to finally consider on the second half of the field name.

Tested this using

SELECT [E_year] FROM [REQUALSPE_STEP6]

tjhb

9,452 post(s)
#16-Oct-20 07:14

(It doesn’t mean that. See belated comment above.)

apo122 post(s)
#16-Oct-20 07:22

If I understand you right the pointer is anything but a name, right? Then it doesn't explain to me how I can "reach" [REQUALSPE_STEP5.\u0001E_year] calling [E_year]. But what ever is the technical background it has to be more clear for me

tjhb

9,452 post(s)
#16-Oct-20 07:26

Yes, that seems fair. Quite right.

What I am saying in a nutshell, basically agreeing with you, is that there is room for improving SQL9 here.

We can’t do it ourselves...

tjhb

9,452 post(s)
#16-Oct-20 07:47

It’s not quite so simple, but also more simple.

You can refer to a field only by a non-ambiguous name. What is unambiguous changes according to context.

The query engine tries to assign non-ambiguous names to fields whenever and however they are called upon by syntax.

Unfortunately and fortunately, this means names can vary with query level.

When a table is made concrete by SELECT or SELECT INTO, ambiguities are again eliminated.

That’s my reading, but this is a really great conversation.

apo122 post(s)
#16-Oct-20 07:18

But then three logics are used for that

  1. the old Manifold custom to add a incremental number at the end of the name of a duplicated name in columns -> E_year becoming E_year2
  2. the new way to add the name of the table in front even if the column name

  • the result generic name for calculation which I understand but was before column

    For all that I would love having the AS available in order to be cleaner

  • tjhb

    9,452 post(s)
    #16-Oct-20 07:22

    I think you are onto something. Several conventions might be in collision here.

    And while I understand that aliases aren’t available for the results of COLLECT, I accept that without knowing why. You are right to push on that I think.

    apo122 post(s)
    #15-Oct-20 19:25

    hmm never got it to work reason why i stick on the SPLIT, could you rewrite my small query with this principle

    tjhb

    9,452 post(s)
    #15-Oct-20 19:28

    Sure. Give me a few minutes, I am feeding the dogs. (For which I use an iPad.)

    tjhb

    9,452 post(s)
    #15-Oct-20 21:07

    You are right, I had overlooked that the query returns two fields per record.

    We can do either

    SELECT [points].[mfd_id],

        (( -- cast single field as scalar

        COLLECT

        GeomDistance([points].[Geom][objects].[Geom], 0)  --> [result]

        ORDER BY GeomDistance([points].[Geom][objects].[Geom], 0)

        FETCH 1

        ))

    FROM [points][objects] -- CROSS JOIN

    GROUP BY [points].[mfd_id]

    ;

    or

    SELECT [points].[mfd_id],

        ((  -- cast single field as scalar

        COLLECT

        [objects].[mfd_id] --> [mfd_id 2]

        ORDER BY GeomDistance([points].[Geom][objects].[Geom], 0)

        FETCH 1

        ))

    FROM [points][objects] -- CROSS JOIN

    GROUP BY [points].[mfd_id]

    ;

    But if both result fields are needed, then you need SPLIT, as in yoiur query.

    #16-Oct-20 14:04

    Super helpful code apo, thanks!! It worked very fast on my sample.

    I am trying to select only the distances that are >0.

    I am using CASE, but I think I am missing something here?

    SELECT [points].[mfd_id],

    SPLIT (COLLECT [objects].[mfd_id],

    CASE GeomDistance([points].[Geom], [objects].[Geom], 0) > 0

    THEN ORDER BY GeomDistance([points].[Geom], [objects].[Geom], 0) FETCH 1 )

    FROM [points] , [objects]

    GROUP BY [points].[mfd_id]

    tjhb

    9,452 post(s)
    #16-Oct-20 16:31

    The syntax for CASE is CASE WHEN... THEN... END, all parts required. There can be multiple instances of WHEN...

    But this is not an instance when CASE is very useful.

    Here you could instead use a WHERE filter inside the COLLECT clause.

    But it would be better to use an INNER or OUTER JOIN to prevent touching matches in the first place. It would almost always be best to use a search radius as well. CROSS JOINs are usually a mistake, even when they work correctly.

    apo122 post(s)
    #17-Oct-20 11:40

    Tim is right the cross join is not the more elegant path to follow but I was focusing on the COLLECT. Anyway concerning the condition you Amintas want to inject I would add it in the JOIN, the query at the end being something like

     

    SELECT [points].[mfd_id]

    SPLIT ( 

      COLLECT [objects].[mfd_id]

              GeomDistance([points].[Geom][objects].[Geom], 0) 

              ORDER BY GeomDistance([points].[Geom][objects].[Geom], 0) 

              FETCH 1 ) 

    FROM [points]

    INNER JOIN [objects] 

    ON GeomDistance([points].[Geom][objects].[Geom], 0)>0

    GROUP BY [points].[mfd_id] 

    best

    a.

    Manifold User Community Use Agreement Copyright (C) 2007-2019 Manifold Software Limited. All rights reserved.