Subscribe to this thread
Home - General / All posts - M8 help for writting a SQL Query to merge couple of points in a cloud of points
Corentin47 post(s)
#30-May-18 17:12

Hello,

I have a set of 18000 points locating villages of Mali. Many of them are couple, very close (less than 1km) and with the same name. How can I merge these point in a single one.

Case 1 : same location, same name => I can just delete one of them

Case 2 : same name, very close. How can I create a new point in the middle and delete the old one? (but only if they are close enough, like less than 1km)?

I don't want to just merge points with the same name because some of them can be real different villages...

adamw


7,948 post(s)
#31-May-18 09:27

If you have a threshold distance below which points with the same name are close enough to be thought of as duplicates, then you can (a) take the X / Y coordinates for each point, (b) divide X / Y by threshold distance, then round to the nearest integer, (c) group points by name and rounded coordinates, then (d) take the first point in each group and delete the rest.

See example MXB.

The query:

--SQL9

 

-- create table for IDs to keep

CREATE TABLE [keep] ([mfd_id] INT64INDEX [mfd_id_x] BTREE ([mfd_id]));

 

-- collect duplicates by name / collapsed coordinates

INSERT INTO [keep] ([mfd_id])

SELECT First([mfd_id]FROM (

  SELECT [mfd_id][name],

    Round(VectorValue(GeomCoordXY([geom], 0), 0) / 50) AS [cx],

    Round(VectorValue(GeomCoordXY([geom], 0), 1) / 50) AS [cy]

  FROM [buildings]

GROUP BY [name][cx][cy];

 

-- show duplicates (used as an example)

SELECT * FROM [buildings] WHERE [mfd_id] NOT IN [keep] ([mfd_id]);

-- delete duplicates

--DELETE FROM [buildings] WHERE [mfd_id] NOT IN [keep] ([mfd_id]);

I keep IDs to keep in an intermediate table for simplicity.

Hope this helps.

Attachments:
remove-duplicates.mxb

Manifold User Community Use Agreement Copyright (C) 2007-2017 Manifold Software Limited. All rights reserved.