Subscribe to this thread
Home - General / All posts - Convert dark colored pixels to an area drawing with Radian
Mike Pelletier

2,122 post(s)
#23-Mar-17 21:31

Is it possible in Radian to take an image, such as the attached, and convert shadows and other very dark colored pixels into a vector area drawing? It's possible in Mfd 8 but requires converting the project into many time consuming batches, so I'm hoping Radian has the tools.

Attachments:
Capture.JPG

tjhb
10,094 post(s)
#23-Mar-17 22:05

We could:

Start with TileToValuesXN. Then a self join on the resulting table to match each pixel P (ValueXN) with each neighbouring pixel Q (within radius R, or dX, dY) having similar colour (or e.g. Lightness or Value). Then group by each P and draw a new area combining cell P (pixel as area) with all matching neighbours Q (pixels as areas). Then union all those new areas in one group (giving a multipatch area, with overlaps resolved by the union). Lastly split using GeomToShapes (islands).

You could do the same in Manifold 8, but Radian would have its obvious benefits.

tjhb
10,094 post(s)
#24-Mar-17 00:00

There's a bug in that method. To avoid a "cascading tolerance" effect (where a smooth gradient, for example, could all be combined as one area) we would need to do this instead.

Start with TileToValuesXN. Filter pixels (ValueXN) in the resulting table within a colour range, or a lightness or value range. Then a self join on the filtered table, matching each pixel P with each neighbouring pixel Q (within radius R, or dX, dY). Then group by each P... (as before).

(That would also be quite a bit faster than the wrong approach.)

Mike Pelletier

2,122 post(s)
#24-Mar-17 18:34

Thanks Tim for the method and Adam for the hope that it will be easier in the future. Here's some SQL that gets the X, Y values plus a lightness range from the image attached above. I choose 100 for the lightness cut off but was thinking of refining that after the rest of the query is working. Seems like this on track but probably not very elegant SQL.

Why match pixel to neighboring pixels? Could a rectangle be created around each pixel and then unioned from there?

Function f1() Table As

(select X, Y, value  from

 (SELECT SPLIT Call TileToValuesx3([Tile])

 FROM [capture])

 ) End;

Function f2() Table As

(select X, Y, 

 VectorValue(Value,0) as B,

 VectorValue(Value,1) as G,

 VectorValue(Value,2) as R

 from call f1()

 ) End;

Function f3() Table As

-- http://stackoverflow.com/questions/596216/formula-to-determine-brightness-of-rgb-color

(Select X, Y, (0.2126*R + 0.7152*G + 0.0722*B) as Lightness

 from call f2()

 ) End;

Select X,Y, Lightness

 from call f3()

 where Lightness < 100

Mike Pelletier

2,122 post(s)
#25-Mar-17 15:52

The attached map has a query that takes the above query and converts the selected pixels to points as a way to test the lightness formula, which by the way is failing miserably at this point. However, there is weird stuff going on with the query. I have changed the source image that the query uses but it continues to use the original image somehow. This is apparent by the overall extent of the points and their lat/long. Perhaps this is caused by some caching issue. Changing the where clause values does create a change in the created points. What's going on here please?

Attachments:
RS shadow to vector.map

tjhb
10,094 post(s)
#25-Mar-17 23:09

Mike,

(1)

The first problem is that function f1 is incorrect.

Function f1() Table As

(select X, Y, value  from

 (SELECT SPLIT Call TileToValuesx3([Tile])

 FROM [image])

 ) End;

TileToValues returns the X and Y index for each pixel within the current tile.

Within each tile, the range of X and Y indexes is the same. So TileToValues will give the same sequence of X and Y (here [0, 127]) for every tile, followed by a distinct ValueX3 for the colour (BGR).

The effect is somewhat as if you were returning a stack of tiles one on top of the other; in other words, the set of pixels within each tile, but no reference for the position of each tile within the image.

Here you need the tile X and Y indexes, as well as the pixel indexes.

FUNCTION f1() TABLE AS

    (SELECT

        [tile X], [tile Y],

        [X] AS [pixel X], [Y] AS [pixel Y],

        [Value]

    FROM

        (SELECT

            [X] AS [tile X], [Y] AS [tile Y],

            SPLIT CALL TileToValuesX3([Tile])

        FROM [image]

        )

    )

END;

-- test:

EXECUTE CALL f1();

(2)

Next, how to turn the extracted values into points?

It is not as simple as in Manifold 8, where we can use X (I) and Y (I) (or Center X (I), Center Y (I)) for each pixel, and inherit the exact projection of the image including local offsets and scales.

What gets in the way of that is the same thing for (1), the 'double' nature of the X and Y indexes. First of tile within image, then of pixel within tile.

I'll come back if I have any bright ideas on how to draw the centre point for each pixel (in the right projection). I think we will need to use at least one pair of ValueSequence functions for this.

(3)

To get lightness (which is different from brightness, though that's not crucial), it would be simpler to use TileBgrHsl and just using the third value, Lightness. You could also use TileBgrHsi for Intensity or TileBgrHsv for Value.

(4)

From your previous post:

Why match pixel to neighboring pixels? Could a rectangle be created around each pixel and then unioned from there?

You're right, there is no need to join neighbours first. That's a hangover from the previous approach, where it was necessary--but the approach was wrong. Thanks.

tjhb
10,094 post(s)
#25-Mar-17 23:31

Further to point (3): I think we ideally need some extra helper functions here to make life easier.

For example, perhaps a TileBounds function, returning a ValueX4 expressed in geographic coordinates (rather than in indexed space).

adamw


10,447 post(s)
#27-Mar-17 16:00

Why not use indexed space = pixel coordinates? If we create a point with coordinates of 0.5 : 0.5, it is going to appear in the center of the pixel with the coordinate 0 : 0.

The coordinate of a pixel in the image is TileX * TileSizeX + PixelX : TileY * TileSizeY + PixelY.

I am not saying we shouldn't have helper functions, just that they are pretty simple to do manually until we have them built-in.

tjhb
10,094 post(s)
#27-Mar-17 22:48

Adam--Mike also (below),

You're absolutely right, thanks for pointing out how simple that is.

No need for helper functions, it's fine, I was just missing the obvious.

Mike Pelletier

2,122 post(s)
#27-Mar-17 17:20

Thanks Tim for setting me straight on the tile pancake issue! I used your SQL and modified it along with the X and Y offset of the image to align the points correctly. I see Adam posted similar info.

Check out the attached map file. The lightness formula seems to be working well too. Now to figure out the fastest way to convert points to a smooth area and delete small areas. It will be interesting to see how well it works on a larger area.

Attachments:
RS shadow to vector.map

Mike Pelletier

2,122 post(s)
#27-Mar-17 19:18

The attached file has some minor improvements to the above for those interested in this project.

I tried creating small rectangles instead of points (its an option in the query using GeomMakeRect) but the result is huge rectangles from some unknown reason. Also, I'm not seeing an easy way to union the rectangles. Thanks for any help.

Attachments:
RS shadow to vector.map

Mike Pelletier

2,122 post(s)
#28-Mar-17 01:18

Whew... Here's a working project that creates an area vector of the dark pixels and let's you play with what is considered dark and a way to automatically delete small areas. I'll see what it can do with larger files and report back. Any thoughts on improving speed of the code would be great. Thanks.

Attachments:
RS shadow to vector.map

Mike Pelletier

2,122 post(s)
#29-Mar-17 01:28

Okay so while the query runs on a small image, an image of about 4500x4500 didn't finish in 20 minutes. Only 4 CPU processors are running with overall CPU usage hovering around 12% and memory is around 10GB. My machine specs are in the image below.

I tried deleting the "Threads 1" in the query and changing it to "Threads 8" with no apparent change in performance or CPU usage. Canceling the query doesn't seem to be effective and I resort to closing in Task Manger.

Perhaps using the Threads and Batch commands in some slick way for the different functions in the query would help or should I give up and just wait for Mfd to tackle the raster to vector in a future build?

Maybe there's a faster SQL way that groups points that are relatively adjacent and performs a convex hull to create areas, although I couldn't see how to do it.

Attachments:
Capture.JPG

tjhb
10,094 post(s)
#29-Mar-17 10:32

I'll try to apply some things learned from Adam to this tomorrow Mike (just hitting the sack now up at the farm).

The most important thing is to remember that every node (each query/subquery level, each function) should have some real work to do--sometimes at the expense of making the query logic less elegant or less readable. (This is substantially different from 8.)

If a node does something trivial then throwing more threads at it is just pushing on a string--yet it still must occupy a place in the pipeline, which every batch of data must push through. Any trivial stage is a potential blockage.

There's a particularly clever example of Adam's that presses this point in a really startling way--I'll find the bookmark and post a link in the morning too.

Mike Pelletier

2,122 post(s)
#29-Mar-17 16:46

Thanks Tim. That all makes sense but it also feels a bit daunting to be able to tell what nodes are too trivial, how to merge trivial bits of a query into something substantial, and how to optimize RAM usage in a complex query. It does feel like the query I wrote is horribly flawed in some way for anything but a very small image, so these issues must be dealt with head on rather than assuming one can live with a query that is just not optimized. Adam's warning is looming over me, but I suppose at a minimum this exercise taught me a lot.

tjhb
10,094 post(s)
#29-Mar-17 22:09

Here's the link to the example I mentioned, posted by Adam in September.

Because it is in a beta thread, I'm quoting part of it here. I've added references in [] and highlighted timings.

...

[M8] Another query in 8:

--!fullfetch

SELECT parks.id AS parkid, count(parcels.id) AS num

FROM parks, parcels

WHERE Touches(parks.id, parcels.id) AND

  Area(Buffer(parks.id, 200)) > 2 * Area(Buffer(parcels.id, 200))

GROUP BY parks.id

We take each park, find all parcels that touch it, then either leave or reject the parcel based on some convoluted criteria regarding its area compared to the area of the park (which is, again, to try and make operations take some time for performance comparisons).

The result: 9.230 sec (good).

[RS1] The straightforward attempt to port to RS again:

--SQL

SELECT p.mfd_id AS parkid, count(q.mfd_id) AS num

FROM [parks table] AS p, [parcels table] AS q

WHERE GeomTouches(p.[geom (i)], q.[geom (i)], 0) AND

  GeomArea(GeomBuffer(p.[geom (i)], 200, 0), 0) >

    2 * GeomArea(GeomBuffer(q.[geom (i)], 200, 0), 0)

GROUP BY p.mfd_id

THREADS 8 BATCH 4

This runs much better than 8 straight out of the box (multiple reasons): 3.250 sec. Threads, however, don't help.

[RS2] Second attempt, let's move computations into a subquery and leave GROUP outside:

--SQL

SELECT parkid, count(parcelid) AS num

FROM (

 SELECT p.mfd_id AS parkid, q.mfd_id AS parcelid

 FROM [parks table] AS p, [parcels table] AS q

 WHERE GeomTouches(p.[geom (i)], q.[geom (i)], 0) AND

    GeomArea(GeomBuffer(p.[geom (i)], 200, 0), 0) >

      2 * GeomArea(GeomBuffer(q.[geom (i)], 200, 0), 0)

  THREADS 8 BATCH 4

)

GROUP BY parkid

Nope, same 3.260 sec. Threads don't help.

[RS3] Third attempt, rework the inner subquery:

--SQL

SELECT parkid, count(parcelid) AS num

FROM (

 SELECT p.mfd_id AS parkid, q.mfd_id AS parcelid,

    GeomArea(GeomBuffer(p.[geom (i)], 200, 0), 0) >

      2 * GeomArea(GeomBuffer(q.[geom (i)], 200, 0), 0) AS valid

 FROM [parks table] AS p, [parcels table] AS q

 WHERE GeomTouches(p.[geom (i)], q.[geom (i)], 0)

  THREADS 8 BATCH 4

)

WHERE valid

GROUP BY parkid

Without threads, we have the same 3.260 sec (that's OK), but threads now help, the time with threads is 0.640 sec (!). Because we moved the expensive computations from WHERE into the SELECT list, where they can be accelerated.

...

See the changes made in RS2, but especially in RS3. This is really thinking outside the Rect (sorry, can't resist).

tjhb
10,094 post(s)
#29-Mar-17 22:33

(I hope this is right.)

The speedup from RS2 to RS3 is not because join conditions (here expressed in WHERE syntax; the same holds for ON syntax) cannot benefit from multiple threads assigned to their SELECT query. They can benefit--or at least, they don't interfere with multiple threading, if used (which statement is more accurate?)--provided they can use available indexes.

What slows down the join in RS2 is that as well as a condition that can use indexes (GeomTouches, using original geometry -> using existing RTREE indexes), the join condition also includes a condition that does not use indexes (GeomArea(GeomBuffer(...)) X2 -> fresh geometry, unindexed).

It's the lack of useful indexes that makes the second condition expensive. That slows down what would otherwise be a fast indexed join.

In RS3, moving the expensive condition to the SELECT list, and using multiple threads, leaves a fully optimized, indexed spatial join, then lots of horsepower for the expensive work (without indexes) once the join has been completed.

tjhb
10,094 post(s)
#29-Mar-17 23:27

More succinctly:

A join condition that cannot use existing indexes (even in part) will force a SELECT query to be serialized into a single thread (no matter how many threads are allocated to it).

That's my inference. Again I hope it's right.

tjhb
10,094 post(s)
#30-Mar-17 09:25

"(even in part)" -> "(for all of its parts)". Sorry for my bad English.

adamw


10,447 post(s)
#24-Mar-17 07:47

The above is best done using a script function that analyzes tile values. We will provide examples in the docs, such functions are not at all difficult to write.

Performing the analysis on a pixel or coordinate level in SQL usually limits data size significantly - queries are fast, but the overhead of having to treat every pixel and coordinate as a record is high.

adamw


10,447 post(s)
#24-Mar-17 07:45

Raster to vector and vector to raster conversions are not yet in, we are going to add them, extended from Manifold 8. The functions are pretty computation-intensive, they aren't something one could easily add using a script. I would use Manifold 8 for now.

Manifold User Community Use Agreement Copyright (C) 2007-2021 Manifold Software Limited. All rights reserved.