Subscribe to this thread
Home - General / All posts - Creating a mesh/grid using raster cell boundaries in M9
gjsa42 post(s)
#11-Jul-18 02:48

Have been searching for ways to create a regular grid as vector drawing, but nothing so far that specifically applies in M9.

The links I've found so far:

http://www.georeference.org/doc/view_grid.htm

http://www.georeference.org/Forum/t132034.8

The trace areas feature is fantastic, but sometimes I just want the pixels turned into individual polygons regardless of their raster value. The equivalent ESRI process would be Create Fishnet.

Dimitri

5,083 post(s)
#11-Jul-18 06:07

You could do that in SQL or a script today. Point and click tools for grids are coming.

Grids as virtual display overlays and automatically creating grids as vectors have been in the wishlist for 9 for a while. They're just not as high a priority as other things, at least not in terms of what users have asked for. [...oh, no! ... not that link again! :-) ]

In my personal work I rarely need to create a regular grid, but if I do I just use 8 to create the thing and then use it in 9.

I have to admit to being curious:

sometimes I just want the pixels turned into individual polygons regardless of their raster value.

That's a very small grid in most cases, if I understand your desire as being to create a grid so small each cell in the grid is one pixel size. Why do you want to do that? If it is for operations on rasters, perhaps there is an alternative workflow.

gjsa42 post(s)
#11-Jul-18 07:55

It is a strange use case, and probably not one that would be common to other users.

Basically I need to systematically edit a raster at the pixel level - the easiest way to do that is to do vector operations (Overlay, adjacent - for example) on a grid that contains the values of individual pixels as polygons or perhaps centroids (points).

The final outcome would be all pixels of the same value merged into areas, but only after this editing step is completed.

For the time being I can use ArcGIS or I could buy M8 by the sounds of it.

tjhb

8,348 post(s)
online
#11-Jul-18 08:14

The first part of this—creating the pixel-aligned grid—is actually pretty easy in Manifold 9 SQL. Easier and mire natural than in Manifold 8.

But how about the next step, the systematic editing? And other steps.

The more you can say about the workflow you have in mind, the better. E.g. what you would do in Arc, as explained to soneone who does not have Arc, and paying attention to the purpose at each step.

Whatever you need to do will be possible, either now or soon (now I think). But with overhead—which it would be best to try to judge in advance.

gjsa42 post(s)
#13-Jul-18 08:01

Is there an example of a pixel-aligned vector grid creation in Manifold 9 SQL you could post when you get a chance?

One follow-up question also:

I've decided that extracting raster values to said grid is far more efficient than creating a vector from the raster and trying to overlay that onto the grid. But, pretty sure M9 doesn't do raster value extractions to shapes yet? If I'm wrong, happy to be corrected.

SAGA 6 does raster value extractions fast - but the slow part is exporting/importing very large shapefiles, so the more I can do in M9 the better.

Dimitri

5,083 post(s)
#13-Jul-18 08:24

But, pretty sure M9 doesn't do raster value extractions to shapes yet?

Yes, it does.

See the Example: Trace Vector Areas from Raster Pixels topic for an example.

tjhb

8,348 post(s)
online
#13-Jul-18 10:05
Is there an example of a pixel-aligned vector grid creation in Manifold 9 SQL you could post when you get a chance?

No. You need to say what you need to do with the grid first, as I said.

One obvious reason for that: you have said you need a pixel-aligned grid, that is one thing. But I suspect you actually mean a grid of pixel cells along with pixel values, which is different.

So to avoid a wild goose chase...

gjsa42 post(s)
#16-Jul-18 14:35

A long series of SQL commands I am writing using M9 templates will follow soon when they are complete.

They all begin with some input layers: one of which is a vector grid of raster pixel values and also an empty vector grid with the same dimensions and alignment. If I could create the grid of pixel values direct from the raster the second empty grid is easy (just need to copy and nullify all the values).

I appreciate the need to fully understand what I am trying to achieve, and I will complete the picture soon.

For now just to reiterate that Trace Areas is great and I use it often, but in this case I do need the pixels vectorised individually and haven't figured out how in M9.

tjhb

8,348 post(s)
online
#16-Jul-18 20:39

That's definitely clear enough, thanks. I'll see if I can get it right. As you'll see it's not hard, lots of built-in tools to make it straightforward.

mdsumner


4,206 post(s)
#16-Jul-18 23:29

Interesting to me is the abstractability of this, the source is six numbers (offset, scale, dimension) and always the code to retrieve these seems harder work than the explicit polygon generation. Obviously there's a lot of pipeline potential, keeping things implicit, the data transfer is trivial with a single pixel index.


https://github.com/mdsumner

tjhb

8,348 post(s)
online
#17-Jul-18 07:17

Here's a first go.

A few things to note.

First, this requires Manifold 9 >= 9.0.167.6, released today.

Secondly, the source data type is hard coded (here INT32X3, e.g. standard RGB or RGB). This needs to be adjusted (in three places) for other data types. I think that's currently unavoidable (unless we wrap in a script).

Third, there is a Rect filter, commented out. So all pixels from all tiles are returned, including those falling outside the image Rect. Uncomment the filter to remove peripheral cells for pixels that don't show.

Lastly, I don't think there is anything here that would benefit from multiple threads, but I haven't tested that yet.

Execution time isn't bad. For an image 5761 x 8505px, I get a time of 16 minutes on this i7 laptop, producing 49,397,760 cells (pixel areas).

--SQL9

-- source/target vector type is hardcoded (see notes)

-- Rect filter inactive (see WHERE clause)

--

-- source table and image

VALUE @table TABLE = [BA32_GRIDLESS_GeoTifv1-06 Tiles];

VALUE @image TABLE = [BA32_GRIDLESS_GeoTifv1-06]-- NB image qua TABLE

--

-- image metadata

VALUE @field NVARCHAR = ComponentProperty(@image'FieldTile');

VALUE @dimXY INT32X2 = CAST(ComponentProperty(@table'FieldTileSize.' + @field) AS INT32X2);

VALUE @dimX INT32 = VectorValue(@dimXY, 0);

VALUE @dimY INT32 = VectorValue(@dimXY, 1);

VALUE @rect INT32X4 = CAST(ComponentProperty(@image'Rect'AS INT32X4);

VALUE @minX INT32 = CAST(VectorValue(@rect, 0) AS INT32);

VALUE @minY INT32 = CAST(VectorValue(@rect, 1) AS INT32);

VALUE @maxX INT32 = CAST(VectorValue(@rect, 2) AS INT32);

VALUE @maxY INT32 = CAST(VectorValue(@rect, 3) AS INT32);

VALUE @cs NVARCHAR = ComponentProperty(@table'FieldCoordSystem.' + @field);

--

-- target table

CREATE TABLE [Grid Table] (

    [tileX] INT32[tileY] INT32,

    [pixelX] INT32[pixelY] INT32,

    [imageX] INT32[imageY] INT32,

    [Cell] GEOM,

    [Value] UINT8X3,   -- source data type (restored)

    INDEX [Y_X_x] BTREE ([imageY][imageX]), -- same order as image

    INDEX [Cell_x] RTREE ([Cell]),

    PROPERTY 'FieldCoordSystem.Cell' @cs -- same as image

    );

-- target drawing

CREATE DRAWING [Grid] (

    PROPERTY 'FieldGeom' 'Cell',

    PROPERTY 'Table' '[Grid Table]'

    );

--

INSERT INTO [Grid table]

    (

    [tileX][tileY],

    [pixelX][pixelY]

    [imageX][imageY],

    [Cell],

    [Value]

    )

SELECT

    [tileX][tileY],

    [pixelX][pixelY]

    [imageX][imageY],

    GeomMakeRect(

        VectorMakeX4([imageX][imageY][imageX] + 1, [imageY] + 1)

        ) AS [Cell],

    --[Value] -- FLOAT64XN

    CAST([Value] AS INT32X3-- restore source data type

FROM

    (

    SELECT

        [tileX][tileY],

        [X] AS [pixelX][Y] AS [pixelY]

        [tileX] * @dimX + [X] AS [imageX],

        [tileY] * @dimY + [Y] AS [imageY],

        [Value]

    FROM

        (

        SELECT [X] AS [tileX][Y] AS [tileY],

        SPLIT CALL TileToValuesX3([Tile])

            --> X, Y, Value (always FLOAT64XN)

        FROM @table

        )

    )

-- exclude pixels outside image Rect

--WHERE [imageX] BETWEEN @minX AND @maxX

--AND [imageY] BETWEEN @minY AND @maxY

;

Attachments:
Extract cells and values (int32x3) forum.sql

tjhb

8,348 post(s)
online
#18-Jul-18 00:15

Small error above. In the line

CAST([Value] AS INT32X3-- restore source data type

the type should instead be UINT32X3 for BGR/RGB test data (and the query name should also be adjusted).

But see below.


More on this:

Secondly, the source data type is hard coded (here INT32X3, e.g. standard RGB or RGB). This needs to be adjusted (in three places) for other data types. I think that's currently unavoidable (unless we wrap in a script).

I think that is basically right, although we can reduce the number of places where this query needs to be edited for different source types from three places to two by using a SELECT INTO query rather than INSERT INTO. That leaves target data types to be inferred by the compiler (though NB still deterministically and in advance), rather than specifying types explicitly via CREATE TABLE.

We can also restore the source data type using CASTV rather than CAST

--CAST([Value] AS UINT32X3)

CASTV([Value] AS UINT32)

which is slightly neater. And CASTV handles scalar types sensibly, for example for a single-channel image, as well as handling vector types. CASTV might also be slightly faster than CAST for vector types? (Not much.)

However, the particular TileToValues*() function for the SPLIT must be chosen explicitly, to match the source data type (scalar or XN vector). As I understand it, the reason is that the compiler needs to know, before it sees any source data, the exact schema of the output table.

We couldn't for example do something like this

SELECT [X] AS [tileX][Y] AS [tileY],

SPLIT CALL

    CASE @type

        WHEN 'x2' THEN TileToValuesX2([Tile])

        WHEN 'x3' THEN TileToValuesX3([Tile])

        WHEN 'x4' THEN TileToValuesX4([Tile])

        ELSE TileToValues([Tile])

    END

FROM @table

;

That is ficticious syntax--and there would no doubt be better ways to express it if it were possible. But I don't think it is possible, as a matter of principle. Using a CASE expression as a "guard" over functions with inherently different data types like this would break the rule that the output schema must be exactly determinable in advance of data.

I'm not 100% sure of my reasoning (perhaps it is more subtle than that, even something to do with the handling of vector types in hardware), but that is how I read it.

It would be exactly the same if we moved the CASE expression to an SQL function. And also the same if we used a script function returning a table, since again return types must be known in advance, in this case by means of a model function.

If that is right then, to be fully generic in a case like this, we would need to use a script (not a query) to choose the SQL expression we need according to the source data type.

[One other thing that I think could work in principle, is to allow a @variable to contain a function pointer. The required function could be selected according to metadata, which is known at compile time (data properties). Then the compiler could know all output data types in advance of data. Perhaps this sounds more like a pragma.]

adamw


8,216 post(s)
#23-Jul-18 09:23

However, the particular TileToValues*() function for the SPLIT must be chosen explicitly, to match the source data type (scalar or XN vector). As I understand it, the reason is that the compiler needs to know, before it sees any source data, the exact schema of the output table.

Yes, exactly.

[One other thing that I think could work in principle, is to allow a @variable to contain a function pointer. The required function could be selected according to metadata, which is known at compile time (data properties). Then the compiler could know all output data types in advance of data. Perhaps this sounds more like a pragma.]

First, let me say that for the issue above - using the same query to handle rasters of varying number of channels - we think we have far better solutions that don't require any changes to the query engine at all.

With that out of the way and talking generally:

Setting @value to a function with the exact type of function parameters and return value determined at run time would have worked. We aren't planning to do this currently, but it would have worked.

There is something similar that we might do - allowing multiple functions with the same name but different parameter types = function name overloading. Then one could define a function like TileToValuesAny(t TILE, v FLOAT64) TABLE ... and an overload TileToValuesAny(t TILE, v FLOAT64X2) TABLE ..., etc, and then you'd be able to use the type of a query parameter to figure out which of the overloads to call. We could also have something like VALUE @x AUTO = <take pixel value out of tile>, but, frankly, we'd like to avoid syntax like that if possible (because it tends to spread and make everything far more complex than it has to be for the engine and sometimes for the user as well).

One can write a script function right now that would take an image and output a table of pixel values of the appropriate type. The result table of the function would have to have the same schema that the function returned without knowing the parameter values, but with a fixed name of the image and so only one real type of pixel values this would work and save some typing in the query.

tjhb

8,348 post(s)
online
#24-Jul-18 05:14

Thanks for these comments Adam, very helpful.

Would it be tricky to distinguish function name overloading from function redefinition?

I have tried to understand the last paragraph, and so far have "understood" it two different ways, probably both wrong. Any chance you could break the second sentence into 2 or 3? Sorry to ask.

Having said that, I think I understand well enough to try something out.

On which, I wonder if there is a more elegant/robust way to count to number of values in a scalar/vector type than e.g.

--SQL9

VALUE @table TABLE = ...

VALUE @image TABLE = ...

VALUE @field NVARCHAR = ComponentProperty(@image'FieldTile');

VALUE @type NVARCHAR = ComponentProperty(@table'FieldTileType.' + @field);

VALUE @channels UINT8 =

    CASE StringToUpperCase(StringSubstring(@type, StringLength(@type) - 2))

        WHEN 'X2' THEN 2

        WHEN 'X3' THEN 3

        WHEN 'X4' THEN 4

        ELSE 1

    END;

tjhb

8,348 post(s)
online
#24-Jul-18 05:39

Answering my own last question, this is more elegant and feels more robust, though it is not faster since it requires actual data access.

--SQL9

VALUE @table TABLE = ...

SELECT TileChannelCount(FIRST([Tile])) FROM @table;

tjhb

8,348 post(s)
online
#24-Jul-18 05:56

So (late edit)

--SQL9

VALUE @channels UINT8 =

    (SELECT CAST(TileChannelCount(FIRST([Tile])) AS UINT8FROM @table)

    ;

adamw


8,216 post(s)
#25-Jul-18 10:05

Here is what I meant by the last paragraph (apologies, I tend to be too terse):

Right now, we can create a script function that would, say, take the name of an image as a string, and return a table of pixel values. A script can determine the type of pixels used by the image with a specific name, so the function can return the table with, say, X INT32, Y INT32, and PIXELVALUE of whatever type the pixel value in the image is.

It might be tempting to use such a function, however, it cannot really work completely. When the query engine sees a call to such a function, it sees that the function returns a table and asks the script to describe the fields and indexes that the returned table is going to contain. The issue is that the script has to describe the fields knowing only that the input parameter is going to be a string (for the image name) without the actual string value (this happens because in runtime there might be multiple different calls with multiple different values). So the script will have to assume what the pixel type of the image is going to be. If it guesses wrong and subsequently returns a table with different field types, things are going to fail. However, the autodetection done by the function can still be somewhat useful if, say, there is only one image (don't have to even pass the name then) meaning that the pixel type is unknown to the author of the query, but it is fixed and the function is safe to use. Or when there are multiple images with the same pixel type, that is also unknown to the author of the query.

tjhb

8,348 post(s)
online
#25-Jul-18 22:46

Thanks Adam. I understand now.

I think we can avoid the script function having to guess, if the calling query passes a constant specifying the type known from metadata, consumed by both the model function and the runtime function. (For the latter, perhaps by calling the former, as shown in the excellent API examples.)

But I'm only partway through, and that is terse. I'll post back.

tjhb

8,348 post(s)
online
#18-Jul-18 00:35

Attachment corrected.

Attachments:
Extract cells and values (uint32x3) forum.sql

tjhb

8,348 post(s)
online
#18-Jul-18 01:37

Lastly a version using SELECT INTO (rather than INSERT INTO), so that the target value type can be left implicit. I also changed the BTREE from Y-X to a (new) mfd_id field, more conventional and more efficient.

Also this version is a bit faster than the original query, just under 13mn rather than 16mn on the same machine and same data.

-- Rect filter inactive (see WHERE clause)

--

-- source table and image

VALUE @table TABLE = [BA32_GRIDLESS_GeoTifv1-06 Tiles];

VALUE @image TABLE = [BA32_GRIDLESS_GeoTifv1-06]-- NB image qua TABLE

--

-- image metadata

VALUE @field NVARCHAR = ComponentProperty(@image, 'FieldTile');

VALUE @dimXY INT32X2 = CAST(ComponentProperty(@table'FieldTileSize.' + @field) AS INT32X2);

VALUE @dimX INT32 = VectorValue(@dimXY, 0);

VALUE @dimY INT32 = VectorValue(@dimXY, 1);

VALUE @rect INT32X4 = CAST(ComponentProperty(@image, 'Rect'AS INT32X4);

VALUE @minX INT32 = CAST(VectorValue(@rect, 0) AS INT32);

VALUE @minY INT32 = CAST(VectorValue(@rect, 1) AS INT32);

VALUE @maxX INT32 = CAST(VectorValue(@rect, 2) AS INT32);

VALUE @maxY INT32 = CAST(VectorValue(@rect, 3) AS INT32);

VALUE @cs NVARCHAR = ComponentProperty(@table'FieldCoordSystem.' + @field);

--

SELECT

    [tileX][tileY],

    [pixelX][pixelY]

    [imageX][imageY],

    GeomMakeRect(

        VectorMakeX4([imageX][imageY][imageX] + 1, [imageY] + 1)

        ) AS [Cell],

    --[Value] -- FLOAT64XN

    CASTV([Value] AS UINT32)

        -- restore source data type

        -- (XN implicit; also handles scalar types)

        AS [value]

INTO [Grid Table]

FROM

    (

    SELECT

        [tileX][tileY],

        [X] AS [pixelX][Y] AS [pixelY]

        [tileX] * @dimX + [X] AS [imageX],

        [tileY] * @dimY + [Y] AS [imageY],

        [Value]

    FROM

        (

        SELECT

            [X] AS [tileX][Y] AS [tileY],

        SPLIT CALL

            -- function must match source data type

--            TileToValues([Tile])

--            TileToValuesX2([Tile])

            TileToValuesX3([Tile])

--            TileToValuesX4([Tile])

            --> X, Y, Value (always FLOAT64XN)

        FROM @table

        )

    )

-- exclude pixels outside image Rect

--WHERE [imageX] BETWEEN @minX AND @maxX

--AND [imageY] BETWEEN @minY AND @maxY

;

--

-- adjust target table

ALTER TABLE [Grid Table] (

    ADD [mfd_id] INT64,

    ADD INDEX [mfd_id_x] BTREE ([mfd_id]),

    ADD INDEX [Cell_x] RTREE ([Cell]),

    ADD PROPERTY 'FieldCoordSystem.Cell' @cs -- same as image

    );

-- target drawing

CREATE DRAWING [Grid] (

    PROPERTY 'FieldGeom' 'Cell',

    PROPERTY 'Table' '[Grid Table]'

    );

Attachments:
Extract cells and values (uint32x3) b forum.sql

gjsa42 post(s)
#26-Jul-18 10:03

This is great. Just having difficulty getting the output to be aligned correctly.

I have attached an image (test.tif) in WGS84 and if you run the query posted above, the Grid Table vector output is rotated incorrectly.

I'm sure there's a simple fix but my brain right now can't find the correct part of the query to modify.

Attachments:
test.tif

gjsa42 post(s)
#26-Jul-18 10:31

Correction to my post above: I should not be conflating alignment and rotation: It is only a rotation issue.

adamw


8,216 post(s)
#26-Jul-18 11:13

Force the coordinate system of the image to use XY axes (open image, go to the Contents - Component pane, click the button next to the projection, select Repair Initial Coordinate System - Force XY Axes). Then generate vector data.

We should probably add an easy way for a query to do so automatically.

Manifold User Community Use Agreement Copyright (C) 2007-2017 Manifold Software Limited. All rights reserved.