Subscribe to this thread
Home - General / All posts - Radian - Sql sequence breaking unexpected at add columns query - But query works as individual query
tgazzard117 post(s)
#02-Aug-17 01:24

Hi this is probably a newbie error - admittedly my first real test of radian (long time listener, first time caller...!)

My problem

code

-- $manifold$

--$include$ Lib

--!fullfetch

PRAGMA ('progress.percent' = '1','gpgpu' = 'aggressive', 'gpgpu.fp'='32');

SELECT

 [Grid Block].[ID], [Fuel].[ID], [AngTab].[Ang],[Fuel].[Fuel 

load],[Fuel].[Width],[Fuel].[Surface Fuel Load],[Grid 

Block].[Height],[Fuel].[Height]as [fuelHgt], [Grid Block].[Height] - 

[Fuel].[Height] as [Hgtdif]

,GeomDistance([Grid Block].[Geom (I)],[Fuel].[Geom (I)],0) as [dis], 

[Fuel].[Fuel Type], [Fuel].[Aspect], [Fuel].[Slope], [Grid Block].[Aspect] as [Grid Block Aspect], --[Fuel].[Fuel Load],

--d.searchdis as [Searchdis], FDI as [FDI], GFDI as [GFDI], [Heath Veg Hgt] as [Heathveghgt], AVGWIND as [Avewind],

[Grid Block].[x] as [GridX],[Fuel].[x]as [FuelX]

,[Grid Block].[y] as [GridY],[Fuel].[y] as [FuelY]

into Hgt

from [Grid Block],[Fuel], AngTab--, alcoa_parameters as d

where GeomDistance([Grid Block].[Geom (I)], [Fuel].[Geom (I)], 0) < 150 and [Fuel].[Fuel Type]<> 'unclassifiable'

THREADS 8 Batch 10;

alter table hgt (add mfd_id int64, add index mfd_id_x btree (mfd_id));

alter table Hgt (rename [id 2] [Fuel ID]);

alter table Hgt (add searchdis int32, add FDI int32, add GFDI int32, add Heathveghgt int32, add avewind int32);

update hgt set searchdis = 150, fdi = 120, gfdi = 120, heathveghgt = 4, avewind = 45;

alter table Hgt (rename [aspect 2] [grid block aspect]);

alter table hgt (add avgslope float32,add ros float32, add diffx float32, add diffy float32);

text

This query sequence is part of a bigger sequence but it breaks at the last query above. But then when I run the query either on its own or as the first query in the remainder of sequence, it runs as it should.

Log file

> -- $manifold$

--$include$ Lib

--!fullfetch

PRAGMA ('progress.percent' = '1','gpgpu' = 'aggressive', 'gpgpu.fp'='32');

SELECT [Grid Block].[ID], [Fuel].[ID], [AngTab].[Ang],[Fuel].[Fuel load],[Fuel].[Width],[Fuel].[Surface Fuel Load],[Grid Block].[Height],[Fuel].[Height]as [fuelHgt], [Grid Block].[Height] - [Fuel].[Height] as [Hgtdif]

,GeomDistance([Grid Block].[Geom (I)],[Fuel].[Geom (I)],0) as [dis],

[Fuel].[Fuel Type], [Fuel].[Aspect], [Fuel].[Slope], [Grid Block].[Aspect] as [Grid Block Aspect], --[Fuel].[Fuel Load],

--d.searchdis as [Searchdis], FDI as [FDI], GFDI as [GFDI], [Heath Veg Hgt] as [Heathveghgt], AVGWIND as [Avewind],

[Grid Block].[x] as [GridX],[Fuel].[x]as [FuelX]

,[Grid Block].[y] as [GridY],[Fuel].[y] as [FuelY]

into Hgt

from [Grid Block],[Fuel], AngTab--, alcoa_parameters as d

where GeomDistance([Grid Block].[Geom (I)], [Fuel].[Geom (I)], 0) < 150 and [Fuel].[Fuel Type]<> 'unclassifiable'

THREADS 8 Batch 10;

alter table hgt (add mfd_id int64, add index mfd_id_x btree (mfd_id));

alter table Hgt (rename [id 2] [Fuel ID]);

alter table Hgt (add searchdis int32, add FDI int32, add GFDI int32, add Heathveghgt int32, add avewind int32);

update hgt set searchdis = 150, fdi = 120, gfdi = 120, heathveghgt = 4, avewind = 45;

alter table Hgt (rename [aspect 2] [grid block aspect]);

alter table hgt (add avgslope float32, add ros float32, add diffx float32, add diffy float32);

Invalid field.

Any thoughts on why this might break at this point?

I would upload a radian file if this would help.

Attachments:
sql_sequence.txt

tgazzard117 post(s)
#02-Aug-17 01:43

Should of said that I am using the following version of radian

Attachments:
ScreenHunter_348 Aug. 02 10.44.jpg

Dimitri

4,118 post(s)
#02-Aug-17 06:17

The latest version of Radian is 162.2 not 162.0

...it always pays to post reports only using the latest version. That avoids spinning wheels on anything that might have already been fixed or changed. That might not make a difference, but sometimes it does so may as well use the latest.

Note that it is easy to use the latest version: download the portable installation and launch from there. If the effect is the same you can report "checked on 162.2..."

tgazzard117 post(s)
#02-Aug-17 07:34

Fair enough......

checked on 162.2 - same result

adamw


7,129 post(s)
#02-Aug-17 07:46

It would have been helpful to have the MAP file, yes - even without the data, just with the structure of tables.

A couple of notes on the query:

1. I guess you already know this, but just in case you don't - the line with --!fullfetch does nothing. (It is good that it does nothing, too. You wouldn't want to use !fullfetch in production, it's just a tool to help gauge the execution time of queries like SELECT without turning them into action queries like SELECT INTO. Perhaps you included this line so that you can run just the !fullfetch fragment interactively via Alt-Enter.)

2. It is wasteful to first have SELECT INTO rename [Fuel].[ID] into [ID 2] and then fix that using ALTER TABLE. Just use an alias in SELECT INTO.

3. It is wasteful to be doing multiple ALTER TABLE statements adding a couple of fields each. Merge them together and add all fields in one step. Unless added fields belong to different pieces of logic.

Regarding the source of the error, you could try running each statement separately. I guess that the error is in the SELECT.

tgazzard117 post(s)
#02-Aug-17 08:07

Thanks for the tips

1) Yep, onto 1 - was using it for test purposes - alt-enter. In particularly still working out the gpu/cpu settings.

2) Yep onto 2. I will fix up further additions of this code.Manifold 8 automatically created this column heading when the join occurred. Radian created a different name.

3) I wasn't aware it would wasteful to do multiple alter table t add statements rather than having lots in the one statement.

Your right Adam - Nice pick up. Again this was a translation thing from manifold 8. I initially did the rename. But must of decided that an alias would be better... Very messy. Thanks for looking at it.

Cheers

Attachments:
temp_v9_for_upload.map

tjhb

7,387 post(s)
#02-Aug-17 10:30

I was holding my tongue on this, but I think it would be wrong not to say it.

tgazzard you are really shooting yourself in the foot by writing such scruffy amorphous SQL.

It doesn't do anyone else any favours either.

If you plan to write more than two or three queries in your life, then it will pay to invest some time in formatting, structure, and some consistent style. (Style can be anything that makes sense to you. It doesn't have to match anyone else's.)

If your batch of queries had had even a little bit of each of those, then you probably would have seen the error straight away, but even if you hadn't seen it, it would have been much easier for someone else to spot. This matters when you're asking for other people's time, also your own.

You can see why I was holding my tongue, but ir's important enough to say.

It's hard to write horrible SQL correctly. And why bother? It takes too long.

tgazzard117 post(s)
#02-Aug-17 12:07

Fair enough Tim. I agree that if I am asking for others to help that is reasonable.

On the other side of things - all things are a trade-off in terms of results vs style. And as you say style is in the eye of the beholder.

For me, in this instance the effort was in learning a new program.

The sql that was posted in this thread was about a 5th of what was in the entire sequence. And I needed to get to done today so I could run it over night and finish the job tomorrow.

Next time I post I will take some more time to style it up.

tgazzard117 post(s)
#03-Aug-17 05:01

Finished the job I was working on and wanted to add a couple of things before I move on.

This matters when you're asking for other people's time, also your own.

1) You comment above is a good point.

2) Style is often defined by the program you use and how you use it.

a) For example, I could of written the first query above like this below. But as far as I am aware there is no way to block out multiple rows (like you would in python with say '''.....'''. Happy to be corrected on this.

So instead I revert to having multiple columns identified on the one row as it easy to block them for testing purposes.

b) A second example would be in postgres it is much easier (for me) to write things in lowercase with no spaces.

PRAGMA ('progress.percent' = '1','gpgpu' = 'aggressive', 'gpgpu.fp'='32');

    SELECT 

        [a].[ID], 

        [b].[ID] as [Fuel ID], 

        [c].[Ang],

        [Fuel].[Fuel load],

        [b].[Width],

        [b].[Surface Fuel Load],

        [a].[Height],b.[Height] as [fuelHgt],

        [a].[Height] - b.[Height] as [Hgtdif],

        GeomDistance([a].[Geom (I)],[b].[Geom (I)],0) as [dis], 

        [b].[Fuel Type], 

        [b].[Aspect], 

        [b].[Slope], 

        [a].[Aspect] as [Grid Block Aspect],

        [a].[x] as [GridX],b.[x] as [FuelX],

        [a].[y] as [GridY],b.[y] as [FuelY]

    into Hgt

    from 

        [Grid Block] as a,

        [Fuel] as b, 

        [AngTab] as c

    where 

        GeomDistance([a].[Geom (I)], [b].[Geom (I)], 0) < 100 and 

        [Fuel].[Fuel Type] <> 'unclassifiable'

THREADS 8 Batch 10;

tgazzard117 post(s)
#03-Aug-17 05:27

Point 3

I had not used radian for any meaningful activity until the last couple of days. This was the first query in a sequence of about 40. And there were 5 new things that I had to work through to get one query to run. Lucky this was probably the hardest of them.

Even basic statements like alter table t (add c int32) has two new bits of syntax. Namely the brackets and the data type text

Attachments:
ScreenHunter_358 Aug. 03 14.28.jpg

adamw


7,129 post(s)
#03-Aug-17 06:55

Looking at this screen, maybe it makes sense to resolve ambiguous field names in the result table by adding names of producing tables, indeed. "ID 2" is worse than "Fuel ID" or, say, "ID, Fuel" to keep the field name first. Realistically, after you see such a name in the result table, you would still want to rename it, but it would be much clearer what table the field is coming from. Although there still would be "2"s and "3"s from joins which use the same table several times.

This probably isn't a big deal, but if we can make things clearer, why not.

tgazzard117 post(s)
#03-Aug-17 07:58

I agree that this could be considered poorer form on my part to not explicitly define an alias for two columns and we saw in the initial post it caused some (minor) grief. It didn't bug me to have to fix this.

adamw


7,129 post(s)
#03-Aug-17 08:02

No, no, no, I meant maybe it makes sense for our code to help a bit there, like Manifold 8 does. You would still want to assign aliases to conflicting field names manually, but the default generated field names will be clearer.

(I was just going through the items on your screen making mental notes - "yes, we do this for a reason", "yes, this difference exists, too, perhaps we should emphasize it in the doc better" - and that item with resolving conflicts between ambiguous field names was the one where I think Manifold 8 did it slightly better than we do now.)

tgazzard117 post(s)
#03-Aug-17 08:25

Cool. Anything that makes it a little easier. . I was wrong about the [geom (i)] not working. It does work with lower case which is great.

tgazzard117 post(s)
#03-Aug-17 05:36

Point 4 - last point.

Style evolves as you learn the language and become fluent in things. I am sure that as we all learn, our style improves - often just because your latter attempts are closer to the mark - so less trial and error.

So sometimes there is a legacy set of queries or some script that you know is working (and has been rigorously error checked) that you just stick with.

I have said my piece now .

Dimitri

4,118 post(s)
#03-Aug-17 07:19

So sometimes there is a legacy set of queries

Understood, and a fair comment, but perhaps it indicates a disconnect. Any time you move queries from one SQL environment, say, PostgreSQL, to a different environment, say, Oracle, you have to adapt those queries. The bigger the difference in the environments, the more work you must do to port queries. That's generally not a big intellectual deal since most SQLs are similar, but the mass of details that need tweaking in complex queries can all the same involve significant work to adapt.

Radian is a completely different product and a completely different technology than Release 8. It makes no pretensions at being a super-set of 8 or a variation on 8 or 1 to 1 compatibility or similarity with 8. It is as totally different from 8 as PostgreSQL or Oracle are from 8. That's a good thing because a bigger and more sophisticated Radian query engine opens the door to bigger and better ways of doing things that is possible at the more, say, Microsoft, level characteristic of 8.

Just as you must first learn Oracle to move queries from PostgreSQL to Oracle, and cannot accomplish a port by reasoning by analogy with how PostgreSQL does things, you must also first learn how Radian does things so you can do whatever alterations are necessary to queries from other systems to adapt them to Radian. That is best done as a fresh start.

On the plus side, that fresh start is a good thing in most cases because SQL in Radian is much closer to how SQL works in big, well-implemented, enterprise-class DBMS environments like Oracle or PostgreSQL. So getting your head around the Radian way is not a waste of time, but investing into a bigger picture way of doing things, bigger than is possible with 8.

I agree that it is often the case when a vendor issues a new product line that users of older or parallel products will often expect the new products to be similar, or that there will be automated means to translate legacy approaches into new approaches. That's certainly fair and something that Manifold strives to do where it doesn't conflict with other objectives. Release 8, for example, will in general be able to load .map files created by earlier Manifold GIS releases.

In the case of Radian the difference between how Radian does things and how those were done in 8 is sufficiently different that an automated translator would not be as effective as what should be a straightforward adaptation once people learn Radian. There is very good reason to believe that no reasonable level of investment in such a thing would result in an automated translator that would avoid the need to do some informed, manual adaptation in any event.

So, to date the emphasis has been on building a better query engine that can pull its weight in big-time enterprise settings, trying to provide core information on that new engine and betting that a well-implemented, fully-complete SQL in the big-time style of world class applications like Oracle or PostgreSQL will be a worthwhile enough payoff for people to make the effort to adapt 8 queries.

That's all subject to case-by-case decisions, of course, so whenever there is something that reasonably can be done to smooth a transition Manifold is eager to hear suggestions.

tgazzard117 post(s)
#03-Aug-17 08:13

No issue here Dimitri - There is trade-off between learning something new and hopefully some performance gains. I am happy to invest in this as I believe that radian will be helpful to my work now and into the future.

And there is enough that is similar in other sql implementations that the initial transition has been okay. In the end I got through 40 queries in about 3 hours..... Which was reasonable. And while I was working with these I was running the ones that I had translated across. The next time it will be go quicker....

tjhb

7,387 post(s)
#05-Aug-17 01:41

tgazzard, one more comment.

On the other side of things - all things are a trade-off in terms of results vs style.

I strongly disagree, I think there is no tradeoff. Results and style are not orthogonal.

Habitual style (indentation, capitalization...) is your best friend, alongside good commenting. It's like playing scales when learning the piano.If what is on the page (or screen) has a direct correspondance to what's in your head, then you are in charge of syntax (and errors), rather than syntax being in charge of you.

If form has rules (partly implemented in muscle memory), then your brain can devote itself to function (that's a big deal). If there are familiar patterns, then the eye can find anomalies quickly.

I'm very slowly getting to grip with formatting in Radian SQL. Every new concept or change of syntax needs a fresh look at style. At first I usually misunderstand how a new operator or function relates to others, and get the style 'wrong' (subjectively wrong, but wrong). As it becomes clearer I revise the brain wiring and also the layout. Good examples in my case are COLLECT and the new SPLIT. Everything is still more fluid than I would like, but making this kind of mistake is useful.

Concentration on style also tends to make code more succinct, and simpler. Thinking 'this can't be right, it looks much too complex' is often a useful judgement, and often right.

adamw


7,129 post(s)
#02-Aug-17 08:06

Actually, I think I know what the error is.

The SELECT INTO leaves one field named Aspect as Aspect and renames another to Grid Block Aspect. The 2nd ALTER TABLE with the RENAME tries to do the same and rename Aspect 2 to Grid Block Aspect. Aspect 2 does not exist, the field has already been renamed by SELECT INTO, and so things fail.

(Yeah, it would definitely help if the error message mentioned which field name was the culprit and included the line / position of at least the statement where this occurred. We will do it.)

adamw


7,129 post(s)
#02-Aug-17 08:10

Checked the file, and yes, that's it. Just remove the renaming of Aspect 2, it is already renamed. The rest runs fine.

tgazzard117 post(s)
#08-Aug-17 07:56

Still testing things on my end. Just interested in your thoughts on why Manifold 8 is significantly faster than radian in this particular example - more than twice as fast. I will post the radian file shortly. Below is the query that I tested on

sql

SELECT

        [a].[ID], 

        [b].[ID] as [Fuel ID], 

        [c].[Ang],

        [Fuel].[Fuel load],

        [b].[Width],

        [b].[Surface Fuel Load],

        [a].[Height],b.[Height] as [fuelHgt],

        [a].[Height] - b.[Height] as [Hgtdif],

        GeomDistance([a].[Geom (I)],[b].[Geom (I)],0) as [dis], 

        [b].[Fuel Type], 

        [b].[Aspect], 

        [b].[Slope], 

        [a].[Aspect] as [Grid Block Aspect],

        [a].[x] as [GridX],b.[x] as [FuelX],

        [a].[y] as [GridY],b.[y] as [FuelY]

    into Hgt

    from 

        [Grid Block] as a,

        [Fuel] as b, 

        [AngTab] as c

    where 

        GeomDistance([a].[Geom (I)], [b].[Geom (I)], 0) < 2 and 

        [Fuel].[Fuel Type] <> 'unclassifiable'

Attachments:
ScreenHunter_370 Aug. 08 16.54.jpg

adamw


7,129 post(s)
#08-Aug-17 07:59

Try replacing GeomDistance with GeomWithin. (See this post and maybe several posts above it.)

PS: The query is quite a pleasure to read and the structure is immediately clear.

tgazzard117 post(s)
#08-Aug-17 08:01

Hi Adam,

I saw your post as I was testing the other night and tried the GeomWithin - See the last 4 or 5 lines in the table.

Made a small difference. But still twice as fast in Manifold 8.

Apologies the queries in the files are still "not pretty"

Attached is the manifold 8 and radian versions of the files. Sorry I had to zip up the second file.

Attachments:
manifold8_test.map
temp_v9_test1.zip

adamw


7,129 post(s)
#08-Aug-17 08:10

Apologies for not looking.

Could you upload example data somewhere? There are ways to rewrite the query, but since there are multiple possible directions to explore, it would be useful to look at the data while doing that. In general, there are some scenarios where the Manifold 8 query engine is legitimately outperforming the Radian query engine, we are gradually improving the Radian query engine to cover them.

For this particular query, the GPGPU pragmas should make no difference.

adamw


7,129 post(s)
#08-Aug-17 08:27

Thanks for the files, we will take a look.

A few side notes following the columns in the spreadsheet screen. I realize you are just trying different things exploring the effect of each. Here's a view of how things are supposed to be:

1. fullfetch is irrelevant and should be removed as a column (it is turned off everywhere already like it should be, so that's a very minor point).

2. Index on Geom should always be there, the distance filter cannot be fast without it.

3. Index on Fuel Type is irrelevant, the <> operator cannot use it. If the operator was = or IN, it would have been different.

4. Currently, GeomDistance should always be GeomWithin, because the result of GeomDistance is then compared with a numeric value using <. The < could have been <= or even =. If it was >= / > / <> (as in, 'find all objects with distance higher than this limit', GeomDistance cannot be GeomWithin due to different semantics and cannot be optimized).

I will report back after taking a look at the data / queries (will take some time).

tgazzard117 post(s)
#08-Aug-17 08:34

Thanks Adam.

Point 3 is something new for me. I will keep that in mind.

Point 2 - I assumed this to be the case, but didn't see a noticeable difference here.

The reason for 10 queries doing the same thing.... Well, little differences magnify and I assume even out some system noise..... And this seemed like a reasonable way to test things.

adamw


7,129 post(s)
#12-Aug-17 13:42

We looked into this and managed to find a bug in the query engine that was preventing some of the scans from using spatial indexes. (The issue is the same as here.)

Execution times after the fix (I used Query in manifold8_test.map in Manifold 8 and a direct port of the same query into Radian with the call to Distance in WHERE switched to GeomWithin, the entire run of Sql_sequence should behave more or less similarly):

Manifold 8: 2.862 sec

Radian / Viewer: 0.953 sec

That's without making any adjustments, single thread vs single thread, the exact same semantic code.

We are going to deliver the fix in a few days in an update to Radian / Viewer.

tgazzard117 post(s)
#13-Aug-17 05:48

Thanks for reporting back. Cool that the improvement is so significant even without additional threads. Look forwarding to testing it when the next update comes out.

adamw


7,129 post(s)
#15-Aug-17 07:20

Try 9.0.162.4.

Manifold User Community Use Agreement Copyright (C) 2007-2017 Manifold Software Limited. All rights reserved.