Subscribe to this thread
Home - General / All posts - Postgresql 11.3 SQL query requires a wildcard to return the correct results
#16-Jul-19 20:48

Manifold 9 Edge Build 9.0.169.4(22-Jun-2019)

Datasource: Postgresql 11.3 with PostGIS 2.5

It is possible that I'm making a newbie mistake here, but I'm having trouble creating a query based on a table in my database. I hope someone here can set me on the right path.

I'm using the following SQL query, but it returns 0 records:

SELECT * FROM [BGS_DMS]::[tenement.global_master]

WHERE [subjurisdiction]='MB'

;

If I modify the query to the following syntax it returns the correct set of records:

SELECT * FROM [BGS_DMS]::[tenement.global_master]

WHERE [subjurisdiction] like 'MB%'

;

For reference, the field [subjurisdiction] is character (25) in postgresql and shows in the Manifold 9 schema as nvachar. It seems to me that the Manifold SQL engine sees the extra room in the field as trailing spaces, hence the success with a wildcard. I've tested the first query in pgAdmin, QGIS, and Access; these all return the correct data subset.

Is there something that I've set up incorrectly while setting up the connection in Manifold or in the database itself? Please let me know if I need to include additional information.

Also, is there a way to know the number of records that have been returned by a query in Manifold?

Cheers,

Chris

oeaulong

203 post(s)
#16-Jul-19 21:51

Is there a Trim() or Split_part() command you could run the exact match from the first example through to test this?

tjhb

8,805 post(s)
#17-Jul-19 01:59

Interesting question! I am out of my depth, looking at online references for the first time. From what I understand:

PostegreSQL has two relevant string datatypes: varchar and varlena. The varchar type is for fixed-length strings, varlena for variable-length character arrays.

Various posts on stackoverflow suggest that only varlena is used for the most recent versions of postgreSQL, even for fixed-length strings.

I imagine that length in varchar is controlled by a null delimiter (as in C).

Fixed length in varlena is instead given by a property and controlled by a constraint. C exports from varlena add a null delimiter after the known length.

Apparently, null characters cannot be stored as part of a string either in varchar or in varlena. (Seems strange in the latter case but perhaps understandable; not sure.)

To get to the point:

Perhaps Manifold is expecting fixed-length postgreSQL strings always to be null-delimited (as when coming from varchar), while sometimes, in recent versions, they are not (because coming from varlena). Perhaps it is not reading the internal length property in the latter case when it should. (That could explain why the equality test does not work for a shortened fixed-length string stored as varlena.)

For now, as a better workaround, I would follow oeaulong by using Manifold StringTrim(), probably with the WHITESPACE constant as the <trim> argument, as a somewhat more robust alternative to LIKE.

Or you could count backwards from the character preceding the last whitespace character, as varchar.bcTruelen() does in postgreSQL. That would be even more robust, preserving internal whitespace.

As I said, out of my depth, just trying to help a little bit.

Dimitri


5,491 post(s)
#17-Jul-19 09:27

PostegreSQL has two relevant string datatypes: varchar and varlena.

Also has a third type, character(n) or char(n) for short, which is what's being used and which pads with space characters.

Dimitri


5,491 post(s)
#17-Jul-19 09:22

To understand the difference between results returned by the two different WHERE qualifiers, ask yourself what does the [subjurisdiction] field contain?

Using the where qualifier

WHERE [subjurisdiction]='MB'

means that only those records that contain the two characters M and B in sequence, and only those two characters, will be selected. Anything else, including white space characters of any kind (spaces, newlines, unicode characters that are invisible, etc), that precedes or follows the MB sequence will not be equal to 'MB'.

You don't say what your [subjurisdiction] field contains, but we can infer what it contains based on your description of the data type as character(25) and then consulting the PostgreSQL documentation, for example, at https://www.postgresql.org/docs/9.1/datatype-character.html

That tells us... "Values of type character are physically padded with spaces to the specified width n, and are stored and displayed that way."

So, if you are looking at a character(25) field that seems to contain the characters MB, it actually does not contain only 2 characters, it contains 25 characters, the characters M and B followed by 23 space characters.

If you want to match that, you have to write:

WHERE [subjurisdiction]='MB                       '

...where you've put 23 space characters between the B and the concluding single ' quote. Spaces are real characters, just as real as, say, an underscore _ character. You don't expect 'MB' to be equal to 'M_B' or 'MB_', so you shouldn't expect 'MB' to be equal to 'M B' or to 'MB ' (space character between the B and the ').

The solution is to take advantage of Manifold's handy and ever-popular string manipulation SQL functions, such as StringTrim:

WHERE StringTrim([subjurisdiction], ' ') ='MB'

We use the above to trim any space characters (the space between the ' ' in the function arguments) before making the comparison to the string literal 'MB'.

I've tested the first query in pgAdmin, QGIS, and Access

Meaningless, since Access doesn't add space characters to data to pad them, QGIS has no database (so you were really using some other DBMS), and the PostgreSQL documentation says it explicitly ignores space characters used for padding.

That latter bit, by the way, is an uncharacteristically sloppy approach to data for Postgres, since either a literal is what you mean it to be or it isn't, with sharp minds not buying any of this "oh sometimes I mean it when I use a space character and sometimes I don't".

#17-Jul-19 18:35

Thank you for you detailed response Dimitri. When reading the Postgresql documentation I had missed the part, or not comprehended, about "blank padding" in the character (n) field type. I expect I'll look at modifying these fields to varchar (n) to avoid this problem.

As per, oeaulong's suggestion yesterday I had discovered and tried the StringTrim function. As you pointed out it achieved my goal for the query.

Thank you everyone who helped get me back on track here.

tjhb

8,805 post(s)
#18-Jul-19 05:11

Also has a third type, character(n) or char(n) for short, which is what's being used and which pads with space characters.

Thanks Dimitri. Not for the first time, I have missed the somewhat obvious.

...the PostgreSQL documentation says it explicitly ignores space characters used for padding.

That latter bit, by the way, is an uncharacteristically sloppy approach to data for Postgres, since either a literal is what you mean it to be or it isn't

Yes, wow, that is slack. Spaces are characters. Literals are literals. Equality is... well, sometimes there could perhaps be a reason to use a heuristic or an approximation, but only very very rarely. (A case in point is equality between geometry, in Manifold 8 and 9. Equality means numerical equality, with no tolerance.)

You are so much better at homework than I am! I try but, well, chapeau. :)

Manifold User Community Use Agreement Copyright (C) 2007-2017 Manifold Software Limited. All rights reserved.