Subscribe to this thread
Home - General / All posts - error in Reguar expressions ?
KlausDE

6,300 post(s)
#03-Feb-18 20:16

I'm not on safe ground using regular expressions. So could someone please check if the escape backslash works for him to flag a left parentheses as "(" and not the start of a capture group?

--SQL

StringRegexpMatches([text]'.*\(XYZ\).*''i')

doesn't match "This text with (XYZ) and more" in [text] in 9.0.165.

lionel

560 post(s)
#03-Feb-18 20:22

StringRegexpMatches([doesn't match "This text with (XYZ) and more" in [text] in 9.0.165.]'.*\(XYZ\).*''i')

at https://www.regextester.com/

using onlinr regexp editor it is ok

Attachments:
regexp.png


join image

"Because my dad promised me" ( interstellar ) but blackhole don't exist

best hardware with no ads focus on quality features price like manifold see xiaomi

KlausDE

6,300 post(s)
#03-Feb-18 20:27

Sorry, my first post matches. But it matches a capture group. This doesn't match in Mfd9 SQL but should:

--SQL

StringRegexpMatches([text]'.*\(XY.*''i')

adamw


8,447 post(s)
#06-Feb-18 12:31

As of 9.0.165.1:

--SQL

> ? StringRegexpMatches('This text with (XYZ) and more''.*\(XY.*''i')

boolean: false

> ? StringRegexpMatches('This text with (XYZ) and more'@'.*\(XY.*''i')

boolean: true

tjhb

8,657 post(s)
online
#03-Feb-18 20:31

Klaus,

I have hit this one too.

We need to use double backslash in RegExp inside SQL9--the first slash to escape the second (for SQL), the second to escape the special character (for RegExp).

[Link to beta thread, with comments by Adam.]

KlausDE

6,300 post(s)
#03-Feb-18 20:38

uhhhh. It's buried deeper in the manual. A hint in the table would be nice.

Dimitri


5,359 post(s)
#04-Feb-18 07:56

It's buried deeper in the manual. A hint in the table would be nice.

With all due respect, it is highlighted with a boldfaced Important: word right at the beginning of the very first Example in the Regular Expressions topic. That's not "buried"! :-)

Regular expressions are very powerful but they are sophisticated and require significant investment into learning. I'm all for hints but if the explicit commentary below is overlooked, could it be a less explicit hint would also be overlooked? If you think it would help, by all means send in the text you recommend and where you think it should go and I'll get it added to the documentation.

The boldfacing is lost in the text quote below... see the topic in the user manual for the fully-formatted version:

Important: When using regular expressions to specify a Pattern in the Select panel Template tab, remember that we must enclose the regular expression in single quote characters, as in '(Carlos|Mario) .*' and also that we must escape each backslash with a preceding backslash. To simplify the examples shown the backslash characters are not doubled as they would be to escape each regexp backslash. For example, the US style phone numbers example

(\+\d)?\s*(\(\d+\))?\s*\d[\s\d-]*

Would be entered into the Pattern box as

'(\\+\\d)?\\s*(\\(\\d+\\))?\\s*\\d[\\s\\d-]*'

tjhb

8,657 post(s)
online
#04-Feb-18 08:20

You are right Dimitri, it's thorough, but Klaus is right too.

There's a missed opportunity to alert us to the necessity to double-escape, in the first row of the table under the heading Using Regular Expressions, that is, in the Description for the \ character. I think that's all Klaus means.

It's important for two reasons. First that RegExp syntax is hard enough already (however powerful--it's a bit Faustian). Secondly that the double-escape is a (necessary) departure from standard syntax, and therefore from otherwise helpful examples provided for other implementations.

So it's really easy to get stuck.

[Added] E.g. the following would be good, added to the \ description in the table:

"Important: see the section 'Escaping special characters in Regular Expressions' below for the special treatment of the '\' escape character in strings within SQL."

Dimitri


5,359 post(s)
#04-Feb-18 14:34

I have still put in a change request as you suggest.

Note that the double \\ backslash is explicitly discussed already in three locations in that topic. In the table where you recommend the "Important:..." note it discussed in

The sequence \\ matches the backslash character "\"

... so I have my doubts that adding an Important: ... note to effect of, "please read this stuff..." is going to get much traction.

But it harms nothing and every little bit helps. Thanks for the text!

KlausDE

6,300 post(s)
#04-Feb-18 17:50

The Problem is that the first row in the table feels like a complete match to your reason to search for help. So why bother about all the complicated rest?

tjhb

8,657 post(s)
online
#04-Feb-18 19:28

Yes the pointer/nudge will be good (thanks Dimitri).

Once you do get into the examples below it's extremely helpful stuff. The patient detail on capturing/non-capturing groups for example is excellent--something I always need a refresher on each time I use regular expressions (not often enough for me to be good at them). RegExp syntax can seem a bit spooky and the discussion here demystifies things nicely.

lionel

560 post(s)
#03-Feb-18 20:50

Attachments:
regex101.png


join image

"Because my dad promised me" ( interstellar ) but blackhole don't exist

best hardware with no ads focus on quality features price like manifold see xiaomi

KlausDE

6,300 post(s)
#03-Feb-18 21:05

lionel, the ordinary syntax of regular expressions is a miracle to me because it quickly can become confusing. But here it's the double parsing of interpreters, SQL and RegEx.

Missed that link, Tim. Thx

lionel

560 post(s)
#03-Feb-18 21:11

I do use regex there is many years ( with dreamweaver ) and when go to my old documentation the best tool online is https://regex101.com/r/gWAPns/1/

this editor help you understand :

--all operator ( alone or begin/Start ) see EXPLANATION in the rigth side

--for some language ..see FLAVOR in the left side . syntax differ a little when using differents languages . I think perl is the first that have efficient library to do regex ?

--the different group extract by the regular expression see MATCH INFORMATION in the right side

the operators for extract motif name motif1 are (motif1)

those group can be call using predefine variable call $1 and $2 ....

hope this help

Attachments:
regexp101_group_content_with_quote.png


join image

"Because my dad promised me" ( interstellar ) but blackhole don't exist

best hardware with no ads focus on quality features price like manifold see xiaomi

lionel

560 post(s)
#03-Feb-18 21:31

you have to create user case by any cases ( relative to (1) see next ) you have to work and write the result ( relative to (4) se next ).Then you have to test all your cases ( all (1)) in a regexp editor then when all your test pass with only one regexp whatever test cases ( (1)) then you know you can use it in the real context.

(1) begin http://somesite.com/assets/Test1.gif

(4) end http://www.example.com/mockup/Test1.gif

So i want to keep the group that contain the image name ...here xxxx.yy

so i use this regex (2) */assets/(.*) that ll store what it find beetween (...) here xxxx.yy or more specific Test1.gif inside variable $1

so i can use this variable in (3) http://www.example.com/mockup/$1 that only replace $1 by its value . in a way (3) is a kind of template

At this time i can't really help . An you describe what you want ? retrive all part after ...(XY ?

hope this help .. i ll try regexp behaviour in manifold 9 ..." a little "

Attachments:
regexp_escape.png


join image

"Because my dad promised me" ( interstellar ) but blackhole don't exist

best hardware with no ads focus on quality features price like manifold see xiaomi

lionel

560 post(s)
#03-Feb-18 21:48

A) Jquery example

here some examples regexp for specific data : mail , credit card , date fie name , image, html tag, http url ( protocol ?) ....

https://www.sitepoint.com/jquery-basic-regex-selector-examples/

B) in manifold 9 regexp 123456

http://www.manifold.net/doc/mfd9/regular_expressions.htm

SQL => StringRegexp....()

transform => number and nature of item name depend of the type of the fields it is why there is many chapter for transform . Each chapter is specific to a type !!

here a little list : TextMatchesRegexp() TextMatches() TextRegexp() , number of RegexpMatches

I think transform tool should have a filter to reduce the number of items


join image

"Because my dad promised me" ( interstellar ) but blackhole don't exist

best hardware with no ads focus on quality features price like manifold see xiaomi

KlausDE

6,300 post(s)
#03-Feb-18 23:37

To cut a long story short. Tim answered my question:

Summary

We have to double the escape characters in Manifold!

The first backslash is fed by the SQL parser and the second backslash is fed by the RegEx parser.

--SQL9

StringRegexpMatches([text]'.*\\(XY.*''i')

-- matches 'some text with (XYZ) and more'

lionel

560 post(s)
#04-Feb-18 08:58

in mfd9 documentation under SQL -> SQL functions

the \\ don't work in regex101 editor so seem specific to manifold !

SQL9 

     ? StringRegexpMatches('This text with (XYZ) and more', '.*\\(XY.*', 'i')

=> to test by run View -> NEw command windows -> SQL Ctrl Titde 

=> copy the SQL Statement in the command editor 

=> the SQL function ll return true in this case 

=> the pattern exist but nothing store in groups variable $1 $2 ...

.....

StringRegexpMatches(<string>, <regexp>, <flags>) : <boolean>

Given a string, a regular expression pattern and a flag to use case or to not use case, returns True if the string exactly matches the regular expression and 0 for False otherwise. The <flag> is not optional and must be either 'i' to ignore case or 'c' to use case. This function is used instead of "LIKEX" query operators found in some query engines.

? StringRegexpMatches('Netherlands', 'n.*', 'c')

Returns 0 (meaning False) since the string does not begin with a lower case n character.

? StringRegexpMatches('Netherlands', 'n.*', 'i')

Returns 1 (meaning True) since if case is ignored the upper case N at the first character position matches the regular expression 'n*', that is, text which begins with the letter n or N and then has zero or more of any characters following.


join image

"Because my dad promised me" ( interstellar ) but blackhole don't exist

best hardware with no ads focus on quality features price like manifold see xiaomi

lionel

560 post(s)
#04-Feb-18 09:25

for the context here , it seem that javascript regxp engine behave like the one when using chrome console

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp

javascript

re = /\\/; //litteral ? 

// or 

re = new RegExp('\\\\'); // object

javascript 

class RegExp1 extends RegExp {

  [Symbol.match](str) {

    var result = RegExp.prototype[Symbol.match].call(this, str);

    if (result) {

      return 'VALID';

    }

    return 'INVALID';

  }

}

console.log('This text with (XYZ) and more'.match(new RegExp1('.*\\(XY.*')));

'This text with (XYZ) and more'.match(new RegExp1('.*\\(XY.*'))

// expected output: "VALID"

Attachments:
javascript_litteral.png
javascript_regex.png


join image

"Because my dad promised me" ( interstellar ) but blackhole don't exist

best hardware with no ads focus on quality features price like manifold see xiaomi

Manifold User Community Use Agreement Copyright (C) 2007-2017 Manifold Software Limited. All rights reserved.