Re: find-regexp-in-string: anything wrong?
Thanks Pascal for the nice recap.
As things stands, I’d say it is a bug on LW 7.x
Cheers
MA
> On Feb 23, 2016, at 21:16 , Pascal J. Bourguignon <pjb@informatimago.com> wrote:
>
>
>
>
> On 23/02/16 16:52, Antoniotti Marco wrote:
>> Aren’t you supposed to escape the parentheses in order to tell the regexp compiler that they are to be used for grouping?
> That depends.
>
> Each regexp matcher may implement its own syntax.
>
> Historically, unix regex(3) library function implements TWO different syntaxes!
> Eventually, they've been normalized by the POSIX standard:
> IEEE Std 1003.2 (``POSIX.2''), sections 2.8 (Regular Expression Notation) and B.5 (C Binding for Regular Expression Matching).
> http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html
>
> The Basic Regular Expressions (BRE), and the Extended Regular Expressions (ERE).
>
> To make it short, in BRE, the special characters are: .[\*^$
> while in ERE, the special characters are: .[\()*+?{|^$
>
> The rules are a little complexified by the context (inside brackets or outside parentheses), but in the right context, it means that you have to escape () only if your regular expression is an ERE (or a derived of ERE), but not if it's a BRE (or a derived of BRE).
>
> Some libraries accept both ERE and BRE (possibly with extensions), selected by a flag, notably unix regex(3), which can be confusing.
>
> cl-ppcre uses ERE:
>
> (scan "'([a-z])+'" " 'foo' ")
> 1
> 6
> #(4)
> #(5)
>
>
>> In any case….
>>
>> CL-USER 78 > (find-regexp-in-string "'([a-z])+'" " 'foo'" :start 0)
>> NIL
>> NIL
>
> So it looks like this find-regexp-in-string is expecting a BRE by default. ( is not special, and you have to use \( and \) for grouping, and \+ for the repeatition, with \ escaped in the string: "'\\([a-z]\\+\\)'"
>
>
> Finally, notice that some regexp matchers are even more confusing, by mixing elements both from ERE and BRE; for example, emacs regexps look like BRE, since you have to use \( and \), but you don't use \+ for the 1-or-more repeatition, but +:
> in emacs, it would be:
>
> (string-match "'\\([a-z]+\\)'" " 'foo'") --> 1
>
>
>
> -- __Pascal J. Bourguignon__ http://www.informatimago.com/
>
> _______________________________________________
> Lisp Hug - the mailing list for LispWorks users
> lisp-hug@lispworks.com
> http://www.lispworks.com/support/lisp-hug.html
--
Marco Antoniotti, Associate Professor tel. +39 - 02 64 48 79 01
DISCo, Università Milano Bicocca U14 2043 http://bimib.disco.unimib..it
Viale Sarca 336
I-20126 Milan (MI) ITALY
Please check: http://cdac.lakecomoschool.org
Please note that I am not checking my Spam-box anymore.
Please do not forward this email without asking me first.
_______________________________________________
Lisp Hug - the mailing list for LispWorks users
lisp-hug@lispworks.com
http://www.lispworks.com/support/lisp-hug.html