Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How about making {n,m}+ as a possessive quantifier? #122

Open
k-takata opened this issue Jan 28, 2019 · 4 comments
Open

How about making {n,m}+ as a possessive quantifier? #122

k-takata opened this issue Jan 28, 2019 · 4 comments
Labels

Comments

@k-takata
Copy link
Owner

Currently, {n,m}+ is a possessive quantifier only in Java and Perl syntax.
How about making it as a possessive quantifier also in the default (Ruby) syntax?

@k-takata k-takata added the spec label Jan 28, 2019
@nurse
Copy link
Contributor

nurse commented Jan 28, 2019

It's hard to migrate because it breaks compatibility.

irb(main):001:0> /a{2,5}+b/
=> /a{2,5}+b/
irb(main):002:0> /a{2,5}+b/=~"aaaaaaaaaaaaaaaab"
=> 0
irb(main):003:0> $&
=> "aaaaaaaaaaaaaaaab"

@jaynetics
Copy link

it would be nice for consistency and avoiding surprises.

maybe a possible route to migration could look like this?

  1. make it a SyntaxError, e.g. possessive interval qualifiers are not enabled
  2. everyone who is still updating will remove the effectless + at some point
  3. after a sufficiently long time, enable the feature

@patch
Copy link

patch commented Mar 31, 2019

Here’s an example of how Perl has deprecated regex syntax in the past in order to open the door to new syntax.

perl-5.16.0:

Unescaped literal "{" in regular expressions.

Starting with v5.20, it is planned to require a literal "{" to be escaped, for example by preceding it with a backslash. In v5.18, a deprecated warning message will be emitted for all such uses. This affects only patterns that are to match a literal "{". Other uses of this character, such as part of a quantifier or sequence as in those below, are completely unaffected:

/foo{3,5}/
/\p{Alphabetic}/
/\N{DIGIT ZERO}

Removing this will permit extensions to Perl's pattern syntax and better error checking for existing syntax. See "Quantifiers" in perlre for an example.

perl-5.22.0:

A literal "{" should now be escaped in a pattern

If you want a literal left curly bracket (also called a left brace) in a regular expression pattern, you should now escape it by either preceding it with a backslash ("\{") or enclosing it within square brackets "[{]", or by using \Q; otherwise a deprecation warning will be raised. This was first announced as forthcoming in the v5.16 release; it will allow future extensions to the language to happen.

perl-5.26.0:

Unescaped literal "{" characters in regular expression patterns are no longer permissible

You have to now say something like "\{" or "[{]" to specify to match a LEFT CURLY BRACKET; otherwise, it is a fatal pattern compilation error. This change will allow future extensions to the language.

These have been deprecated since v5.16, with a deprecation message raised for some uses starting in v5.22. Unfortunately, the code added to raise the message was buggy and failed to warn in some cases where it should have. Therefore, enforcement of this ban for these cases is deferred until Perl 5.30, but the code has been fixed to raise a default-on deprecation message for them in the meantime.

Some uses of literal "{" occur in contexts where we do not foresee the meaning ever being anything but the literal, such as the very first character in the pattern, or after a "|" meaning alternation. Thus

qr/{fee|{fie/

matches either of the strings {fee or {fie. To avoid forcing unnecessary code changes, these uses do not need to be escaped, and no warning is raised about them, and there are no current plans to change this.

But it is always correct to escape "{", and the simple rule to remember is to always do so.

See Unescaped left brace in regex is illegal here.

@patch
Copy link

patch commented Apr 1, 2019

PCRE and ICU also support possessive {n,m}+

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants