Skip to content
Snippets Groups Projects
  • Russ Cox's avatar
    930cf59b
    regexp/syntax: recognize category aliases like \p{Letter} · 930cf59b
    Russ Cox authored
    The Unicode specification defines aliases for some of the general
    category names. For example the category "L" has alias "Letter".
    
    The regexp package supports \p{L} but not \p{Letter}, because there
    was nothing in the Unicode tables that lets regexp know about Letter.
    Now that package unicode provides CategoryAliases (see #70780),
    we can use it to provide \p{Letter} as well.
    
    This is the only feature missing from making package regexp suitable
    for use in a JSON-API Schema implementation. (The official test suite
    includes usage of aliases like \p{Letter} instead of \p{L}.)
    
    For better conformity with Unicode TR18, also accept case-insensitive
    matches for names and ignore underscores, hyphens, and spaces;
    and add Any, ASCII, and Assigned.
    
    Fixes #70781.
    
    Change-Id: I50ff024d99255338fa8d92663881acb47f1e92a5
    Reviewed-on: https://go-review.googlesource.com/c/go/+/641377
    
    
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
    Reviewed-by: default avatarAlan Donovan <adonovan@google.com>
    930cf59b
    History
    regexp/syntax: recognize category aliases like \p{Letter}
    Russ Cox authored
    The Unicode specification defines aliases for some of the general
    category names. For example the category "L" has alias "Letter".
    
    The regexp package supports \p{L} but not \p{Letter}, because there
    was nothing in the Unicode tables that lets regexp know about Letter.
    Now that package unicode provides CategoryAliases (see #70780),
    we can use it to provide \p{Letter} as well.
    
    This is the only feature missing from making package regexp suitable
    for use in a JSON-API Schema implementation. (The official test suite
    includes usage of aliases like \p{Letter} instead of \p{L}.)
    
    For better conformity with Unicode TR18, also accept case-insensitive
    matches for names and ignore underscores, hyphens, and spaces;
    and add Any, ASCII, and Assigned.
    
    Fixes #70781.
    
    Change-Id: I50ff024d99255338fa8d92663881acb47f1e92a5
    Reviewed-on: https://go-review.googlesource.com/c/go/+/641377
    
    
    LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
    Reviewed-by: default avatarAlan Donovan <adonovan@google.com>
Code owners
Assign users and groups as approvers for specific file changes. Learn more.