RegExp set notation: stage 1?
https://github.com/mathiasbynens/proposal-regexp-set-notation
Add RegExp pattern syntax and semantics for these set operations:
Note that union (in A or in B) is already supported in limited form (only within a single character class).
RegExp set notation
// Matching non-ASCII digits, to convert them to ASCII digits:
[\p{Decimal_Number}--[0-9]]
// → difference/subtraction + nested character class
// Matching spans of “word/identifier letters” of specific scripts:
[\p{Script=Khmer}&&[\p{Letter}\p{Mark}\p{Number}]]
// → intersection + nested character class
// Matching non-script-specific combining marks:
[\p{Nonspacing_Mark}&&[\p{Script=Inherited}\p{Script=Common}]]
// → intersection + nested character class
/…\UnicodeSet{…}…/u
ICU, Java, Perl (experimental), Python regex module, .Net, XML Schema, Xerces, Ruby
language/implementation | union | subtraction | intersection | nested classes | symmetric difference |
✅ | ✅ | ✅ | ✅ | ❌ | |
✅ | 🤷 * | ✅ | ✅ | ❌ | |
✅ | ✅ | ✅ | ✅ | ✅ | |
✅ | ✅ | ❌ | ✅ | ❌ | |
✅ | ✅ | ❌ | ✅ | ❌ | |
✅ | ✅ | ✅ | ✅ | ❌ | |
Python regex | ✅ | ✅ | ✅ | ✅ | ✅ |
Ruby | ✅ | ❌ | ✅ | ❌ | ❌ |
ECMAScript prior to this proposal | ✅ | ❌ | ❌ | ❌ | ❌ |
ECMAScript with this proposal | ✅ | ✅ | ✅ | ✅ | ❌ |
RegExp set notation: stage 1?
https://github.com/mathiasbynens/proposal-regexp-set-notation