aercolino / ando-php Goto Github PK
View Code? Open in Web Editor NEWA collection of PHP 5.2.4+ classes for web projects.
License: MIT License
A collection of PHP 5.2.4+ classes for web projects.
License: MIT License
I call lexical backreference one like
(?1)
(numbered absolute)(?-1)
(numbered relative)(?P>name)
(named)(?&name)
(named)Defined at: http://php.net/manual/en/regexp.reference.recursive.php
If the syntax for a recursive subpattern reference (either by number or by name) is used outside the parentheses to which it refers, it operates like a subroutine in a programming language. An earlier example pointed out that the pattern (sens|respons)e and \1ibility matches "sense and sensibility" and "response and responsibility", but not "sense and responsibility". If instead the pattern (sens|respons)e and (?1)ibility is used, it does match "sense and responsibility" as well as the other two strings. Such references must, however, follow the subpattern to which they refer.
I'm not sure about my name for this syntax (lexical backreference). I invented it looking at the above example about non-recursive usage. Lexical means that such a backreference represents a previous group as it was defined not as it will be matched. In fact, the following regular expressions are identical:
@(sens|respons)e and (?1)ibility@
@(sens|respons)e and (?:sens|respons)ibility@
There is also another interesting fact about my naming conventions, and it is that recursive backreferences are self lexical backreferences.
For the time being, I'll stick to it.
Named groups are like
(?P<name>pattern)
(?<name>pattern)
(?'name'pattern)
Currently only the angle brackets type is supported.
Defined at: http://php.net/manual/en/regexp.reference.subpatterns.php
It is possible to name a subpattern using the syntax (?Ppattern). This subpattern will then be indexed in the matches array by its normal numeric position and also by name. PHP 5.2.2 introduced two alternative syntaxes (?pattern) and (?'name'pattern).
Templates like (a)$variable(b)\1\2
are not currently supported in the general case, meaning that they only are supported if $variable doesn't contain groups, which should be very unfrequent.
So, I think this should be considered a bug, at the moment.
I call existential backreferences those that make up conditions, like
(?(1)then|else)
Defined at: http://php.net/manual/en/regexp.reference.conditional.php
There are two kinds of condition. If the text between the parentheses consists of a sequence of digits, then the condition is satisfied if the capturing subpattern of that number has previously matched.
I call forward backreference one that appears before the group it refers to.
Defined at: http://php.net/manual/en/regexp.reference.back-references.php
However, if the decimal number following the backslash is less than 10, it is always taken as a back reference, and causes an error only if there are not that many capturing left parentheses in the entire pattern. In other words, the parentheses that are referenced need not be to the left of the reference for numbers less than 10. A "forward back reference" can make sense when a repetition is involved and the subpattern to the right has participated in an earlier iteration. See the section entitled "Backslash" above for further details of the handling of digits following a backslash.
I call self backreference one that refers to the same group it appears into.
Defined at: http://php.net/manual/en/regexp.reference.back-references.php:
A back reference that occurs inside the parentheses to which it refers fails when the subpattern is first used, so, for example, (a\1) never matches. However, such references can be useful inside repeated subpatterns. For example, the pattern (a|b\1)+ matches any number of "a"s and also "aba", "ababba" etc. At each iteration of the subpattern, the back reference matches the character string corresponding to the previous iteration. In order for this to work, the pattern must be such that the first iteration does not need to match the back reference. This can be done using alternation, as in the example above, or by a quantifier with a minimum of zero.
Regex::find_duplicate_numbers and Regex::explode_alternation both need balanced parentheses and they scan the subject for them one character at a time.
I call partial interpolations those where not all variables of a template are interpolated at once.
I call recursive interpolations those where some variables of a template may themselves be templates with more variables and so on, and the interpolation provides a selection of all the variables at any level (but all variables with same name are the same variable) and the substitution is carried on until all provided variables have been interpolated.
This feature will make it a no brainer to interpolate variables starting from high level templates and providing a flat list of variable definitions.
I wonder if it makes sense to preserve in the Regex all previously interpolated variables, so that when new variables are defined using templates with old variables, interpolations could take place immediately and automatically.
It is a requirement in Regex::count_matches(). Exceptions are currently thrown if the pattern to count contains duplicate numbers and those parentheses are not balanced. That means that all expressions should have balanced parentheses even if they contain variables, even if they are contained in variables.
I don't think it's worth to go through the pain of relaxing this requirement.
Comments are like
(?# ... )
# ... \n
-- only when the PCRE_EXTENDED is set (and '#' not escaped nor into [])Defined at: http://php.net/manual/en/regexp.reference.comments.php
The sequence (?# marks the start of a comment which continues up to the next closing parenthesis. Nested parentheses are not permitted. The characters that make up a comment play no part in the pattern matching at all.
If the PCRE_EXTENDED option is set, an unescaped # character outside a character class introduces a comment that continues up to the next newline character in the pattern.
Currently only one string template is supported by Regex::pattern_quoted_string(). That is to be considered just an example of a very interesting usage of the Regex class, as a reusable template repository.
Non-capturing groups are like
(?:a)
(?i)
(?i:a)
Defined at: http://php.net/manual/en/regexp.reference.subpatterns.php
The fact that plain parentheses fulfill two functions is not always helpful. There are often times when a grouping subpattern is required without a capturing requirement. If an opening parenthesis is followed by "?:", the subpattern does not do any capturing, and is not counted when computing the number of any subsequent capturing subpatterns. For example, if the string "the white queen" is matched against the pattern the ((?:red|white) (king|queen)) the captured substrings are "white queen" and "queen", and are numbered 1 and 2. The maximum number of captured substrings is 65535.
As a convenient shorthand, if any option settings are required at the start of a non-capturing subpattern, the option letters may appear between the "?" and the ":". Thus the two patterns
(?i:saturday|sunday) (?:(?i)saturday|sunday)
match exactly the same set of strings. Because alternative branches are tried from left to right, and options are not reset until the end of the subpattern is reached, an option setting in one branch does affect subsequent branches, so the above patterns match "SUNDAY" as well as "Saturday".
As it appears, the Regex class (in particular, but any other too) is still pretty immature and it only allows for a very limited set of features. It's a good time to start documenting organically what is allowed and what is not, also taking into account the many TODOs / issues.
I call g-backreference one like \g1
to \g99
, or \g{1}
to \g{99}
(absolute), and \g-1
to \g-99
, or \g{-1}
to \g{-99}
(relative).
Defined at: http://php.net/manual/en/regexp.reference.back-references.php
As of PHP 5.2.2, the \g escape sequence can be used for absolute and relative referencing of subpatterns. This escape sequence must be followed by an unsigned number or a negative number, optionally enclosed in braces. The sequences \1, \g1 and \g{1} are synonymous with one another. The use of this pattern with an unsigned number can help remove the ambiguity inherent when using digits following a backslash. The sequence helps to distinguish back references from octal characters and also makes it easier to have a back reference followed by a literal number, e.g. \g{2}1.
The use of the \g sequence with a negative number signifies a relative reference. For example, (foo)(bar)\g{-1} would match the sequence "foobarbar" and (foo)(bar)\g{-2} matches "foobarfoo". This can be useful in long patterns as an alternative to keeping track of the number of subpatterns in order to reference a specific previous subpattern.
Named backreferences are like
(?P=name)
\k<name>
\k'name'
\k{name}
\g{name}
Defined at: http://php.net/manual/en/regexp.reference.back-references.php
Back references to the named subpatterns can be achieved by (?P=name) or, since PHP 5.2.2, also by \k or \k'name'. Additionally PHP 5.2.4 added support for \k{name} and \g{name}.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.