sakura/en/Regular Expression can be Used

Last-modified: 2008-03-30 (日) 09:18:59

Regular Expression can be Used

The explanation of the regular expression that can be used. It is difficult and operation might change by renewing the library to the confirmation of all operation. Finally, please confirm it by yourself.

[star] See: Regular Expression Library

Basic Element

\Save Modification (Escape)
Control of regular expression sign effective/invalid. The regular expression sign that is next to \ is treated as a usual character. Moreover, it combines with the alphabetical letter and a special meaning is given.
Selection Element
Logical add in pattern
(...)Expression set (Group)
The pattern is made a group.
[...]Character set (character class)
Character class.

Character Set (character class)

The following can specify it for [...].

...[ABC] matches to any of A or B or C.
^...Denial
[^ABC] matches to one arbitrary character other than A, B, and C.
x-yRange
[A-Z] matches to the one character from "A" to "Z".
[...]("oni") Character set in character set
..&&..("oni") Product operation
[:xxxxx:]("oni") POSIX Bracket
[:^xxxxx:]("oni") POSIX Bracket (Denial)

("oni") is only bregonig.dll.

Amount of finger fixed child(numerical quantum)

Minimum agreement
(Disinterestedness)
Maximum agreement
(Greed)
*?*Repetition of pattern immediately before of 0 times or more.
+?+Repetition of pattern immediately before of one time or more.
???The pattern immediately before appears 0 times or once.
{n}?{n}Pattern immediately before Repetition of times of n.
{n,}?{n,}Pattern immediately before N repetition of times or more.
{n,m}?{n,m}From times or more of n The pattern immediately before to times or less of m.

Regular expression A is assumed and [A-Z_]*PROC and regular expression B are assumed to be [A-Z_]*?PROC.

  • SAKURA_COLLBACKPROC_BREXP_PROC

The first matching it in the character string is as follows.

  • For A : SAKURA_COLLBACKPROC_BREXP_PROC
  • For B : SAKURA_COLLBACKPROC

Character

\tThe horizontal tab code (HT,TAB)
\nLine feed (LF)
\rCarriage return (CR)
\bRetreat blank/back space (BS)
It is effective only in [ ].
\fForm feed/changing page (FF)
\aBell/alarm (BEL)
\eSave modification/escape code (ESC)
\oooThe character-code is specified for "o" by octal number. (o is 1...3 column)
\xHHThe character-code is specified for "H" by hexadecimal number. (H is 1...2 column)
\x{HHHH}("oni"-san) Enhancing 16 -adic number expression(H is 1...4 column)
\c[Control character. ([ is control character.)
\Q("oni"-san) The regular expression operator (regular expression sign) is controlled to \E.
\E("oni"-san) The control of the regular expression operator (regular expression sign) is terminal.

("oni"-san) is only bregonig.dll.

Character Kind

.One arbitrary character except \n It is same as [^\n].
\wThe composition character of the word It is same as [0-9A-Za-z_].
("oni"-san) Two byte character is contained.
\WExcluding the composition character of the word
\sSpace character
\SExcluding the space character
\dDecimal digit
\DExcluding the decimal digit
\p{property-name}("oni"-san) Character property
\p{^property-name}
\P{property-name}
("oni"-san) Character property (Denial)

("oni"-san) is only bregonig.dll.

Position(Anchor)

^The head of line
$The end of line
\bBoundary of word
It becomes a meaning of the back space in [ ].
\BExcluding the boundary of the word
\AThe beginning of character string
\ZThe end of character string(Immediately before line-feed line-feed the end.)
\z("oni"-san) The end of character string
\GCollation beginning position

("oni"-san) is only bregonig.dll.

Rear reference and partial call

\nRear reference by number specification
Refer to the character string made "()" group with \n (n is an integer of one or more).
\k<name>
\k'name'
("oni"-san) Rear reference by name specification
\k<name+n>
\k<name-n>
\k'name+n'
\k'name-n'
("oni"-san) Rear reference with nest level
\g<name>
\g'name'
("oni"-san) Name specification part expression call
\g<n>
\g'n'
("oni"-san) Number specification part expression call

("oni"-san) is only bregonig.dll.

Enhancing Type Set

(?#...)Annotation.
(expression)Capture type set.
(?:expression)Non-capture type set. (Only making to the group.)
(?<name>expression)
(?'name'expression)
("oni"-san) Named capture type set.
(?=expression)Lookahead
(?!expression)Negative lookahead
(?<=expression)("oni"-san) Return reading.
(?<!expression)("oni"-san) Negative return reading.
(?>expression)("oni"-san) Atomic expression set.
(?imsx)Isolation option.
i: Capital letters and small letters collation.
m: Two or more lines. (Default and on in the Sakura-Editor.)
s: Single line. (. matches to \n.)
x: Enhancing form. (The blank is disregarded, and it disregards since #.)
(?imsx-imsx)("oni"-san) Isolation option. (The option can be denied in bregonig.dll.)
(imsx-imsx:expression)("oni"-san) Expression option.

("oni"-san) is only bregonig.dll.

Reference that can be used because of substitution

$nRefer to the number specification.
Refer to the character string made group with "$n" (n is an integer of one or more). "\n" can be used in the Sakura-Editor instead of "$n".
${n}("oni"-san) (Safe)Refer to the number specification.
The figure can be written as it is continuously behind.
$&The entire matched character string.
$+("oni"-san) Partial character string matched at the end.
$+{name}
$-{name}[n]
("oni"-san) Refer to the name specification.(Perl 5.10 compatible, conforming)
\k<name>
\k'name'
("oni"-san) Refer to the name specification. ("oniguruma" conforming)
${name}("oni"-san) Refer to the name specification. (Non-recommend it for an original enhancing and the tentative specification.)

("oni"-san) is only bregonig.dll.

Difference by change to bregonig.dll

  • \w contains two byte character.
    \w [A-Za-z0-9_] Two byte character in addition to is contained. The operation of \W, \b, and \B changes according to this, too.
  • To [ ] When [ is written, it is necessary to escape.
    It ..[.. came to have to write \ [ instead of came to being able to use the character set and the product operation in the character set when the one was written.
  • The operation of \c\ is different.
    In Bregexp.dll, \c\ : The following \ of \c is interpreted as an escape in bregonig.dll though it means Ctrl+\. To specify Ctrl+\, it should be assumed \c\\. (Subject to change because it differs from the specification of Perl. )
  • The operation of \ooo is a little different because of substitution.
    It is brought it close to the operation of Perl.

Search Line Feed

When you retrieve line-feed (CRLF) in the Sakura-Editor
Please retrieve it with "\r\n".

When you retrieve line-feed (all of CR, LF, and CRLF)
Please specify it like "[\r\n]+".
When you retrieve line-feed (CR,LF,CRLF,LFCR) and the end of line of the final line
Please specify "$".


[hatena] Hint: It is not necessary to escape in "/" when the regular expression is used with the retrieval, substitution, and Grep, and to enclose it with "/".

[hatena] More Information:
In the retrieval, substitution, and Grep, as follows does the character string passed to Bregexp. [0xFF] is assumed to be \xff.
The option when retrieving it is "m[0xFF]Pattern[0xFF]km".
The option when substituting it is "s[0xFF]PatternBefore[0xFF]PatternAfter[0xFF]km".
Moreover, "i" is added at the end when capital letters and small letters are distinguished.
(The retrieval that steps over line-feed cannot be done though m option has adhered. )