Scripting Regex » History » Version 34
Per Amundsen, 01/16/2017 07:11 AM
1 | 1 | Per Amundsen | {{>toc}} |
---|---|---|---|
2 | |||
3 | h1. Regular Expressions |
||
4 | |||
5 | 34 | Per Amundsen | AdiIRC uses the .NET regular expression engine, while mIRC uses the PCRE engine, some differences are converted from PCRE to .NET while others are not possible. |
6 | 33 | Per Amundsen | |
7 | 34 | Per Amundsen | Exploring ways to use PCRE in .NET is on TODO. |
8 | 33 | Per Amundsen | |
9 | 13 | Per Amundsen | Read more about .NET regular expressions: |
10 | |||
11 | http://regexhero.net/reference/ |
||
12 | https://msdn.microsoft.com/en-us/library/hs600312%28v=vs.110%29.aspx |
||
13 | https://msdn.microsoft.com/en-us/library/az24scfc%28v=vs.110%29.aspx |
||
14 | http://www.regular-expressions.info/dotnet.html |
||
15 | |||
16 | 1 | Per Amundsen | h1. Modifiers |
17 | |||
18 | 23 | Per Amundsen | .NET does not use a /pattern/modifier syntax, but AdiIRC tries to interpret it and remove them from the pattern. |
19 | 20 | Per Amundsen | |
20 | 1 | Per Amundsen | /g /G - Enables global match. |
21 | 10 | Per Amundsen | /i /I - Enables case in-sensitive. |
22 | 31 | Per Amundsen | /S - Strips any [[Formatting_Text|control codes]] before matching ([[$hfind]] will ignore this). |
23 | 1 | Per Amundsen | /s - Enables single line match. |
24 | /m /M /c /C - Enables multi line match. |
||
25 | /x /X - Eliminates unescaped white space from the pattern. |
||
26 | 9 | Per Amundsen | /U - Enables non greedy mode. (Tries to replace greedy patterns with non greedy patterns + > +?, * -> *?) |
27 | 32 | Per Amundsen | <notextile>/u - Enables UTF8 instead of ASCII regular expression.</notextile> |
28 | 1 | Per Amundsen | |
29 | 34 | Per Amundsen | h1. Differences between .NET and PCRE |
30 | 1 | Per Amundsen | |
31 | 34 | Per Amundsen | AdiIRC translate some patterns from PCRE into .NET patterns. |
32 | 17 | Per Amundsen | |
33 | 30 | Per Amundsen | <notextile>(*UTF8)/(*UTF) -> Enables UTF8 instead of ASCII regular expression.</notextile> |
34 | 6 | Per Amundsen | <notextile>(?R) -> .*</notextile> |
35 | <notextile>(?2) -> .*</notextile> |
||
36 | <notextile>(?1) -> .*</notextile> |
||
37 | 4 | Per Amundsen | <notextile>++ -> +</notextile> |
38 | 5 | Per Amundsen | <notextile>[:alnum:] -> a-zA-Z0-9</notextile> |
39 | <notextile>[:alpha:] -> a-zA-Z</notextile> |
||
40 | <notextile>[:ascii:] -> \x00-\x7F</notextile> |
||
41 | <notextile>[:blank:] -> \s\t</notextile> |
||
42 | <notextile>[:cntrl:] -> \x00-\x1F\x7F</notextile> |
||
43 | <notextile>[:digit:] -> 0-9</notextile> |
||
44 | <notextile>[:graph:] -> \x21-\x7E</notextile> |
||
45 | <notextile>[:lower:] -> a-z</notextile> |
||
46 | <notextile>[:print:] -> \x20-\x7E</notextile> |
||
47 | <notextile>[:punct:] -> !"#$%&'()*+,\-./:;<=>?@[\\\]^_`{|}~</notextile> |
||
48 | <notextile>[:space:] -> \s\t\r\n\v\f</notextile> |
||
49 | <notextile>[:upper:] -> A-Z</notextile> |
||
50 | 8 | Per Amundsen | <notextile>[:word:] - > A-Za-z0-9_</notextile> |
51 | 5 | Per Amundsen | <notextile>[:xdigit:] -> A-Fa-f0-9</notextile> |
52 | <notextile>\cc -> \x003</notextile> |
||
53 | <notextile>\co -> \x00F</notextile> |
||
54 | <notextile>\cb -> \x002</notextile> |
||
55 | <notextile>\x\{([A-Fa-f0-9]{1,4})\} -> \uXXXX</notextile> |
||
56 | 24 | Per Amundsen | <notextile>\Q \E tries to escapes all characters in between</notextile> |
57 | 1 | Per Amundsen | |
58 | \K is not available in .NET, use (<=abc)d instead. |
||
59 | |||
60 | These are not available and have no .NET counterpart: |
||
61 | 13 | Per Amundsen | |
62 | 1 | Per Amundsen | code (?{…}) |
63 | 25 | Per Amundsen | recursive (R), (R1), (R&name) |
64 | 1 | Per Amundsen | define (DEFINE). |
65 | |||
66 | 34 | Per Amundsen | List of differences between .NET and PCRE https://stackoverflow.com/questions/3417644/translate-perl-regular-expressions-to-net |