Regular Expressions
Follow this lesson in Ullman Chapter 13. The scripts are located in the 13 directory.
If you aren’t familiar with using them see my Shell Scripting section on Regular Expressions.
Have you used grep? If you have you’re going to recognize this syntax. If you haven’t, see my Linux Fundamentals section on grep.
However, WATCH OUT if you’re a grep user! Some of the characters are different here!
Use it like this:
$pattern = “Flintstones”;
$string = “Flintstones! Meet the Flintstones!”;
ereg ($pattern, $string);This makes a lot more sense when you think in terms of trying to find which member of an array matches a pattern, or which line of a file matches the pattern.
PHP’s POSIX Extended Functions | ||
Function | Purpose | Syntax |
ereg() | Match a pattern in a string | ereg (‘pattern‘, ‘string‘); |
eregi() | Same, case-insensitive | eregi (‘pattern‘, ‘string‘); |
ereg_replace() | Match and replace a pattern in a string | ereg_replace ( ‘pattern‘, ‘replacement‘, ‘string‘ ) |
eregi_replace() | Same, case-insensitive | eregi_replace ( ‘pattern‘, ‘replacement‘, ‘string‘ ) |
split() | Split a string into an array, splitting at pattern, up to an optional limit of times. |
split ( ‘pattern‘, ‘string‘ [, limit] ) |
spliti() | Same, case-insensitive | spliti( ‘pattern‘, ‘string‘ [, limit] ) |
preg_match() | Similar, using regular expression matching | preg_match (‘pattern‘, ‘string‘); |
Literals match literally themselves:
ereg (‘Flintstones’, ‘Meet the Flintstones’);
will return TRUE.
You can specify lists of literals:
.Matches a single character. *Matches zero or more instances of the immediately preceding character. Example: C* if found would match C, CC or CCC … not to mention a blank string! ?Matches one or more instances of the immediately preceding character. Example: C? if found would match C, CC or CCC … ( )Group |“Or” – (mouse|cat|dog) ^Represents the beginning of the string, so if you specified ^T grep would search for any string starting with a T. $Represents the end of the string, so if you specified \.$ then grep would pull up any string that ended with . \The escape character: it means to take the next character literally, so you can search for characters like * that have special meanings: \* {x}Exactly x occurrences of the preceding character or expression {x, y}Between x and y occurrences of the preceding character or expression {x,}At least x occurrences of the preceding character or expression
This is simply a term for grouping options in square brackets. For instance:
[HhJ]ello matches lines containing hello or Hello or Jello.
Use the ^ character before a character or expression to indicate negation:
^a is “NOT a”.The $ character and the . character are NOT wildcards inside character classes (inside [ ] characters).
Ranges of characters are also permitted:
[0-3] is the same as [0123]
[a-k] is the same as [abcdefghijk]
[A-C] is the same as [ABC]
[A-Ca-k] is the same as[ABCabcdefghijk]
[ \f\r\t\n\v] matches any white space
There are also some alternate forms :
[[:alpha:]] is the same as [a-zA-Z]
[[:upper:]] is the same as [A-Z]
[[:lower:]] is the same as [a-z]
[[:digit:]] is the same as [0-9]
[[:alnum:]] is the same as [0-9a-zA-Z]
[[:space:]] matches any white space
Do it like this:
$pattern = “Flintstones”;
$replacement = “Jetsons”;
$string = “Flintstones! Meet the Flintstones!”;
eregi_replace ($pattern, $replacement, $string);
Review Chapter 13 of Ullman.