Sunday, November 23, 2008

Regular Expression

Many of us have written applications where we have used regular expression for different tasks, like validation , parsing and other related task. Like the a1 of 207, we have the regular expression to read the user's input of next move. After learn regular expression in csc236 i can understand how those code works. Regular Experssion is quite a powerful tool, and has been available in most of the programming languages. I believe, most of the time, like us, use this tool is to search specific words in a long paragraph.

In csc236, seems we talk about the regular expression, we only focus on binary string which contains only 0 and 1. The more interesting about regular expression is to deal with words that other than bianry string. Like the pattern contains words and numbers, or even the charater witht the back slash.

Usually in a regular expression ‘\’ is used as a the escape sequence , to escape meta characters. Regular Expression also support a construct called ‘character classes’ ,which can be roughly taken as a set of characters. There are some pre-defined character classes like \d \w \s etc. and if you need a more customized version you can define your own using Square brackets notation "[]" with represent a set that match the pattern inside.

The world inside the square brackets is much different than the one outside.  Inside the brackets , there are only two meta characters, ‘^’ and ‘-’; even an opening bracket ‘[', asterisk '*' , plus sign '+' are not considered as meta inside []. Furthermore , [] has no escape sequence within them. Now what you will do if you want to have ‘^’ , ‘-’  and ‘]’ inside a character class.

No comments: