Python - Regular Expression (called REs, or regexes, or regex patterns)

Card Puncher Data Processing


Regular expression in Python

The regexp in Python have matching operations similar to those found in Perl.


  • the backslash character ('\') indicate special forms or allow special characters
  • for example, to match a literal backslash, one might have to write '\\\\' as the pattern string, because the regular expression must be \\, and each backslash must be expressed as
    inside a regular Python string literal.
  • backslashes are not handled in any special way in a string literal prefixed with 'r'. So r“\n” is a two-character string containing '\' and 'n', while “\n” is a one-character string containing a newline.



Web Log Parsing

Log - Apache Common Log Format

'^(\S+) (\S+) (\S+) \[([\w:/]+\s[+\-]\d{4})\] "(\S+) (\S+)\s*(\S*)" (\d{3}) (\S+)' - - [24/Jul/1973:08:32:01 -0400] "GET /images/gerardnico.gif HTTP/1.0" 200 2564

Word tokenization


Documentation / Reference

Share this page:
Follow us:
Task Runner