About
Regular expression in the Oracle database context
The escape character for the pattern is backslash (\).
Articles Related
Functions
REGEXP_REPLACE (Extraction, Replace)
SELECT
REGEXP_REPLACE( '111.222.3333', '[[:digit:]]{3}\.[[:digit:]]{3}\.([[:digit:]]{4})', '\1' )
FROM
dual;
3333
REGEXP_INSTR
Search a string for a regular expression pattern and return the position of the substring.
- Nederland Postal Code : Expression return 1 if good, 0 in other case.
select REGEXP_INSTR(' 2343 LM ', '^([[:blank:]]*[0-9]{4}[[:blank:]]*[A-Z]{2}[[:blank:]]*)$') from dual
- Date verification
regexp_instr(StringDate,'^[0-9]{4}["01","02","03","04","05","06","07","08","09","10","11","12"]{2}[0-9]{2}$') <> 1
REGEXP_SUBSTR
apps/search/search.jsp search a string for a regular expression pattern and return the substring that match
- Return only the first text part of the description
select REGEXP_SUBSTR(description,'^[A-Z]+') from mytable
- Return the last Alphanumeric characters:
SELECT REGEXP_SUBSTR('12345^6a7b8v9','[[:alnum:]]+$') from dual
Return:
6a7b8v9
REGEXP_LIKE
apps/search/search.jsp searches a character column for a pattern. Use this function in the WHERE clause of a query to return rows matching a regular expression.
SELECT col FROM test WHERE REGEXP_LIKE(col,'r[[=e=]]sum[[=e=]]');
- Search a string that finish with a space
SELECT col FROM test WHERE REGEXP_LIKE(col,' $');
Character Range '[x-y]' in Regular Expressions and Diacritic
According to the POSIX standard, a range in a regular expression includes all collation elements between the start point and the end point of the range in the linguistic definition of the current locale. Therefore, ranges in regular expressions are meant to be linguistic ranges, not byte value ranges, because byte value ranges depend on the platform, and the end user should not be expected to know the ordering of the byte values of the characters. The semantics of the range expression must be independent of the character set. This implies that a range such as [a-d] includes all the letters between a and d plus all of those letters with diacritics, plus any special case collation element such as ch in Traditional Spanish that is sorted as one character.
Oracle interprets range expressions as specified by the NLS_SORT parameter to determine the collation elements covered by a given range. For example :
SQL> ALTER SESSION SET NLS_SORT = XWEST_EUROPEAN;
Session altered.
SQL> SELECT regexp_instr( 'Schröder', '^([A-Za-z]*)$') FROM dual;
REGEXP_INSTR('SCHRÖDER','^(A-ZA-Z]*)$')
-------------------------------------------------------------------
1
SQL> ALTER SESSION SET NLS_SORT = BINARY;
Session altered.
SQL> SELECT regexp_instr( 'Schröder', '^([A-Za-z]*)$') FROM dual;
REGEXP_INSTR('SCHRÖDER','^(A-ZA-Z]*)$')
-------------------------------------------------------------------
0
SQL> SELECT regexp_instr( 'Schröder', '^([[:upper:]]*[[:lower:]]*)$') FROM dual;
REGEXP_INSTR('SCHRÖDER','^([[:upper:]]*[[:lower:]]*)$')
-------------------------------------------------------------------
1