Oracle Database - Character Set

Card Puncher Data Processing

About

Oracle Database uses the database character set for:

  • Data stored in SQL character datatypes (CHAR, VARCHAR2, CLOB, and LONG).
  • Identifiers such as table names, column names, and PL/SQL variables.
  • Stored SQL and PL/SQL source code, including text literals embedded in this code.

The database character set to be selected is the character set of most clients connecting to this database.

Naming Convention

Naming convention for Oracle character set names:

<region><number of bits used to represent a character><standard character set name>[S|C]

Character Set

AL32UTF8

Oracle recommends Unicode AL32UTF8 as the database character set. Unicode is the universal character set that supports most of the currently spoken languages of the world.

As AL32UTF8 is a multibyte character set, database operations on character data may be slightly slower when compared to single-byte database character sets, such as WE8MSWIN1252.

Storage space requirements for text in most languages that use characters outside of the ASCII repertoire are higher in AL32UTF8 compared to legacy character sets supporting the language. Note that the increase in storage space concerns only character data and only data that is not in English. The universality and flexibility of Unicode usually outweighs these additional costs.

Legacy character sets should be considered when compatibility, storage requirements, or performance of text processing is critical and the database will ever support only a single group of languages. The database character set to be selected in such case is the character set of most clients connecting to this database.

WE8MSWIN1252 (Windows)

For most languages, the default character set is one of the Microsoft Windows character sets, for example WE8MSWIN1252, even though the database is not installed on Windows.

This results from the assumption that most clients connecting to the database run under the Microsoft Windows operating system.

As the database should be able to store all characters coming from the clients and Microsoft Windows character sets have richer character repertoire than the corresponding ISO 8859 character sets, the Microsoft Windows character sets are usually the better choice.

For example, the EE8MSWIN1250 character set supports the Euro currency symbol and various smart quote characters, while the corresponding EE8ISO8859P2 character set does not support them. In any case, Oracle converts the data between the database character set and the client character sets, which are declared by the NLS_LANG settings.

Installation Default

The default character set suggested or used by Oracle Universal Installer and Database Configuration Assistant in this release is based on the language configuration of the operating system.

How to

Get the list of Oracle Character set

select unique VALUE from V$NLS_VALID_VALUES where PARAMETER ='CHARACTERSET' and isdeprecated = 'FALSE';

See also the documentation to get the full description: Recommended Database Character set in Oracle® Database Globalization Support Guide

Get the Id, Get the name

  • The function NLS_CHARSET_ID returns the character set ID number corresponding to character set name string
select NLS_CHARSET_ID('WE8MSWIN1252') from dual;
  • the function NLS_CHARSET_NAME function returns the name of the character set corresponding to a specified character set identification number.

Get the current database (national) character set

select * from nls_database_parameters where parameter like '%CHARACTERSET%';
PARAMETER                      VALUE                                  
------------------------------ --------------
NLS_NCHAR_CHARACTERSET         AL16UTF16                                
NLS_CHARACTERSET               WE8ISO8859P1                             

Tools

Character set scanner

Character Set Scanner: See character set scanner utilities

Configuration

NLS_CHARACTERSET

The character set (code page) of the localization parameter of the Oracle Database

Documentation / Reference





Discover More
Card Puncher Data Processing
Oracle Database - Bytes or Characters for VARCHAR2 and CHAR

Text data is encoded/stored/transformed on the computer in bytes thanks to a character set that maps text to bytes. Historically, the character sets were single-byte character sets that could hold 256...
Card Puncher Data Processing
Oracle Database - Character Set Functions : CONVERT, UNISTR

character set Functions : CONVERT, UNISTR CONVERT converts a character string from one character set to another. The datatype of the returned value is VARCHAR2. The char argument is the value...
Card Puncher Data Processing
Oracle Database - Globalization (Localization, NLS parameter)

Globalization support enables the storing, processing and retrieval of data in native languages. The languages that can be stored in an Oracle database are encoded by Oracle Database-supported character...
Oracle Database 11gr2 Typical Installation
Oracle Database - Installation 11g Release 2 (11.2) on Linux OEL 5 (X86)

Installation of Oracle Database 11g Release 2 (11.2) on Oracle Enterprise Linux 5. Linux OEL installation login as root RAM: At least 1 GB swap space The following table describes...
Card Puncher Data Processing
SQL*Loader - Running SQL*Loader (sqlldr)

SqlLoader is an utility tool and can not change the environment parameters, especially the NLS Localization parameter. From 160521.1Checklist before starting SQLLoader You have to change them: in the...



Share this page:
Follow us:
Task Runner