Oracle Database - Character Set

About

Oracle Database uses the database character set for:

Data stored in SQL character datatypes (CHAR, VARCHAR2, CLOB, and LONG).
Identifiers such as table names, column names, and PL/SQL variables.
Stored SQL and PL/SQL source code, including text literals embedded in this code.

The database character set to be selected is the character set of most clients connecting to this database.

Articles Related

Naming Convention

Naming convention for Oracle character set names:

<region><number of bits used to represent a character><standard character set name>[S|C]

Character Set

AL32UTF8

Oracle recommends Unicode AL32UTF8 as the database character set. Unicode is the universal character set that supports most of the currently spoken languages of the world.

As AL32UTF8 is a multibyte character set, database operations on character data may be slightly slower when compared to single-byte database character sets, such as WE8MSWIN1252.

Storage space requirements for text in most languages that use characters outside of the ASCII repertoire are higher in AL32UTF8 compared to legacy character sets supporting the language. Note that the increase in storage space concerns only character data and only data that is not in English. The universality and flexibility of Unicode usually outweighs these additional costs.

Legacy character sets should be considered when compatibility, storage requirements, or performance of text processing is critical and the database will ever support only a single group of languages. The database character set to be selected in such case is the character set of most clients connecting to this database.

WE8MSWIN1252 (Windows)

For most languages, the default character set is one of the Microsoft Windows character sets, for example WE8MSWIN1252, even though the database is not installed on Windows.

This results from the assumption that most clients connecting to the database run under the Microsoft Windows operating system.

As the database should be able to store all characters coming from the clients and Microsoft Windows character sets have richer character repertoire than the corresponding ISO 8859 character sets, the Microsoft Windows character sets are usually the better choice.

For example, the EE8MSWIN1250 character set supports the Euro currency symbol and various smart quote characters, while the corresponding EE8ISO8859P2 character set does not support them. In any case, Oracle converts the data between the database character set and the client character sets, which are declared by the NLS_LANG settings.

Installation Default

The default character set suggested or used by Oracle Universal Installer and Database Configuration Assistant in this release is based on the language configuration of the operating system.

How to

Get the list of Oracle Character set

select unique VALUE from V$NLS_VALID_VALUES where PARAMETER ='CHARACTERSET' and isdeprecated = 'FALSE';

See also the documentation to get the full description: Recommended Database Character set in Oracle® Database Globalization Support Guide

Get the Id, Get the name

The function NLS_CHARSET_ID returns the character set ID number corresponding to character set name string

select NLS_CHARSET_ID('WE8MSWIN1252') from dual;

the function NLS_CHARSET_NAME function returns the name of the character set corresponding to a specified character set identification number.

Get the current database (national) character set

select * from nls_database_parameters where parameter like '%CHARACTERSET%';

PARAMETER                      VALUE                                  
------------------------------ --------------
NLS_NCHAR_CHARACTERSET         AL16UTF16                                
NLS_CHARACTERSET               WE8ISO8859P1

Tools

Character set scanner

Character Set Scanner: See character set scanner utilities

Configuration

NLS_CHARACTERSET

The character set (code page) of the localization parameter of the Oracle Database

Documentation / Reference

Recommended ASCII Database Character Sets (SB:Single Bit, MB:MultiByte, …)