Character - Conversion / Encoding translation

Data System Architecture

About

A string is a sequence of bytes that may represent characters. All the characters within a string have a common coding representation. In some cases such as the coding representations may be different at the sending and receiving systems, it may be necessary to convert these characters to a different coding representation.

This process is known as character conversion. Character conversion, when required, is automatic, and when successful, is transparent to the application.

Firexfox Character Set

As a result of having many character encoding methods in use (and the need for backward compatibility with archived data), many computer programs have been developed to translate data between encoding schemes. On Firefox 3, for example, see the View/Character Encoding sub-menu (here in Dutch).

Relaxing the code page constraint (or validation) means that this process must not be entirely successful and then you can end up with loss of data during the conversion.

Mapping a Character Set in Different Code Pages

The following figure shows how a typical character set might map to different code points in two different code pages.

Even with the same encoding scheme, there are many different code pages, and the same code point can represent a different character in different code pages.

Furthermore, a byte in a character string does not necessarily represent a character from a single-byte character set (SBCS). Character strings are also used for mixed and bit data. Mixed data is a mixture of single-byte, double-byte, or multi-byte characters. Bit data (columns defined as FOR BIT DATA, or BLOBs, or binary strings) is not associated with any character set.

Character Set Code Pages

Documentation / Reference





Discover More
Data System Architecture
Character Set - Code page

Code page is a number identifier for a character set. The term code page originated from IBM's EBCDIC-based mainframe systems, but many vendors use this term including Microsoft, SAP, and Oracle Corporation....
Map Of Internet 1973
FTP - Type

Set file transfer type. The default transfer type is . where: TypeName is : binary ASCII The ascii command should be used when transferring text files. In ASCII mode, character conversions...
Obia Powercenter Topology
OBIA - Installation Version 7.9.6 with EBS, PowerCenter, Oracle Database on Windows

Roadmap To install and set up Oracle BI Applications, do the following: preinstallation steps for the source...
Card Puncher Data Processing
OWB - Flat File

\n : new line code pagecharacter setNLS_LANGcode pageGUIGUIcharacter conversion Configuring Oracle Database...
Card Puncher Data Processing
Oracle Database - Long (Text) and Long Raw Datatype

LOB Oracle also recommends that you convert existing LONG columns to LOB columns. LOB columns are subject to far fewer restrictions than LONG columns. Further, LOB functionality is enhanced in every release,...
Card Puncher Data Processing
Oracle Database - Two-Task Common (TTC) protocol

A protocol that is used in a typical Oracle Net connection to provide character set and data type conversion between different character sets or formats on the client and server. Two-Task...
Card Puncher Data Processing
PowerCenter - Code Page

code page in Powercenter You can configure the Integration Service to relax code page validation when you run the Integration Service in Unicode data movement mode. However, you might get unexpected...
Data System Architecture
SQL - Character large object (CLOB)

A Character large object (CLOB) is a SQL data type used to store a large amount of character data. It's a specialized variant of large object (LOB) where data is stored in a file on the local file system...
Character Map 0248 00f8
Text - Character

A character is: an atomic unit of text (10646ISO/IEC 10646:2000 Character specification] is categorized as a primitive data type A character is the smallest component of written language that has...
Character Set Code Pages
Text - Encoding (Character Set|charset|code page)

A character set is a repertoire of characters in which each character is (assigned|encoded) into a numeric code point. An character set (as an alphabet) is any finite set of symbols (characters). In...



Share this page:
Follow us:
Task Runner