Character - Conversion / Encoding translation

Data System Architecture

About

A string is a sequence of bytes that may represent characters. All the characters within a string have a common coding representation. In some cases such as the coding representations may be different at the sending and receiving systems, it may be necessary to convert these characters to a different coding representation.

This process is known as character conversion. Character conversion, when required, is automatic, and when successful, is transparent to the application.

Firexfox Character Set

As a result of having many character encoding methods in use (and the need for backward compatibility with archived data), many computer programs have been developed to translate data between encoding schemes. On Firefox 3, for example, see the View/Character Encoding sub-menu (here in Dutch).

Relaxing the code page constraint (or validation) means that this process must not be entirely successful and then you can end up with loss of data during the conversion.

Mapping a Character Set in Different Code Pages

The following figure shows how a typical character set might map to different code points in two different code pages.

Even with the same encoding scheme, there are many different code pages, and the same code point can represent a different character in different code pages.

Furthermore, a byte in a character string does not necessarily represent a character from a single-byte character set (SBCS). Character strings are also used for mixed and bit data. Mixed data is a mixture of single-byte, double-byte, or multi-byte characters. Bit data (columns defined as FOR BIT DATA, or BLOBs, or binary strings) is not associated with any character set.

Character Set Code Pages

Documentation / Reference





Discover More
Map Of Internet 1973
FTP - Type

Set file transfer type. The default transfer type is . where: TypeName is : binary ASCII The ascii command should be used when transferring text files. In ASCII mode, character conversions...
Card Puncher Data Processing
Flat File

\n : new line code pagecharacter setNLS_LANGcode pageGUIGUIcharacter conversion Configuring Oracle Database...
Obia Powercenter Topology
OBIA - Installation Version 7.9.6 with EBS, PowerCenter, Oracle Database on Windows

Roadmap To install and set up Oracle BI Applications, do the following: preinstallation steps for the source...
Card Puncher Data Processing
Oracle Database - Long (Text) and Long Raw Datatype

LOB Oracle also recommends that you convert existing LONG columns to LOB columns. LOB columns are subject to far fewer restrictions than LONG columns. Further, LOB functionality is enhanced in every release,...
Oracle Database Nls Lang Windows Registry
Oracle Database - NLS_LANG (LOCALE)

NLS_LANG is an environment parameter used to set the locale of a client application. Setting the NLS_LANG environment parameter is the simplest way to specify locale behaviour for Oracle Database software....
Card Puncher Data Processing
Oracle Database - Two-Task Common (TTC) protocol

A protocol that is used in a typical Oracle Net connection to provide character set and data type conversion between different character sets or formats on the client and server. Two-Task...
Card Puncher Data Processing
PowerCenter - Code Page

code page in Powercenter You can configure the Integration Service to relax code page validation when you run the Integration Service in Unicode data movement mode. However, you might get unexpected...
Data System Architecture
SQL - Character large object (CLOB)

A Character large object (CLOB) is a SQL data type used to store a large amount of character data. It's a specialized variant of large object (LOB) where data is stored in a file on the local file system...
Character Set Code Pages
Text - Encoding (Character Set|charset|code page)

A character set is a repertoire of characters in which each character is (assigned|encoded) into a numeric code point. An character set (as an alphabet) is any finite set of symbols (characters). In...
Data System Architecture
What is Punycode Text Encoding?

Punycode is a encoding that: converts unicode characters (ie international character) into the ASCII (A-Z, 0-9) character set. The main usage is to store international characters in a system...



Share this page:
Follow us:
Task Runner