SQL Engine - SQL Parser

Oracle Database Sql Processing

About

The parser in an SQL engine parse a Sql statement.

It's the first stage of SQL processing. This stage involves separating the pieces of a SQL statement into a SQL Tree where each node is a SQL token (such as Select,from, …)

The database parses a statement when instructed by the application, which means that only the application­, and not the database itself, can reduce the number of parses.

When an application issues a SQL statement, the application makes a parse call to the database to prepare the statement for execution. The parse call opens or creates a cursor, which is a handle for the session-specific private SQL area that holds a parsed SQL statement and other processing information.

During the parse call, the database performs the following checks:

The preceding checks identify the errors that can be found before statement execution. Some errors cannot be caught by parsing. For example, the database can encounter deadlocks or errors in data conversion only during statement execution.

Oracle Database Sql Processing

Check

Syntax

Lexical Analysis - Parser (Syntax analysis|Linter)

The engine must check each SQL statement for syntactic validity. A statement that breaks a rule for well-formed SQL syntax fails the check. For example, the following statement fails because the keyword FROM is misspelled as FORM:

SQL> SELECT * FORM employees;
SELECT * FORM employees
         *
ERROR at line 1:
ORA-00923: FROM keyword not found where expected

Semantic

Compiler - Semantics Analysis.

The semantics of a statement are its meaning. Thus, a semantic check determines whether a statement is meaningful, for example, whether the objects and columns in the statement exist. A syntactically correct statement can fail a semantic check, as shown in the following example of a query of a nonexistent table:

SQL> SELECT * FROM nonexistent_table;
SELECT * FROM nonexistent_table
              *
ERROR at line 1:
ORA-00942: table or view does not exist

Shared Pool

During the parse, the database performs a shared pool check to determine whether it can skip resource-intensive steps of statement processing. To this end, the database uses a hashing algorithm to generate a hash value for every SQL statement. The statement hash value is the SQL ID shown in VSQL.SQL_ID.

When a user submits a SQL statement, the database searches the shared SQL area to see if an existing parsed statement has the same hash value. The hash value of a SQL statement is distinct from the following values:

  • Memory address for the statement

Oracle Database uses the SQL ID to perform a keyed read in a lookup table. In this way, the database obtains possible memory addresses of the statement.

  • Hash value of an execution plan for the statement

A SQL statement can have multiple plans in the shared pool. Each plan has a different hash value. If the same SQL ID has multiple plan hash values, then the database knows that multiple plans exist for this SQL ID.

The figure below is a simplified representation of a shared pool check of an UPDATE statement in a dedicated server architecture. Comparison of hash value between the PGA and the SGA.

Oracle Database Shared Pool Check

Categories

Parse operations fall into the following categories, depending on the type of statement submitted and the result of the hash check:

Hard

If Oracle Database cannot reuse existing code, then it must build a new executable version of the application code. This operation is known as a hard parse, or a library cache miss. The database always perform a hard parse of DDL. During the hard parse, the database accesses the library cache and data dictionary cache numerous times to check the data dictionary. When the database accesses these areas, it uses a serialization device called a latch on required objects so that their definition does not change (see “Latches”). Latch contention increases statement execution time and decreases concurrency.

Soft

A soft parse is any parse that is not a hard parse. If the submitted statement is the same as a reusable SQL statement in memory, then the database reuses the existing code. This reuse of code is also called a library cache hit.

Soft parses can vary in the amount of work they perform. For example, configuring the session shared SQL area can sometimes reduce the amount of latching in the soft parses, making them “softer.”

In general, a soft parse is preferable to a hard parse because the database skips the optimization and row source generation steps, proceeding straight to execution.

Example

Identical syntax is not sufficient

If a check determines that a statement in the shared pool has the same hash value, then the database performs semantic and environment checks to determine whether the statements mean the same. Identical syntax is not sufficient. For example, suppose two different users log in to the database and issue the following SQL statements:

CREATE TABLE my_table ( some_col INTEGER );
SELECT * FROM my_table;

The SELECT statements for the two users are syntactically identical, but two separate schema objects are named my_table. This semantic difference means that the second statement cannot reuse the code for the first statement.

Environmental difference can force a hard parse

Even if two statements are semantically identical, an environmental difference can force a hard parse. In this case, the environment is the totality of session settings that can affect execution plan generation, such as the work area size or optimizer settings. Consider the following series of SQL statements executed by a single user:

ALTER SYSTEM FLUSH SHARED_POOL;
SELECT * FROM my_table;

ALTER SESSION SET OPTIMIZER_MODE=FIRST_ROWS;
SELECT * FROM my_table;

ALTER SESSION SET SQL_TRACE=TRUE;
SELECT * FROM my_table;

In the preceding example, the same SELECT statement is executed in three different optimizer environments. Consequently, the database creates three separate shared SQL areas for these statements and forces a hard parse of each statement.

Library

Specialized parser library for SQL

Calcite

JOOQ

JOOQ (Implementation Parse Query function) SQL parser is a hand-written, recursive descent parser. There are great advantages of this approach over formal grammar-based, generated parsers (e.g. by using ANTLR). These advantages are, among others:

  • They can be tuned easily for performance
  • They are very simple and easy to maintain
  • It's easy to implement corner cases of the grammar, which might require context (that's a big plus with SQL)
  • It's the easiest way to bind a grammar to an existing backing expression tree implementation (which is what jOOQ really is)





Discover More
Card Puncher Data Processing
Calcite - Sql Parser

The ''calcite SQL Parser is a LL(k) parser that build a Sql tree (SqlNode) The org/apache/calcite/sql/parser/SqlParserparserConfig parameters control the parse process. For example: identifiers...
Data System Architecture
Concurrency - Latches (System Lock)

Latches are like semaphores. Latches are used to guarantee physical consistency of data, while locks are used to assure logical consistency of data. Latches are simple, low-level system lock (serialization...
Card Puncher Data Processing
Oracle - Recursive call caused by Hard Parse

Hard Parse are made when insufficient informations are present in its cache (shared pool, ...) as : object being accessed, permissions, ... When Oracle retrieve this information, it made some “recursive...
Card Puncher Data Processing
PL/SQL - Cursor

A cursor in the context of Oracle. A cursor is a SQL datatype. A cursor is a pointer to a private SQL area that stores information processing a specific: SELECT or DML statement. The cursor data...
Oracle Database Sql Processing
SQL Engine - Hard Parse (Library cache miss)

During the sql parsing (which is a step of the sql processing), if the Database (for instance Oracle) cannot reuse existing code, then it must build a new executable version of the application code (ie...
Oracle Database Sql Processing
SQL Engine - SQL Tree

The SQL tree is the output of the SQL parser that represents a SQL statement in a parse tree where each node is a SQL Token (ie select,from, ...)
Oracle Database Sql Processing
SQL Engine - SQL processing (Sql Compiler)

In which way the database processes: DDL statements to create objects, DML to modify data, and queries to retrieve data. Depending on the statement, the database may omit some of these...
Oracle Database Sql Processing
SQL Engine - Soft parse (Library cache hit)

A soft parse is any SQL parse that is not a hard parse. If the submitted statement is the same as a reusable SQL statement in the shared pool, then Oracle Database reuses the existing code. This reuse...



Share this page:
Follow us:
Task Runner