About
You can use the BULK COLLECT clause with a SELECT INTO or FETCH statement to retrieve a set of rows into a collection (ie table of varray):
- of records.
- of a collections
Articles Related
Bulk Collect Into
Record
Fetch into a collection of records.
DECLARE
TYPE EmployeeSet IS TABLE OF employees%ROWTYPE;
underpaid EmployeeSet;
-- Holds set of rows from EMPLOYEES table.
CURSOR c1 IS SELECT first_name, last_name FROM employees;
TYPE NameSet IS TABLE OF c1%ROWTYPE;
some_names NameSet;
-- Holds set of partial rows from EMPLOYEES table.
BEGIN
-- With one query,
-- bring all relevant data into collection of records.
SELECT * BULK COLLECT INTO underpaid FROM employees
WHERE salary < 5000 ORDER BY salary DESC;
-- Process data by examining collection or passing it to
-- eparate procedure, instead of writing loop to FETCH each row.
DBMS_OUTPUT.PUT_LINE
(underpaid.COUNT || ' people make less than 5000.');
FOR i IN underpaid.FIRST .. underpaid.LAST
LOOP
DBMS_OUTPUT.PUT_LINE
(underpaid(i).last_name || ' makes ' || underpaid(i).salary);
END LOOP;
-- You can also bring in just some of the table columns.
-- Here you get the first and last names of 10 arbitrary employees.
SELECT first_name, last_name
BULK COLLECT INTO some_names
FROM employees
WHERE ROWNUM < 11;
FOR i IN some_names.FIRST .. some_names.LAST
LOOP
DBMS_OUTPUT.PUT_LINE
('Employee = ' || some_names(i).first_name
|| ' ' || some_names(i).last_name);
END LOOP;
END;
/
Collection
Fecth into a collection of a collection
DECLARE
TYPE EmpCurTyp IS REF CURSOR;
TYPE NumList IS TABLE OF NUMBER;
TYPE NameList IS TABLE OF VARCHAR2(25);
emp_cv EmpCurTyp;
empids NumList;
enames NameList;
sals NumList;
BEGIN
OPEN emp_cv FOR 'SELECT employee_id, last_name FROM employees';
FETCH emp_cv BULK COLLECT INTO empids, enames;
CLOSE emp_cv;
END;
/
Memory Impact
The tradeoff with BULK COLLECT, like so many other performance-enhancing features, is “run faster but consume more memory.”
Specifically, memory for collections is stored in the program global area (PGA), not the system global area (SGA). SGA memory is shared by all sessions connected to Oracle Database, but PGA memory is allocated for each session. Thus, if a program requires 5MB of memory to populate a collection and there are 100 simultaneous connections, that program causes the consumption of 500MB of PGA memory, in addition to the memory allocated to the SGA.
Fortunately, PL/SQL makes it easy for developers to control the amount of memory used in a BULK COLLECT operation by using the LIMIT clause.
Suppose I need to retrieve all the rows from the employees table and then perform some compensation analysis on each row. I can use BULK COLLECT as follows:
PROCEDURE process_all_rows
IS
TYPE employees_aat
IS TABLE OF employees%ROWTYPE
INDEX BY PLS_INTEGER;
l_employees employees_aat;
BEGIN
SELECT *
BULK COLLECT INTO l_employees
FROM employees;
FOR indx IN 1 .. l_employees.COUNT
LOOP
analyze_compensation
(l_employees(indx));
END LOOP;
END process_all_rows;
Very concise, elegant, and efficient code. If, however, my employees table contains tens of thousands of rows, each of which contains hundreds of columns, this program can cause excessive pga memory consumption.
Consequently, you should avoid this sort of “unlimited” use of BULK COLLECT. Instead, move the SELECT statement into an explicit cursor declaration and then use a simple loop to fetch many, but not all, rows from the table with each execution of the loop body, as shown in Listing 1.
Code Listing 1: Using BULK COLLECT with LIMIT clause
PROCEDURE process_all_rows (limit_in IN PLS_INTEGER DEFAULT 100)
IS
CURSOR employees_cur
IS
SELECT * FROM employees;
TYPE employees_aat IS TABLE OF employees_cur%ROWTYPE
INDEX BY PLS_INTEGER;
l_employees employees_aat;
BEGIN
OPEN employees_cur;
LOOP
FETCH employees_cur
BULK COLLECT INTO l_employees LIMIT limit_in;
FOR indx IN 1 .. l_employees.COUNT
LOOP
analyze_compensation (l_employees(indx));
END LOOP;
EXIT WHEN l_employees.COUNT < limit_in;
END LOOP;
CLOSE employees_cur;
END process_all_rows;
The process_all_rows procedure in Listing 1 requests that up to the value of limit_in rows be fetched at a time. PL/SQL will reuse the same limit_in elements in the collection each time the data is fetched and thus also reuse the same memory. Even if my table grows in size, the PGA consumption will remain stable.
How do you decide what number to use in the LIMIT clause? Theoretically, you will want to figure out how much memory you can afford to consume in the PGA and then adjust the limit to be as close to that amount as possible.
From tests I (and others) have performed, however, it appears that you will see roughly the same performance no matter what value you choose for the limit, as long as it is at least 25. The test_diff_limits.sql script, included with the sample code for this column, at otn.oracle.com/oramag/oracle/08-mar/o28plsql.zip, demonstrates this behavior, using the ALL_SOURCE data dictionary view on an Oracle Database 11g instance. Here are the results I saw (in hundredths of seconds) when fetching all the rows (a total of 470,000):
- Elapsed CPU time for limit of 1 = 1839
- Elapsed CPU time for limit of 5 = 716
- Elapsed CPU time for limit of 25 = 539
- Elapsed CPU time for limit of 50 = 545
- Elapsed CPU time for limit of 75 = 489
- Elapsed CPU time for limit of 100 = 490
- Elapsed CPU time for limit of 1000 = 501
- Elapsed CPU time for limit of 10000 = 478
- Elapsed CPU time for limit of 100000 = 527