Table migration #
Table definition are quite identical between both RDBMS, except that PostgreSQL
does not have global temporary tables. Each temporary table is private to its
own session. Data inserted into a temporary table can be automatically destroyed
at the transaction end (using ON COMMIT DELETE ROWS
clause) or only at the end
of the session (with default implicit clause ON COMMIT PRESERVE ROWS
). In addition,
PostgreSQL can also automatically drop the table when the transaction is finished
(ON COMMIT DROP
). A workaround exists and will be proposed later in this guide.
There’s no equivalent in PostgreSQL of the storage clause like INITTRANS
and
MAXEXTENTS
. PostgreSQL allocates storage on a dynamic basis. Only PCTFREE
,
which indicates a percentage of size left free in a block, has an equivalent
in fillfactor under PostgreSQL. However PCTUSED
has no equivalent and has
no sense regarding how PostgreSQL manages its storage.
References:
Virtual Columns #
To replace virtual columns, PostgreSQL supports generated columns, introduced
in version 12. A generated column is a special column which is always computed
from other columns. Only STORED
capability is implemented; a VIRTUAL
generated
column may be added at a future major release.
Basic example of a table with a virtal column with Oracle:
CREATE TABLE employees (
id NUMBER,
first_name VARCHAR2(10),
salary NUMBER(9,2),
commission NUMBER(3),
salary2 NUMBER GENERATED ALWAYS
AS (ROUND(salary*(1+commission/100),2)) VIRTUAL,
);
This could be ported in PostgreSQL with the following syntax:
CREATE TABLE employees (
id BIGINT,
first_name VARCHAR(10),
salary DOUBLE PRECISION,
commission INTEGER,
salary2 DOUBLE PRECISION GENERATED ALWAYS
AS (ROUND((salary*(1+commission/100))::numeric,2)) STORED
);
To emulate a non-stored data like VIRTUAL
works with Oracle, views are usually
the best solution.
CREATE TABLE employees (
id BIGINT,
first_name VARCHAR(10),
salary DOUBLE PRECISION,
commission INTEGER
);
CREATE VIEW virt_employees AS
SELECT id, first_name, salary, commission,
(ROUND((salary*(1+commission/100))::numeric,2)) salary2
FROM employees;
References:
Index-Organized Tables #
PostgreSQL has no direct equivalent of the IOT.
It’s possible to physically sort a table in the same order as one of its indexes
thanks to the CLUSTER
statement. However, each following write will be written
in any available space in the table, not taking the index’s order into account.
It’s possible to reduce this phenomenon by reducing the fillfactor to a smaller
value (than the default 100) to let space to the new versions of a record, as a
new version is put on the same page as the original one if free space is sufficient.
An Index Organized Table, or clustered table, can bring a large improvement on
the range scan type queries, where a consecutive range of records of a table are
accessed by a query. For instance, a query returning all the customers whose
name start with MAR (SELECT * FROM clients WHERE nom LIKE 'MAR%' )
will benefit
from an IOT. Reading will be faster as all records pointed by the index will be
in the same block, or in a few consecutive blocks.
CLUSTER
is thus a one-time operation, that should be performed again more or
less frequently, depending on the write rate in the table, and the data
maintenance window. One has to keep in mind that this operation is heavy and
requires an exclusive lock that will block all reads and writes during the
rebuild. Please note that no reindexation is necessary following a CLUSTER
, as
the indexes are already rebuilt during the operation.
The following example illustrates a significant performance improvement on a simple table containing 10 millions of records. The query runtime is divided by two after clustering it on the appropriate index.
Objects creation:
CREATE TABLE t6 (i SERIAL PRIMARY KEY, v INTEGER);
INSERT INTO t6 (v) SELECT round(random()*10000) FROM generate_series(1, 10000000);
CREATE INDEX idx_t6_v ON t6 (v);
The query runtime is consistent, as data is cached:
SELECT count(*) FROM t6 WHERE v BETWEEN 100 AND 1000;
-- count
-- --------
-- 902274
--
-- Time: 350,100 ms
This table is then clustered over the idx_t6_v
index:
CLUSTER t6 USING idx_t6_v;
The runtime has been divided by a factor of two:
SELECT count(*) FROM t6 WHERE v BETWEEN 100 AND 1000;
-- count
-- --------
-- 902274
--
-- Time: 148,755 ms
References:
External tables #
External tables make it possible to interact with CSV files from an Oracle database. This can be emulated through a Foreign Data Wrapper, as implemented by PostgreSQL in respect with the SQL/MED extension to SQL. This new feature makes it possible to declare Foreign Data Wrappers (FDW) which allow for remote access to objects outside of the database: another database, a CSV file, etc…
Given an external table’s definition for Oracle:
CREATE OR REPLACE DIRECTORY ext AS '/usr/tmp/';
CREATE TABLE ext_tab (
empno CHAR(4),
ename CHAR(20),
job CHAR(20),
deptno CHAR(2)
) ORGANIZATION EXTERNAL (
TYPE oracle_loader
DEFAULT DIRECTORY ext
ACCESS PARAMETERS (
RECORDS DELIMITED BY NEWLINE
BADFILE 'bad_%a_%p.bad'
LOGFILE 'log_%a_%p.log'
FIELDS TERMINATED BY ','
MISSING FIELD VALUES ARE NULL
REJECT ROWS WITH ALL NULL FIELDS
(empno, ename, job, deptno))
LOCATION ('demo1.dat')
)
PARALLEL
REJECT LIMIT 0
NOMONITORING;
It will be translate this way in PostgreSQL:
CREATE EXTENSION file_fdw;
CREATE SERVER ext FOREIGN DATA WRAPPER file_fdw;
CREATE FOREIGN TABLE ext_tab (
empno CHAR(4),
ename CHAR(20),
job CHAR(20),
deptno CHAR(2)
) SERVER ext OPTIONS(filename '/usr/tmp/demo1.dat', format 'csv',
delimiter ',');
If the sole pupose is loading data, please note that PostgreSQL’s COPY
statement
is an alternative to the creation of a foreign data wrapper. For example, to load
the example’s CSV file, one only has to run the following statement:
COPY ext_table FROM '/usr/tmp/demo1.csv' WITH CSV;
References:
Global temporary tables #
The Oracle GLOBAL TEMPORARY TABLE
object has no equivalent in PostgreSQL. As a
reminder, a global temporary table is a table whose structure is permanent, but
whose content is temporary, i.e. specific to each user session.
Several solutions exist to carry these objects.
Depending on usage, it may happen that a classic table can meet the need, with a
simple truncation of the table with TRUNCATE
command on first use. But this
prevents a use of the table by several users in parallel.
The pgtt extension, developed by Gilles Darold, emulates how global temporary tables work. But his use requires the extension library to be loaded once the user is logged in.
It is also possible to build your own temporary tables globally as follows:
- create a set of tables in a dedicated schema with the structure of global temporary tables to emulate; these tables will serve as a model for creating temporary tables;
- associate a comment to each table to indicate the mode of operation of the
temporary table during transaction COMMITs (
ON COMMIT PRESERVE ROWS
orON COMMIT DROP
); - create a procedure or function for initializing a temporary table with a pre-existing table as a template;
- in PL/pgSQL code, call this procedure or function at the first using the global temporary table.
CREATE SCHEMA IF NOT EXISTS gtt_schema;
CREATE TABLE gtt_schema.gtt_table_1 (
col1 INTEGER,
col2 TEXT
);
COMMENT ON TABLE gtt_schema.gtt_table_1 IS 'ON COMMIT PRESERVE ROWS';
CREATE OR REPLACE PROCEDURE prepare_temp_table(p_relname varchar) AS
$$
DECLARE
v_temp_schema varchar = 'gtt_schema';
v_temp_desc varchar;
BEGIN
-- Reading the comment associated with the table
v_temp_desc := pg_catalog.obj_description(
(format('%s.%s', v_temp_schema, p_relname))::regclass, 'pg_class'
);
-- Creating the temporary table
EXECUTE format(
'CREATE TEMP TABLE %2$s (LIKE %1$s.%2$s) %3$s',
v_temp_schema, p_relname,
CASE
WHEN v_temp_desc ~* 'delete' THEN 'ON COMMIT DELETE ROWS'
WHEN v_temp_desc ~* 'drop' THEN 'ON COMMIT DROP'
ELSE 'ON COMMIT PRESERVE ROWS'
END
);
-- If the temporary table already exists, empty it
EXCEPTION WHEN SQLSTATE '42P07' THEN
EXECUTE format('TRUNCATE %s', p_relname);
END;
$$
LANGUAGE plpgsql;
CALL prepare_temp_table('gtt_table_1');
References: