Porting SQL queries #
DML statements compatibility #
Most DML statements are compatible between Oracle and PostgreSQL. Note though
that the MERGE
statement doesn’t exist yet in PostgreSQL. Some restrictions
exist for some window functions and recursive Common Table Expressions (CTE).
Subquery aliases #
Oracle does not handle error messages while subqueries are unaliased unlike PostgreSQL. For instance, this query need to be rewritten:
SELECT <columns, ...>
FROM (SELECT <...>
)
WHERE <conditions>
as:
SELECT <columns, ...>
FROM (SELECT <...>
) sub1
WHERE <conditions>
Implicit conversions #
Many implicit conversion to and from a text type have been removed since PostgreSQL 8.3.
For instance, one cannot write this type of query:
CREATE TABLE depts ( number CHAR(2), dname VARCHAR(25) );
SELECT * FROM depts WHERE number BETWEEN 0 AND 42;
-- ERROR: operator does not exist: character >= INTEGER
-- LIGNE 1 : SELECT * FROM depts WHERE number BETWEEN 0 AND 42;
In order to run this query, one has to explicitly declare the conversion to be performed:
SELECT * FROM depts WHERE number::INTEGER BETWEEN 0 AND 42;
-- or (more SQL compliant)
SELECT * FROM depts WHERE CAST(id AS INTEGER) BETWEEN 0 AND 42;
With Oracle, this conversion is implicit (but will mask a potential performance problem as it will force a type conversion).
HAVING and GROUP BY clauses #
Even though Oracle’s documentation stipulates that the GROUP BY
clause precedes
the HAVING
clause, Oracle’s grammar permits the opposite. Those queries having
HAVING
before GROUP BY
will have to be corrected.
This query, for example:
SELECT * FROM test HAVING count(*) > 3 GROUP BY i;
will have to be converted this way to be run in PostgreSQL:
SELECT * FROM test GROUP BY i HAVING count(*) > 3;
References:
ROWID #
In very rare occasions, some SQL queries use Oracle’s ROWID
column, for instance
to deduplicate records. The ROWID
column is the physical location of a record in
a table. PostgreSQL’s equivalent is ctid
.
More specifically, Oracle’s ROWID
is a record’s logical address, in the
OOOOOO.FFF.BBBBBB.RRR
form, where O
is the object’s number, F
the file, B
the
block’s number, and R
the row’s number in this block. The format may vary if
the table is in a BIG FILE TABLESPACE
, but the principle stays the same.
PostgreSQL’s ctid
contains only the block number and the row’s number in this
block. There is no other localisation information. The ctid
is only unique
inside a table. Because of that, a query returning ctid
s from a partitioned table
may have duplicated. In this peculiar case, one can add the tableoid
column
(the table’s id in the catalog), which will distinguish between duplicates
coming from different partitions.
Another difference is that an update doesn’t change a record’s rowid in Oracle, while it does change the ctid in PostgreSQL.
This access method should be avoided as much as possible.
References:
Minus operator conversion #
The MINUS
operator has to be converted to EXCEPT
for PostgreSQL. The other
set operations UNION
, UNION ALL
, and INTERSECT
don’t need any conversion.
Thus, the following query returns all inventory products which have never been in a command. It can be written like this for Oracle:
SELECT product_id FROM inventories
MINUS
SELECT product_id FROM order_items
ORDER BY product_id;
It will be written this way for PostgreSQL:
SELECT product_id FROM inventories
EXCEPT
SELECT product_id FROM order_items
ORDER BY product_id;
References:
Window Functions #
The queries using window functions don’t usually need a lot of work.
Oracle proposes an ORDER SIBLINGS BY
clause. The SIBLINGS
keyword has no
equivalent and is anyway only used for hierarchy processing, with CONNECT BY
.
This kind of query has to be rewritten anyway.
The PARTITION BY
don’t need any adaptation, neither the windowing clause
(RANGE ...
or ROWS ...
).
Most of Oracle’s general usage window functions exist in PostgreSQL. Some functions still have no equivalent for the moment.
RATIO_TO_REPORT
will need to be rewritten (using a division on the sum on the window).LISTAGG
function will need to be rewritten usingarray_agg
andarray_to_string
.
Please note though that PostgreSQL’s extension capabilities make it possible to write new aggregate and window functions, making it possible to write the missing functions.
References:
Recursive CTE #
Oracle’s grammar makes no difference between a standard CTE and a recursive CTE,
while PostgreSQL does and requires a correct usage of WITH
and WITH RECURSIVE
. One will need to correct recursive CTE, which should include a
UNION ALL
and a reference to the common table expression itself.
The following query does a very simple recursion:
WITH recursion (a) AS (
SELECT 1 AS a
FROM dual
UNION ALL
SELECT a + 1
FROM recursion
WHERE a < 10
)
SELECT * FROM recursion;
Here it is, rewritten for PostgreSQL:
WITH RECURSIVE recursion (a) AS (
SELECT 1 AS a
UNION ALL
SELECT a + 1
FROM recursion
WHERE a < 10
)
SELECT * FROM recursion;
The WITH
clause also proposes some extensions in Oracle, to describe how to
perform the recursion (search_clause) and the loop detection (cycle_clause).
Cycle detection has already been tackled in “
Hierarchical querying " section of this document.
References:
MERGE #
MERGE
command allows to make insertions or updates of tables depending on
whether the rows already exist or not. The typical syntax for the statement
looks like this:
MERGE INTO destination_table
USING source_table ON (condition)
WHEN MATCHED THEN update_clause
WHEN NOT MATCHED THEN insert_clause;
But versions prior to PostgreSQL 15 do not support the syntax MERGE
from the
SQL standard. On the other hand, we can emulate this instruction with a INSERT
.
It is necessary to distinguish here several cases, according to the presence or
not of WHEN MATCHED
and WHEN NOT MATCHED
clauses.
If there is no WHEN NOT MATCHED
clause, a simple UPDATE
is required:
UPDATE destination_table
update_clause
USING source_table
WHERE join_condition;
If there is no WHEN MATCHED
clause, using an INSERT
on non existing rows is
more relevant:
INSERT INTO destination_table
SELECT ... FROM source_table
WHERE join_condition
ON CONFLICT DO NOTHING;
If the two clauses WHEN MATCHED
and WHEN NOT MATCHED
are present, the
translation becomes:
INSERT INTO destination_table
SELECT ... FROM source_table
WHERE join_condition
ON CONFLICT DO
UPDATE update_clause
Sometimes, on Oracle, a DELETE WHERE ...
is present after the UPDATE
of the
WHEN MATCHED
clause. In this case, we can add a simple query DELETE
before
or after the INSERT
. We can also set this DELETE
to inside the INSERT
, as
a common table expression (CTE).
References:
Hints management #
Oracle’s optimizer accepts hints, which make it possible for the DBA to force the optimizer into using a plan it considers isn’t the best. Those hints are expressed as commends and will be ignored by PostgreSQL, which has no hints.
Nevertheless, a query making use of a hint should have its execution plan analyzed carefully, to be sure that it will work with PostgreSQL.
The plan will be checked with EXPLAIN ANALYZE
, which displays both the
estimates of the optimizer and what really occurred during execution. One should
look for a large discrepancy between estimated and real selectivity, for each
node. This will indicate what the optimizer doesn’t understand in the query.
Often, this is only the consequence of too imprecise statistics. This can be
corrected in several way.
First, it’s possible to make statistics collection more thorough, by raising the
amount of sampled data. This is controlled by the default_statistics_target
parameter. This can be done globally or per column of each table. A higher value
will make statistics collection consume more resources, and will make query
planning longer, as more statistics will have to be analyzed to make a decision.
The default value of 100 is good for most tables and data sets, so one usually
only change statistics on the very few columns requiring it. This is done with
ALTER TABLE … ALTER COLUMN … SET STATISTICS …
. It’s also possible to
artificially force the number of distinct values statistic on a column with
ALTER TABLE … SET COLUMN … SET n_distinct = …
, as this statistic is quite
difficult to get right for the statistics collector.
Sometimes, a query rewrite is the solution: queries can be written in ways that prevent the optimizer from doing good estimations. This is as true for PostgreSQL as it is for Oracle.