Datatypes #
The Oracle RDBMS does not support every SQL standard data type natively. Some are
supported but converted into an equivalent internal type. For example, all native
integer types, like integer
being converted to NUMBER(38)
. Also, the VARCHAR2
type is specific to Oracle and differs from the standard on its way to handle
the NULL value. Additionally, many Oracle specific data types are named differently
in PostgreSQL.
type Oracle | type PostgreSQL |
---|---|
varchar2, nchar2, nvarchar2, nclob | varchar, text |
clob, long | varchar, text |
blob, raw, long raw | bytea |
number | numeric, integer, bigint, smallint, real, double precision |
date | date, timestamp |
binary float | real |
binary double | double precision |
Boolean type #
Oracle does not handle a proper boolean type. Depending on the application
developper, a CHAR
or a NUMBER
type is used to emulate the behavior. Ora2Pg
can handle the transformation of a boolean value to PostgreSQL.
Also, note that the lack of boolean value in Oracle can result in some pains with
ORM like Hibernate if the datatypes are not ported correctly. The column seen
configured as a boolean on the ORM side will be stored as a NUMBER
or CHAR
type
on the Oracle side. If no care has been taken to transform the boolean type and
values accordingly, the ORM will look for a boolean value in PostgreSQL and will
find something different, resulting in an error. More generally, when using ORM,
it is better to start from a database schema generated by the application and
convert the data accordingly.
SELECT true AND false;
-- -[ RECORD 1 ]
-- ?column? | f
SELECT true OR false;
-- -[ RECORD 1 ]
-- ?column? | t
Character types #
Oracle implements a specific datatype, VARCHAR2
. This type does not handle an
empty string and NULL
according the SQL standard, as an empty string and a NULL
are both seen as a NULL
value. Thus, in Oracle, when you compare a value with ''
,
this is equivalent to IS NULL
.
On the contrary, PostgreSQL provides an SQL standard conformant varchar datatype:
comparing to ''
is something distinct from IS NULL
.
There is another subtle difference regarding length limit, but it usually is not
a problem during porting. While Oracle’s length limit is by default expressed in
bytes (it can be changed with NLS_LENGTH_SEMANTICS), PostgreSQL length limit is
expressed in characters. Moreover, in PostgreSQL, one can use a varchar
datatype
without providing a limit, which is equivalent to PostgreSQL’s text
datatype.
In this case, the string will be limited to 1GB.
While porting applications, another subtle differences can appear while
comparing char
and varchar
datas. This is not clearly described in the SQL
standard and each RDBMS has its own interpretation: while comparing thoses
types, PostgreSQL converts each data from varchar
to char
. Some empty spaces
(padding) will be added on the right of varchar
string, in order to obtain
the same length as char(x)
.
For example:
SELECT 'foo'::CHAR(5)='foo '::VARCHAR(5);
-- ?column?
-- ----------
-- t
SELECT 'foo '::CHAR(5)='foo'::VARCHAR(5);
-- ?column?
-- ----------
-- t
SELECT 'foo '::VARCHAR(5)='foo'::VARCHAR(5);
-- ?column?
-- ----------
-- f
This feature can have undesirable effects on execution plans:
CREATE TABLE t1 (a VARCHAR(5));
INSERT INTO t1 SELECT generate_series(1,10000);
CREATE INDEX idx1 ON t1(a);
EXPLAIN SELECT * FROM t1 WHERE a='foo'::CHAR(5);
-- QUERY PLAN
-- ----------------------------------------------------
-- Seq Scan on t1 (cost=0.00..170.00 rows=1 width=4)
-- Filter: ((a)::bpchar = 'foo '::character(5))
This query can’t take advantage of the index on column a
: each value must be
converted to char
(in fact to bpchar
, which is the internal type for char
)
before being compared to the right argument.
This problem appears frequently when the query is embedded into a function, and
when the function parameters are of type char(x)
. For example, the following
function can be problematic:
CREATE OR REPLACE FUNCTION public.demo_char(p1 CHARACTER)
RETURNS CHARACTER VARYING
LANGUAGE plpgsql
AS $function$
DECLARE
v1 VARCHAR;
BEGIN
SELECT INTO v1 a FROM t1 WHERE a=p1;
RETURN v1;
END
$function$
In this function, the SELECT
statement will always do a sequential scan of
table t1
because of the implicit casting issue.
References:
Character encoding and collation #
In PostgreSQL, Server-side encoding is set at the database-level. A session
variable named client_encoding
can define the encoding at the client-level.
Its default value is set accordingly to the database encoding, but can be
redefined dynamically at session start with SET
, for example:
SET client_encoding = UTF8
.
Regarding collation handling, some improvements were made since PostgreSQL 8.4:
- collation global to the database cluster before release 8.4;
- collation per database since release 8.4;
- collation per column/index/SQL query since release 9.1.
References:
Temporal types #
Oracle supports several datatypes to manipulate temporal data:
DATE
, encodes both date and time, with a resolution of a second;TIMESTAMP
, encodes date and time but with a better resolution than what PostgreSQL can handle (nanosecond);TIMESTAMP WITH TIME ZONE
, same as above, with information about the time zone;INTERVAL
, can have 2 resolutions: year/month or days/seconds.
PostgreSQL, on the other side, implements the following datatypes:
date
, encodes a date, conforming to the SQL standard;timestamp
, encodes date and time, with a resolution up to 1 microsecond;time
, encodes time only, with a resolution up to 1 microsecond;interval
which offers a resolution up to 1 microsecond;time
andtimestamp
can optionally hold time zone information by addingwith time zone
keywords, like in Oracle.
The problem when porting data typed as DATE
in Oracle resides in knowning if it
holds only the date part or it also holds the time part.
SELECT ('1970-01-01'::DATE
+ '15 YEARS 3 MONTHS 2 DAYS 1 HOUR 23 MINUTES'::INTERVAL)
AS calc_interval;
-- -[ RECORD 1 ]-+--------------------
-- calc_interval | 1985-04-03 01:23:00
References:
Composite types #
All of the types that can be defined by a user are supported, but may require some adaptation. One could need to define input/output functions for these types, to manage selects and inserts on this peculiar data type. In most cases, the types are composite or array types that are completely supported by PostgreSQL.
The following composite type for Oracle:
CREATE OR REPLACE TYPE phone_t AS OBJECT (
a_code CHAR(3),
p_number CHAR(8)
);
and the same for PostgreSQL:
CREATE TYPE phone_t AS (
a_code CHAR(3),
p_number CHAR(8)
);
The following example using an array:
CREATE OR REPLACE TYPE phonelist AS VARRAY(50) OF phone_t;
will be translated to:
CREATE TYPE phonelist AS (phonelist phone_t[50]);