pgsql-4741

Version:

8.3.7

Bug Link:

http://postgresql.1045698.n5.nabble.com/BUG-4741-Domain-constraint-violation-with-trigger-td2126549.html

Symptom:

When we put a null into the not nullable domain in a trigger(A database trigger is procedural code that is automatically executed in response to certain events on a particular table or view in a database), postgresql does not throw an error.

A domain is essentially a data type with optional constraints , i.e. restrictions on the allowed set of values.

Failure type:

Wrong result (Accepting invalid input)

Is there any log message?

No

How it is diagnosed:

Reproduced!

How to reproduce:

-- Then we create a domain.

$ CREATE DOMAIN test_domain AS varchar DEFAULT '' NOT NULL;

CREATE DOMAIN

-- Next we create a table. Note that the column ‘text’ is of type ‘text_domain’, which has the constraint NOT NULL as above.

$ CREATE TABLE test(id integer not null, text test_domain);

CREATE TABLE

-- We first create a function that is used as a trigger. The trigger is executed on every insert into table test.

$ CREATE LANGUAGE plpgsql;

CREATE LANGUAGE

$  CREATE OR REPLACE FUNCTION test_function() RETURNS TRIGGER AS $function$

    BEGIN

        SELECT NULL INTO new.text;

        RETURN new;

    END;

    $function$ LANGUAGE plpgsql;

CREATE FUNCTION

$ CREATE TRIGGER test_trigger BEFORE INSERT ON test FOR EACH ROW EXECUTE

   PROCEDURE test_function();

CREATE TRIGGER

--- Now, as all the necessary data are created, the table with the same trigger AND a domain let a null value go into the

$ INSERT INTO test VALUES(1);

INSERT 0 1

-- Here, it should reject this input by issuing: ERROR:  null value in column "text" violates not-null constraint

$ SELECT * FROM test WHERE text IS null;

--  id | text

-- ----+------

--   1 |

-- (1 row)

-- However, if we do not create the domain, inserting a null value would throws an error

$ DROP TABLE IF EXISTS test;

DROP TABLE

$ CREATE TABLE test(id integer not null, text varchar default '' not null);

CREATE TABLE

$ CREATE TRIGGER test_trigger BEFORE INSERT ON test FOR EACH ROW EXECUTE

    PROCEDURE test_function();

CREATE TRIGGER

$ INSERT INTO test VALUES(1);

ERROR:  null value in column "text" violates not-null constraint

Root Cause:

The NOT-NULL constraint is mistakenly disabled by an premature optimization when processing the creation of ‘DOMAIN’.

In the buggy execution, the constraint on the column ‘text’ was expressed in the ‘domain’ declaration. In the code logic, this constraint will be casted to the table column when postgres is processing the ‘domain’ creation, by function ‘exec_simple_cast_value’. However, due to an immature optimization, postgres disabled this casting, thus later when function ‘ExecConstraints’ checks whether the inserted value conforms the constraint, it is not even aware of the NOT-NULL constraint.

The flow of a insert query with triggers execution is usually:

exec_simple_query(){...

   PortalRun(){...

      PortalRunMulti(){...

         ProcessQuery(){...

            ExecutorRun(){...

                ExecutePlan(){...

                   ExecInsert(){...

                        ExecBRInsertTriggers(){...

   ExecCallTriggerFunc(){...

      exec_stmt(){...

         exec_stmt_execsql(){...

    exec_assign_value(){...                  

        exec_simple_cast_value()

    …}

 …}

                              …}

                           …}

                        …}

ExecConstraints()

                   ...}

               …}

           …}

        …}

    …}

…}

ExecConstrains() will check if one of the attribute of slot (i.e column) has a constraint of notnull and if so whether the value conforms to it.

When the table has a domain, its second attribute(i.e (domain)NULL ) has no notnull constraint itself (otherwise the table should be defined as test(id integer not null, text test_domain not null)), the notnull constraint is defined within the domain. When parsing the domain creation, exec_simple_cast_value is supposed to cast the notnull constraint onto the table’s attribute. But exec_simple_cast_value() mistakenly supposed that it could bypass casting effort whenever the input value was NULL.  This prevents application of not-null domain constraints in the cases that use this function. So the patch is simple, remove the optimization of null input value checking, the call of exec_cast_value within exec_simple_cast_value() will do the checking for a domain.

/* ----------

 * exec_simple_cast_value                        Cast a value if required

 */

static Datum

exec_simple_cast_value(Datum value, Oid valtype,

                                           Oid reqtype, int32 reqtypmod,

                                           bool isnull)

{

-        if (!isnull) //should remove this optimization

-        {

                if (valtype != reqtype || reqtypmod != -1)

                {

                        … …

                        /* Actual casting */

                        value = exec_cast_value(... ...);

                }

-        }

        return value;

}

Datum

domain_in(PG_FUNCTION_ARGS)

{

        char           *string;

        ...

        if (PG_ARGISNULL(0))

                string = NULL;//string is NULL

        else

                string = PG_GETARG_CSTRING(0);

        if (PG_ARGISNULL(1))

                PG_RETURN_NULL();

        domainType = PG_GETARG_OID(1);

        ...

        /*

         * Do the necessary checks to ensure it's a valid domain value.

         */

        domain_check_input(value, (string == NULL), my_extra);

        if (string == NULL)

                PG_RETURN_NULL();

        else

                PG_RETURN_DATUM(value);

}

domain_check_input(Datum value, bool isnull, DomainIOData *my_extra)

{

        ExprContext *econtext = my_extra->econtext;

        ListCell   *l;

        foreach(l, my_extra->constraint_list)

        {

                DomainConstraintState *con = (DomainConstraintState *) lfirst(l);

                switch (con->constrainttype)

                {

                        case DOM_CONSTRAINT_NOTNULL:

                                if (isnull)

                                        ereport(ERROR, (errcode(ERRCODE_NOT_NULL_VIOLATION), errmsg("domain %s does not allow null values",format_type_be(my_extra->domain_type))));

//the correct version should print this error message.

                                break;

        ...

}

void

ExecConstraints(ResultRelInfo *resultRelInfo,

                                TupleTableSlot *slot, EState *estate)

{

        … ...

        if (constr->has_not_null)

        {

                

                        if (rel->rd_att->attrs[attrChk - 1]->attnotnull &&

                                slot_attisnull(slot, attrChk))

ereport(ERROR,                                (errcode(ERRCODE_NOT_NULL_VIOLATION),

errmsg("null value in column \"%s\" violates not-null constraint",

NameStr(rel->rd_att->attrs[attrChk - 1]->attname))));

                

        }

        …

}

=======

assume we know ExecStoreTuple is output point

ExecInsert(...)

{

        ...

        if (...)

        {

                HeapTuple        newtuple;

                newtuple = ExecBRInsertTriggers(estate, resultRelInfo, tuple);

                if (newtuple == NULL)        /* "do nothing" */

                        return;

                if (newtuple != tuple)        /* modified by Trigger(s) */

                {

                        TupleTableSlot *newslot = estate->es_trig_tuple_slot;

                        ...

                        ExecStoreTuple(newtuple, newslot, InvalidBuffer, false);

                        slot = newslot;

                        tuple = newtuple;

                }

                ...

        }

}

--------

ExecBRInsertTriggers(EState *estate...)

{

                …

                newtuple = ExecCallTriggerFunc(...);

                if (oldtuple != newtuple && oldtuple != trigtuple)

                        heap_freetuple(oldtuple);

                if (newtuple == NULL)

                        break;

        }

        return newtuple;

}

------

ExecCallTriggerFunc(...)

{

        …

        result = FunctionCallInvoke(&fcinfo);

        …

        return (HeapTuple) DatumGetPointer(result);

}

-------

Assume we know the function pointer will call

plpgsql_call_handler(PG_FUNCTION_ARGS)

{

        if (CALLED_AS_TRIGGER(fcinfo))

                retval = PointerGetDatum(plpgsql_exec_trigger(...));

}

plpgsql_exec_trigger(...)

{

        …

        rc = exec_stmt_block(&estate, func->action);

        …

        /* Copy tuple to upper executor memory */

        rettup = SPI_copytuple((HeapTuple) (estate.retval));

        …

        return rettup;

}

-----

exec_stmt_block

{

        …

        rc = exec_stmts(estate, block->body);

        ...

}

---

exec_stmts(PLpgSQL_execstate *estate, List *stmts)

{

        ...

        rc = exec_stmt(estate, stmt);

        ...

}

---

exec_stmt(...)

{

        ….

        case PLPGSQL_STMT_EXECSQL:

                rc = exec_stmt_execsql(estate, (PLpgSQL_stmt_execsql *) stmt);

                break;

        …

        return rc;

}

--------

exec_stmt_execsql(...)

{

        ...

        rc = SPI_execute_plan(expr->plan, values, nulls,

                                                  estate->readonly_func, tcount);

        …

        /* Put the first result row into the target */

        exec_move_row(estate, rec, row, tuptab->vals[0], tuptab->tupdesc);

        …

}

-------

exec_move_row(...)

{

        ...

        exec_assign_value(estate, (PLpgSQL_datum *) var,

                                                          value, valtype, &isnull);

}

exec_assign_value(PLpgSQL_execstate *estate,...)

{

        rec = (PLpgSQL_rec *) (estate->datums[recfield->recparentno]);

        values[fno] = exec_simple_cast_value(...)  <------last write

        ...

        newtup = heap_formtuple(rec->tupdesc, values, nulls);

        if (rec->freetup)

                        heap_freetuple(rec->tup);

        rec->tup = newtup;

        rec->freetup = true;

}

Additional Comments:

The developer says in the commit comment:

Since this function isn't meant for use in performance-critical paths anyway, this certainly seems like
another case of "
premature optimization is the root of all evil".

Can we anticipate error?

It’s an interesting case of ‘premature optimization. Developers should be advised of the trickiness and log.

But it’s hard to enforce this good practice.

How can it be automatically handled?

In this case, it would be very helpful if we know that the function exec_simple_cast_value

was called with variable ‘isnull’ equal to ‘true’. However, identify this automatically would be challenging.

This important variable value is used in condition.