[pgsql] 5084

pgsql-5084

Version:

8.3.7

Bug Link:

http://postgresql.1045698.n5.nabble.com/BUG-5084-Query-gives-different-number-of-rows-depending-on-ORDER-BY-td2131558.html

Symptom:

Query with `where X=X’ clause gives different result (number of rows) depending on whether there is an ‘ORDER BY’.

How it is diagnosed:

Reproduced!

How to reproduce:

------ buggy result: -----------

postgres=# select * from (values (1),(null)) v(k) where k = k order by k;

---

(2 rows) -- the second row above was a NULL.

The expression ‘(VALUES (1),(NULL)) v(k)’ is to create a single column table named ‘v’ and column name ‘k’ (both ‘v’ and ‘k’ are aliases). There are two rows from the table, with the value 1 and NULL.

------ correct result: -----------

postgres=# select * from (values (1),(null)) v(k) where k = k;

---

(1 row) // without order by there’s one row

There is an inconsistency between the above two results (number of rows). According to the specification of postgres, the 2nd one is correct, that a single NULL result should be filtered.

Next we show the query plan, which further explains the failure:

------ buggy result: -----------

postgres=# explain select * from (values (1),(null)) v(k) where k = k order by k;

QUERY PLAN

-------------------------------------------------------------------

Sort (cost=0.04..0.04 rows=2 width=4)

Sort Key: "*VALUES*".column1

-> Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2 width=4)

(3 rows)

------ correct result: -----------

postgres=# explain select * from (values (1),(null)) v(k) where k = k;

QUERY PLAN

-------------------------------------------------------------

Values Scan on "*VALUES*" (cost=0.00..0.03 rows=1 width=4)

Filter: (column1 = column1)

// we can see that this filter is dropped in the query plan above

(2 rows)

So from the query plan we can see, in the incorrect execution (with ‘order by’), postgres did not apply any filter: (column1 = column1). Actually, this filter is the key to filter out the NULL results.

Root Cause:

It is caused by premature optimization.The filter that checks the ‘WHERE’ constraint will filter out the NULL result. But this filter was not used in the buggy version due to a premature optimization -- when postgres detects the ‘WHERE’ clause is in the form of (column1 = column1), it will skip any filtering process.

Detail:

static void distribute_qual_to_rels(...) {

...

if (maybe_equivalence) {

/* Here, it’s an optimization call ‘process_equivalence’. The function tries to

* return true, causing ‘distribute_qual_to_rels’ to return without

* calling ‘distribute_restrictinfo_to_rels’ below. ‘distribute_restrictinfo_to_rels’

* is to actually use the WHERE clause to filter the result. In the buggy execution,

* process_equivalence returned true because it sees the WHERE clause was x=x,

* thinking there is no need to enforce it. */

if (process_equivalence(root, restrictinfo, below_outer_join))

return;

/* EC rejected it, so pass to distribute_restrictinfo_to_rels */

}

…

/* No EC special case applies, so push it into the clause lists */

/* distribute_restrictinfo_to_rels is to strictly enforce the WHERE clause.

* In our case ‘WHERE x=x’, This is essentially unoptimized handling of equality */

distribute_restrictinfo_to_rels(root, restrictinfo);

}

bool process_equivalence(PlannerInfo *root, RestrictInfo *restrictinfo, bool below_outer_join) {

… …

/* The patch logic: if the WHERE clause is X=X, then we should return false, meaning

* we cannot do any optimization, and force postgres to call

* distribute_restrictinfo_to_rels to use the WHERE clause filtering the result. */
         /*
+          * Reject clauses of the form X=X. These are not as redundant as they
+          * might seem at first glance: assuming the operator is strict, this is
+          * really an expensive way to write X IS NOT NULL. So we must not risk
+          * just losing the clause, which would be possible if there is already
+          * a single-element EquivalenceClass containing X. The case is not
+          * common enough to be worth contorting the EC machinery for, so just
+          * reject the clause and let it be processed as a normal restriction
+          * clause.
+          */
+         if (equal(item1, item2))
+                 return false;                        /* X=X is not a useful equivalence */
+

…

/* Various optimization cases so that postgres does not need to really forcing the

* WHERE clause to filter result....*/

… ...

if (ec1 && ec2)

{

/* If case 1, nothing to do, except add to sources */

//The “with order by” query falls into this if! It thinks because

// ec1 == ec2 (x=x), we can optimize!!! While it is hard to anticipate

// the bug, they should log here just in case this is wrong!

if (ec1 == ec2)

{

ec1->ec_sources = lappend(ec1->ec_sources, restrictinfo);

ec1->ec_below_outer_join |= below_outer_join;

/* mark the RI as usable with this pair of EMs */

/* NB: can't set left_ec/right_ec until merging is finished */

restrictinfo->left_em = em1;

restrictinfo->right_em = em2;

return true;

}

* Case 2: need to merge ec1 and ec2.

...

}

For the test case without ‘order by’, the postgres’ logic was eventually still executing distribute_restrictinfo_to_rels to enforce the WHERE filter.

Failure type:

Wrong result

Is there any log message?

Can ErrLog insert a log message?

No. The information sufficient to diagnose the failure: within process_equivalence, ec1 == ec2 (same as item1 == item2). It’s desirable to log at

if (list_length(ec->ec_members) <= 1)

continue;

, which is the diverting point between “with order by” and “without order by”. But there’s not generic pattern for this.

Additional Elaboration:

The WHERE clause will result in a filter in postgres, which is to check whether it satisfies the WHERE constraint. It will also check each value against NULL thus will discard those NULL results.

And this filter (column1 = column1) is to be applied on each row, which is to check whether it satisfies the WHERE constraint. It will also check each value against NULL thus will discard those NULL results. However, if there is ‘ORDER BY’, postgres will build a data structure ‘EquivalenceClass’ to sort on the ORDER BY column. However, when building this data structure, it applied an optimization, that it identified ‘column1 = column1’ is unnecessary, thus completely discard the checking against this constraint. This works fine in most cases, but when the return result is a single NULL, it will not be checked against any constraint at all before returning to users.

In function process_equivalence(equivclass.c),

1. Without ORDER BY, process_equivalence generates an equivalence class

that lists k twice. This is pretty bogus but it happens to produce the

desired results in the example at hand. (In some other cases you'll get

redundant clauses out, because the eclass machinery isn't expecting

this.)

2. With ORDER BY k, the code first creates a single-element equivalence

class containing k, because it needs that to represent the desired

pathkey. Then, process_equivalence finds that both sides of the k = k

clause are already known to be in the same eclass, so it concludes that

this is redundant information.

The original coding correctly noted that these aren't just redundancies
(they're effectively X IS NOT NULL, assuming = is strict). However, they got treated that way if X happened to be in a single-member EquivalenceClass already, which could happen if there was an ORDER BY X clause, for instance.
The simplest and most reliable solution seems to be to not try to process such clauses through the EquivalenceClass machinery; just throw them back for traditional processing.

bool

process_equivalence(PlannerInfo *root, RestrictInfo *restrictinfo,

bool below_outer_join)

{

Expr *clause = restrictinfo->clause;

Oid opno,item1_type,item2_type;

Expr *item1;

Expr *item2;

Relids item1_relids,item2_relids;

List *opfamilies;

EquivalenceClass *ec1,*ec2;

EquivalenceMember *em1, *em2;

ListCell *lc1;

/* Extract info from given clause */

Assert(is_opclause(clause));

opno = ((OpExpr *) clause)->opno;

item1 = (Expr *) get_leftop(clause);

item2 = (Expr *) get_rightop(clause);

item1_relids = restrictinfo->left_relids;

        item2_relids = restrictinfo->right_relids;
        /*
+          * Reject clauses of the form X=X. These are not as redundant as they
+          * might seem at first glance: assuming the operator is strict, this is
+          * really an expensive way to write X IS NOT NULL. So we must not risk
+          * just losing the clause, which would be possible if there is already
+          * a single-element EquivalenceClass containing X. The case is not
+          * common enough to be worth contorting the EC machinery for, so just
+          * reject the clause and let it be processed as a normal restriction
+          * clause.
+          */
+         if (equal(item1, item2))
+                 return false;                        /* X=X is not a useful equivalence */
+

...

* Sweep through the existing EquivalenceClasses looking for matches to

* item1 and item2. These are the possible outcomes:

* 1. We find both in the same EC. The equivalence is already known, so

* there's nothing to do.

* 2. We find both in different ECs. Merge the two ECs together.

* 3. We find just one. Add the other to its EC.

* 4. We find neither. Make a new, two-entry EC.

* Note: since all ECs are built through this process, it's impossible

* that we'd match an item in more than one existing EC. It is possible

* to match more than once within an EC, if someone fed us something silly

* like "WHERE X=X". (However, we can't simply discard such clauses,

* since they should fail when X is null; so we will build a 2-member EC

* to ensure the correct restriction clause gets generated. Hence there

* is no shortcut here for item1 and item2 equal.) ?

ec1 = ec2 = NULL;

em1 = em2 = NULL;

foreach(lc1, root->eq_classes)

{

...

}

/* Sweep finished, what did we find? */

if (ec1 && ec2)

{

/* If case 1, nothing to do, except add to sources */

if (ec1 == ec2) //The “with order by” query falls into this if!

//fix me:

//If we think this case is also an optimization, can we discard this

//and let it go to the case 2?

{

ec1->ec_sources = lappend(ec1->ec_sources, restrictinfo);

ec1->ec_below_outer_join |= below_outer_join;

/* mark the RI as usable with this pair of EMs */

/* NB: can't set left_ec/right_ec until merging is finished */

restrictinfo->left_em = em1;

restrictinfo->right_em = em2;

return true;

}

* Case 2: need to merge ec1 and ec2.

...

}

else if (ec1)

{

/* Case 3: add item2 to ec1 */

...

}

else if (ec2)

{

/* Case 3: add item1 to ec2 */

...

}

else

{ // The “without order by” query falls into this if!

/* Case 4: make a new, two-entry EC */

EquivalenceClass *ec = makeNode(EquivalenceClass);

ec->ec_opfamilies = opfamilies;

ec->ec_members = NIL;

ec->ec_sources = list_make1(restrictinfo);

ec->ec_derives = NIL;

ec->ec_relids = NULL;

ec->ec_has_const = false;

ec->ec_has_volatile = false;

ec->ec_below_outer_join = below_outer_join;

ec->ec_broken = false;

ec->ec_sortref = 0;

ec->ec_merged = NULL;

em1 = add_eq_member(ec, item1, item1_relids, false, item1_type);

em2 = add_eq_member(ec, item2, item2_relids, false, item2_type);

root->eq_classes = lappend(root->eq_classes, ec);

/* mark the RI as usable with this pair of EMs */

restrictinfo->left_em = em1;

restrictinfo->right_em = em2;

}

return true;

}

As for the two different cases in process_equivalence between “with order by” and “without order by”, they divert when they keep on to the following code:

/*Ding: where is this generate_base_implied_equalities called from ?*/

void generate_base_implied_equalities(PlannerInfo *root)

{

ListCell *lc;

Index rti;

foreach(lc, root->eq_classes)

{

EquivalenceClass *ec = (EquivalenceClass *) lfirst(lc);

Assert(ec->ec_merged == NULL); /* else shouldn't be in list */

Assert(!ec->ec_broken); /* not yet anyway... */

/* Single-member ECs won't generate any deductions */

/*This is exactly the case for the “with order by” query!*/

if (list_length(ec->ec_members) <= 1)

continue;

if (ec->ec_has_const)

generate_base_implied_equalities_const(root, ec);

else /*The“without order by” query will fall here*/

generate_base_implied_equalities_no_const(root, ec);

/* Recover if we failed to generate required derived clauses */

if (ec->ec_broken)

generate_base_implied_equalities_broken(root, ec);

}

generate_base_implied_equalities_no_const

will also eventually call distribute_restrictinfo_to_rels, remember the patch’s execution, that’s why the patch now behaves like the “without order by”!