pgsql-5111

Version:

8.4.1

Bug Link:

http://postgresql.1045698.n5.nabble.com/BUG-5111-Segmentation-fault-if-to-tsvector-returns-empty-row-to-ts-stat-td2124993.html

Patch Link:

http://archives.postgresql.org/pgsql-committers/2009-10/msg00055.php

Symptom:

if to_tsvector returns empty row to ts_stat, there will be Segmentation fault.

How it is diagnosed:

Reproduced!

postgres=# SELECT * from ts_stat('SELECT to_tsvector(''simple'','''')');

server closed the connection unexpectedly

        This probably means the server terminated abnormally

        before or while processing the request.

The connection to the server was lost. Attempting reset: Failed.

Root Cause:

Brief:

The root node repsents an empty ts_vector items is null, and is deferenced.

Detail:

pgsql/src/backend/utils/adt/tsvector_op.c

static void

ts_setup_firstcall(FunctionCallInfo fcinfo, FuncCallContext *funcctx,

                                   TSVectorStat *stat)

{

        TupleDesc        tupdesc;

        MemoryContext oldcontext;

        StatEntry  *node;

        funcctx->user_fctx = (void *) stat;

        oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);

        stat->stack = palloc0(sizeof(StatEntry *) * (stat->maxdepth + 1));

        stat->stackpos = 0;

        node = stat->root;

        /* find leftmost value */

+        if (node == NULL)

+                stat->stack[stat->stackpos] = NULL;

+        else

                for (;;)

                {

                        stat->stack[stat->stackpos] = node;

                        if (node->left)<-- Crash

                        {

                                stat->stackpos++;

                                node = node->left;

                        }

                        else

                                break;

                }

        ….

}

Is there any log message?

Yes


Introduction:

A tsvector value is a sorted list of distinct lexemes:

postgres=# SELECT 'a fat cat sat on a mat and ate a fat rat'::tsvector;

                      tsvector

----------------------------------------------------

 'a' 'on' 'and' 'ate' 'cat' 'fat' 'mat' 'rat' 'sat'

to_tsvector function is a more advanced usage. It parses a textual document

returns a tsvector which lists the lexemes together with their positions in the document, it can also handle normalization of  words.

postgres=# SELECT to_tsvector('simple','a fat cat sat on a mat and ate a fat rat');

                                  to_tsvector                                  

-------------------------------------------------------------------------------

 'a':1,6,10 'and':8 'ate':9 'cat':3 'fat':2,11 'mat':7 'on':5 'rat':12 'sat':4

(1 row)

ts_stat get statistics of a tsvector column.

postgres=# SELECT * from ts_stat('SELECT to_tsvector(''simple'',''a fat cat sat on a mat and ate a fat rat'')');

 word | ndoc | nentry

------+------+--------

 sat  |    1 |      1

 rat  |    1 |      1

 on   |    1 |      1

 mat  |    1 |      1

 fat  |    1 |      2

 cat  |    1 |      1

 ate  |    1 |      1

 and  |    1 |      1

 a    |    1 |      3

(9 rows)