Xapian plus omega  -is work on freebsd (not on freenas)

Do it 

cd /usr/ports/www/xapian-omega

make install

cd /usr/ports/graphics/xpdf

make install

***uncheck X11 and DRAW

cd /usr/ports/textproc/catdoc

make install

***Uncheck WORDVIEW

cd /usr/ports/archivers/unzip

make install

cd /usr/ports/archivers/gzip

make install

cd /usr/ports/textproc/antiword

make install

cd /usr/ports/textproc/unrtf

make install

cd /usr/ports/textproc/catdvi

make install

======================

In this step I have many time for coffee, tea and on another terminal make changes on config files. As:

ee /usr/local/etc/apache22/httpd.conf

        Change: ScriptAlias /cgi-bin/ “/usr/local/www/apache22/cgi-bin/”

        Into: ScriptAlias /cgi-bin/ “/usr/local/www/xapian-omega/cgi-bin/” 

*****  change only apache22 → xapian-omega.  symbol “ is bad

Create new file

ee /usr/local/etc/apache22/Includes/xapian.conf

Alias /something /path/to/something
<Directory "/path/to/something">
       Options Indexes
       AllowOverride None
       Order allow,deny
       Allow from all
</Directory>
<Directory "/usr/local/www/xapian-omega/cgi-bin/">
   AllowOverride None
   Options None
   Order allow,deny
   Allow from all
</Directory>

Create the holding directory

mkdir -p /var/lib/omega/data/

mkdir -p /var/lib/omega/cdb/

Copy over the templates.

cp -rfv /usr/ports/www/xapian-omega/work/xapian-omega-<version number>/templates /var/lib/omega/

Tell Xapian-Omega where to look for the files.

Create the file:

ee /usr/local/www/xapian-omega/cgi-bin/omega.conf

# Directory containing Xapian databases:

database_dir /var/lib/omega/data

# Directory containing OmegaScript templates:

template_dir /var/lib/omega/templates

# Directory to write Omega logs to:

log_dir /var/log/omega

# Directory containing any cdb files for the $lookup OmegaScript command:

cdb_dir /var/lib/omega/cdb

Create a search page. I’ll just use index.html in Apache’s default DocumentRoot

ee /usr/local/www/apache22/data/index.html

<head>
<title>Intranet Search</title>
</head>
<body bgcolor="#ffffff">
<FORM NAME=P METHOD=GET
ACTION="/cgi-bin/omega" TARGET="_top">
<center>
<INPUT NAME=P VALUE="" SIZE=65>
<INPUT TYPE=SUBMIT VALUE="Search">
<hr>
<INPUT TYPE=radio NAME=DEFAULTOP VALUE=or > Match any word
<INPUT TYPE=radio NAME=DEFAULTOP VALUE=and CHECKED> Match all words
</center><br>
<INPUT TYPE=hidden NAME=DB VALUE="default">
<INPUT TYPE=hidden NAME=FMT VALUE="query">
<INPUT TYPE=hidden NAME=xDB VALUE="default">
<INPUT TYPE=hidden NAME=xFILTERS VALUE="--O">
</FORM>
<hr>
</body>
</html>

 Try it by hand. Run:

/usr/local/bin/omindex –db /usr/local/lib/omega/data/default –url /something /path/to/something –depth-limit=0

Omnidex command sintax

Provided by: xapian-omega_1.2.4-1_i386 

NAME                      omindex - Index static website data via the filesystem

SYNOPSIS      omindex [OPTIONS] --db DATABASE [BASEDIR] DIRECTORY

DESCRIPTION      omindex - Index static website data via the filesystem

OPTIONS

      -d, --duplicates
                     
set duplicate handling ('ignore' or 'replace')
       -p, --no-delete
                     
skip  the  deletion  of documents corresponding to deleted files
             (
--preserve-nonduplicates is a deprecated alias for --no-delete)
     
 -D, --db
                     
path to database to use
       -U, --url
                      base url DIRECTORY represents (default: /)
       -M, --mime-type=EXT:TYPE map file extension EXT  to  MIME  Content-Type
      TYPE              (empty TYPE removes any MIME mapping for EXT)
       -F, --filter=TYPE:CMD
                     process  files  with  MIME  Content-Type TYPE using command CMD,
                     which   should   produce   UTF-8    text    on    stdout    e.g.

              -Fapplication/octet-stream:'strings -n8'
     
 -l, --depth-limit=LIMIT
           
 set recursion limit (0 = unlimited)
       -f, --follow
              follow symbolic links
       -S, --spelling
           
 index data for spelling correction
       -v, --verbose
             
show more information about what is happening
     
--overwrite
             
create  the  database  anew  (the  default  is  to update if the
             database already exists)
       -s, --stemmer=LANG
           
 set the stemming language, the default is  'english'.   Possible
             values:  danish  dutch  english  finnish  french  german german2
             hungarian  italian  kraaij_pohlmann  lovins   norwegian   porter
             portuguese romanian russian spanish swedish turkish (pass 'none'
             to disable stemming)
       -h, --help
             
display this help and exit
       -V, --version
             
output version information and exit


When i check it on my virtual freebsd, i was connect to Fileserver over NFS, and was make index on NFS folder.

Original folder contain

And database for it contain

Calculate - 0.063   from sourse,

I think very well, for 100 GB i have 6 GB  index tables

Now fire up your browser and validate the result by surfing over to the IP address of the server. If that worked out well too, the last step is to add it to Crontab, so that it refreshes         the index automatically. In my case, once a day is enough. As you can see, the index is                 being refreshed at 1:15 AM every night.

         Edit crontab (/etc/crontab)

15        1        *        *        *        root        /usr/local/bin/omindex –db                         /usr/local/lib/omega/data/default –url /something /path/to/something –depth-limit=0 > /var/log/index.log

Screen