Summary

These are instructions to set up and run the FIS GT.M threeen1f benchmark on an x86 GNU/Linux system.  A complete functional specification of the program can be found at http://ksbhaskar.blogspot.com/2010/06/3n1-nosqlkey-valueschema-freesche.html

Commands that you would type in are shown in 10 point magenta Courier New font.

Download and Install GT.M

You only need to do this once.

Download the latest release of GT.M for x86 architecture from Source Forge (http://sourceforge.net/projects/fis-gtm).  For example, the latest release as today (October, 8, 2010) is V5.4-001.  If you have 64-bit hardware and OS, you have a choice of running either a 32-bit or 64-bit GT.M.  To download GT.M to a temporary directory, you can use a command such as:

$ wget -P /tmp http://sourceforge.net/projects/fis-gtm/files/GT.M-amd64-Linux/V5.4-001/gtm_V54001_linux_x8664_pro.tar.gz

After downloading it create a temporary directory and unpack it there.

$ mkdir /tmp/tmp ; cd /tmp/tmp

$ tar zxf ../gtm_V54001_linux_i686_pro.tar.gz

As root, install GT.M.  Unicode support is optional.

# sh ./configure

                     GT.M Configuration Script

Copyright 2009, 2010 Fidelity National Information Services, Inc. Use of this

software is restricted by the provisions of your license agreement.

What account should own the files? (bin)

Should execution of GT.M be restricted to a group? (y or n) n

In what directory should GT.M be installed? /usr/lib/fis-gtm/V5.4-001_x86_64

Directory /usr/lib/fis-gtm/V5.4-001_x86_64 does not exist. Do you wish to create it as part of

this installation? (y or n) y

Installing GT.M....

Should unicode support be installed? (y or n) y

Should an ICU version other than the default be used? (y or n) y

Enter ICU version (at least ICU version 3.6 is required. Enter as <major-ver>.<minor-ver>): 4.2

All of the GT.M MUMPS routines are distributed with uppercase names.

You can create lowercase copies of these routines if you wish, but

to avoid problems with compatibility in the future, consider keeping

only the uppercase versions of the files.

Do you want uppercase and lowercase versions of the MUMPS routines? (y or n)n

Compiling all of the MUMPS routines. This may take a moment.

GTM>

%GDE-I-GDUSEDEFS, Using defaults for Global Directory

        /usr/lib/fis-gtm/V5.4-001_x86_64/gtmhelp.gld

GDE>

GDE>

GDE>

%GDE-I-VERIFY, Verification OK

%GDE-I-GDCREATE, Creating Global Directory file

        /usr/lib/fis-gtm/V5.4-001_x86_64/gtmhelp.gld

GTM>

%GDE-I-GDUSEDEFS, Using defaults for Global Directory

        /usr/lib/fis-gtm/V5.4-001_x86_64/gdehelp.gld

GDE>

GDE>

GDE>

%GDE-I-VERIFY, Verification OK

%GDE-I-GDCREATE, Creating Global Directory file

        /usr/lib/fis-gtm/V5.4-001_x86_64/gdehelp.gld

Installation completed. Would you like all the temporary files

removed from this directory? (y or n) y

#

You can delete the temporary directory and the downloaded GT.M distribution after installing GT.M.

Benchmarking

Environment Variables

The operation of GT.M is controlled by environment variables.  You need to ensure that the environment variables are set up correctly every time you run GT.M.  For example, if you plan to run the benchmark in /testarea1/benchmark, the following will set up the environment variables correctly:

$ export gtmdir=/testarea1/benchmark

$ source /usr/lib/fis-gtm/V5.4-001_x86_64/gtmprofile

%GDE-I-GDUSEDEFS, Using defaults for Global Directory

        /testarea1/benchmark/V5.4-001_x86_64/g/gtm.gld

GDE>

%GDE-I-EXECOM, Executing command file /usr/lib/fis-gtm/V5.4-001_x86_64/gdedefaults

GDE>

%GDE-I-VERIFY, Verification OK

%GDE-I-GDCREATE, Creating Global Directory file

        /testarea1/benchmark/V5.4-001_x86_64/g/gtm.gld

Created file /testarea1/benchmark/V5.4-001_x86_64/g/gtm.dat

%GTM-I-JNLCREATE, Journal file /testarea1/benchmark/V5.4-001_x86_64/g/gtm.mjl created for region DEFAULT with BEFORE_IMAGES

%GTM-I-JNLSTATE, Journaling state for region DEFAULT is now ON

$

If the database exists, it will be more quiet.

$ export gtmdir=/testarea1/benchmark

$ source /usr/lib/fis-gtm/V5.4-001_x86_64/gtmprofile

$

Configure the Benchmark Directory

You only need to do this once.  If you want to configure other GT.M parameters to evaluate their impact on throughput, you can do so as appropriate.

$ mumps -run GDE

%GDE-I-LOADGD, Loading Global Directory file

        /testarea1/benchmark/V5.4-001_x86_64/g/gtm.gld

%GDE-I-VERIFY, Verification OK

GDE> change -segment DEFAULT -global_buffer_count=65536

GDE> exit

%GDE-I-VERIFY, Verification OK

%GDE-I-GDUPDATE, Updating Global Directory file

        /testarea1/benchmark/V5.4-001_x86_64/g/gtm.gld

$

While the steps above do not create a new database file, a new database file will be created when you start a run following the instructions here.

Download and Install the threeen1f program

You only need to do this once.  Download the current version of the threeen1f program from Source Forge.

$ wget -P /tmp http://sourceforge.net/projects/fis-gtm/files/Benchmarking/threeen1/threeen1f.tgz

After downloading it, unpack the program in the r/ subdirectory of the benchmark directory.

$ tar zxf /tmp/threeen1f.tgz -C $gtmdir/r

Create an input file

Each run will need an input file.  At a minimum, an input file must contain a starting integer for the range (1 is recommended) and an ending integer.  You can run the benchmark on different input sizes by providing multiple lines.  Each line can optionally have two more parameters, the number of parallel worker processes (the program will use this number or twice the number of CPUs, whichever is greater) and the chunk or block size - worker processes work on blocks of integers of this size (the program will use the smaller of this number or the range divided by the number of worker processes).  Input data on a line must be separated by a single space and the program does no input checking.  For example:

$ cat $gtmdir/threeen.dat

1 100000 8 5000

1 1000000 8 50000

1 10000000 8 50000

1 20000000 8 50000

$

On a system with 2-4GB RAM, ranges smaller than 1 through 1,000,000 will be CPU limited.  Ranges of 1 through 5,000,000 to around 20,000,000 are likely to give results based on a blend of file system caching and disk IO.  Ranges of 1 through 40,000,000 and above are likely to primarily test disk IO.

Cleaning the environment

Each run of the test ideally starts with a clean environment.

$ rm -f *.mj[oe] $gtmdir/$gtmver/g/gtm.{dat,mjl*}

Additionally, since the test will generate a lot of journal files, if you are concerned about running out of space in the filesystem, you can run the following command to delete prior generation journal files.

$ rm -f $gtmdir/$gtmver/g/gtm.mjl_*

Run the benchmark

$ mupip create

Created file /testarea1/benchmark/V5.4-001_x86_64/g/gtm.dat

$ mupip set -journal=before -region DEFAULT

%GTM-I-JNLCREATE, Journal file /testarea1/benchmark/V5.4-001_x86_64/g/gtm.mjl created for region DEFAULT with BEFORE_IMAGES

%GTM-I-JNLSTATE, Journaling state for region DEFAULT is now ON

$ mumps -run threeen1f <$gtmdir/threeen.dat

1 100,000 8 5,000 350 1,570,824,736 2 218,523 318,523 109,262 159,262

1 1,000,000 8 50,000 524 56,991,483,520 16 2,171,377 3,171,377 135,711 198,211
...

You can stop the test at any time from a shell with a killall mumps command.  If you want to tweak other database parameters, the most likely place to set them is after the mupip create, but this depends on which parameters you wish to set.

The output is as follows:

  1. The first number the range.
  2. The last number in the range.
  3. The number of worker processes.  If it is reported in a form such as (2->8) it means that although the input specified two worker processes but based on the number of CPUs, the program chose to use 8 worker processes.
  4. The size of a block of integers on which a worker process crunches on at a time.  Again, it can have a form such as (500->125) to report the input specified and the actual number used.
  5. The number of steps in the largest 3n+1 sequence found.  This number of steps, and the largest integer encountered are measures of program correctness and should be determined solely by the first and last numbers in the range.
  6. The largest integer encountered during the calculation of 3n+1 sequences for that range of numbers.
  7. The elapsed (wall clock) time in seconds, with a resolution of one second.
  8. The number of updates performed in aggregate by all worker processes.
  9. The number of reads performed in aggregate by all worker processes.
  10. The update rate in updates per second.
  11. The read rate in reads per second.

atop ((http://www.atoptool.nl) run as root is a good program to monitor the computer while the benchmark is running.