Hardware Architecture Document

Project / Subproject:
PLAN-JO No. 10030

Reference Number / Version:
D.HAD / 0.07

PLAN JO – Release 1

Hardware Architecture Document

Version 0.07

22/12/2006


Table of Contents

1        INTRODUCTION        

2        LOGICAL PERSPECTIVE OF THE SYSTEM        

3        SERVER CONFIGURATION        

3.1        Hardware Infrastructure        

3.1.1        Application Layer        

3.1.2        Content Layer        

3.1.3        Database Server Layer        

3.1.4        Shared Storage        

4        CORRELATION OF HARDWARE RESOURCES AND THROUGHPUT        

5        SOFTWARE & DIRECTORY STRUCTURE        


TABLE OF FIGURES

Figure 1: Logical separation of the PLAN-JO system        

Figure 2: The Documentum Repository        

Figure 3: Directory structure        


LIST OF TABLES

Table 1: User/Workload profile        

Table 2: Workload-Specific Criteria        

Table 3: Document profile        

Table 4: Platform profile        

Table 5: System output sizing        

Table 6:  Software components        


  1. Introduction

This document provides the proposed hardware architecture of the PLAN-JO system. The document presents a configuration based on existing hardware of the Publications Office. It is based on the SUN Fire SFx800 architecture, using one physical server. Additionally, the logical perspective is provided as well. Finally estimates of the system’s throughput in terms of hardware resources/load are provided.


  1. Logical Perspective of the System

The figure below illustrates an abstract view of the logical system distribution:

Figure 1: Logical separation of the PLAN-JO system

  1. Server Configuration

Utilising the hardware already available at OPOCE , the proposed  configuration/solution is to use a SUN Fire SFx800 server and assign resources to each layer using the Solaris Resource Manager. The advantages of this configuration can be summarised as follows:

  1. Hardware Infrastructure

The layers that constitute the system should be initially identified before distributing the available hardware resources between them. The first layer is the Application server, which will provide the Webtop interface to the clients. Furthermore, it can support any other enterprise components that will be developed with respect to the Plan-JO (e.g EJBs, JMS for the DEMED integration etc.).  The next layer is the Content server. The Content server provides resources for the DocBroker and mediates all the requests of the Webtop to the Repository. The Repository conceptually consists of three distinct parts. The first one is the file repository itself, which will be hosted on the Content server and the metadata repository which will be stored in the Database server.  Finally, the third part is the Indexes store. It holds all the indexes that will be build on the documents and they are used for the search capabilities of the system. A detailed illustration of the Repository is provided below:

Figure 2: The Documentum Repository

To provide the necessary resources for the above components, it is proposed to use two boards in a Sunfire SF4800 or SF6800 enclosures. Each one of the boards will have the following characteristics:

It is proposed to separate the resources as presented below:

  1. Application Layer

The application Layer will run the web tier, Webtop. The following resources should be allocated to it:

  1. Content Layer

The following resources should be allocated to the Content Layer :

•        4 CPUs Ultra Sparc IV+ running at 1.5 GHz

•        4x8MB Ecache

•        16GB of RAM

  1. Database Server Layer

The following resources should be allocated to the Database Server Layer  :

•        2 CPUs Ultra Sparc IV+ running at 1.5 GHz

•        2x8MB Ecache

•        8GB of RAM

  1. Shared Storage

The existing optical EMC2 already available at the Publications Office’s premises will be used for storage.

Details regarding the throughput of the configuration in conjunction with the system workload and the documents’ profile is provided in the next section.

  1. Correlation of Hardware resources and Throughput

This section provides a preliminary estimate of the system’s throughput with respect to the available system resources.  All the data are calculated using “Documentum’s Sizing Tool”. The following table presents the expected user/workload profile:

User/Workload Profile

 

 

 

User Profile

Web Publisher 5.3

Portal 5.3

Webtop 5.3 & Forms 5.3

DCM 5.3

Desktop 5.3

DAM 5.3

 

 

Heavy Users

0

0

100

0

0

0

 

 

Light Users

0

0

50

0

0

0

 

 

%Heavy Users Active

0%

0%

40%

0%

0%

0%

 

 

%Light Users Active

0%

0%

10%

0%

0%

0%

 

 

Heavy Users/Busy hour

0

0

40

0

0

0

 

 

Light users/Busy hour

0

0

5

0

0

0

 

 

Total Users/Busy hour

0

0

45

0

0

0

 

 

Estimated % Growth of Users Per Year

0%

0%

10%

0%

0%

0%

 

 

Level of Customization

None

None

None

None

None

None

 

 

Workflow Intensive

Yes

Yes

Yes

Yes

Yes

Yes

 

Table 1: User/Workload profile

The next table presents the expected workload specific criteria per component

 

Workload-Specific Criteria

 

 

WDK/Webtop based Applications

 

BPM

 

 

 

 

Web Publisher 5.2.5

 

 

Session Pooling Enabled

Yes

 

Peak Manual Activities per min

300

 

 

Max Rendition Queueing time (secs)

15

 

Classic or Streamline

Classic

 

Automatic Activities per hour

600

 

 

CIS Enabled?

Yes

 

Extended HTTP Timeout

No

 

BPS messages per hour 

100

 

 

Change Set processing during peak hours

Yes

 

Clustered App Server 

No

 

 

 

 

 

 

Index page regeneration during peak hours

Yes

 

Peak Fulltext Queries /min 

0

 

 

 

 

 

 

Content Server

 

 

Portal 5.2.5

 

 

 

 

 

 

 

Number of custom types

20

 

Operations per user per hour

60

 

 

 

 

 

 

Number of CS Instances per machine

1

 

Components per page

8

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Table 2: Workload-Specific Criteria


Document Profile

 

Content Loading CPU Input

 

 

 

 

 

 

 

Content Profile

 

 

Loading days per year

0

 

 

 

 

 

 

Num of Original Source Documents: Yr 1

170.016

 

Num. of Docs/Day

 

20

** Do not include these documents

 

 

 

Estimated Average Size (kbytes)

118

 

Content Input  Window (hrs)

 

24

    in the profile below.

 

 

 

 

Avg. Versions per Document

6

 

Num. AutoWF Tasks per Doc

 

9

 

 

 

 

 

 

Average Additional Renditions

1

 

Average Size (Kb)

170

 

 

 

 

 

 

 

 

 

Document or Media
Transformation Services?

DTS

 

 

 

 

 

 

Custom Attribute size per Doc (kbytes)

1

 

Renditioning Priority

ASAP

 

 

 

 

 

 

Number of Custom Attributes

15

 

Full Text Indexing

None

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Document Sizes(kb):

Average Size (KB)

Number of Source Docs in First Year

% of All

Request Media Transf-
ormation?

Number of New Docs
Per Year

Avg. # of Add'l Rend.

Avg. Rend. Size
(% of Orig)

Average # of Versions

Content to be FT Indexed 

 

 

Format/Input Type

 

Word

100

154.560

91%

No

154.560

1

30%

6

Yes

 

 

TIFF

100

7.728

5%

No

7.728

1

50%

6

Yes

 

 

PDF

500

7.728

5%

No

7.728

1

0%

6

Yes

 

 

HTML/Web Pages

0

0

0%

No

                    -  

0

40%

1

Yes

 

 

XML

0

0

0%

No

                    -  

0

0%

1

No

 

 

Images

0

0

0%

No

                    -  

0

20%

1

No

 

 

Contentless

0

0

0%

No

                    -  

0

0%

1

No

 

 

MPEG

0

0

0%

No

                    -  

0

15%

1

No

 

 

Total

700

170016

100%

0

         170.016

 

 

 

 

 

 

Weighted Average

118

 

 

 

 

1

27%

6,0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Table 3: Document profile


Platform Profile Information 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Years of Coverage for Hardware

3

 

 

 

 

 

 

 

 

 

 

High Availability Needs

none

 

 

 

 

 

 

 

 

 

 

Database Server Type

Oracle

 

 

 

 

 

 

 

 

 

 

JVM version

1,4

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

CPUs per server

MHz

CPU type

 

 

 

 

 

 

 

 

Web-tier machines

2

1500

SPARC

 

 

 

 

 

Content Server machines

2

1500

SPARC

 

 

 

 

 

Index Agent/Server machines

2

1500

SPARC

.

 

 

 

 

 

 

RDBMS machines

N/A

1500

SPARC

 

 

 

 

 

 

 

 

CIS Server machines

1

1500

SPARC

 

Site Caching Services Target machines

1

1500

SPARC

 

 

 

 

 

 

 

 

Document Transformation machines

1

1500

SPARC

 

 

 

 

 

 

 

 

PDF Aqua Server machines

1

1500

SPARC

 

 

 

 

 

 

 

 

Media Transformation Servers

1

1500

SPARC

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Table 4: Platform profile

Base on the above data the following estimates are deducted in terms of system sizing parameters:

 

System Sizing Output

 

 

 

User Profile Summary

 

 

 

 

 

 

User population after 3 years

182

 

 

Users/busy hour after 3 years

54

 

 

Number of Documents from all sources after 3 years

510.048
6.120.600

source
source + versions + rend.

 

 

Estimated Hardware Resource Summary

 

 

 

 

Output

CPUs

***

Memory (MB)

Disk
Space (MB)

Est. Disk
IOs/sec

 

 

Content Server

2

 

                          2.304

              441.492

3

 

 

Index Agent/Server

0

 

                                 -  

                          -  

0

 

 

WDK/App Server (Web)

1

***

                              512

 

6

 

 

RDBMS Server

3

 

                          7.424

                   1.568

12

 

 

 

 

6

 

 

 

 

 

Total for Servers

6

 

                        10.240

              443.060

21

 

 

 

 

 

 

 

 

 

 

Document Transformation Svr

2

Note: These estimates are NOT adjusted for High Availability

 

 

CIS Server

0

Note:  No single server machine should have less than 2 CPUs

 

Site Caching Serv. Target

0

 

 

 

 

 

 

PDF Aqua Server

0

 

 

 

 

 

 

Media Transformation Svr

0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Hardware Deployment Options

(note: Not Adjusted for HA needs)

 

 

 

 

 

 

 

 

 

 

 

Option #1

 

 

# of machines

CPUs/machine

 

 

 

Host-based (Web + Content Serv. + FT + DB)

 

1

6

 

 

 

 

 

 

 

 

 

 

 

Option #2

 

 

 

 

 

 

 

Web Tier Server separate

 

1

2

 

 

 

Content Server/FT Index subsystem combined

 

1

2

 

 

 

RDBMS separate

 

1

3

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Option #3

 

 

 

 

 

 

 

Web Tier separate

 

1

2

 

 

 

Content Server separate

 

1

2

 

Index Agent / Index Server (Fulltext)

 

1

0

 

 

 

NAS device recommended for file sharing

 

1

 

 

 

 

RDBMS separate

 

1

3

 

 

 

 

 

 

 

 

 

 

 

Other Servers

 

 

 

 

 

 

 

Document Transformation Service PCs

 

2

1

 

 

 

PDF Aqua Servers

 

0

1

 

 

 

CIS Servers

 

0

1

 

 

 

Site Caching Services Targets

 

0

1

 

 

 

Media Transformation Servers

 

0

1

 

 

 

Notes

 

 

 

 

 

 

 

 

 

 

 

WARNING: Fulltext Index Server CPU demands are greater than single node configuration.

 

 

 

 

 

 

 

 

 

Table 5: System output sizing

It is a good sizing practice for the Index server to have four times the processing power of the Content server. However, full text indexing is not required by Plan-JO specs. Thus a good estimate for the Index server is to have twice the CPU power of  the content server .  Additionally, the users will work on the documents that they have checked out for quite a long time, consequently a huge database activity is not expected. So it is estimated that 2 CPUs should be allocated to DB, 4 CPUs to the layer of the Context and the Index Server and 2 CPUs to the Application server layer.

  1. Software & Directory Structure

The following table provides a preliminary deployment matrix of the software identified above. The final software packages will be described in the Technical Specification document.

Node

Java Application Server

Operating System

Documentum Components

RDBMS

Application server

JMS BEA WebLogic
8.1 SP5

(Weblogic 9.2 is not certified by Documentum)

Solaris 10

Documentum Webtop,

WDK,

Business Process  Services,

Documentum Administrator
(Solaris, Version 5.3 SP3.  This is the exact version for these products )

Content server

Solaris 10

Documentum Content Server

(Solaris-Oracle, Version 5.3 SP3)

DB server

Solaris 10

Oracle 10g Release 2 (10.2.0.1)

Table 6:  Software components

Although this is beyond the scope of this guide, as details will be included in the “D.RE1.001-IIN-Installation Instructions”, the following schematics provides the directory layout that will be used during the installation of the required software packages.


The following figure represents the organisation of the filesystem:

/applications

 |

 -- /planjo

     |

     -- /users

     |   |

     |   -- /system (to install by Publications Office)

     |   |   |

     |   |   -- /init.d                                             Start/stop scripts

     |   |   |

     |   |   -- /...

     |   |   |

     |   |

     |   -- /bea  (link to /home/bea)

     |   |

     |   -- /oracle                                             oracle binaries  

     |    

     |   -- /...          (if required - link to /home/...)

     |   |

     |   -- /planjo  (if required - link to /home/planjo)

     |

     |

     -- /xchange

         |

         -- /DEMED

         |   |

         |   -- /in

         |

         -- ... other interfaces ...

(The names of the different filesystems are in bold)

 Figure 3: Directory structure

Issue Date:
22/12/2006

Document File Name:
D.HAD Hardware Architecture Document v0.07.doc

Page:
 of