Sun Hardware configuration

Freedom Services 2.0 Archive Validated


Thomas Brenneur / Zero-Knowledge / April-2000

Goal of this document:

The purpose of this document is to explore and propose some configuration parameters for the zkDB database server and its related database on the Sun server. Those numbers are derived from the tests done on "Attack" in order to get the best performance of the database server.

Most of the parameters available to configure the databases and the databases server do not need to be changed as the default value seems to suit the needs. But the parameters that will have the most dramatic impact on the performance are the database cache size, the distribution of the databases and logs on the availables disks, and the configuration of the zkDBSrv itself. To get more info on the performance see zkdbsrv_perf.html.

Those parameters are not the final parameter values. Some tests still ne to be done in order to evaluate the behaviour of this system on a sun server. Monitoring and tuning the system during those tests and when using it in production will give more information and more reliable parameter values. Furthermore, the later addition of a RAID Disk array will probably change the behaviour of the system triggering the need to re-evaluate those parameters.

Current hardware configuration (node: attack.rndtest)

2x UltraSparc-II 450Mhz CPUs
1024 Mb of RAM
10/100 Ethernet
100/1000-SX Fiber Gigabit ethernet (Not currently running)
2x 18.2G UltraWide SCSI Disks.

Recommended database configuration:
 

Sun Server: Nym server and PKI databases
DB Name Current size in production 2000/04/26
38802 nyms 
(in Mb)
Projected size with 80k users
(3 nyms/user,
7.6Ko/nym)
Projected size with 250k users
(3 nyms/user,
7.6Ko/nym)
Recommended configuration parameters on the Sun server *
80K users Cache Size
(10% of db size in Mb - rounded to power of 2 numbers) 
Default = 1Mb
250K Users Cache Size
(10% of db size in Mb - rounded to power of 2 numbers ) 
Default = 1Mb
Page size Log File size Log location
Nsdb-1.1 
289.00
1781.25
5566.40
192.00
576.00
Nsdb-1.1-disabled
1.10
6.77
20.90
1.00
2.00
Nymblock
0.16
1.00
3.04
1.00
1.00
Nymcache
9.60
59.13
182.40
8.00
32.00
Spent-tokens
4.20
25.87
79.80
4.00
8.00
Pubkey
161.00
991.76
3059.00
128.00
320.00
Pubkey-disabled
0.22
1.35
4.18
1.00
1.00
Total
465.28
2867.13
8915.72
335.00
940.00

 
 

* To get an explanation of the parameters and their default value, see the document Database server Configuration. If a cell is left blank this means that you should leave the configuration value to the default value.

Cache notes:

The current physical memory of attack.rndtest is enough to handle the cache needs of the databases. The Sun that will be used in production will have 2Gigs of Ram. This seems to be enough to handle the needs with 250K users.

In addition, SunOS will always try to load as muh files as possible in physical memory. Those loaded files can be libraries, programs, or any other files like the database files used by BerkleyDB. Those files will remained mapped in memory even if nobody access them. They will be discarded only if the Sun need memory to load another file or fo a specific process. This feature is interesting as the cache size configured with BerkleyDB may not need to be as big as the calculated configuration value. The OS itself act as a 'global cache' and handle that for us. More investigation on that would be required to know if the Solaris system caching mechanism is more efficient that BerkleyDB caching mechanism.

I have found a note in the BerkleyDB (3.x) documentation saying that:

It is possible to specify caches to Berkeley DB that are large enough so that they cannot be allocated contiguously on some architectures, e.g., some releases of Solaris limit the amount of memory that may be allocated contiguously by a process.

Although that I don't think that this apply to the current version of the OS installed on the Sun, I asked some more information to SleepyCat. Here is their answer :

> "It is possible to specify caches to Berkeley DB that are large enough
> so that they cannot be allocated contiguously on some architectures,
> e.g., some releases of Solaris limit the amount of memory that may be
> allocated contiguously by a process."
>
> Can you tell me what are those releases of Solaris ?

No, I can't (and, for all I know, it may be a configuration parameter
that's settable based on the amount of physical memory in the machine).
Generally, it our experience, the limit is around 2.3GB.

Regards,
Sleepycat Software Support
 

 
 

Recommended zkDB database server configuration parameters:
 
 
 
 

Parameter Description Default value Recommended value Note
logFile <filename> Specifies the server logFile
NA
NA
Depending on the production configuration
pidFile <filename> Specifies the server pid file
NA
NA
Depending on the production configuration
accessFile <filename> Specifies the name of the file containing IP addresses allowed to connect
NA
NA
Depending on the production configuration
authKey <string> Specifies the key for packet authentication
NA
NA
Depending on the production configuration
addressPath <address> Specifies the address path on which the server will listen
NA
NA
Depending on the production configuration
listenQueueLength <length> Maximum number of incoming pending connections
NA
32
maxPacketSize <kilo-Bytes> This is the maximum size (in kB) of packets that can be transmited
NA
16kb
maxIdleTime <seconds> This is the maximum number of seconds that a connection can be idle. When the limit is exceeded, the connection is automatically closed
NA
0
maxConnections <connections> This specifies the number of connections to handle (i.e.number of processes to fork
NA
10
processLife <connections> This specifies the maximum connections that a process should handle. After that the process will terminate and be replaced by a newly forked process. Note that special value 0 means that processes are immortal
NA
5000
tcpDelay This specifies that TCP transmit delay should be used when sending packets
NA
0
verbose <level> Specifies the logging verbose level. This is an integer value between 0 and 4, where 0 provides almost no logging and 4 logs all input.
NA
1

 
 

Too see all the configuration parameters and more information on this subject, see : http://caesar.zks.net/devnet/cvs/libzkcs/doc/zkcs-conf.html.

Recommended system configuration parameters:

The current file system block configured on Sun is 8Kb. This is suitable for our needs. And as the size of a nym is around 7.6Kb, this is a very good number for the nym database (note : by default, BerkleyDB will configure its database page with the same size as the file system block size).

# mkfs -m /dev/dsk/c0t0d0s4
mkfs -F ufs -o nsect=248,ntrack=19,bsize=8192,fragsize=1024,cgsize=22,free=1,rps=167,nbpi=8238,opt=t,apc=0,gap=0,nrpos=8,maxcontig=16 /dev/dsk/c0t0d0s4 24577792
#

I don't see any needs now to change any system parameters. I think that the default kernel parameter values can handle our current needs. They may be tweaked later if we detect a tangible problem. I think that the main performance gain can be achieved by optimizing our software and mostly by adding a RAID Disk (or multiple RAID disk depending the needs). With the current memory size and the behaviour of the operating system, most of the databases will be loaded in memory (this will have to be revised for the 500K and more users scenario). The file system will then have the responsibility to handle the 'write' load: writes to the different databases and writes to the database logs. In the current system architecture, there are a lot more queries than there are updates. In a sense : the query load is heavy but the update load is normal. A RAID-5 disk seems to be enough to to have a good performance (depending on the number of disks in the array and the overall performance of it) along with data reliability and security. If necessary, the NIQS cache, NYM cache and other similar database could be transferred to a RAID-0 Disk. As we do a lot more write and queries on those database than we do on the others. And as we expect a very good performance from them but less reliability (they are cache - the data is fluched and reconstructed from the original data very ofter). The RAID-0 would bring the performance without impacting the others databases located on the RAID-5.

This needs more investigation and tests in order to have more precise answers. We would need to know the transfer rates, response time, etc needed by the nym server (and other servers) in order to compare them to the different RAID system existing for a Sun.

Note on the memory: If we have enough memory to be able to have most of the active part (the part currently used by the connected users) of the database in memory, then adding more CPU would probably bring a good performance boost to the system

Sun performance related links:

Too get some information on the sun performance and configuration you may take a look at those pages :

Performance Q&A

Sun performance information

RAID: What does it mean to me? Extract from this article :

Putting it all together

By now it should be clear that each of the various RAID organizations has clear advantages and disadvantages. Striping solves most of the performance problems associated with individual disks, but for configurations involving realistic dataset sizes, striping alone is far too subject to member disk failure to be of practical use. Mirroring provides reliability, and does so with a reasonable tradeoff in performance. Unfortunately, the sheer cost of mirroring often makes it unacceptable in the very situations that most require high reliability: very large configurations. RAID-5 provides reliability comparable to that of mirroring, combined with substantially lower capital expenditure. However, for some applications, primarily those which have any non-trivial amount of writing, RAID-5 exacts an impractically high cost in terms of performance. Fortunately, there is no reason why users can't mix or match different types of RAID volumes in a single configuration, and in fact this is usually the most appropriate strategy.

For example, consider a large DBMS system used to support both online transaction processing (OLTP) and decision support (DSS). Decision support is characterized by heavy sequential I/O, most of which is dominated by read-only activity; meanwhile, random access reigns with OLTP, which usually sustains a considerable mix of updates. In such a system, the OLTP tables that are heavily updated should clearly avoid RAID-5 -- particularly if the DBMS is configured to operate in disk units of 2 kilobytes. On theother hand, the primary tables that are heavily accessed by the DSS applications are likely to be read a lot, not written very much, and tend to be larger and thus much more expensive to mirror. For those tables, RAID-5 is the right solution. However, most DSS applications make extensive use of sort areas and temporary tables -- data areas that endure extensive writing; if this storage must be protected against disk failure, RAID-0+1 is probably the best option. However, since this storage is very transitory (the data on it is only valid during the execution of a sort or join), a higher-performance alternative is a simple RAID-0 stripe.