NI Servers Performance Test Plan (NOTES)

Copyright (c) Zero-Knowledge Systems Inc., 2000

 

 

About These Notes

 

This is a very small overview of the requirements for performance testing of the NI servers. It is no great piece of literature, but hopefully explains the minimum that should be done to know and understand the actual capacities of the servers. We encourage you to read the following notes before running your tests: they contain information that will aid your understanding of the components.

 

Prerequisites

 

The following hardware and software are required:

 

            Hardware:

 

Ø       1 machine for the NIQ and NIS servers,

Ø       And at least 3 machines for the client test application

 

NOTE:  This is the minimal number of machines that we need in order to properly test each NI server in a production like environment. This does not include other machines that may be running core servers that are required for the Freedom system to run properly

 

Ø       Different connection types:

·         V.90 modem (56K)

·         Ethernet LAN card (100 Mbits)

 

Software:

 

Ø       Latest version of the NIQS and NISS,

Ø       And the latest version of the NIQSPERF

 

Introduction

 

            The NIQS acts as a daemon for the various Freedom entities requesting Freedom network information. The main client of the NIQ server is the Freedom client application. The load of the Freedom client on the NIQS is very important compared with the load caused by the nodes and the core servers. The transactions of the Freedom client can be divided in two categories:

 

Ø       Normal startup procedure,

Ø       Full update,

Ø       And other requests

 

Among these categories, the full update transactions are the most important one to consider during testing. There is a linear relation between the number of users we are able to support and the number of full updates per second. The load the Freedom client adds on the NIQS is likely to increase in the future for the following reasons:

 

Ø       We are expecting to multiply the number of users,

Ø       And we are expecting to increase the number of nodes

 

Both will amplify the number of connections and the number of requests to the NIQS. To be able to answer this raise, a new command has been added to the NIQS to improve its performance. This command allows the client to send a single request instead of multiple requests done in the previous version. This way, we decrease the connection latency allowing more users to connect to the NIQS at a giving time. The objective of this performance test is to assess its new capacity. In the next paragraph we mention some important technical details that you should know before running your tests. Those details should be considered during testing.

 

            On the other hand, the NISS acts as status gathering daemon for the various Freedom entities in the Freedom network. It accepts incoming UDP stat packets and updates the state database using the information it receives. Things are simpler for the NISS, since we only need to figure out how many statistic updates per second it can process.

 

Technical Details To Consider

 

NIQS uses a pre-forking model server. The parent process is responsible only for forking child processes, it does not serve any requests or service any network sockets. The child processes actually processes connections; they serve multiple connections (one at time) before dying. The parent spawns maxConnections new children at start up and replaces dead child. As you know, the maxConnections setting represents the limit of simultaneous connections the NIQS may handle. Varying the maxConnections setting should affect the performance. The lost of performance is cause by the overhead of forking process, the overhead of context switches between processes, and the memory overhead of having multiple processes. The single biggest hardware issue affecting public server performance is RAM. A public server, such as NIQS, should never have to swap; swapping increases the latency of each request.

 

The optimal value for maxConnections to obtain the best performance should be determined by experimentation. A test application must be run with different values in order to find the maximum of queries per second we can get.

 

If you get a lot of error messages about running out of file handles, you might want to raise the limit of file-max. The value in file-max denotes the maximum number of files handles that the Linux kernel will allocate. The default value is 4096. To change it, just write the new number into the file:

 

# cat /proc/sys/fs/file-max

4096

# echo 8192 > /proc/sys/file-max

# cat /proc/sys/fs/file-max

8192

 

The three values in file-nr denote the number of allocated file handles, the number of used file handles, and the maximum of file handles. When the allocated file handles come close to the maximum, but the number of actually used ones is far behind, you’ve encountered a peak in your usage of file handles and you don’t need to increase the maximum. Taking in note the number of used files handles during performance testing is a good idea. It will give us the right number of file handles we need to run the NIQS properly in the production network.

 

However, there is still a per process limit of open files, which unfortunately can’t be changed that easily. It is set to 1024 by default. To change this you have to edit the files limits.h and fs.h in the directory /usr/src/linux/include/linux. Change the definition of NR_OPEN and recompile the kernel.

 

Related to process creation is process death induced by the processLife setting. By default this is 0, which means that there is no limit to the number of requests handled per child. If your configuration currently has this set to some very low number, such as 50 for example, you may want to bump this up significantly. Limit this to 10000 or so because of memory leaks. Having a very low number for this parameter can introduce drastic effects on the benchmark results. If the machine is busy spawning children it can't service requests. This is an important factor to considerate during testing activities.

 

Another important factor to mention is the connection latency. The parameters that affect the connection latency are the following:

 

Ø      Client access speed

Ø      Internet and Freedom latencies

Ø      The time it takes to the NIQS to generate the response when the cache expired

 

This is illustrated in the figure below:

 

 

Performance Parameters

 

The NIQ and NIS servers have different performance parameters that must be measured to determine the server’s performance level. These parameters are described in the table below:

 

Parameter

Description

“maxConnections”

 The maxConnections setting define the maximum number of simultaneous connections the NIQ server can process. This parameter is set to 20 for the production network. Since we are pretty sure that the NIQS is not CPU bound; this value should be increased significantly.

“processLife”

The maximum number of requests handled per child. This setting should be set to a high constant value. Set the processLife setting to 10000 for testing.

Database size

We would like to understand here the influence of the number of nodes in the network on the compression factor. Is the response length, in bytes, increases considerably when we add nodes to the network? This should be test by varying the number of nodes entered in the database. Let say 50, 100, 200, 400, and 800 nodes. Preferably the information for each node has to be random values; otherwise you will get biased result.

Time To Live (TTL)

The TTL value is the expiration time of the NIQS’s cache. The NIQ server uses a cache mechanism in order to increase the number of requests per second. This cache needs to be refreshed periodically. This frequency is defined by the TTL settings.

Client access speed

To ascertain the server’s performance level, all tests should be performed with different connection types. This is one of the most important parameter to take into account to obtain pertinent results. If the hardware is not available, delays can be inserted between each request by setting the latency parameters of the NIQSPERF.

Internet and Freedom latencies

This represents the connection latency between the client and the NIQS. These parameters are the most difficult one to evaluate. Once again, life is still being a bitch. The latency should be considered as a variable during performance evaluation.

 

Tests Performed

 

As you have seen, a lot of parameters influence the NIQS’s performance. This gives us over hundred different test cases to be considered.

TO CONTINUE …

 

Assumptions

 

Here are the assumptions we made to simulate a 56K connection:

 

56 kbps

 

20 ms / 100 bytes

 

3 hops:

Min.

60 ms

 

 

 

Max.

800 ms

 

 

AIP's overhead

30 ms

 

 

Cryptography disabled

 

 

 

 

 

 

 

NOTE: TCP and IP headers are not taken into account

 

From these assumptions we obtained the following latencies for the NIQSPERF’s configuration file.

 

Connection latency, in milliseconds

minconnlat = 110

maxconnlat = 850

 

KQD latencies, in milliseconds

minkqdlistlat = 130

maxkqdlistlat = 870

minkqdquerylat = 190

maxkqdquerylat = 930

 

Cache Data Version latency, in milliseconds

mincdvlat = 300

maxcdvlat = 1780

 

Client Info latency, in milliseconds

mincilat = 310

maxcilat = 1050

 

Token Info latency, in milliseconds

mintilat = 310

maxtilat = 1050

 

MxQuery latency, in milliseconds

minmxqlat = 810

maxmxqlat = 1650

 

Old Full Update latencies, in milliseconds

minfullupdlistlat = 110

maxfullupdlistlat = 850

minfullupdquerylat = 130

maxfullupdquerylat = 870

 

NIQS’s Results (with MxQuery Enabled)

 

NIQS statistics for the last 5 minutes

 

NIQS statistics for the last 5 minutes

 

Total connections

 

 

13786

Total connections

 

 

18559

New children

 

 

100

New children

 

 

150

CacheDataVersion request

 

13672

CacheDataVersion request

 

18558

CacheDataVersion request [HIT]

 

13666

CacheDataVersion request [HIT]

 

18549

ClientInfo requests

 

 

13654

ClientInfo requests

 

 

18605

ClientInfo requests [HIT]

 

13648

ClientInfo requests [HIT]

 

18597

List requests

 

 

13683

List requests

 

 

18559

ListSince0 requests

 

 

13683

ListSince0 requests

 

 

18559

ListSince0 requests [HIT]

 

13679

ListSince0 requests [HIT]

 

18558

MxQueryDescription requests

 

13640

MxQueryDescription requests

 

18584

MxQueryDescriptionSince0 requests

13640

MxQueryDescriptionSince0 requests

18584

MxQueryDescriptionSince0 requests [HIT]

13492

MxQueryDescriptionSince0 requests [HIT]

18424

QueryDescription requests

 

13678

QueryDescription requests

 

18553

QueryDescription requests [HIT]

 

13672

QueryDescription requests [HIT]

 

18547

TokenInfo requests

 

 

13644

TokenInfo requests

 

 

18597

TokenInfo requests [HIT]

 

13638

TokenInfo requests [HIT]

 

18592

Simultaneous connections:

 

 

Simultaneous connections:

 

 

 

Average

 

 

96.6

 

Average

 

 

149

 

Minimum

 

 

0

 

Minimum

 

 

126

 

Maximum

 

 

100

 

Maximum

 

 

150

Commands/connection:

 

 

Commands/connection:

 

 

 

Average

 

 

6

 

Average

 

 

6

 

Minimum

 

 

0

 

Minimum

 

 

6

 

Maximum

 

 

6

 

Maximum

 

 

6

Time (sec)/connection

 

 

Time (sec)/connection

 

 

 

Average

 

 

2.4

 

Average

 

 

2.4

 

Minimum

 

 

0

 

Minimum

 

 

2

 

Maximum

 

 

7

 

Maximum

 

 

8

 

 

 

 

 

 

 

 

 

 

File Descriptors

 

 

4467

File Descriptors

 

 

7545

Memory used

 

 

112073K

Memory used

 

 

117724K

Network Interface (bps)

 

 

Network Interface (bps)

 

 

 

Received

 

 

72707

 

Received

 

 

146349

 

 

 

 

105508

 

 

 

 

135536

 

 

 

 

83025

 

 

 

 

142179

 

Sent

 

 

394430

 

Sent

 

 

435657

 

 

 

 

253479

 

 

 

 

633067

 

 

 

 

338294

 

 

 

 

421241

 

 

 

 

 

 

 

 

 

 

NIQS statistics for the last 5 minutes

 

NIQS statistics for the last 5 minutes

 

Total connections

 

 

21146

Total connections

 

 

18996

New children

 

 

175

New children

 

 

200

CacheDataVersion request

 

20765

CacheDataVersion request

 

18665

CacheDataVersion request [HIT]

 

20760

CacheDataVersion request [HIT]

 

18660

ClientInfo requests

 

 

20734

ClientInfo requests

 

 

18639

ClientInfo requests [HIT]

 

20729

ClientInfo requests [HIT]

 

18632

List requests

 

 

20767

List requests

 

 

18698

ListSince0 requests

 

 

20767

ListSince0 requests

 

 

18698

ListSince0 requests [HIT]

 

20765

ListSince0 requests [HIT]

 

18697

MxQueryDescription requests

 

20668

MxQueryDescription requests

 

18592

MxQueryDescriptionSince0 requests

20668

MxQueryDescriptionSince0 requests

18592

MxQueryDescriptionSince0 requests [HIT]

20424

MxQueryDescriptionSince0 requests [HIT]

18490

QueryDescription requests

 

20754

QueryDescription requests

 

18678

QueryDescription requests [HIT]

 

20747

QueryDescription requests [HIT]

 

18672

TokenInfo requests

 

 

20702

TokenInfo requests

 

 

18610

TokenInfo requests [HIT]

 

20695

TokenInfo requests [HIT]

 

18604

Simultaneous connections:

 

 

Simultaneous connections:

 

 

 

Average

 

 

169

 

Average

 

 

181.3

 

Minimum

 

 

32

 

Minimum

 

 

0

 

Maximum

 

 

175

 

Maximum

 

 

200

Commands/connection:

 

 

Commands/connection:

 

 

 

Average

 

 

5.9

 

Average

 

 

5.9

 

Minimum

 

 

0

 

Minimum

 

 

0

 

Maximum

 

 

6

 

Maximum

 

 

6

Time (sec)/connection

 

 

Time (sec)/connection

 

 

 

Average

 

 

2.4

 

Average

 

 

2.4

 

Minimum

 

 

0

 

Minimum

 

 

0

 

Maximum

 

 

11

 

Maximum

 

 

6

 

 

 

 

 

 

 

 

 

 

File Descriptors

 

 

7604

File Descriptors

 

 

8610

Memory used

 

 

117933K

Memory used

 

 

123088K

Network Interface (bps)

 

 

Network Interface (bps)

 

 

 

Received

 

 

139614

 

Received

 

 

157293

 

 

 

 

152246

 

 

 

 

169469

 

 

 

 

156085

 

 

 

 

161543

 

Sent

 

 

539895

 

Sent

 

 

867212

 

 

 

 

601189

 

 

 

 

754622

 

 

 

 

619104

 

 

 

 

647632

 

 

 

 

 

 

 

 

 

 

NIQS statistics for the last 5 minutes

 

 

 

 

 

 

Total connections

 

 

26606

 

 

 

 

 

New children

 

 

250

 

 

 

 

 

CacheDataVersion request

 

26335

 

 

 

 

 

CacheDataVersion request [HIT]

 

26328

 

 

 

 

 

ClientInfo requests

 

 

26306

 

 

 

 

 

ClientInfo requests [HIT]

 

26301

 

 

 

 

 

List requests

 

 

26321

 

 

 

 

 

ListSince0 requests

 

 

26321

 

 

 

 

 

ListSince0 requests [HIT]

 

26319

 

 

 

 

 

MxQueryDescription requests

 

26254

 

 

 

 

 

MxQueryDescriptionSince0 requests

26254

 

 

 

 

 

MxQueryDescriptionSince0 requests [HIT]

25826

 

 

 

 

 

QueryDescription requests

 

26329

 

 

 

 

 

QueryDescription requests [HIT]

 

26324

 

 

 

 

 

TokenInfo requests

 

 

26308

 

 

 

 

 

TokenInfo requests [HIT]

 

26301

 

 

 

 

 

Simultaneous connections:

 

 

 

 

 

 

 

 

Average

 

 

233.9

 

 

 

 

 

 

Minimum

 

 

0

 

 

 

 

 

 

Maximum

 

 

250

 

 

 

 

 

Commands/connection:

 

 

 

 

 

 

 

 

Average

 

 

5.9

 

 

 

 

 

 

Minimum

 

 

0

 

 

 

 

 

 

Maximum

 

 

6

 

 

 

 

 

Time (sec)/connection

 

 

 

 

 

 

 

 

Average

 

 

2.7

 

 

 

 

 

 

Minimum

 

 

0

 

 

 

 

 

 

Maximum

 

 

19

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

File Descriptors

 

 

10662

 

 

 

 

 

Memory used

 

 

138404K

 

 

 

 

 

Network Interface (bps)

 

 

 

 

 

 

 

 

Received

 

 

177187

 

 

 

 

 

 

 

 

 

182620

 

 

 

 

 

 

 

 

 

229673

 

 

 

 

 

 

Sent

 

 

433139

 

 

 

 

 

 

 

 

 

707452

 

 

 

 

 

 

 

 

 

928623

 

 

 

 

 

 

 

First draft: Stéphane Rhéaume