Zero-Knowledge Database Configuration

$Id: zkdb.html,v 1.0 2000/07/27 08:46:36 sam Exp $

Revision

Comments

1.1

Author(s): Francis L. (francisl@zks.net)

Initial draft

1.2

Author(s): Sam Sanjabi (sam@zks.net),

Scarecrow revision. Partitioning functionality added.

Current Maintainer: Sam Sanjabi

Contents

  1. Introduction

  2. Tutorial

  3. Configuration

  4. Future Plans

1 - Introduction

See also: Zero-Knowledge Database Server Protocol Specification

Goto : [top]

The Zero-Knowledge Database (zkDB) library provides a single interface for accessing both local and remote databases.

2 - Tutorial

To open a database you use the zkDB_Open() function call, which takes a string indicating the filename of a database configuation file. You may think of this file as the database file itself, although you will have to ensure that the configuration is correct. Furthermore, these configuration files are now used by ndbutil to read it's variables, hence each section has some variables that are only used by ndbutil - these are labelled in comments in the following examples, but don't need to be. Incidentally, a comment is any line that begins with a semicolon (;).

2.1 Local Databases

A local database resides on the same machine as the process using it. The configuration file for opening a multi-partitioned local database will look something like this for a 2-partition case:

;
; Foo Database Local Access Configuration File
;

[ main ]

Database = foodb
Partitions = foo-1,foo-2
DefaultHomeDir = /freedom/db/foodb
DefaultTransactLogDir = /freedom/db/translog/foodb
Isolated_Partitions = Yes

;---Variables used by ndbutil only
Debug = Yes
LogFile = ndbutil.log
Prompt = No
Operation = recover

[ foo-1 ]

IsRemote = No
IsHashed = No
DoHashing = No
MaxLimit = l
PartitionType = HASH

;---Variables used by ndbutil only
recovery.Fatal = No

[ foo-2 ]

IsRemote = No
IsHashed = No
DoHashing = No
MinLimit = m
PartitionType = HASH

;---Variables used by ndbutil only
recovery.Fatal = No

Point your zkDB_Open() call at this config file.

2.2 Remote Databases

A remote database resides on different machine from the process using it. The database is accessed through a server. The use of the server is transparent to the user of the zkDB library, all you need to do is call zkDB_Open() with a good configuration file. A configuration file for opening a remote 2-partition database will look something like this:

;
; Foo Database Remote Access Configuration File
;

[ main ]

partitions = foo-1,foo-2

[ foo-1 ]

IsRemote = Yes
IsHashed = No
MaxLimit = l
AddressPath = dbhomeip.freedom.net:51131
AuthKey = jK3=;Sa0
ConnectionTimeout = 1500

[ foo-2 ]
IsRemote = Yes
IsHashed = No
MinLimit = m
AddressPath = dbhomeip.freedom.net:51132
AuthKey = jk3=;Sa0
ConnectionTimeout = 1500
The zkDB_Open() function should be pointed at this file.  For each partition, there should also be a corresponding local configuration file on the machine where you want the partition to be stored.  If the port number, the AuthKey, and the partition names match between the remote file and the local file used by the database server (see section 2.3), then everything should go smoothly.

2.3 Database Server

The database server uses the same configuration file as a local database, but with additional parameters, it looks like this:

;
; Foo Database Server Access Configuration File
;

[ CommandServer ]

AuthKey = jK3=;Sa0
AccessFile = /freedom/etc/db/access.conf
PidFile = /freedom/var/run/zkdbsrv-foodb.pid
LogFile = /freedom/var/log/zkdbsrv-foodb.log
MaxIdleTime = 0
MaxConnections = 100
AddressPath = :51131

[ main ]

Database = foodb
DefaultHomeDir = /freedom/db/foodb/
DefaultTransactLogDir = /freedom/db/translog/foodb/
partitions = foo-1
isolated_partitions = Yes

[ foo-1 ]

IsRemote = No
IsHashed = No
DoHashing = No
PartitionType = HASH

This file is the file used by the database server that serves the first partition (foo-1) of the above foodb database. There should be one of these for each partition required. The main and foo-1 sections are identical to that for a local database since the server opens the database locally.

3 - Configuration - The complete list of options

Goto : [top]

3.1 Configuration options

The zkDB library uses the zkConfig frame-work for configuration. So both command line and configuration file options are available. The following table lists the zkDB configuration options. Column D column indicates database options, column LP indicates local partition options and RP indicates remote partition options. The 1.2 column indicates if the option is available in Freedom 1.2 release. When an option is specified for both, the partition option has precedence.

 

Option name

D

LP

RP

1.2

Option Parameter

Default

Description

DefaultHomeDir

X



X

Directory

---

Home directory of the database partitions. Each partition will be located in its directory (under HomeDir/partition_name).

DefaultTransactLogDir

X



X

Directory

---

Transaction log directory. Same rule as for DefaultHomeDir.

HomeDir

X

X

Directory

---

Used to explicitly specify a partition directory instead of using the default database directory.

TransactLogDir

X

X

Directory

---

Used to explicitly specify a partition Transaction log directory instead of using the default directory.

Partitions

X

X

Partition list

---

List of partitions for the database

IsolatedPartitions

X

Boolean

Yes

Indicates if partitions are isolated or if they can overlap. Overlapping is not supported in Core 1.2 release.

IsRemote

X

X

X

Boolean

No

Indicates if the partition is local or remote.

IsHashed

X

X

X

Boolean

---

Used when partitioning is based on hashed keys to specify that keys passed to zkDB are already hashed. (Can not be set if DoHashing is set). The hash is currently just the ASCII value of the first character.

DoHashing

X

X

Boolean

No

Used when partitioning is based on hashed keys to specify that keys passed to zkDB should be hashed. (Can not be set if IsHashed is set). Not supported in Freedom 1.2 release.

MinLimit

X

X

X

Integer

---

Beginning of range served by a partition. If key hashing is performed than a value between 1 to 256 should be specified otherwise the real key values should be used.

MaxLimit

X

X

X

Integer

---

End of range served by a partition. If key hashing is performed than a value between 1 to 256 should be specified otherwise the real key values should be used.

ReadOnly

X

X

X

Boolean

No

Indicates if partition is read only

AddressPath

X

X

String

---

Server address path (e.g. "zkdbsrv.zks.net:51130")

ConnectionTimeout

X

X

Time (seconds)

30

Maximum connection time for transaction

Authkey

X

X

Key

---

Packet authentication key (must be same as DB server)

PartitionType

X

X

HASH or BTREE

BTREE

Database type of the partition

3.2 Performance Options

The following table contains performance related options. The options are listed starting by the most impacting options to the least. Note that theses options only apply to local partition configuration.

 

Option name

Option Parameter

Default

Description

CacheSize

Integer

1048576

Partition  cache size

Checkpoint.MinTime

Time (seconds)

5

Minimum time between two checkpoints

Checkpoint.MinSize

Size (kBytes)

32

Minimum size before doing checkpoints

EarlyLockRecognition

Boolean

No

Used for the quick lock recognition. Prevent lock stagnation problem in the situation of big number of processes doing update or fetch operations concurrently on the same record.

PageSize

Integer

0

Partition page size (0 means use system settings).

LogFileSize

Integer

10485760

Size of partition log file

LogFlash

Boolean

No

Force transactions to be logged to disk immediately (as supposed to letting the Operating System handle this)

KeyLock

Boolean

Yes

Indicates if key lock or page lock should be used.

MaxLockAttempts

Integer

---

Maximum number of attempts for getting a page or key lock.

3.3 - Configuration file

The database options should be specified in a configuration file using the following section names:

[ Main ]
Partitions = part-1,part-2,part-3
[ part-1 ]
...
[ part-2 ]
...
[ part-3 ]
...

The name in italic are user defined and represent database partitions.

4 - Future Plans

Goto : [top]

Even though the zkDB option syntax allows for multiple partitions, only one partition is supported in the Freedom 1.1 release. Also, it is possible to configure partitions so that their data range can overlap (specified with the Isolated_Partitions option), however, this is not supported in the 1.2 release.

It is possible to have zkDB perform hashing on keys and to partition the database using hashed keys. This provides an easy way of uniformly distributing data over the different partitions. When real keys are used (which can be strings for example), an analysis of the key distribution is required in order to ensure a good distribution of data over the multiple partitions. When key hashing is performed, the keys are mapped to unique values that are generated in pseudo-random fashion to provide a new pseudo-random ordering of the keys. The partition limits are then specified using values between 1 to 256 (and not using hash key values).


Copyright © 2000 Zero-Knowledge Systems Inc.
All rights reserved.