berkdb open

APIRef

berkdb open
	[-btree | -hash | -recno | -queue | -unknown]
	[-cachesize {gbytes bytes ncache}]
	[-create]
	[-delim delim]
	[-dup]
	[-dupsort]
	[-env env]
	[-errfile filename]
	[-excl]
	[-ffactor density]
	[-len len]
	[-mode mode]
	[-nelem size]
	[-pad pad]
	[-pagesize pagesize]
	[-rdonly]
	[-recnum]
	[-renumber]
	[-snapshot]
	[-source file]
	[-truncate]
	[-upgrade]
	[--]
	[file [database]]

Description

The berkdb open command opens, and optionally creates, a database. The returned database handle is bound to a Tcl command of the form dbN, where N is an integer starting at 0 (e.g., db0 and db1). It is through this Tcl command that the script accesses the database methods.

The options are as follows:

-btree
Open/create a database of type Btree. The Btree format is a representation of a sorted, balanced tree structure.

-hash
Open/create a database of type Hash. The Hash format is an extensible, dynamic hashing scheme.

-queue
Open/create a database of type Queue. The Queue format supports fast access to fixed-length records accessed by sequentially or logical record number.

-recno
Open/create a database of type Recno. The Recno format supports fixed- or variable-length records, accessed sequentially or by logical record number, and optionally retrieved from a flat text file.

-unknown
The database is of an unknown type, and must already exist.

-cachesize {gbytes bytes ncache}
Set the size of the database's shared memory buffer pool, i.e., the cache, to gbytes gigabytes plus bytes. The cache should be the size of the normal working data set of the application, with some small amount of additional memory for unusual situations. (Note, the working set is not the same as the number of simultaneously referenced pages, and should be quite a bit larger!)

The default cache size is 256KB, and may not be specified as less than 20KB. Any cache size less than 500MB is automatically increased by 25% to account for buffer pool overhead, cache sizes larger than 500MB are used as specified.

It is possible to specify caches to Berkeley DB that are large enough so that they cannot be allocated contiguously on some architectures, e.g., some releases of Solaris limit the amount of memory that may be allocated contiguously by a process. If ncache is 0 or 1, the cache will be allocated contiguously in memory. If it is greater than 1, the cache will be broken up into ncache equally sized separate pieces of memory.

For information on tuning the Berkeley DB cache size, see Selecting a cache size.

As databases opened within Berkeley DB environments use the cache specified to the environment, it is an error to attempt to set a cache in a database created within an environment.

-create
Create any underlying files, as necessary. If the files do not already exist and the -create argument is not specified, the call will fail.

-delim delim
Set the delimiting byte used to mark the end of a record in the backing source file for the Recno access method.

This byte is used for variable length records, if the -source argument file is specified. If the -source argument file is specified and no delimiting byte was specified, <newline> characters (i.e. ASCII 0x0a) are interpreted as end-of-record markers.

-dup
Permit duplicate data items in the tree, i.e. insertion when the key of the key/data pair being inserted already exists in the tree will be successful. The ordering of duplicates in the tree is determined by the order of insertion, unless the ordering is otherwise specified by use of a cursor or a duplicate comparison function.

It is an error to specify both -dup and -recnum.

-dupsort
Sort duplicates within a set of data items. A default, lexical comparison will be used. Specifying that duplicates are to be sorted changes the behavior of the db put operation as well as the dbc put operation when the -keyfirst, -keylast and -current options are specified.

-env env
If no -env argument is given, the database is standalone, i.e., it is not part of any Berkeley DB environment.

If a -env argument is given, the database is created within the specified Berkeley DB environment. The database access methods automatically make calls to the other subsystems in Berkeley DB based on the enclosing environment. For example, if the environment has been configured to use locking, then the access methods will automatically acquire the correct locks when reading and writing pages of the database.

-errfile filename

When an error occurs in the Berkeley DB library, a Berkeley DB error or an error return value is returned by the function. In some cases, however, the errno value may be insufficient to completely describe the cause of the error especially during initial application debugging.

The -errfile argument is used to enhance the mechanism for reporting error messages to the application by specifying a file to be used for displaying additional Berkeley DB error messages. In some cases, when an error occurs, Berkeley DB will output an additional error message to the specified file reference.

The error message will consist of a Tcl command name and a colon (":"), an error string, and a trailing <newline> character. If the database was opened in an environment the Tcl command name will be the environment name (e.g., env0), otherwise it will be the database command name (e.g., db0).

This error logging enhancement does not slow performance or significantly increase application size, and may be run during normal operation as well as during application debugging.

For database handles opened inside of Berkeley DB environments, specifying the -errfile argument affects the entire environment and is equivalent to specifying the same argument to the berkdb env command.

-excl
Return an error if the file already exists. Underlying filesystem primitives are used to implement this flag. For this reason it is only applicable to the physical database file and cannot be used to test if a database in a file already exists.

-ffactor density
Set the desired density within the hash table.

The density is an approximation of the number of keys allowed to accumulate in any one bucket

-len len
For the Queue access method, specify that the records are of length len.

For the Recno access method, specify that the records are fixed-length, not byte delimited, and are of length len.

Any records added to the database that are less than len bytes long are automatically padded (see the -pad argument for more information).

Any attempt to insert records into the database that are greater than len bytes long will cause the call to fail immediately and return an error.

-mode mode

On UNIX systems, or in IEEE/ANSI Std 1003.1 (POSIX) environments, all files created by the access methods are created with mode mode (as described in chmod(2)) and modified by the process' umask value at the time of creation (see umask(2)). The group ownership of created files is based on the system and directory defaults, and is not further specified by Berkeley DB. If mode is 0, files are created readable and writeable by both owner and group. On Windows systems, the mode argument is ignored.

-nelem size
Set an estimate of the final size of the hash table.

If not set or set too low, hash tables will still expand gracefully as keys are entered, although a slight performance degradation may be noticed.

-pad pad
Set the padding character for short, fixed-length records for the Queue and Recno access methods.

If no pad character is specified, <space> characters (i.e., ASCII 0x20) are used for padding.

-pagesize pagesize
Set the size of the pages used to hold items in the database, in bytes. The minimum page size is 512 bytes and the maximum page size is 64K bytes. If the page size is not explicitly set, one is selected based on the underlying filesystem I/O block size. The automatically selected size has a lower limit of 512 bytes and an upper limit of 16K bytes.

For information on tuning the Berkeley DB page size, see Selecting a page size.

-rdonly
Open the database for reading only. Any attempt to modify items in the database will fail regardless of the actual permissions of any underlying files.

-recnum
Support retrieval from the Btree using record numbers.

Logical record numbers in Btree databases are mutable in the face of record insertion or deletion. See the -renumber argument for further discussion.

Maintaining record counts within a Btree introduces a serious point of contention, namely the page locations where the record counts are stored. In addition, the entire tree must be locked during both insertions and deletions, effectively single-threading the tree for those operations. Specifying -recnum can result in serious performance degradation for some applications and data sets.

It is an error to specify both -dup and -recnum.

-renumber
Specifying the -renumber argument causes the logical record numbers to be mutable, and change as records are added to and deleted from the database. For example, the deletion of record number 4 causes records numbered 5 and greater to be renumbered downward by 1. If a cursor was positioned to record number 4 before the deletion, it will reference the new record number 4, if any such record exists, after the deletion. If a cursor was positioned after record number 4 before the deletion, it will be shifted downward 1 logical record, continuing to reference the same record as it did before.

Using the db put or dbc put interfaces to create new records will cause the creation of multiple records if the record number is more than one greater than the largest record currently in the database. For example, creating record 28, when record 25 was previously the last record in the database, will create records 26 and 27 as well as 28.

If a created record is not at the end of the database, all records following the new record will be automatically renumbered upward by 1. For example, the creation of a new record numbered 8 causes records numbered 8 and greater to be renumbered upward by 1. If a cursor was positioned to record number 8 or greater before the insertion, it will be shifted upward 1 logical record, continuing to reference the same record as it did before.

For these reasons, concurrent access to a Recno database with the -renumber flag specified may be largely meaningless, although it is supported.

-snapshot
This argument specifies that any specified -source file be read in its entirety when the database is opened. If this argument is not specified, the -source file may be read lazily.

-source file
Set the underlying source file for the Recno access method. The purpose of the -source file is to provide fast access and modification to databases that are normally stored as flat text files.

If the -source argument is give, it specifies an underlying flat text database file that is read to initialize a transient record number index. In the case of variable length records, the records are separated as specified by -delim. For example, standard UNIX byte stream files can be interpreted as a sequence of variable length records separated by <newline> characters.

In addition, when cached data would normally be written back to the underlying database file (e.g., the db close or db sync commands are called), the in-memory copy of the database will be written back to the -source file.

By default, the backing source file is read lazily, i.e., records are not read from the file until they are requested by the application. If multiple processes (not threads) are accessing a Recno database concurrently and either inserting or deleting records, the backing source file must be read in its entirety before more than a single process accesses the database, and only that process should specify the backing source argument as part of the berkdb open call. See the -snapshot argument for more information.

Reading and writing the backing source file specified by -source cannot be transactionally protected because it involves filesystem operations that are not part of the Berkeley DB transaction methodology. For this reason, if a temporary database is used to hold the records, i.e., no file argument was specified to the berkdb open call, it is possible to lose the contents of the -file file, e.g., if the system crashes at the right instant. If a file is used to hold the database, i.e., a file name was specified as the file argument to berkdb open, normal database recovery on that file can be used to prevent information loss, although it is still possible that the contents of -source will be lost if the system crashes.

The -source file must already exist (but may be zero-length) when berkdb open is called.

It is not an error to specify a read-only -source file when creating a database, nor is it an error to modify the resulting database. However, any attempt to write the changes to the backing source file using either the db close or db sync commands will fail, of course. Specify the -nosync argument to the db close command will stop it from attempting to write the changes to the backing file, instead, they will be silently discarded.

For all of the above reasons, the -source file is generally used to specify databases that are read-only for Berkeley DB applications, and that are either generated on the fly by software tools, or modified using a different mechanism, e.g., a text editor.

-truncate
Physically truncate the underlying file, discarding all previous databases it might have held. Underlying filesystem primitives are used to implement this flag. For this reason it is only applicable to the physical file and cannot be used to discard databases within a file.

The -truncate argument cannot be transaction protected, and it is an error to specify it in a transaction protected environment.

-upgrade
Upgrade the database represented by file, if necessary.

Note: Database upgrades are done in place and are destructive, e.g., if pages need to be allocated and no disk space is available, the database may be left corrupted. Backups should be made before databases are upgraded. See Upgrading databases for more information.

--
Mark the end of the command arguments.

file
The name of a single physical file on disk that will be used to back the database.

database
The database argument allows applications to have multiple databases inside of a single physical file. This is useful when the databases are both numerous and reasonably small, in order to avoid creating a large number of underlying files. It is an error to attempt to open a second database file that was not initially created using a database name.

The berkdb open command returns a database handle on success.

In the case of error, a Tcl error is thrown.

APIRef

Copyright Sleepycat Software