Please send comments, corrections and angry letters to Quinlan Pfiffer.
OlegDB
liboleg
General Overview ¶
OlegDB is a concurrent, pretty fast K/V hash-table with an Erlang frontend. It uses the Murmur3 hashing algorithm to hash and index keys. We chose Erlang for the server because it's functional, uses the actor model and the pattern matching is ridiculous.
Installation ¶
Installing OlegDB is pretty simple, you only need a POSIX compliant system, make, gcc/clang (thats all we've tested) and Erlang. You'll also need the source code for Oleg.
Once you have your fanciful medley of computer science tools, you're ready to dive into a lengthy and complex process of program compilation. Sound foreboding? Have no fear, people have been doing this for at least a quarter of a century.
I'm going to assume you've extracted the source tarball into a folder called ~/src/olegdb
and that you haven't cd
'd into it yet. Lets smash some electrons together:
$ cd ~/src/olegdb
$ make
$ sudo make install
If you really wanted to, you could specify a different installation directory. The default is /usr/local
. You can do this by setting PREFIX
$ sudo make PREFIX=/usr/ install
Actually running OlegDB and getting it do stuff after this point is trivial, if your installation prefix is in your PATH you should just be able to run something like the following:
$ olegdb <data_directory>
...where <data_directory>
is the place you want Oleg to store persistent data information. Make it /dev/null
if you want, I don't care. You can also specify IP/port information from the commandline:
$ olegdb /tmp 1978 #Starts OlegDB listening on port 1978
$ olegdb /tmp 0.0.0.0 1337 #Starts OlegDB listening on the 0.0.0.0 IP, with port 1337
$ olegdb /tmp data.shithouse.tv 666 #Hostnames work too
Getting Started ¶
Communicating with OlegDB is done via a pretty simple REST interface. You POST to create/update records, GET to retrieve them, DELETE to delete, and HEAD to get back some information about them. Probably.
For example, to store the value Raphael into the named database turtles under the key red you could use something like the following:
$ curl -X POST -d 'Raphael' http://localhost:8080/turtles/red
Retrieving data is just as simple:
$ curl http://localhost:8080/turtles/red
Deleting keys can be done by using DELETE:
$ curl -X DELETE http://localhost:8080/turtles/red
You can also tell Oleg what the Content-Type is:
$ curl -X POST -H "Content-Type: text/html" -d '<p>Raphael</p>' http://localhost:8080/turtles/red
OlegDB supports lazy key expiration. You can specify an expiration date by setting the X-OlegDB-use-by header to a UTC POSIX timestamp .
$ curl -X POST \
-H "X-OlegDB-use-by: $(date +%s)" \
-H "Content-Type: application/json" \
-d '{turtle: "Johnny", age: 34}' http://localhost:8080/turtles/Johnny
> POST /turtles/Johnny HTTP/1.1
> User-Agent: curl/7.22.0 (x86_64-pc-linux-gnu) libcurl/7.22.0
> Host: localhost:8080
> Accept: */*
> X-OlegDB-use-by: 1394323192
> Content-Type: application/json
> Content-Length: 27
>
* upload completely sent off: 27out of 27 bytes
< HTTP/1.1 200 OK
< Server: OlegDB/fresh_cuts_n_jams
< Content-Type: text/plain
< Connection: close
< Content-Length: 7
<
無駄
$ curl -v http://localhost:8080/turtles/Johnny
> GET /turtles/Johnny HTTP/1.1
> User-Agent: curl/7.22.0 (x86_64-pc-linux-gnu) libcurl/7.22.0
> Host: localhost:8080
> Accept: */*
>
< HTTP/1.1 404 Not Found
< Status: 404 Not Found
< Server: OlegDB/fresh_cuts_n_jams
< Content-Length: 26
< Connection: close
< Content-Type: text/plain
<
These aren't your ghosts.
As you can hopefully tell, the POST succeeds and a 200 OK is returned. We used the bash command `date +%s`
which returns a timestamp. Then, immediately trying to access the key again results in a 404, because the key expired.
If you want to retrieve the expiration date of a key, you can do so by sending HEAD:
$ curl -v -X HEAD http://localhost:8080/turtles/Johnny
> HEAD /turtles/Johnny HTTP/1.1
> User-Agent: curl/7.35.0
> Host: localhost:8080
> Accept: */*
>
< HTTP/1.1 200 OK
* Server OlegDB/fresh_cuts_n_jams is not blacklisted
< Server: OlegDB/fresh_cuts_n_jams
< Content-Length: 0
< Content-Type: application/json
< Expires: 1395368972
<
What the hell is up with your responses? ¶
We have fun with our HTTP responses. Really all you need is the HTTP status code to see if something worked or not. 404 means not found, 200 means the operation completed successfully, 500 if something bad happened, etc.
liboleg
Macros ¶
VERSION ¶
#define VERSION "0.1.0"
The current version of the OlegDB.
KEY_SIZE ¶
#define KEY_SIZE 250
The hardcoded upperbound for key lengths.
HASH_MALLOC ¶
#define HASH_MALLOC 65536
The size, in bytes, to allocate when initially creating the database. ol_bucket pointers are stored here.
PATH_LENGTH ¶
#define PATH_LENGTH 256
The maximum length of a database's path.
DB_NAME_SIZE ¶
#define DB_NAME_SIZE 64
Database maximum name length.
DEVILS_SEED ¶
#define DEVILS_SEED 666
The seed to feed into the murmur3 algorithm.
Type Definitions ¶
ol_val ¶
typedef unsigned char *ol_val;
Typedef for the values that can be stored inside the database.
Enums ¶
ol_feature_flags ¶
typedef enum {
OL_F_APPENDONLY = 1 << 0,
OL_F_SEMIVOL = 1 << 1,
OL_F_REGDUMPS = 1 << 2
} ol_feature_flags;
Feature flags tell the database what it should be doing.
OL_F_APPENDONLY: Enable the append only log
OL_F_SEMIVOL: Tell servers that it's okay to fsync every once in a while
OL_F_REGDUMPS: Tell servers to snapshot the data using ol_save() regularly
ol_state_flags ¶
typedef enum {
OL_S_STARTUP = 0,
OL_S_AOKAY = 1
} ol_state_flags;
State flags tell the database what it should be doing.
OL_S_STARTUP: The DB is starting, duh.
OL_S_AOKAY: The database is a-okay
Structures ¶
ol_bucket ¶
typedef struct ol_bucket {
char key[KEY_SIZE]; /* The key used to reference the data */
size_t klen;
char *content_type;
size_t ctype_size;
ol_val data_ptr;
size_t data_size;
uint32_t hash;
struct ol_bucket *next; /* The next ol_bucket in this chain, if any */
struct tm *expiration;
} ol_bucket;
This is the object stored in the database's hashtable. Contains references to value, key, etc.
key[KEY_SIZE]: The key used for this bucket.
klen: Length of the key.
*content_type: The content-type of this object. Defaults to "application/octet-stream".
ctype_size: Length of the string representing content-type.
data_ptr: Location of this key's value.
data_size: Length of the value in bytes.
hash: Hashed value of this key.
next: Collisions are resolved via linked list. This contains the pointer to the next object in the chain, or NULL.
expiration: The POSIX timestamp when this key will expire.
ol_database ¶
typedef struct ol_database {
void (*get_db_file_name)(struct ol_database *db,const char *p,char*);
void (*enable)(int, int*);
void (*disable)(int, int*);
bool (*is_enabled)(int, int*);
char name[DB_NAME_SIZE];
char path[PATH_LENGTH];
char *dump_file;
char *aol_file;
FILE *aolfd;
int feature_set;
short int state;
int rcrd_cnt;
int key_collisions;
time_t created;
size_t cur_ht_size;
ol_bucket **hashes;
} ol_database;
The object representing a database.
get_db_file_name: A function pointer that returns the path/name.db to reduec code duplication. Used for writing and reading of dump files.
enable: Helper function to enable a feature for the database instance passed in.
disable: Helper function to disable a database feature.
is_enabled: Helper function that checks weather or not a feature is enabled.
name: The name of the database.
path[PATH_LENGTH]: Path to the database's working directory.
dump_file: Path and filename of db dump.
aol_file: Path and filename of the append only log.
aolfd: Pointer of FILE type to append only log.
feature_set: Bitmask holding enabled/disabled status of various features. See ol_feature_flags.
state: Current state of the database. See ol_state_flags.
rcrd_cnt: Number of records in the database.
key_collisions: Number of key collisions this database has had since initialization.
created: Timestamp of when the database was initialized.
cur_ht_size: The current amount, in bytes, of space allocated for storing ol_bucket objects.
**hashes: The actual hashtable. Stores ol_bucket instances.
ol_meta ¶
typedef struct ol_meta {
time_t uptime;
} ol_meta;
Structure used to record meta-information about the database.
Functions ¶
ol_open ¶
ol_database *ol_open(char *path, char *name, int features);
Opens a database for use.
*path: The directory where the database will be stored.
*name: The name of the database. This is used to create the dumpfile, and keep track of the database.
features: Features to enable when the database is initialized. ORd.
Returns: A new database object.
ol_close ¶
int ol_close(ol_database *database);
Closes a database cleanly, frees memory and makes sure everything is written.
*database: The database to close.
Returns: 0 on success, 1 if not everything could be freed.
ol_close_save ¶
int ol_close_save(ol_database *database);
Dumps and closes a database cleanly, frees memory and makes sure everything is written.
*database: The database to close.
Returns: 0 on success, 1 if not everything could be freed.
ol_unjar ¶
ol_val ol_unjar(ol_database *db, const char *key, size_t klen);
Unjar a value from the mayo. Calls ol_unjar_ks with a dsize of null.
*db: Database to retrieve value from.
*key: The key to use.
klen: The length of the key.
Returns: A pointer to an ol_val object, or NULL if the object was not found.
ol_unjar_ks ¶
ol_val ol_unjar_ds(ol_database *db, const char *key, size_t klen, size_t *dsize);
Unjar a value from the mayo. Makes ksize a reference to the size of the data returned.
*db: Database to retrieve value from.
*key: The key to use.
klen: The length of the key to use.
*dsize: The key to use.
Returns: A pointer to an ol_val object, or NULL if the object was not found.
ol_jar ¶
int ol_jar(ol_database *db, const char *key, size_t klen, unsigned char *value, size_t vsize);
Put a value into the mayo. It's easy to piss in a bucket, it's not easy to piss in 19 jars. Uses default content type.
*db: Database to retrieve value from.
*key: The key to use.
klen: The length of the key.
*value: The value to insert.
vsize: The size of the value in bytes.
Returns: 0 on sucess.
ol_jar_ct ¶
int ol_jar_ct(ol_database *db, const char *key, size_t klen, unsigned char *value, size_t vsize,
const char *content_type, const size_t content_type_size);
Put a value into the mayo. It's easy to piss in a bucket, it's not easy to piss in 19 jars. Allows you to specify content type.
*db: Database to retrieve value from.
*key: The key to use.
klen: The key to use.
*value: The value to insert.
vsize: The size of the value in bytes.
*content_type: The content type to store, or really anything. Store your middle name if you want to.
content_type_size: The length of the content_type string.
Returns: 0 on sucess.
ol_content_type ¶
char *ol_content_type(ol_database *db, const char *key, size_t klen);
Retrieves the content type for a given key from the database.
*db: Database to retrieve value from.
*key: The key to use.
klen: The length of the key.
Returns: Stored content type, or NULL if it was not found.
ol_expiration ¶
struct tm *ol_expiration_time(ol_database *db, const char *key, size_t klen);
Retrieves the expiration time for a given key from the database.
*db: Database to retrieve value from.
*key: The key to use.
klen: The length of the key.
Returns: Stored struct tm *representing the time that this key will expire, or NULL if not found.
ol_scoop ¶
int ol_scoop(ol_database *db, const char *key, size_t klen);
Removes an object from the database. Get that crap out of the mayo jar.
*db: Database to retrieve value from.
*key: The key to use.
klen: The length of the key.
Returns: 0 on success, 2 if the object wasn't found.
ol_uptime ¶
int ol_uptime(ol_database *db);
Gets the time, in seconds, that a database has been up.
*db: Database to retrieve value from.
Returns: Uptime in seconds since database initialization.
ol_spoil ¶
int ol_spoil(ol_database *db, const char *key, size_t klen, struct tm *expiration_date);
Sets the expiration value of a key. Will fail if no bucket under the chosen key exists.
*db: Database to retrieve value from.
*key: The key to use.
klen: The length of the key.
expiration_date: The UTC time to set the expiration to.
Returns: 0 upon success, -1 if otherwise.
ol_ht_bucket_max ¶
int ol_ht_bucket_max(size_t ht_size);
Does some sizeof witchery to return the maximum current size of the database.
*ht_size: The size you want to divide by sizeof(ol_bucket).
Returns: The maximum possible bucket slots for db.