Functions¶
- Functions
- Construct a new TrailDB
- Open a TrailDB and access metadata
- Setting Options
- Working with items, fields and values
- Working with UUIDs
- Query events with cursors
- Join trails with multi-cursors
- Filter events
- tdb_event_filter_new
- tdb_event_filter_new_match_none
- tdb_event_filter_new_match_all
- tdb_event_filter_free
- tdb_event_filter_add_term
- tdb_event_filter_add_time_range
- tdb_event_filter_new_clause
- tdb_event_filter_num_clauses
- tdb_event_filter_num_terms
- tdb_event_filter_get_term_type
- tdb_event_filter_get_item
- tdb_event_filter_get_time_range
Construct a new TrailDB¶
tdb_cons_init¶
Create a new TrailDB constructor handle.
tdb_cons *tdb_cons_init(void)
Return NULL if memory allocation fails.
tdb_cons_open¶
Open a new TrailDB.
tdb_error tdb_cons_open(tdb_const *cons,
const char *root,
const char **ofield_names,
uint64_t num_ofields)
cons
constructor handle as returned fromtdb_cons_init
.root
path to new TrailDB.ofield_names
names of fields, each name terminated by a zero byte.num_ofields
number of fields.
Return 0 on success, an error code otherwise.
tdb_cons_close¶
Free a TrailDB constructor handle. Call this after tdb_cons_finalize().
void tdb_cons_close(tdb_cons *cons)
cons
TrailDB constructor handle.
tdb_cons_add¶
Add an event to TrailDB.
tdb_error tdb_cons_add(tdb_cons *cons,
const uint8_t uuid[16],
const uint64_t timestamp,
const char **values,
const uint64_t *value_lengths)
cons
TrailDB constructor handle.uuid
16-byte UUID.timestamp
integer timestamp. Usually Unix time.values
values of each field, as an array of pointers to byte strings. The order of values is the same asofield_names
in tdb_cons_open().values_length
lengths of byte strings invalues
.
Return 0 on success, an error code otherwise.
tdb_cons_append¶
Merge an existing TrailDB to this constructor. The fields must be equal between the existing and the new TrailDB.
tdb_error tdb_cons_append(tdb_cons *cons, const tdb *db)
cons
TrailDB constructor handle.db
An existing TrailDB to be merged.
Return 0 on success, an error code otherwise.
tdb_cons_set_opt¶
Set a constructor option.
tdb_error tdb_cons_set_opt(tdb_cons *cons,
tdb_opt_key key,
tdb_opt_value value);
Currently the supported options are:
-
key
TDB_OPT_CONS_OUTPUT_FORMAT
- value
TDB_OPT_CONS_OUTPUT_PACKAGE
create a one-file TrailDB (default). - value
TDB_OPT_CONS_OUTPUT_DIR
do not package TrailDB, keep a directory.
- value
-
key
TDB_OPT_CONS_NO_BIGRAMS
- value
0
to enable bigram-based size optimization at TrailDB finalization (default). This decreases the size of resulting TrailDB at the cost of increased compression time. - value
1
to disable bigram-based size optimization at TrailDB finalization.
- value
Return 0 on success, an error code otherwise.
tdb_cons_get_opt¶
Get a constructor option.
tdb_error tdb_cons_get_opt(tdb_cons *cons,
tdb_opt_key key,
tdb_opt_value *value);
See tdb_cons_set_opt() for valid keys. Sets the value
to the current value of the key. Return 0 on success, an error code otherwise.
tdb_cons_finalize¶
Finalize TrailDB construction. Finalization takes care of compacting the events and creating a valid TrailDB file.
tdb_error tdb_cons_finalize(tdb_cons *cons)
cons
TrailDB constructor handle.
Return 0 on success, an error code otherwise.
Open a TrailDB and access metadata¶
tdb_init¶
Create a new TrailDB handle.
tdb *tdb_init(void)
Return NULL if memory allocation fails.
tdb_open¶
Open a TrailDB for reading.
tdb_error tdb_open(tdb *tdb, const char *root)
tdb
Traildb handle returned by tdb_init().root
path to TrailDB.
Return 0 on success, an error code otherwise.
tdb_close¶
Close a TrailDB.
void tdb_close(tdb *db)
db
TrailDB handle.
tdb_dontneed¶
Inform the operating system that this TrailDB does not need to be kept in memory.
void tdb_dontneed(tdb *db)
db
TrailDB handle.
tdb_willneed¶
Inform the operating system that this TrailDB will be accessed soon. Call this after tdb_dontneed() once the TrailDB is needed again.
void tdb_willneed(tdb *db)
db
TrailDB handle.
tdb_num_trails¶
Get the number of trails.
uint64_t tdb_num_trails(const tdb *db)
db
TrailDB handle.
tdb_num_events¶
Get the number of events.
uint64_t tdb_num_events(const tdb *db)
db
TrailDB handle.
tdb_num_fields¶
Get the number of fields.
uint64_t tdb_num_fields(const tdb *db)
db
TrailDB handle.
tdb_min_timestamp¶
Get the oldest timestamp.
uint64_t tdb_min_timestamp(const tdb *db)
db
TrailDB handle.
tdb_max_timestamp¶
Get the newest timestamp.
uint64_t tdb_max_timestamp(const tdb *db)
db
TrailDB handle.
tdb_version¶
Get the version.
uint64_t tdb_version(const tdb *db)
db
TrailDB handle.
tdb_error_str¶
Translate an error code to a string.
const char *tdb_error_str(tdb_error errcode)
Return a string description corresponding to the error code. The string is owned by TrailDB so the caller does not need to free it.
Setting Options¶
TrailDB supports cascading options. You can set top-level options with tdb_set_opt() which are inherited by all operations performed with the handle. You can override top-level options for individual trails using tdb_set_trail_opt(). Finally you can override top-level and trail-level event filters with tdb_cursor_set_event_filter().
tdb_set_opt¶
Set a top-level option.
tdb_error tdb_set_opt(tdb *db,
tdb_opt_key key,
tdb_opt_value value);
Currently the supported options are:
- key
TDB_OPT_ONLY_DIFF_ITEMS
- value:
0
- Cursors should return all items (default). - value:
1
- Cursors should return mostly distinct items.
- value:
- key
TDB_OPT_CURSOR_EVENT_BUFFER_SIZE
- value:
number of events
- Set the size of the cursor readahead buffer.
- value:
- key
TDB_OPT_EVENT_FILTER
- value: pointer to
const struct tdb_event_filter*
as returned by tdb_event_filter_new(). The filter is applied to all cursors that use thisdb
handle at the next call to tdb_get_trail(). The event filter must stay alive for the lifetime of thedb
handle or until the filter is disabled by calling this function withvalue.ptr = NULL
.
- value: pointer to
Return 0 on success, an error code otherwise.
tdb_get_opt¶
Get a top-level option.
tdb_error tdb_get_opt(tdb *db,
tdb_opt_key key,
tdb_opt_value *value);
See tdb_set_opt() for valid keys. Sets the value
to the current value of the key. Return 0 on success, an error code otherwise.
tdb_set_trail_opt¶
Set a trail-level option. These options override top-level options set with
tdb_set_opt() for an individual trail at trail_id
.
tdb_error tdb_set_trail_opt(tdb *db,
uint64_t trail_id,
tdb_opt_key key,
tdb_opt_value value);
Currently the supported options are:
- key
TDB_OPT_EVENT_FILTER
- value: pointer to
const struct tdb_event_filter*
as returned by tdb_event_filter_new(). The filter is applied to all cursors that use thisdb
handle at the next call to tdb_get_trail(). The event filter must stay alive for the lifetime of thedb
handle or until the filter is disabled by calling this function withvalue.ptr = NULL
.
- value: pointer to
Return 0 on success, an error code otherwise.
tdb_get_trail_opt¶
Get a trail-level option
tdb_error tdb_get_trail_opt(tdb *db,
uint64_t trail_id,
tdb_opt_key key,
tdb_opt_value *value);
See tdb_set_trail_opt() for valid keys. Sets the value
to the current value of the key. Return 0 on success, an error code otherwise.
Working with items, fields and values¶
See TrailDB Data Model for a description of items, fields, and values.
For maximum performance, it is a good idea to use tdb_item
s as
extensively as possible in your application when working with TrailDBs.
Convert items to strings only when really needed.
tdb_item_field¶
Extract the field ID from an item.
tdb_field tdb_item_field(tdb_item item)
item
an item.
Return a field ID.
tdb_item_val¶
Extract the value ID from an item.
tdb_val tdb_item_val(tdb_item item)
item
an item.
Return a value ID.
tdb_make_item¶
Make an item given a field ID and a value ID.
tdb_item tdb_make_item(tdb_field field, tdb_val val)
field
field ID.val
value ID.
Return a new item.
tdb_item_is32¶
Determine if an item can be safely cast to a 32-bit integer. You can use this function to help to conserve memory by casting items to 32-bit integers instead of default 64-bit items.
int tdb_item_is32(tdb_item item)
item
an item
Return non-zero if you can cast this item to 32-bit integer without loss of data.
tdb_lexicon_size¶
Get the number of distinct values in the given field.
uint64_t tdb_lexicon_size(const tdb *db, tdb_field field);
db
TrailDB handle.field
field ID.
Returns the number of distinct values.
tdb_get_field¶
Get the field ID given a field name.
tdb_error tdb_get_field(tdb *db, const char *field_name, tdb_field *field)
db
TrailDB handle.field_name
field name (zero-terminated string).field
pointer to variable to store field ID in.
Return 0 on success, an error code otherwise (field not found).
tdb_get_field_name¶
Get the field name given a field ID.
const char *tdb_get_field_name(tdb *db, tdb_field field)
db
TrailDB handle.field
field ID.
Return the field name or NULL if field ID is invalid. The string is owned by TrailDB so the caller does not need to free it.
tdb_get_item¶
Get the item corresponding to a value. Note that this is a relatively slow operation that may need to scan through all values in the field.
tdb_item tdb_get_item(tdb *db,
tdb_field field,
const char *value,
uint64_t value_length)
db
TrailDB handle.field
field ID.value
value byte string.value_length
length of the value.
Return 0 if item was not found, a valid item otherwise.
tdb_get_value¶
Get the value corresponding to a field ID and value ID pair.
const char *tdb_get_value(tdb *db,
tdb_field field,
tdb_val val,
uint64_t *value_length)
db
TrailDB handle.field
field ID.val
value ID.value_length
length of the returned byte string.
Return a byte string corresponding to the field-value pair or NULL if value was not found. The string is owned by TrailDB so the caller does not need to free it.
tdb_get_item_value¶
Get the value corresponding to an item. This is a shorthand version of tdb_get_value().
const char *tdb_get_item_value(tdb *db, tdb_item item, uint64_t *value_length)
db
TrailDB handle.item
an item.value_length
length of the returned byte string.
Return a byte string corresponding to the field-value pair or NULL if value was not found. The string is owned by TrailDB so the caller does not need to free it.
Working with UUIDs¶
Each trail has a user-defined 16-byte UUID and a sequential 64-bit trail ID associated to it.
tdb_get_uuid¶
Get the UUID given a trail ID. This is a fast O(1) operation.
const uint8_t *tdb_get_uuid(const tdb *db, uint64_t trail_id)
db
TrailDB handle.trail_id
trail ID (an integer between 0 and tdb_num_trails()).
Return a raw 16-byte UUID or NULL if trail ID is invalid.
tdb_get_trail_id¶
Get the trail ID given a UUID. This is an O(log N) operation.
tdb_error tdb_get_trail_id(const tdb *db,
const uint8_t uuid[16],
uint64_t *trail_id)
db
TrailDB handle.uuid
a raw 16-byte UUID.trail_id
output pointer to the trail ID.
Return 0 if UUID was found, an error code otherwise.
tdb_uuid_raw¶
Translate a 32-byte hex-encoded UUID to a 16-byte UUID.
tdb_error tdb_uuid_raw(const uint8_t hexuuid[32], uint8_t uuid[16])
hexuuid
source 32-byte hex-encoded UUID.uuid
destination 16-byte UUID.
Return 0 on success, an error code if hexuuid
is not a valid
hex-encoded string.
tdb_uuid_hex¶
Translate a 16-byte UUID to a 32-byte hex-encoded UUID.
void tdb_uuid_hex(const uint8_t uuid[16], uint8_t hexuuid[32])
uuid
source 16-byte UUID.hexuuid
destination 32-byte hex-encoded UUID.
Query events with cursors¶
tdb_cursor_new¶
Create a new cursor handle.
tdb_cursor *tdb_cursor_new(const tdb *db)
db
TrailDB handle.
Return NULL if memory allocation fails.
tdb_cursor_free¶
Free a cursor handle.
void tdb_cursor_free(tdb_cursor *cursor)
tdb_get_trail¶
Reset the cursor to the given trail ID.
tdb_error tdb_get_trail(tdb_cursor *cursor, uint64_t trail_id)
cursor
cursor handle.trail_id
trail ID (an integer between 0 and tdb_num_trails()).
Return 0 or an error code if trail ID is invalid.
tdb_get_trail_length¶
Get the number of events remaining in this cursor.
uint64_t tdb_get_trail_length(tdb_cursor *cursor);
cursor
cursor handle.
Return the number of events in this cursor. Note that this function consumes the cursor. You need to reset it with tdb_get_trail() to get more events.
tdb_cursor_set_event_filter¶
Set an event filter for the cursor. See filter events for more information about event filters.
tdb_error tdb_cursor_set_event_filter(tdb_cursor *cursor,
const struct tdb_event_filter *filter);
cursor
cursor handle.filter
filter handle.
Return 0 on success or an error if this cursor does not support event
filtering (TDB_OPT_ONLY_DIFF_ITEMS
is enabled).
Note that this function borrows filter
so it needs to stay alive as long
as the cursor is being used. You can use the same filter
in multiple cursors.
tdb_cursor_unset_event_filter¶
Remove an event filter from the cursor.
void tdb_cursor_unset_event_filter(tdb_cursor *cursor);
cursor
cursor handle.
tdb_cursor_next¶
Consume the next event from the cursor.
const tdb_event *tdb_cursor_next(tdb_cursor *cursor)
cursor
cursor handle.
Return an event struct or NULL if the cursor has no more events. The event structure is defined as follows:
typedef struct{
uint64_t timestamp;
uint64_t num_items;
const tdb_item items[0];
} tdb_event;
tdb_event
represents one event in the trail. Each event has a timestamp,
and a number of field-value pairs, encoded as items.
tdb_cursor_peek¶
Return the next event from the cursor without consuming it.
const tdb_event *tdb_cursor_peek(tdb_cursor *cursor)
cursor
cursor handle.
See tdb_cursor_next for more details about tdb_event
.
Join trails with multi-cursors¶
A multi-cursor merges multiple trails represented by tdb_cursor
together to produce a single merged trail that has its events sorted in
the ascending timestamp order. The trails can originate from a single
TrailDB or multiple separate TrailDBs. In effect, a multi-cursor performs
efficient merge sort of the underlying trails on the fly.
You need to initialize all underlying tdb_cursor
s to point at the
desired trails with tdb_get_trail as usual. Then,
call tdb_multi_cursor_reset to reset the
multi-cursor state. After this, you can iterate over the joined
trail with tdb_multi_cursor_next, event
by event, or get multiple joined events with a single call using
tdb_multi_cursor_next_batch. You can
repeat these steps for arbitrarily many trails using the same handles.
tdb_multi_cursor_new¶
Create a new multi-cursor handle.
tdb_multi_cursor *tdb_multi_cursor_new(tdb_cursor **cursors, uint64_t num_cursors)
cursors
a list of cursors to be merged.num_cursors
number of cursors incursors
Return NULL if memory allocation fails.
tdb_multi_cursor_free¶
Free a multi-cursor handle.
void tdb_multi_cursor_free(tdb_multi_cursor *mcursor)
mcursor
a multi-cursor handle
tdb_multi_cursor_reset¶
Reset a multi-cursor handle to reflect the state of the underlying cursors. Call this function every time after tdb_get_trail.
void tdb_multi_cursor_reset(tdb_multi_cursor *mcursor);
mcursor
a multi-cursor handle
tdb_multi_cursor_next¶
Consume the next event, in the ascending timestamp order, from the underlying cursors.
const tdb_multi_event *tdb_multi_cursor_next(tdb_multi_cursor *mcursor)
mcursor
a multi-cursor handle
Return a multi event struct or NULL if the cursor has no more events. The multi event structure is defined as follows:
typedef struct{
const tdb *db;
const tdb_event *event;
uint64_t cursor_idx;
} tdb_multi event;
db
is a TrailDB handle to the TrailDB that contains this event
.
See tdb_cursor_next for more details about
tdb_event
. Use the db
handle to translate event->items
to
values. The cursor_idx
index points to the array of cursors given in
tdb_multi_cursor_new.
tdb_multi_cursor_next_batch¶
An optimized version of tdb_multi_cursor_next. Instead of returning a single event, this function returns an array of events with a single function call.
uint64_t tdb_multi_cursor_next_batch(tdb_multi_cursor *mcursor,
tdb_multi_event *events,
uint64_t max_events);
mcursor
a multi-cursor handleevents
a pre-allocated array oftdb_multi_event
structsmax_events
size of theevents
array
Returns the number of events added to events
, at most max_events
.
If the value returned is 0, all events have been exhausted. See
tdb_multi_cursor_next for the definition of
tdb_multi_event
.
Note that the pointers in the events
array are valid only until the
next call to one of the multi-cursor functions. If you want to persist
the underlying events, you should copy them to another data structure.
tdb_multi_cursor_peek¶
Return the next event, in the ascending timestamp order, from the underlying cursors without consuming it.
const tdb_multi_event *tdb_multi_cursor_peek(tdb_multi_cursor *mcursor);
mcursor
a multi-cursor handle
See tdb_multi_cursor_next for the definition
of tdb_multi_event
.
Filter events¶
An event filter is a boolean query over fields, expressed in conjunctive normal form.
Once assigned to a cursor, only the subset of events that match the query are returned. See technical overview for more information.
tdb_event_filter_new¶
Create a new event filter handle.
struct tdb_event_filter *tdb_event_filter_new(void)
Return NULL if memory allocation fails.
tdb_event_filter_new_match_none¶
Create a new event filter handle that is optimized to match no events. Commonly used to create a view over a subset of trails.
struct tdb_event_filter *tdb_event_filter_new_match_none(void)
Return NULL if memory allocation fails.
tdb_event_filter_new_match_all¶
Create a new event filter handle that is optimized to match all events. Commonly used to create a view over a subset of trails.
struct tdb_event_filter *tdb_event_filter_new_match_all(void)
Return NULL if memory allocation fails.
tdb_event_filter_free¶
Free an event filter handle.
void tdb_event_filter_free(struct tdb_event_filter *filter)
tdb_event_filter_add_term¶
Add a term (item) in the query. This item is attached to the
current clause with OR. You can make the item negative by setting
is_negative
to non-zero.
tdb_error tdb_event_filter_add_term(struct tdb_event_filter *filter,
tdb_item term,
int is_negative)
filter
filter handle.term
an item to be included in the clause.is_negative
is this item negative?
Return 0 on success, an error code otherwise (out of memory).
tdb_event_filter_add_time_range¶
Add a time-range term to the query. This item is attached to the
current clause with OR. Finds events with timestamp t
such that
start_time <= t < end_time
.
tdb_error tdb_event_filter_add_time_range(struct tdb_event_filter *filter,
uint64_t start_time,
uint64_t end_time)
filter
filter handle.start_time
(inclusive) start of time rangeend_time
(exclusive) end of time range
Return 0 on success, an error code otherwise (out of memory or invalid time range).
tdb_event_filter_new_clause¶
Add a new clause in the query. The new clause is attached to the query with AND.
tdb_error tdb_event_filter_new_clause(struct tdb_event_filter *filter)
filter
filter handle.
Return 0 success, an error code otherwise (out of memory).
tdb_event_filter_num_clauses¶
Get the number of clauses in this filter.
uint64_t tdb_event_filter_num_clauses(const struct tdb_event_filter *filter);
filter
filter handle.
Return the number of clauses. Note that a new filter has one clause by default, so the return value is always at least one.
tdb_event_filter_num_terms¶
Get the number of terms in a clause of this filter.
tdb_error tdb_event_filter_num_terms(const struct tdb_event_filter *filter,
uint64_t clause_index,
uint64_t *num_terms);
filter
filter handle.clause_index
clause index:0 <= clause_index < tdb_event_filter_num_clauses()
.num_terms
returns the number of terms in the clause.
Returns 0 (TDB_ERR_OK
) if the given clause exists, otherwise TDB_ERR_NO_SUCH_ITEM
.
tdb_event_filter_get_term_type¶
Get the time of a term in a clause in this filter.
tdb_error tdb_event_filter_get_term_type(const struct tdb_event_filter *filter,
uint64_t clause_index,
uint64_t terms_index,
tdb_event_filter_term_type *term_type);
filter
filter handle.clause_index
clause index:0 <= clause_index < tdb_event_filter_num_clauses()
.term_index
term index:0 <= term_index < tdb_event_filter_num_terms()
.tdb_event_filter_term_type
returns the term type.
If the term was found, then the function returns 0 and tdb_event_filter_term_type
is either
TDB_EVENT_FILTER_MATCH_TERM
or TDB_EVENT_FILTER_TIME_RANGE_TERM
. Otherwise, if the clause
or term do not exist, the function returns TDB_ERR_NO_SUCH_ITEM
and
tdb_event_filter_term_type
is TDB_EVENT_FILTER_UNKNOWN_TERM
.
tdb_event_filter_get_item¶
Get an item added to this filter.
tdb_error tdb_event_filter_get_item(const struct tdb_event_filter *filter,
uint64_t clause_index,
uint64_t item_index,
tdb_item *item,
int *is_negative)
filter
filter handle.clause_index
clause index:0 <= clause_index < tdb_event_filter_num_clauses()
.item_index
item index in the clause:0 <= term_index < tdb_event_filter_num_terms()
.item
returned item.is_negative
return 1 if the item negative, 0 otherwise, as set in tdb_event_filter_add_term.
Returns 0 if an item was found at this location and is a match term. If the clause or term
do no exist, TDB_ERR_NO_SUCH_ITEM
is returned. Note that empty clauses always return
TDB_ERR_NO_SUCH_ITEM
although the clauses themselves are valid. Lastly, if you try to call
tdb_event_filter_get_item
on a time-range term, then TDB_ERR_INCORRECT_TERM_TYPE
is
returned.
tdb_event_filter_get_time_range¶
Get a time-range term from a clause in this filter.
tdb_error tdb_event_filter_get_time_range(const struct tdb_event_filter *filter,
uint64_t clause_index,
uint64_t term_index,
uint64_t *start_time,
uint64_t *end_time)
filter
filter handle.clause_index
clause index:0 <= clause_index < tdb_event_filter_num_clauses()
.term_index
term index in the clause:0 <= term_index < tdb_event_filter_num_terms()
.start_time
start time (inclusive) of the time range.end_time
end_time (exclusive) of the time range.
Returns 0 if a time-range term was found at this location. If the clause or term do no exist,
TDB_ERR_NO_SUCH_ITEM
is returned. Note that empty clauses always return
TDB_ERR_NO_SUCH_ITEM
although the clauses themselves are valid. Lastly, if you try to call
tdb_event_filter_get_time_range
on a match term, then TDB_ERR_INCORRECT_TERM_TYPE
is
returned.
Here is an example how to deconstruct a filter back to clauses and items:
for (clause = 0; clause < tdb_event_filter_num_clauses(filter); clause++){
uint64_t item, start_time, end_time, idx = 0;
tdb_event_filter_term_type term_type;
int is_negative;
tdb_error ret;
for (term = 0; term < tdb_event_filter_num_terms(filter, clause); term++){
tdb_event_filter_get_term_type(filter, clause, term, term_type);
if (type == TDB_EVENT_FILTER_MATCH_TERM){
ret = tdb_event_filter_get_item(f, clause, term, &item, &is_negative);
if (ret == TDB_ERR_OK){
/* do something with 'item' at 'term' in 'clause' */
}
} else if(type == TDB_EVENT_FILTER_TIME_RANGE_TERM){
ret = tdb_event_filter_get_time_range(f, clause, term, &start_time, &end_time);
if (ret == TDB_ERR_OK){
/* do something with 'start_time' and 'end_time' at 'term' in 'clause' */
}
}
}
}