Functions¶
- Functions
- Construct a new TrailDB
- Open a TrailDB and access metadata
- Setting Options
- Working with items, fields and values
- Working with UUIDs
- Query events with cursors
- Join trails with multi-cursors
- Filter events
- tdb_event_filter_new
- tdb_event_filter_new_match_none
- tdb_event_filter_new_match_all
- tdb_event_filter_free
- tdb_event_filter_add_term
- tdb_event_filter_add_time_range
- tdb_event_filter_new_clause
- tdb_event_filter_num_clauses
- tdb_event_filter_num_terms
- tdb_event_filter_get_term_type
- tdb_event_filter_get_item
- tdb_event_filter_get_time_range
Construct a new TrailDB¶
tdb_cons_init¶
Create a new TrailDB constructor handle.
tdb_cons *tdb_cons_init(void)
Return NULL if memory allocation fails.
tdb_cons_open¶
Open a new TrailDB.
tdb_error tdb_cons_open(tdb_const *cons,
const char *root,
const char **ofield_names,
uint64_t num_ofields)
consconstructor handle as returned fromtdb_cons_init.rootpath to new TrailDB.ofield_namesnames of fields, each name terminated by a zero byte.num_ofieldsnumber of fields.
Return 0 on success, an error code otherwise.
tdb_cons_close¶
Free a TrailDB constructor handle. Call this after tdb_cons_finalize().
void tdb_cons_close(tdb_cons *cons)
consTrailDB constructor handle.
tdb_cons_add¶
Add an event to TrailDB.
tdb_error tdb_cons_add(tdb_cons *cons,
const uint8_t uuid[16],
const uint64_t timestamp,
const char **values,
const uint64_t *value_lengths)
consTrailDB constructor handle.uuid16-byte UUID.timestampinteger timestamp. Usually Unix time.valuesvalues of each field, as an array of pointers to byte strings. The order of values is the same asofield_namesin tdb_cons_open().values_lengthlengths of byte strings invalues.
Return 0 on success, an error code otherwise.
tdb_cons_append¶
Merge an existing TrailDB to this constructor. The fields must be equal between the existing and the new TrailDB.
tdb_error tdb_cons_append(tdb_cons *cons, const tdb *db)
consTrailDB constructor handle.dbAn existing TrailDB to be merged.
Return 0 on success, an error code otherwise.
tdb_cons_set_opt¶
Set a constructor option.
tdb_error tdb_cons_set_opt(tdb_cons *cons,
tdb_opt_key key,
tdb_opt_value value);
Currently the supported options are:
-
key
TDB_OPT_CONS_OUTPUT_FORMAT- value
TDB_OPT_CONS_OUTPUT_PACKAGEcreate a one-file TrailDB (default). - value
TDB_OPT_CONS_OUTPUT_DIRdo not package TrailDB, keep a directory.
- value
-
key
TDB_OPT_CONS_NO_BIGRAMS- value
0to enable bigram-based size optimization at TrailDB finalization (default). This decreases the size of resulting TrailDB at the cost of increased compression time. - value
1to disable bigram-based size optimization at TrailDB finalization.
- value
Return 0 on success, an error code otherwise.
tdb_cons_get_opt¶
Get a constructor option.
tdb_error tdb_cons_get_opt(tdb_cons *cons,
tdb_opt_key key,
tdb_opt_value *value);
See tdb_cons_set_opt() for valid keys. Sets the value
to the current value of the key. Return 0 on success, an error code otherwise.
tdb_cons_finalize¶
Finalize TrailDB construction. Finalization takes care of compacting the events and creating a valid TrailDB file.
tdb_error tdb_cons_finalize(tdb_cons *cons)
consTrailDB constructor handle.
Return 0 on success, an error code otherwise.
Open a TrailDB and access metadata¶
tdb_init¶
Create a new TrailDB handle.
tdb *tdb_init(void)
Return NULL if memory allocation fails.
tdb_open¶
Open a TrailDB for reading.
tdb_error tdb_open(tdb *tdb, const char *root)
tdbTraildb handle returned by tdb_init().rootpath to TrailDB.
Return 0 on success, an error code otherwise.
tdb_close¶
Close a TrailDB.
void tdb_close(tdb *db)
dbTrailDB handle.
tdb_dontneed¶
Inform the operating system that this TrailDB does not need to be kept in memory.
void tdb_dontneed(tdb *db)
dbTrailDB handle.
tdb_willneed¶
Inform the operating system that this TrailDB will be accessed soon. Call this after tdb_dontneed() once the TrailDB is needed again.
void tdb_willneed(tdb *db)
dbTrailDB handle.
tdb_num_trails¶
Get the number of trails.
uint64_t tdb_num_trails(const tdb *db)
dbTrailDB handle.
tdb_num_events¶
Get the number of events.
uint64_t tdb_num_events(const tdb *db)
dbTrailDB handle.
tdb_num_fields¶
Get the number of fields.
uint64_t tdb_num_fields(const tdb *db)
dbTrailDB handle.
tdb_min_timestamp¶
Get the oldest timestamp.
uint64_t tdb_min_timestamp(const tdb *db)
dbTrailDB handle.
tdb_max_timestamp¶
Get the newest timestamp.
uint64_t tdb_max_timestamp(const tdb *db)
dbTrailDB handle.
tdb_version¶
Get the version.
uint64_t tdb_version(const tdb *db)
dbTrailDB handle.
tdb_error_str¶
Translate an error code to a string.
const char *tdb_error_str(tdb_error errcode)
Return a string description corresponding to the error code. The string is owned by TrailDB so the caller does not need to free it.
Setting Options¶
TrailDB supports cascading options. You can set top-level options with tdb_set_opt() which are inherited by all operations performed with the handle. You can override top-level options for individual trails using tdb_set_trail_opt(). Finally you can override top-level and trail-level event filters with tdb_cursor_set_event_filter().
tdb_set_opt¶
Set a top-level option.
tdb_error tdb_set_opt(tdb *db,
tdb_opt_key key,
tdb_opt_value value);
Currently the supported options are:
- key
TDB_OPT_ONLY_DIFF_ITEMS- value:
0- Cursors should return all items (default). - value:
1- Cursors should return mostly distinct items.
- value:
- key
TDB_OPT_CURSOR_EVENT_BUFFER_SIZE- value:
number of events- Set the size of the cursor readahead buffer.
- value:
- key
TDB_OPT_EVENT_FILTER- value: pointer to
const struct tdb_event_filter*as returned by tdb_event_filter_new(). The filter is applied to all cursors that use thisdbhandle at the next call to tdb_get_trail(). The event filter must stay alive for the lifetime of thedbhandle or until the filter is disabled by calling this function withvalue.ptr = NULL.
- value: pointer to
Return 0 on success, an error code otherwise.
tdb_get_opt¶
Get a top-level option.
tdb_error tdb_get_opt(tdb *db,
tdb_opt_key key,
tdb_opt_value *value);
See tdb_set_opt() for valid keys. Sets the value
to the current value of the key. Return 0 on success, an error code otherwise.
tdb_set_trail_opt¶
Set a trail-level option. These options override top-level options set with
tdb_set_opt() for an individual trail at trail_id.
tdb_error tdb_set_trail_opt(tdb *db,
uint64_t trail_id,
tdb_opt_key key,
tdb_opt_value value);
Currently the supported options are:
- key
TDB_OPT_EVENT_FILTER- value: pointer to
const struct tdb_event_filter*as returned by tdb_event_filter_new(). The filter is applied to all cursors that use thisdbhandle at the next call to tdb_get_trail(). The event filter must stay alive for the lifetime of thedbhandle or until the filter is disabled by calling this function withvalue.ptr = NULL.
- value: pointer to
Return 0 on success, an error code otherwise.
tdb_get_trail_opt¶
Get a trail-level option
tdb_error tdb_get_trail_opt(tdb *db,
uint64_t trail_id,
tdb_opt_key key,
tdb_opt_value *value);
See tdb_set_trail_opt() for valid keys. Sets the value
to the current value of the key. Return 0 on success, an error code otherwise.
Working with items, fields and values¶
See TrailDB Data Model for a description of items, fields, and values.
For maximum performance, it is a good idea to use tdb_items as
extensively as possible in your application when working with TrailDBs.
Convert items to strings only when really needed.
tdb_item_field¶
Extract the field ID from an item.
tdb_field tdb_item_field(tdb_item item)
iteman item.
Return a field ID.
tdb_item_val¶
Extract the value ID from an item.
tdb_val tdb_item_val(tdb_item item)
iteman item.
Return a value ID.
tdb_make_item¶
Make an item given a field ID and a value ID.
tdb_item tdb_make_item(tdb_field field, tdb_val val)
fieldfield ID.valvalue ID.
Return a new item.
tdb_item_is32¶
Determine if an item can be safely cast to a 32-bit integer. You can use this function to help to conserve memory by casting items to 32-bit integers instead of default 64-bit items.
int tdb_item_is32(tdb_item item)
iteman item
Return non-zero if you can cast this item to 32-bit integer without loss of data.
tdb_lexicon_size¶
Get the number of distinct values in the given field.
uint64_t tdb_lexicon_size(const tdb *db, tdb_field field);
dbTrailDB handle.fieldfield ID.
Returns the number of distinct values.
tdb_get_field¶
Get the field ID given a field name.
tdb_error tdb_get_field(tdb *db, const char *field_name, tdb_field *field)
dbTrailDB handle.field_namefield name (zero-terminated string).fieldpointer to variable to store field ID in.
Return 0 on success, an error code otherwise (field not found).
tdb_get_field_name¶
Get the field name given a field ID.
const char *tdb_get_field_name(tdb *db, tdb_field field)
dbTrailDB handle.fieldfield ID.
Return the field name or NULL if field ID is invalid. The string is owned by TrailDB so the caller does not need to free it.
tdb_get_item¶
Get the item corresponding to a value. Note that this is a relatively slow operation that may need to scan through all values in the field.
tdb_item tdb_get_item(tdb *db,
tdb_field field,
const char *value,
uint64_t value_length)
dbTrailDB handle.fieldfield ID.valuevalue byte string.value_lengthlength of the value.
Return 0 if item was not found, a valid item otherwise.
tdb_get_value¶
Get the value corresponding to a field ID and value ID pair.
const char *tdb_get_value(tdb *db,
tdb_field field,
tdb_val val,
uint64_t *value_length)
dbTrailDB handle.fieldfield ID.valvalue ID.value_lengthlength of the returned byte string.
Return a byte string corresponding to the field-value pair or NULL if value was not found. The string is owned by TrailDB so the caller does not need to free it.
tdb_get_item_value¶
Get the value corresponding to an item. This is a shorthand version of tdb_get_value().
const char *tdb_get_item_value(tdb *db, tdb_item item, uint64_t *value_length)
dbTrailDB handle.iteman item.value_lengthlength of the returned byte string.
Return a byte string corresponding to the field-value pair or NULL if value was not found. The string is owned by TrailDB so the caller does not need to free it.
Working with UUIDs¶
Each trail has a user-defined 16-byte UUID and a sequential 64-bit trail ID associated to it.
tdb_get_uuid¶
Get the UUID given a trail ID. This is a fast O(1) operation.
const uint8_t *tdb_get_uuid(const tdb *db, uint64_t trail_id)
dbTrailDB handle.trail_idtrail ID (an integer between 0 and tdb_num_trails()).
Return a raw 16-byte UUID or NULL if trail ID is invalid.
tdb_get_trail_id¶
Get the trail ID given a UUID. This is an O(log N) operation.
tdb_error tdb_get_trail_id(const tdb *db,
const uint8_t uuid[16],
uint64_t *trail_id)
dbTrailDB handle.uuida raw 16-byte UUID.trail_idoutput pointer to the trail ID.
Return 0 if UUID was found, an error code otherwise.
tdb_uuid_raw¶
Translate a 32-byte hex-encoded UUID to a 16-byte UUID.
tdb_error tdb_uuid_raw(const uint8_t hexuuid[32], uint8_t uuid[16])
hexuuidsource 32-byte hex-encoded UUID.uuiddestination 16-byte UUID.
Return 0 on success, an error code if hexuuid is not a valid
hex-encoded string.
tdb_uuid_hex¶
Translate a 16-byte UUID to a 32-byte hex-encoded UUID.
void tdb_uuid_hex(const uint8_t uuid[16], uint8_t hexuuid[32])
uuidsource 16-byte UUID.hexuuiddestination 32-byte hex-encoded UUID.
Query events with cursors¶
tdb_cursor_new¶
Create a new cursor handle.
tdb_cursor *tdb_cursor_new(const tdb *db)
dbTrailDB handle.
Return NULL if memory allocation fails.
tdb_cursor_free¶
Free a cursor handle.
void tdb_cursor_free(tdb_cursor *cursor)
tdb_get_trail¶
Reset the cursor to the given trail ID.
tdb_error tdb_get_trail(tdb_cursor *cursor, uint64_t trail_id)
cursorcursor handle.trail_idtrail ID (an integer between 0 and tdb_num_trails()).
Return 0 or an error code if trail ID is invalid.
tdb_get_trail_length¶
Get the number of events remaining in this cursor.
uint64_t tdb_get_trail_length(tdb_cursor *cursor);
cursorcursor handle.
Return the number of events in this cursor. Note that this function consumes the cursor. You need to reset it with tdb_get_trail() to get more events.
tdb_cursor_set_event_filter¶
Set an event filter for the cursor. See filter events for more information about event filters.
tdb_error tdb_cursor_set_event_filter(tdb_cursor *cursor,
const struct tdb_event_filter *filter);
cursorcursor handle.filterfilter handle.
Return 0 on success or an error if this cursor does not support event
filtering (TDB_OPT_ONLY_DIFF_ITEMS is enabled).
Note that this function borrows filter so it needs to stay alive as long
as the cursor is being used. You can use the same filter in multiple cursors.
tdb_cursor_unset_event_filter¶
Remove an event filter from the cursor.
void tdb_cursor_unset_event_filter(tdb_cursor *cursor);
cursorcursor handle.
tdb_cursor_next¶
Consume the next event from the cursor.
const tdb_event *tdb_cursor_next(tdb_cursor *cursor)
cursorcursor handle.
Return an event struct or NULL if the cursor has no more events. The event structure is defined as follows:
typedef struct{
uint64_t timestamp;
uint64_t num_items;
const tdb_item items[0];
} tdb_event;
tdb_event represents one event in the trail. Each event has a timestamp,
and a number of field-value pairs, encoded as items.
tdb_cursor_peek¶
Return the next event from the cursor without consuming it.
const tdb_event *tdb_cursor_peek(tdb_cursor *cursor)
cursorcursor handle.
See tdb_cursor_next for more details about tdb_event.
Join trails with multi-cursors¶
A multi-cursor merges multiple trails represented by tdb_cursor
together to produce a single merged trail that has its events sorted in
the ascending timestamp order. The trails can originate from a single
TrailDB or multiple separate TrailDBs. In effect, a multi-cursor performs
efficient merge sort of the underlying trails on the fly.
You need to initialize all underlying tdb_cursors to point at the
desired trails with tdb_get_trail as usual. Then,
call tdb_multi_cursor_reset to reset the
multi-cursor state. After this, you can iterate over the joined
trail with tdb_multi_cursor_next, event
by event, or get multiple joined events with a single call using
tdb_multi_cursor_next_batch. You can
repeat these steps for arbitrarily many trails using the same handles.
tdb_multi_cursor_new¶
Create a new multi-cursor handle.
tdb_multi_cursor *tdb_multi_cursor_new(tdb_cursor **cursors, uint64_t num_cursors)
cursorsa list of cursors to be merged.num_cursorsnumber of cursors incursors
Return NULL if memory allocation fails.
tdb_multi_cursor_free¶
Free a multi-cursor handle.
void tdb_multi_cursor_free(tdb_multi_cursor *mcursor)
mcursora multi-cursor handle
tdb_multi_cursor_reset¶
Reset a multi-cursor handle to reflect the state of the underlying cursors. Call this function every time after tdb_get_trail.
void tdb_multi_cursor_reset(tdb_multi_cursor *mcursor);
mcursora multi-cursor handle
tdb_multi_cursor_next¶
Consume the next event, in the ascending timestamp order, from the underlying cursors.
const tdb_multi_event *tdb_multi_cursor_next(tdb_multi_cursor *mcursor)
mcursora multi-cursor handle
Return a multi event struct or NULL if the cursor has no more events. The multi event structure is defined as follows:
typedef struct{
const tdb *db;
const tdb_event *event;
uint64_t cursor_idx;
} tdb_multi event;
db is a TrailDB handle to the TrailDB that contains this event.
See tdb_cursor_next for more details about
tdb_event. Use the db handle to translate event->items to
values. The cursor_idx index points to the array of cursors given in
tdb_multi_cursor_new.
tdb_multi_cursor_next_batch¶
An optimized version of tdb_multi_cursor_next. Instead of returning a single event, this function returns an array of events with a single function call.
uint64_t tdb_multi_cursor_next_batch(tdb_multi_cursor *mcursor,
tdb_multi_event *events,
uint64_t max_events);
mcursora multi-cursor handleeventsa pre-allocated array oftdb_multi_eventstructsmax_eventssize of theeventsarray
Returns the number of events added to events, at most max_events.
If the value returned is 0, all events have been exhausted. See
tdb_multi_cursor_next for the definition of
tdb_multi_event.
Note that the pointers in the events array are valid only until the
next call to one of the multi-cursor functions. If you want to persist
the underlying events, you should copy them to another data structure.
tdb_multi_cursor_peek¶
Return the next event, in the ascending timestamp order, from the underlying cursors without consuming it.
const tdb_multi_event *tdb_multi_cursor_peek(tdb_multi_cursor *mcursor);
mcursora multi-cursor handle
See tdb_multi_cursor_next for the definition
of tdb_multi_event.
Filter events¶
An event filter is a boolean query over fields, expressed in conjunctive normal form.
Once assigned to a cursor, only the subset of events that match the query are returned. See technical overview for more information.
tdb_event_filter_new¶
Create a new event filter handle.
struct tdb_event_filter *tdb_event_filter_new(void)
Return NULL if memory allocation fails.
tdb_event_filter_new_match_none¶
Create a new event filter handle that is optimized to match no events. Commonly used to create a view over a subset of trails.
struct tdb_event_filter *tdb_event_filter_new_match_none(void)
Return NULL if memory allocation fails.
tdb_event_filter_new_match_all¶
Create a new event filter handle that is optimized to match all events. Commonly used to create a view over a subset of trails.
struct tdb_event_filter *tdb_event_filter_new_match_all(void)
Return NULL if memory allocation fails.
tdb_event_filter_free¶
Free an event filter handle.
void tdb_event_filter_free(struct tdb_event_filter *filter)
tdb_event_filter_add_term¶
Add a term (item) in the query. This item is attached to the
current clause with OR. You can make the item negative by setting
is_negative to non-zero.
tdb_error tdb_event_filter_add_term(struct tdb_event_filter *filter,
tdb_item term,
int is_negative)
filterfilter handle.terman item to be included in the clause.is_negativeis this item negative?
Return 0 on success, an error code otherwise (out of memory).
tdb_event_filter_add_time_range¶
Add a time-range term to the query. This item is attached to the
current clause with OR. Finds events with timestamp t such that
start_time <= t < end_time.
tdb_error tdb_event_filter_add_time_range(struct tdb_event_filter *filter,
uint64_t start_time,
uint64_t end_time)
filterfilter handle.start_time(inclusive) start of time rangeend_time(exclusive) end of time range
Return 0 on success, an error code otherwise (out of memory or invalid time range).
tdb_event_filter_new_clause¶
Add a new clause in the query. The new clause is attached to the query with AND.
tdb_error tdb_event_filter_new_clause(struct tdb_event_filter *filter)
filterfilter handle.
Return 0 success, an error code otherwise (out of memory).
tdb_event_filter_num_clauses¶
Get the number of clauses in this filter.
uint64_t tdb_event_filter_num_clauses(const struct tdb_event_filter *filter);
filterfilter handle.
Return the number of clauses. Note that a new filter has one clause by default, so the return value is always at least one.
tdb_event_filter_num_terms¶
Get the number of terms in a clause of this filter.
tdb_error tdb_event_filter_num_terms(const struct tdb_event_filter *filter,
uint64_t clause_index,
uint64_t *num_terms);
filterfilter handle.clause_indexclause index:0 <= clause_index < tdb_event_filter_num_clauses().num_termsreturns the number of terms in the clause.
Returns 0 (TDB_ERR_OK) if the given clause exists, otherwise TDB_ERR_NO_SUCH_ITEM.
tdb_event_filter_get_term_type¶
Get the time of a term in a clause in this filter.
tdb_error tdb_event_filter_get_term_type(const struct tdb_event_filter *filter,
uint64_t clause_index,
uint64_t terms_index,
tdb_event_filter_term_type *term_type);
filterfilter handle.clause_indexclause index:0 <= clause_index < tdb_event_filter_num_clauses().term_indexterm index:0 <= term_index < tdb_event_filter_num_terms().tdb_event_filter_term_typereturns the term type.
If the term was found, then the function returns 0 and tdb_event_filter_term_type is either
TDB_EVENT_FILTER_MATCH_TERM or TDB_EVENT_FILTER_TIME_RANGE_TERM. Otherwise, if the clause
or term do not exist, the function returns TDB_ERR_NO_SUCH_ITEM and
tdb_event_filter_term_type is TDB_EVENT_FILTER_UNKNOWN_TERM.
tdb_event_filter_get_item¶
Get an item added to this filter.
tdb_error tdb_event_filter_get_item(const struct tdb_event_filter *filter,
uint64_t clause_index,
uint64_t item_index,
tdb_item *item,
int *is_negative)
filterfilter handle.clause_indexclause index:0 <= clause_index < tdb_event_filter_num_clauses().item_indexitem index in the clause:0 <= term_index < tdb_event_filter_num_terms().itemreturned item.is_negativereturn 1 if the item negative, 0 otherwise, as set in tdb_event_filter_add_term.
Returns 0 if an item was found at this location and is a match term. If the clause or term
do no exist, TDB_ERR_NO_SUCH_ITEM is returned. Note that empty clauses always return
TDB_ERR_NO_SUCH_ITEM although the clauses themselves are valid. Lastly, if you try to call
tdb_event_filter_get_item on a time-range term, then TDB_ERR_INCORRECT_TERM_TYPE is
returned.
tdb_event_filter_get_time_range¶
Get a time-range term from a clause in this filter.
tdb_error tdb_event_filter_get_time_range(const struct tdb_event_filter *filter,
uint64_t clause_index,
uint64_t term_index,
uint64_t *start_time,
uint64_t *end_time)
filterfilter handle.clause_indexclause index:0 <= clause_index < tdb_event_filter_num_clauses().term_indexterm index in the clause:0 <= term_index < tdb_event_filter_num_terms().start_timestart time (inclusive) of the time range.end_timeend_time (exclusive) of the time range.
Returns 0 if a time-range term was found at this location. If the clause or term do no exist,
TDB_ERR_NO_SUCH_ITEM is returned. Note that empty clauses always return
TDB_ERR_NO_SUCH_ITEM although the clauses themselves are valid. Lastly, if you try to call
tdb_event_filter_get_time_range on a match term, then TDB_ERR_INCORRECT_TERM_TYPE is
returned.
Here is an example how to deconstruct a filter back to clauses and items:
for (clause = 0; clause < tdb_event_filter_num_clauses(filter); clause++){
uint64_t item, start_time, end_time, idx = 0;
tdb_event_filter_term_type term_type;
int is_negative;
tdb_error ret;
for (term = 0; term < tdb_event_filter_num_terms(filter, clause); term++){
tdb_event_filter_get_term_type(filter, clause, term, term_type);
if (type == TDB_EVENT_FILTER_MATCH_TERM){
ret = tdb_event_filter_get_item(f, clause, term, &item, &is_negative);
if (ret == TDB_ERR_OK){
/* do something with 'item' at 'term' in 'clause' */
}
} else if(type == TDB_EVENT_FILTER_TIME_RANGE_TERM){
ret = tdb_event_filter_get_time_range(f, clause, term, &start_time, &end_time);
if (ret == TDB_ERR_OK){
/* do something with 'start_time' and 'end_time' at 'term' in 'clause' */
}
}
}
}