fsa file format

header256 bytes
symbol tablesize bytes
state tablesize*4 bytes
data tabledata_size bytes
perfect hahs table (optional)size*4 bytes

header

fieldoffsetsizedescrption
magic04 (uint32)Magic number (0x79832469)
version44 (uint32)Version number of the fsa library used for building this fsa
checksum84 (uint32)Checksum
size124 (uint32)Size of fsa (cells)
start164 (uint32)Start state
data_size204 (uint32)Size of data (bytes)
data_type244 (uint32)Type of data items (0=variable size, 1=fixed size)
fixed_data_size284 (uint32)Data item size if fixed
has_perfect_hash324 (uint32)Indicator for perfect hash (0=no, 1=yes)
serial364 (uint32)Serial number
reserved40216 (54*uint32)Reserved (pads size to 256 bytes)

symbol table and state table

The symbol table and state table contain the transitions of the automaton, each 1-byte entry in the symbol table corresponds to an uint32 entry in the state table. For each state, a list of at most 254 transistions is stored, as the symbol set is 8-bit characters, with 0x00 and 0xff reserved. Each state id is in fact an offset into these tables. For a given state state, there exists a valid transition for symbol sym if the symbol table contains sym at offset state+sym. 0x00 means the cell is empty, 0xff is a special symbol meaning that the given state is a final state, anything else means invalid transition (i.e. the cell is in use by some other state). For valid transitions, the corresponding entry in the state table yields the next state. For 0xff transitions, the state table entry contains the offset of the date item within the data store.

data store

The data store contains the data items for the final states. The 'new state' entry of a final state transition in the state table (corresponding to the special final state symbol 0xff) contains the data store offset of the data item corresponding to that final state. If fixed size items are used, each item takes fixed_data_size bytes as defined in the header. Variable size items take 4 bytes (uint32 item_size) plus item_size bytes. The size of the data store is given in the header.

perfect hash table

The perfect hash table has one uint32 entry for each transition in the symbol/state table, thus the size of the perfect hash table equals the size of the state table. The perfect hash value for a final state is calculated by adding all values in this table for the transitions along the path from the start state to the final state.