Storage and Retrieval

Log structure storage engine

For write, append the new data(e.g. key, value pair) into the file.

For read, get the last occurrence from the file.

Hash Index

Maintain a in memory hash table. Key is the index key (e.g. the key if data is key, value pairs). Value is the offset of the value in file. So each time we want to read the data, we could get the file offset from hash table and find the starting offset where the latest data is stored.

Log file segment and compaction

If there are too much data cannot fit into single log file, we could break the log file into segments, and perform the compaction (throw away some duplicate keys). Each compaction makes the segment smaller, so that we could merge small segments into a new log file. Merging could happen in background, so the old segments could still serve the read, for writes it will be appended to new segment. Until the merging is done, old segments could be safely deleted.

Each segment has its own Hash Index Table

Things to consider

  • File format: Binary format would be better. (encode the length of a string in bytes, followed by the raw string)

  • Deleting records: Append a deletion record (tombstone).

  • Crash recovery: Store a snapshot of the in memory hash index table on disk

  • Partially written data: Include a checksum

  • Concurrency control: single writer(to keep the sequence), and multiple reader

Pros of append-only against updating old value

  • Appending operation is faster than random write

  • Crash recovery is much easier, since the log file has all the previous records

  • Merging keeps the log file tidy

Cons of hash index

  • Hash index table must fit in memory

  • Range queries are inefficient

page-oriented storage engine

Last updated