HotShot store and retrieval operations are detailed in this page.

Overview

Both store and retrieval operations are designed to take advantage of redundancy. If the storage cluster deployment has the suggested N=4, M=3, m=m'=2 minimum topology configuration, this results in a fault tolerant storage mechanism.

Top level operations are invoked through the client API. Their internal implementation is basically the same:
  1. Client posts Store or Retrieve message to the global input queue.
  2. A storage node polls the queue, and receives the message
  3. The node uses entity ID (sent in the message) and list of known nodes and their hash values to determine, which nodes should be responsible for handling the message
    1. This will return M nodes
  4. The node (we will call it 'coordinator node' forwards the message to the selected nodes ('responsible nodes')
    1. It also updates StoreCount and StoreIndex properties on the Storage entity
  5. Responsible nodes store or retrieve the message, and post the result to the coordinator node
  6. Coordinator node correlates the result messages
    1. If all the M responsible node have returned, the message is considered processed
    2. If at least m responsible node have returned, the message is also considered processed. Coordinator waits until timeout, and returns success
    3. If less than m responsible nodes signal success until the timeout has elapsed, coordinator node returns failure
  7. After correlation has finished, Coordinator posts result message to the global output queue
  8. Client receives the result message and returns to caller

Store operation handling

The handling of Store operation differs from the protocol above only in timestamping:
  • Client updates Ticks property on Storage entity before posting message to the queue (it is the current timestamp)
  • Responsible nodes compare Ticks value with the stored value. If stored Ticks is greater, it returns success, but does not updates the entity.
  • TO BE CONSIDERED * - what if they return failure instead, and client-side retry logic must be applied to ensure consistency?

Retrieve operation handling

In retrieve operation Ticks property is used to resolve separate versions of the entity. Multiple store nodes may return different version of the entity, so this must be consolidated before returning to client.
  • Only responses that contains latest entity version (maximum Ticks value) are returned
  • If multiple responses contain the same entity version and value, they are merged into one entity version
  • If multiple responses contain the same entity version but different payload, both versions are returned

Last edited Jun 19, 2013 at 6:50 PM by delzubu, version 3

Comments

No comments yet.