HotShot storage node instances are responsible for storing entities. Although a single-node deployment is sufficient for all the operations, multiple node environment is preferred for redundancy and automatic failover handling.

Multiple storage nodes form a storage cluster, where each node is identified by its hash key. Entity storage is based on this unique hash key, as detailed in the entity storage protocol page.

When a new node is entering the cluster, it receives a list of nodes already in there, and in return, it publishes its information to all the existing nodes. Nodes are communicating within the cluster, sharing health checking information. By this method, node failures can be detected, and appropriate measures taken for handling data consistency.

In the documentations, we assume the following notation:

Variable Description
N Number of nodes in the cluster. Although the system is operable when N=1, for all the protocols to work properly, it can be assumed that the minimum cluster size is N=4
n Indicates the zero-based index of the current node. This index reflects the position of node within the list ordered by hash key (0<=n<N)
hn Indicates the hash key of the actual (n-th) node. Hask key is computed at the node startup sequence, and is unchanged in the node lifetime.


Unless specifically noted, it can be assumed that N and n refers only to the active nodes in the cluster, but in some special cases, we may use N and n to contain inactive nodes too.

Node hashing

Each node has a hash key (hn), which is unique within the system (in the current implementation hash key is a [0..999] integer, but it may change in the upcoming versions). This key is calculated as the part of node startup sequence, ensuring that each fully operable node will have its hash key initialized.

Nodes can be ordered by hash key, so for the following, we can assume that

for nodes n1 < n2 < ... < nN
hn1 < hn2 < ... hnN

And the node hash forms a circular ring, so we can always calculate the next node based on the hash value (in the current implementation, mod 1000).

Redundant data storage

Submitted data will contain a key, which can be transformed to a hash value. So by the given key (k) we can decide which node should store the entity (it will be the first one that has hash value >= key value).

Variable Description
M Measurement of desired redundancy (number of nodes on which a given data entry is replicated). If for minimal architectural purposes we're assuming a desired N=4, the desired M=3
m Measurement of minimum acceptable redundancy when storing an entity (the number of actual nodes to which entity must be stored for a store operation to succeed). For architectural writeup purposes m=2
m' Measurement of minimum acceptable redundancy when retrieving an entity (the number of actual nodes to which entity must be returned for a retrieve operation to succeed). For architectural writeup purposes m'=2


In addition to this node, we will submit the store and retrieve requests to the next M-1 nodes (M>=1), and expect at least m of the store operations to succeed (1 <= m <= M < N). If m>1, this will result in redundant storage of data within the node cluster. Likewise, retrieve operation is sent to M' nodes, and expected to return at least m' results, so redundancy is checked at returning data too (m and m' may not be the same).

Last edited Jun 20, 2013 at 12:16 PM by delzubu, version 7

Comments

No comments yet.