본문 바로가기
DBMS/Vertica

K-Safety (replica), Data Safety, Node Dependencies란

by yororing 2024. 6. 12.

00 개요

  • 이 글은 K-safety, datay safety 및 node dependencies에 대한 요약본을 제공
  • 이 개념들은 Vertica의 high availability 및 recovery feature들을 고려할 때 중요

01 K-Safety란

1. K-Safety 정의

  • K-Safety is a measure of fault tolerance in your database cluster (DB 클러스터의 내결함성 측정값)
  • value K = number of replicas of the data that exists in the DB
  • K-Safety level
    • (데이터 이중화/삼중화 수)
  • to achieve high availability in Vertica, a segmented projection should have at least one buddy projection, and an unsegmented projection should be replicated on all the nodes. Vertica에서는, K-safety enforces these projection requirements

2. K-Safety와 Buddy Projection

  • K-Safety = 1인 DB에서는 auto projections (that Vertica creates) will have a buddy projection
    • 이러한 복제본(buddy projection)은 장애가 발생한 노드에 대해 다른 노드를 대신할 수 있게 하여 DB가 계속 실행되면서 데이터 무결성 보장 가능
    • 만약 a custom projection was creating w/o the ksafe1 keyword, the projection does not have a buddy projection, and the projection is marked unsafe until the buddy projection is explicitly created
  • all unsafe projections are restricted as follows:
    • not up to date
    • cannot be refreshed
    • cannot receive data loaded into a table
    • are not used for answering queries

3. K-Safety 값

  • Vertica에서 가능한 K 값 = 0, 1, 2
  • K = 1인 DB에서 노드가 하나 down되어도 DB는 계속 정상적으로 실행됨
    • 잠재적으로, 클러스터에서 적어도 하나의 다른 노드가 down된 노드의 데이터 복사본을 가지고 있는 한, DB는 계속 실행될 수 있음
  • K = 0이면 node 1 or 2로 구성되며, HA구성 불가능
  • K=1이면 node는 3개 이상, K=2이면 node는 5개 이상 필요하며, HA구성 가능

  • DB의 K-safety 값 확인하는 방법
=> SELECT GET_DESIGN_KSAFE();
get_design_ksafe
------------------
                1
(1 row)

=> SELECT CURRENT_FAULT_TOLERANCE
    FROM SYSTEM;
current_fault_tolerance
-------------------------
                       1
(1 row)

02 Data Safety와 Node Dependency

1. Data Safety 및 Node Dependency 이해하기

  • Unsegmented projections have a copy of the data on every node in the cluster
    • for a DB with a K-safety = 1, every segmented projection has a buddy projection that contains a copy of the data
  • in segmented projections, the data is split into segments based on the segmentation expression specified in the projection design
    • Data segments that belong to a segmented projection and the buddy projections are laid out on nodes with offset
  • nodes that hold a data segment and a copy are considered buddy nodes
    • buddy nodes store either a projection segment or its copy
    • these nodes share a dependency
  • the following figure shows node dependendy
    • Node1 and Node2 are buddy nodes and share dependency for data segment S1
    • Node2 and Node3 are buddy nodes and share dependency for data segment S2, and so forth

  • you can use the following query to get a list of node dependencies in your DB:
  •  
=> SELECT dependency_id, min(node_name ) AS node_x, max(node_name) AS node_y
    FROM vs_node_dependencies
    JOIN nodes
    ON node_oid = node_id
    GROUP BY 1
    HAVING count(*) = 2
    ORDER BY 1;
dependency_id |        node_x        |        node_y
--------------+----------------------+----------------------
0             |        node1         |        node3
1             |        node2         |        node4
2             |        node3         |        node4
3             |        node2         |        node5
4             |        node1         |        node5
(5 rows)
  • you can also get node dependencies by running the following command:
=> SELECT GET_NODE_DEPENDENCIES ();
get_node_dependencies
-----------------------------------------------------------------------------
Deps:
00110 - cnt: 2
01001 - cnt: 2
01100 - cnt: 2
10001 - cnt: 2
10010 - cnt: 2
11111 - cnt: 9

00001 – name: Node 2
00010 – name: Node 1
00100 – name: Node 3
01000 – name: Node 4
10000 – name: Node 5
(1 row)
  • in Vertica versions prior to 8.0.1, the get_node_dependencies API did not provide dependency bits to node mapping
  • if you are running an earlier version of Vertica and reading the node dependency bits, the rightmost bit represents the permanent node with the lowest node_id and the leftmost bit represents the permanent bit with the highest node_id
  • in Vertica 8.0.1, you can run the verbose version of get_node_dependencies to see the node names next to the bit pattern:
=> SELECT get_node_dependencies_verbose ();
  • you can run the following statement to get the node names in ascending order of node_id:
=> SELECT node_name FROM nodes ORDER BY node_id;
node_name
-------------------
node2
node1
node3
node4
node5
(5 rows)
  • a bit value of 1 means that the node representing the bit is the buddy of another node with a bit value of 1
  • e.g., in dependency bits 00110, a node with the second lowest node_id is the buddy node for a node with the third lowest node_id
  • the cnt value represents the number of projection buddy pairs in the DB
  • a dependency with all bits set to 1 represents an unsegmented projection and teh cnt value gives the number of unsegmented projections in the DB
  • if you drew a graph with the nodes as vertices and the dependencies b/w the nodes as the edges of the graph, the graph would be a ring where each vertice had two edges to maintain a K-safety of 1
  • the following figure shows the node dependency ring:

  • if all the nodes in the ring had three edges, that would represent a DB with a K-safety of 2
  • if only some nodes had more than two edges, that ring would indicate that rebalance activity is incomplete
  • you can run the following statement to recompute node dependencies
=> SELECT RECOMPUTE_NODE_DEPENDENCIES();

2. High Availability와 Data Safety

  • a Vertica DB with a K-safety = 1 will remain UP and fully functional if more than half the nodes in the DB are UP and no two adjacent nodes in teh dependency ring are down (this feature is known as data safety)
  • e.g., the DB remains UP if node1 and node4 go DOWN, b/c they are not adjacent nodes in the dependency ring
  • the DB will perform an unsafe SHUTDOWN if node1 and node5 go DOWN at the same time
  • this happens b/c the segment of data on those nodes is no longer accessible
  • when you have a node DOWN in your cluster, you can run the following query to get a list of critical nodes
  • the results table is useful when DB administrators are planning scheduled maintenance on more than one node:
=> SELECT * FROM critical_nodes;
  • for best resuls, Vertica engineers recomment the following:
    • a K-safety = 1 and do not recommend a K-safety = 2. Although you can have two adjacent nodes in the dependency ring go DOWN while the system remains UP, the chances of that occurence are slim. Additionally, a K-safety of 2 uses 50% more system resources. This usage could slow system performance
    • making you DB rack aware by defining fault groups for large clusters that span more than two racks

3. DB Recovery and Data Safety

  • when you start Vertica, you need more than half the nodes to form a quorum
  • no two adjacent nodes in the dependency ring can be left out

 

참조

  1. https://www.vertica.com/kb/KSafetyBestPractices/Content/BestPractices/KSafetyBestPractices.htm
  2. https://x2wizard.github.io/vertica_architecture/Vertica_architecture_1020/
  3.