启动和停止子群集
借助子群集,可以根据需要方便地启动和停止一组节点。您可以使用 admintools 命令行或 Vertica 元函数启动和停止它们。
启动子群集
要启动子群集,请使用 restart_subcluster
工具:
$ adminTools -t restart_subcluster -h
Usage: restart_subcluster [options]
Options:
-h, --help show this help message and exit
-d DB, --database=DB Name of database whose subcluster is to be restarted
-c SCNAME, --subcluster=SCNAME
Name of subcluster to be restarted
-p DBPASSWORD, --password=DBPASSWORD
Database password in single quotes
--timeout=NONINTERACTIVE_TIMEOUT
set a timeout (in seconds) to wait for actions to
complete ('never') will wait forever (implicitly sets
-i)
-i, --noprompts do not stop and wait for user input(default false).
Setting this implies a timeout of 20 min.
-F, --force Force the nodes in the subcluster to start and auto
recover if necessary
此示例启动名为 analytics_cluster 的子群集:
$ adminTools -t restart_subcluster -c analytics_cluster \
-d verticadb -p password
*** Restarting subcluster for database verticadb ***
Restarting host [10.11.12.192] with catalog [v_verticadb_node0006_catalog]
Restarting host [10.11.12.181] with catalog [v_verticadb_node0004_catalog]
Restarting host [10.11.12.205] with catalog [v_verticadb_node0005_catalog]
Issuing multi-node restart
Starting nodes:
v_verticadb_node0004 (10.11.12.181)
v_verticadb_node0005 (10.11.12.205)
v_verticadb_node0006 (10.11.12.192)
Starting Vertica on all nodes. Please wait, databases with a large
catalog may take a while to initialize.
Node Status: v_verticadb_node0002: (UP) v_verticadb_node0004: (DOWN)
v_verticadb_node0005: (DOWN) v_verticadb_node0006: (DOWN)
Node Status: v_verticadb_node0002: (UP) v_verticadb_node0004: (DOWN)
v_verticadb_node0005: (DOWN) v_verticadb_node0006: (DOWN)
Node Status: v_verticadb_node0002: (UP) v_verticadb_node0004: (DOWN)
v_verticadb_node0005: (DOWN) v_verticadb_node0006: (DOWN)
Node Status: v_verticadb_node0002: (UP) v_verticadb_node0004: (DOWN)
v_verticadb_node0005: (DOWN) v_verticadb_node0006: (DOWN)
Node Status: v_verticadb_node0002: (UP) v_verticadb_node0004: (UP)
v_verticadb_node0005: (UP) v_verticadb_node0006: (UP)
Communal storage detected: syncing catalog
Restart Subcluster result: 1
停止子群集
您可以使用 stop_subcluster
admintools 命令或 SHUTDOWN_WITH_DRAIN 和 SHUTDOWN_SUBCLUSTER 函数停止子群集。
admintools
重要
stop_subcluster
如果存在连接到子群集的活动用户会话,则不会发出警告。此行为与停止单个节点相同。在停止子群集之前,确认没有用户连接到它。
要停止子群集,请使用 stop_subcluster
工具:
$ adminTools -t stop_subcluster -h
Usage: stop_subcluster [options]
Options:
-h, --help show this help message and exit
-d DB, --database=DB Name of database whose subcluster is to be stopped
-c SCNAME, --subcluster=SCNAME
Name of subcluster to be stopped
-p DBPASSWORD, --password=DBPASSWORD
Database password in single quotes
--timeout=NONINTERACTIVE_TIMEOUT
set a timeout (in seconds) to wait for actions to
complete ('never') will wait forever (implicitly sets
-i)
-i, --noprompts do not stop and wait for user input(default false).
Setting this implies a timeout of 20 min.
此示例停止名为 analytics_cluster 的子群集:
$ adminTools -t stop_subcluster -c analytics_cluster -d verticadb -p password
*** Forcing subcluster shutdown ***
Verifying subcluster 'analytics_cluster'
Node 'v_verticadb_node0004' will shutdown
Node 'v_verticadb_node0005' will shutdown
Node 'v_verticadb_node0006' will shutdown
Shutdown subcluster command sent to the database
正常关闭
如果您希望在关闭子群集之前清空子群集的客户端连接,则可以使用 SHUTDOWN_WITH_DRAIN 函数正常关闭子群集。该函数首先将指定子群集中的所有节点标记为正在清空。现有用户会话中的工作继续在正在清空的节点上进行,但节点拒绝新的客户端连接,并被排除在负载均衡操作之外。dbadmin 仍然可以连接到正在清空的节点。有关客户端连接清空的详细信息,请参阅排空客户端连接。
要运行 SHUTDOWN_WITH_DRAIN 函数,您必须指定超时值。该函数的行为取决于超时值的正负:
- 正值:节点清空,直到所有现有连接关闭或该函数达到超时值设置的运行时限制为止。一旦满足这些条件之一,该函数就会向子群集发送一条关闭消息并返回。
- 零:该函数立即关闭子群集中所有活动用户会话,然后关闭子群集并返回。
- 负值:该函数将子群集节点标记为正在清空,等待关闭子群集,直到所有活动用户会话断开连接为止。
正在清空的子群集中所有节点关闭之后,其节点自动重置为非清空状态。
以下示例演示如何使用正超时值给予活动用户会话时间,以在关闭子群集之前完成其工作:
=> SELECT node_name, subcluster_name, is_draining, count_client_user_sessions, oldest_session_user FROM draining_status ORDER BY 1;
node_name | subcluster_name | is_draining | count_client_user_sessions | oldest_session_user
----------------------+--------------------+-------------+----------------------------+---------------------
v_verticadb_node0001 | default_subcluster | f | 0 |
v_verticadb_node0002 | default_subcluster | f | 0 |
v_verticadb_node0003 | default_subcluster | f | 0 |
v_verticadb_node0004 | analytics | f | 1 | analyst
v_verticadb_node0005 | analytics | f | 0 |
v_verticadb_node0006 | analytics | f | 0 |
(6 rows)
=> SELECT SHUTDOWN_WITH_DRAIN('analytics', 300);
NOTICE 0: Draining has started on subcluster (analytics)
NOTICE 0: Begin shutdown of subcluster (analytics)
SHUTDOWN_WITH_DRAIN
--------------------------------------------------------------------------------------------------------------------
Set subcluster (analytics) to draining state
Waited for 3 nodes to drain
Shutdown message sent to subcluster (analytics)
(1 row)
您可以查询 NODES 系统表以确认子群集已关闭:
=> SELECT subcluster_name, node_name, node_state FROM nodes;
subcluster_name | node_name | node_state
--------------------+----------------------+------------
default_subcluster | v_verticadb_node0001 | UP
default_subcluster | v_verticadb_node0002 | UP
default_subcluster | v_verticadb_node0003 | UP
analytics | v_verticadb_node0004 | DOWN
analytics | v_verticadb_node0005 | DOWN
analytics | v_verticadb_node0006 | DOWN
(6 rows)
如果要查看有关清空和关闭事件的详细信息(例如所有用户会话是否在超时之前完成工作),可以查询 dc_draining_events 表。在本例中,当函数达到超时的时候,子群集仍有一个活动用户会话:
=> SELECT event_type, event_type_name, event_description, event_result, event_result_name FROM dc_draining_events;
event_type | event_type_name | event_description | event_result | event_result_name
------------+------------------------------+---------------------------------------------------------------------+--------------+-------------------
0 | START_DRAIN_SUBCLUSTER | START_DRAIN for SHUTDOWN of subcluster (analytics) | 0 | SUCCESS
2 | START_WAIT_FOR_NODE_DRAIN | Wait timeout is 300 seconds | 4 | INFORMATIONAL
4 | INTERVAL_WAIT_FOR_NODE_DRAIN | 1 sessions remain after 0 seconds | 4 | INFORMATIONAL
4 | INTERVAL_WAIT_FOR_NODE_DRAIN | 1 sessions remain after 60 seconds | 4 | INFORMATIONAL
4 | INTERVAL_WAIT_FOR_NODE_DRAIN | 1 sessions remain after 120 seconds | 4 | INFORMATIONAL
4 | INTERVAL_WAIT_FOR_NODE_DRAIN | 1 sessions remain after 125 seconds | 4 | INFORMATIONAL
4 | INTERVAL_WAIT_FOR_NODE_DRAIN | 1 sessions remain after 180 seconds | 4 | INFORMATIONAL
4 | INTERVAL_WAIT_FOR_NODE_DRAIN | 1 sessions remain after 240 seconds | 4 | INFORMATIONAL
4 | INTERVAL_WAIT_FOR_NODE_DRAIN | 1 sessions remain after 250 seconds | 4 | INFORMATIONAL
4 | INTERVAL_WAIT_FOR_NODE_DRAIN | 1 sessions remain after 300 seconds | 4 | INFORMATIONAL
3 | END_WAIT_FOR_NODE_DRAIN | Wait for drain ended with 1 sessions remaining | 2 | TIMEOUT
5 | BEGIN_SHUTDOWN_AFTER_DRAIN | Staring shutdown of subcluster (analytics) following drain | 4 | INFORMATIONAL
(12 rows)
重新启动子群集后,可以查询 DRAINING_STATUS 系统表以确认节点已将其清空状态重置为非清空:
=> SELECT node_name, subcluster_name, is_draining, count_client_user_sessions, oldest_session_user FROM draining_status ORDER BY 1;
node_name | subcluster_name | is_draining | count_client_user_sessions | oldest_session_user
----------------------+--------------------+-------------+----------------------------+---------------------
v_verticadb_node0001 | default_subcluster | f | 0 |
v_verticadb_node0002 | default_subcluster | f | 0 |
v_verticadb_node0003 | default_subcluster | f | 0 |
v_verticadb_node0004 | analytics | f | 0 |
v_verticadb_node0005 | analytics | f | 0 |
v_verticadb_node0006 | analytics | f | 0 |
(6 rows)
立即关闭
如果想要立即关闭子群集,则可以调用 SHUTDOWN_SUBCLUSTER。以下示例立即关闭名为 analytics 的子群集,而不检查任何活动客户端连接:
=> SELECT SHUTDOWN_SUBCLUSTER('analytics');
SHUTDOWN_SUBCLUSTER
---------------------
Subcluster shutdown
(1 row)