This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Replacing nodes

If you have a database, you can replace nodes, as necessary, without bringing the system down.

If you have a K-Safe database, you can replace nodes, as necessary, without bringing the system down. For example, you might want to replace an existing node if you:

  • Need to repair an existing host system that no longer functions and restore it to the cluster

  • Want to exchange an existing host system for another more powerful system

The process you use to replace a node depends on whether you are replacing the node with:

  • A host that uses the same name and IP address

  • A host that uses a different name and IP address

  • An active standby node

Prerequisites

  • Configure the replacement hosts for Vertica. See Before you Install Vertica.

  • Read the Important Tipssections under Adding hosts to a cluster and Removing hosts from a cluster.

  • Ensure that the database administrator user exists on the new host and is configured identically to the existing hosts. Vertica will setup passwordless ssh as needed.

  • Ensure that directories for Catalog Path, Data Path, and any storage locations are added to the database when you create it and/or are mounted correctly on the new host and have read and write access permissions for the database administrator user. Also ensure that there is sufficient disk space.

  • Follow the best practice procedure below for introducing the failed hardware back into the cluster to avoid spurious full-node rebuilds.

Best practice for restoring failed hardware

Following this procedure will prevent Vertica from misdiagnosing missing disk or bad mounts as data corruptions, which would result in a time-consuming, full-node recovery.

If a server fails due to hardware issues, for example a bad disk or a failed controller, upon repairing the hardware:

  1. Reboot the machine into runlevel 1, which is a root and console-only mode.

    Runlevel 1 prevents network connectivity and keeps Vertica from attempting to reconnect to the cluster.

  2. In runlevel 1, validate that the hardware has been repaired, the controllers are online, and any RAID recover is able to proceed.

  3. Once the hardware is confirmed consistent, only then reboot to runlevel 3 or higher.

At this point, the network activates, and Vertica rejoins the cluster and automatically recovers any missing data. Note that, on a single-node database, if any files that were associated with a projection have been deleted or corrupted, Vertica will delete all files associated with that projection, which could result in data loss.

1 - Replacing a host using the same name and IP address

If a host of an existing Vertica database is removed you can replace it while the database is running.

If a host of an existing Vertica database is removed you can replace it while the database is running.

You can replace the host with a new host that has the following same characteristics as the old host:

  • Name

  • IP address

  • Operating system

  • The OS administrator user

  • Directory location

Replacing the host while your database is running prevents system downtime. Before replacing a host, backup your database. See Backing up and restoring the database for more information.

Replace a host using the same characteristics as follows:

  1. Run install_vertica from a functioning host using the --rpm or --deb parameter:

    $ /opt/vertica/sbin/install_vertica --rpm rpm_package
    

    For more information see Installing using the command line.

  2. Use Administration Tools from an existing node to restart the new host. See Restart Vertica on a node.

The node automatically joins the database and recovers its data by querying the other nodes in the database. It then transitions to an UP state.

2 - Replacing a failed node using a node with a different IP address

Replacing a failed node with a host system that has a different IP address from the original consists of the following steps:.

Replacing a failed node with a host system that has a different IP address from the original consists of the following steps:

  1. Back up the database.

    Vertica recommends that you back up the database before you perform this significant operation because it entails creating new projections, deleting old projections, and reloading data.

  2. Add the new host to the cluster. See Adding hosts to a cluster.

  3. If Vertica is still running in the node being replaced, then use the Administration Tools to Stop Vertica on Host on the host being replaced.

  4. Use the Administration Tools to replace the original host with the new host. If you are using more than one database, replace the original host in all the databases in which it is used. See Replacing Hosts.

  5. Use the procedure in Distributing Configuration Files to the New Host to transfer metadata to the new host.

  6. Remove the host from the cluster.

  7. Use the Administration Tools to restart Vertica on the host. On the Main Menu, select Restart Vertica on Host, and click OK. See Starting the database for more information.

Once you have completed this process, the replacement node automatically recovers the data that was stored in the original node by querying other nodes within the database.

3 - Replacing a functioning node using a different name and IP address

Replacing a node with a host system that has a different IP address and host name from the original consists of the following general steps:.

Replacing a node with a host system that has a different IP address and host name from the original consists of the following general steps:

  1. Back up the database.

    Vertica recommends that you back up the database before you perform this significant operation because it entails creating new projections, deleting old projections, and reloading data.

  2. Add the replacement hosts to the cluster.

    At this point, both the original host that you want to remove and the new replacement host are members of the cluster.

  3. Use the Administration Tools to Stop Vertica on Host on the host being replaced.

  4. Use the Administration Tools to replace the original host with the new host. If you are using more than one database, replace the original host in all the databases in which it is used. See Replacing Hosts.

  5. Remove the host from the cluster.

  6. Restart Vertica on the host.

Once you have completed this process, the replacement node automatically recovers the data that was stored in the original node by querying the other nodes within the database. It then transitions to an UP state.

4 - Using the administration tools to replace nodes

If you are replacing a node with a host that uses a different name and IP address, use the Administration Tools to replace the original host with the new host.

If you are replacing a node with a host that uses a different name and IP address, use the Administration Tools to replace the original host with the new host. Alternatively, you can use the Management Console to replace a node.

Replace the original host with a new host using the administration tools

To replace the original host with a new host using the Administration Tools:

  1. Back up the database. See Backing up and restoring the database.

  2. From a node that is up, and is not going to be replaced, open the Administration tools.

  3. On the Main Menu, select View Database Cluster State to verify that the database is running. If it’s not running, use the Start Database command on the Main Menu to restart it.

  4. On the Main Menu, select Advanced Menu.

  5. In the Advanced Menu, select Stop Vertica on Host.

  6. Select the host you want to replace, and then click OK to stop the node.

  7. When prompted if you want to stop the host, select Yes.

  8. In the Advanced Menu, select Cluster Management, and then click OK.

  9. In the Cluster Management menu, select Replace Host, and then click OK.

  10. Select the database that contains the host you want to replace, and then click OK.

    A list of all the hosts that are currently being used displays.

  11. Select the host you want to replace, and then click OK.

  12. Select the host you want to use as the replacement, and then click OK.

  13. When prompted, enter the password for the database, and then click OK.

  14. When prompted, click Yes to confirm that you want to replace the host.

  15. When prompted that the host was successfully replaced, click OK.

  16. In the Main Menu, select View Database Cluster State to verify that all the hosts are running. You might need to start Vertica on the host you just replaced. Use Restart Vertica on Host.

    The node enters a RECOVERING state.