<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>OpenText Analytics Database 26.2.x – Using HDFS storage locations</title>
    <link>/en/hadoop-integration/using-hdfs-storage-locations/</link>
    <description>Recent content in Using HDFS storage locations on OpenText Analytics Database 26.2.x</description>
    <generator>Hugo -- gohugo.io</generator>
    
	  <atom:link href="/en/hadoop-integration/using-hdfs-storage-locations/index.xml" rel="self" type="application/rss+xml" />
    
    
      
        
      
    
    
    <item>
      <title>Hadoop-Integration: Hadoop configuration for backup and restore</title>
      <link>/en/hadoop-integration/using-hdfs-storage-locations/hadoop-config-backup-and-restore/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/hadoop-integration/using-hdfs-storage-locations/hadoop-config-backup-and-restore/</guid>
      <description>
        
        
        &lt;p&gt;If your OpenText™ Analytics Database cluster uses storage locations on HDFS, and you want to be able to back up and restore those storage locations using &lt;code&gt;vbr&lt;/code&gt;, you must enable snapshotting in HDFS.&lt;/p&gt;
&lt;p&gt;The database backup script uses HDFS&#39;s snapshotting feature to create a backup of HDFS storage locations. A directory must allow snapshotting before HDFS can take a snapshot. Only a Hadoop superuser can enable snapshotting on a directory. The database can enable snapshotting automatically if the database administrator is also a Hadoop superuser.&lt;/p&gt;
&lt;p&gt;If HDFS is unsecured, the following instructions apply to the database administrator account, usually dbadmin. If HDFS uses Kerberos security, the following instructions apply to the principal stored in the database keytab file, usually vertica. The instructions below use the term &amp;quot;database account&amp;quot; to refer to this user.&lt;/p&gt;
&lt;p&gt;We recommend that you make the database administrator or principal a Hadoop superuser. If you are not able to do so, you must enable snapshotting on the directory before configuring it for use by the database.&lt;/p&gt;
&lt;p&gt;The steps you need to take to make the database administrator account a superuser depend on the distribution of Hadoop you are using. Consult your Hadoop distribution&#39;s documentation for details.&lt;/p&gt;
&lt;h2 id=&#34;manually-enabling-snapshotting-for-a-directory&#34;&gt;Manually enabling snapshotting for a directory&lt;/h2&gt;
&lt;p&gt;If you cannot grant superuser status to the database account, you can instead enable snapshotting of each directory manually. Use the following command:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ hdfs dfsadmin -allowSnapshot &lt;span class=&#34;code-variable&#34;&gt;path&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Issue this command for each directory on each node. Remember to do this each time you add a new node to your HDFS cluster.&lt;/p&gt;
&lt;p&gt;Nested snapshottable directories are not allowed, so you cannot enable snapshotting for a parent directory to automatically enable it for child directories. You must enable it for each individual directory.&lt;/p&gt;
&lt;h2 id=&#34;additional-requirements-for-kerberos&#34;&gt;Additional requirements for Kerberos&lt;/h2&gt;
&lt;p&gt;If HDFS uses Kerberos, then in addition to granting the keytab principal access, you must give the database access to certain Hadoop configuration files. See &lt;a href=&#34;../../../en/admin/backup-and-restore/requirements-backing-up-and-restoring-hdfs-storage-locations/#Configur2&#34;&gt;Configuring Kerberos&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;testing-the-database-accounts-ability-to-make-hdfs-directories-snapshottable&#34;&gt;Testing the database account&#39;s ability to make HDFS directories snapshottable&lt;/h2&gt;
&lt;p&gt;After making the database account a Hadoop superuser, verify that the account can set directories snapshottable:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Log into the Hadoop cluster as the database account (dbadmin by default).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Determine a location in HDFS where the database administrator can create a directory. The &lt;code&gt;/tmp&lt;/code&gt; directory is usually available. Create a test HDFS directory using the command:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ hdfs dfs -mkdir &lt;span class=&#34;code-variable&#34;&gt;/path/testdir&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Make the test directory snapshottable using the command:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ hdfs dfsadmin -allowSnapshot &lt;span class=&#34;code-variable&#34;&gt;/path/testdir&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The following example demonstrates creating an HDFS directory and making it snapshottable:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ hdfs dfs -mkdir /tmp/snaptest
$ hdfs dfsadmin -allowSnapshot /tmp/snaptest
Allowing snaphot on /tmp/snaptest succeeded
&lt;/code&gt;&lt;/pre&gt;
      </description>
    </item>
    
    <item>
      <title>Hadoop-Integration: Removing HDFS storage locations</title>
      <link>/en/hadoop-integration/using-hdfs-storage-locations/removing-hdfs-storage-locations/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/hadoop-integration/using-hdfs-storage-locations/removing-hdfs-storage-locations/</guid>
      <description>
        
        
        &lt;p&gt;The steps to remove an HDFS storage location are similar to standard storage locations:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Remove any existing data from the HDFS storage location by using &lt;a href=&#34;../../../en/sql-reference/functions/management-functions/storage-functions/set-object-storage-policy/#&#34;&gt;SET_OBJECT_STORAGE_POLICY&lt;/a&gt; to change each object&#39;s storage location. Alternatively, you can use &lt;a href=&#34;../../../en/sql-reference/functions/management-functions/storage-functions/clear-object-storage-policy/#&#34;&gt;CLEAR_OBJECT_STORAGE_POLICY&lt;/a&gt;. Because the Tuple Mover runs infrequently, set the &lt;em&gt;&lt;code&gt;enforce-storage-move&lt;/code&gt;&lt;/em&gt; parameter to &lt;code&gt;true&lt;/code&gt; to make the change immediately.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Retire the location on each host that has the storage location defined using &lt;a href=&#34;../../../en/sql-reference/functions/management-functions/storage-functions/retire-location/#&#34;&gt;RETIRE_LOCATION&lt;/a&gt;. Set &lt;em&gt;&lt;code&gt;enforce-storage-move&lt;/code&gt;&lt;/em&gt; to &lt;code&gt;true&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Drop the location on each node using &lt;a href=&#34;../../../en/sql-reference/functions/management-functions/storage-functions/drop-location/#&#34;&gt;DROP_LOCATION&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Optionally remove the snapshots and files from the HDFS directory for the storage location.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Perform a full database backup.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;For more information about changing storage policies, changing usage, retiring locations, and dropping locations, see &lt;a href=&#34;../../../en/admin/managing-storage-locations/#&#34;&gt;Managing storage locations&lt;/a&gt;.

&lt;div class=&#34;admonition important&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Important&lt;/h4&gt;
If you have backed up the data in the HDFS storage location you are removing, you must perform a full database backup after you remove the location. If you do not and restore the database to a backup made before you removed the location, the location&#39;s data is restored.
&lt;/div&gt;&lt;/p&gt;
&lt;h2 id=&#34;removing-storage-location-files-from-hdfs&#34;&gt;Removing storage location files from HDFS&lt;/h2&gt;
&lt;p&gt;Dropping an HDFS storage location does not automatically clean the HDFS directory that stored the location&#39;s files. Any snapshots of the data files created when backing up the location are also not deleted. These files consume disk space on HDFS and also prevent the directory from being reused as an HDFS storage location. OpenText™ Analytics Database cannot create a storage location in a directory that contains existing files or subdirectories.&lt;/p&gt;
&lt;p&gt;You must log into the Hadoop cluster to delete the files from HDFS. An alternative is to use some other HDFS file management tool.&lt;/p&gt;
&lt;h3 id=&#34;removing-backup-snapshots&#34;&gt;Removing backup snapshots&lt;/h3&gt;
&lt;p&gt;HDFS returns an error if you attempt to remove a directory that has snapshots:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ hdfs dfs -rm -r -f -skipTrash /user/dbadmin/v_vmart_node0001
rm: The directory /user/dbadmin/v_vmart_node0001 cannot be deleted since
/user/dbadmin/v_vmart_node0001 is snapshottable and already has snapshots
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The database backup script creates snapshots of HDFS storage locations as part of the backup process. If you made backups of your HDFS storage location, you must delete the snapshots before removing the directories.&lt;/p&gt;
&lt;p&gt;HDFS stores snapshots in a subdirectory named &lt;code&gt;.snapshot&lt;/code&gt;. You can list the snapshots in the directory using the standard HDFS &lt;code&gt;ls&lt;/code&gt; command:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ hdfs dfs -ls /user/dbadmin/v_vmart_node0001/.snapshot
Found 1 items
drwxrwx---   - dbadmin supergroup          0 2014-09-02 10:13 /user/dbadmin/v_vmart_node0001/.snapshot/s20140902-101358.629
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;To remove snapshots, use the command:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ hdfs dfs -removeSnapshot &lt;span class=&#34;code-variable&#34;&gt;directory&lt;/span&gt; &lt;span class=&#34;code-variable&#34;&gt;snapshotname&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The following example demonstrates the command to delete the snapshot shown in the previous example:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ hdfs dfs -deleteSnapshot /user/dbadmin/v_vmart_node0001 s20140902-101358.629
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;You must delete each snapshot from the directory for each host in the cluster. After you have deleted the snapshots, you can delete the directories in the storage location.

&lt;div class=&#34;admonition important&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Important&lt;/h4&gt;
Each snapshot&#39;s name is based on a timestamp down to the millisecond. Nodes independently create their own snapshots. They do not synchronize snapshot creation, so their snapshot names differ. You must list each node&#39;s snapshot directory to learn the names of the snapshots it contains.
&lt;/div&gt;&lt;/p&gt;
&lt;p&gt;See Apache&#39;s &lt;a href=&#34;http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-hdfs/HdfsSnapshots.html&#34;&gt;HDFS Snapshot documentation&lt;/a&gt; for more information about managing and removing snapshots.&lt;/p&gt;
&lt;h3 id=&#34;removing-the-storage-location-directories&#34;&gt;Removing the storage location directories&lt;/h3&gt;
&lt;p&gt;You can remove the directories that held the storage location&#39;s data by either of the following methods:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Use an HDFS file manager to delete directories. See your Hadoop distribution&#39;s documentation to determine if it provides a file manager.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Log into the Hadoop Name Node using the database administrator’s account and use HDFS&#39;s &lt;code&gt;rmr&lt;/code&gt; command to delete the directories. See Apache&#39;s &lt;a href=&#34;http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/FileSystemShell.html&#34;&gt;File System Shell Guide&lt;/a&gt; for more information.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The following example uses the HDFS &lt;code&gt;rmr&lt;/code&gt; command from the Linux command line to delete the directories left behind in the HDFS storage location directory &lt;code&gt;/user/dbamin&lt;/code&gt;. It uses the &lt;code&gt;-skipTrash&lt;/code&gt; flag to force the immediate deletion of the files:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;$ hdfsp dfs -ls /user/dbadmin
Found 3 items
drwxrwx---   - dbadmin supergroup          0 2014-08-29 15:11 /user/dbadmin/v_vmart_node0001
drwxrwx---   - dbadmin supergroup          0 2014-08-29 15:11 /user/dbadmin/v_vmart_node0002
drwxrwx---   - dbadmin supergroup          0 2014-08-29 15:11 /user/dbadmin/v_vmart_node0003

$ hdfs dfs -rmr -skipTrash /user/dbadmin/*
Deleted /user/dbadmin/v_vmart_node0001
Deleted /user/dbadmin/v_vmart_node0002
Deleted /user/dbadmin/v_vmart_node0003
&lt;/code&gt;&lt;/pre&gt;
      </description>
    </item>
    
  </channel>
</rss>
