<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>OpenText Analytics Database 26.2.x – Accessing kerberized HDFS data</title>
    <link>/en/hadoop-integration/accessing-kerberized-hdfs-data/</link>
    <description>Recent content in Accessing kerberized HDFS data on OpenText Analytics Database 26.2.x</description>
    <generator>Hugo -- gohugo.io</generator>
    
	  <atom:link href="/en/hadoop-integration/accessing-kerberized-hdfs-data/index.xml" rel="self" type="application/rss+xml" />
    
    
      
        
      
    
    
    <item>
      <title>Hadoop-Integration: Using Kerberos with OpenText&amp;trade; Analytics Database</title>
      <link>/en/hadoop-integration/accessing-kerberized-hdfs-data/using-kerberos-with/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/hadoop-integration/accessing-kerberized-hdfs-data/using-kerberos-with/</guid>
      <description>
        
        
        &lt;p&gt;If you use Kerberos for your OpenText™ Analytics Database cluster and your principals have access to HDFS, then you can configure the database to use the same credentials for HDFS.&lt;/p&gt;
&lt;p&gt;The database authenticates with Hadoop in two ways that require different configurations:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;User Authentication&lt;/strong&gt;—On behalf of the user, by passing along the user&#39;s existing Kerberos credentials. This method is also called user impersonation. Actions performed on behalf of particular users, like executing queries, generally use user authentication.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Database Authentication&lt;/strong&gt;—On behalf of system processes that access ROS data or the catalog, by using a special Kerberos credential stored in a keytab file.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class=&#34;alert admonition note&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Note&lt;/h4&gt;

The database and Hadoop must use the same Kerberos server or servers (KDCs).

&lt;/div&gt;
&lt;p&gt;The database can interact with more than one Kerberos realm. To configure multiple realms, see &lt;a href=&#34;../../../en/security-and-authentication/client-authentication/kerberos-authentication/#MultiRealmSupport&#34;&gt;Multi-realm Support&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The database attempts to automatically refresh Hadoop tokens before they expire. See &lt;a href=&#34;../../../en/hadoop-integration/accessing-kerberized-hdfs-data/token-expiration/#&#34;&gt;Token expiration&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;user-authentication&#34;&gt;User authentication&lt;/h2&gt;
&lt;p&gt;To use the database with Kerberos and Hadoop, the client user first authenticates with one of the Kerberos servers (Key Distribution Center, or KDC) being used by the Hadoop cluster. A user might run &lt;code&gt;kinit&lt;/code&gt; or sign in to Active Directory, for example.&lt;/p&gt;
&lt;p&gt;A user who authenticates to a Kerberos server receives a Kerberos ticket. At the beginning of a client session, the database automatically retrieves this ticket. The database then uses this ticket to get a Hadoop token, which Hadoop uses to grant access. The database uses this token to access HDFS, such as when executing a query on behalf of the user. When the token expires, the database automatically renews it, also renewing the Kerberos ticket if necessary.&lt;/p&gt;
&lt;p&gt;The user must have been granted permission to access the relevant files in HDFS. This permission is checked the first time the database reads HDFS data.&lt;/p&gt;
&lt;p&gt;The database can use multiple KDCs serving multiple Kerberos realms, if proper cross-realm trust has been set up between realms.&lt;/p&gt;
&lt;p&gt;&lt;a name=&#34;VerticaAuthentication&#34;&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&#34;opentexttrade-analytics-database-authentication&#34;&gt;OpenText™ Analytics Database authentication&lt;/h2&gt;
&lt;p&gt;Automatic processes, such as the Tuple Mover or the processes that access Eon Mode communal storage, do not log in the way users do. Instead, the database uses a special identity (principal) stored in a keytab file on every database node. (This approach is also used for the database clusters that use Kerberos but do not use Hadoop.) After you configure the keytab file, the database uses the principal residing there to automatically obtain and maintain a Kerberos ticket, much as in the client scenario. In this case, the client does not interact with Kerberos.&lt;/p&gt;
&lt;p&gt;Each database node uses its own principal; it is common to incorporate the name of the node into the principal name. You can either create one keytab per node, containing only that node&#39;s principal, or you can create a single keytab containing all the principals and distribute the file to all nodes. Either way, the node uses its principal to get a Kerberos ticket and then uses that ticket to get a Hadoop token.&lt;/p&gt;
&lt;p&gt;When creating HDFS storage locations the database uses the principal in the keytab file, not the principal of the user issuing the CREATE LOCATION statement. The HCatalog Connector sometimes uses the principal in the keytab file, depending on how Hive authenticates users.&lt;/p&gt;
&lt;h2 id=&#34;configuring-users-and-the-keytab-file&#34;&gt;Configuring users and the keytab file&lt;/h2&gt;
&lt;p&gt;If you have not already configured Kerberos authentication for the database, follow the instructions in &lt;a href=&#34;../../../en/security-and-authentication/client-authentication/kerberos-authentication/configure-kerberos-authentication/#&#34;&gt;Configure OpenText&amp;amp;trade; Analytics Database for Kerberos authentication&lt;/a&gt;. Of particular importance for Hadoop integration:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Create one Kerberos principal per node.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Place the keytab files in the same location on each database node and set configuration parameter &lt;a href=&#34;../../../en/sql-reference/config-parameters/kerberos-parameters/&#34;&gt;KerberosKeytabFile&lt;/a&gt; to that location.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Set KerberosServiceName to the name of the principal. (See &lt;a href=&#34;../../../en/security-and-authentication/client-authentication/kerberos-authentication/configure-kerberos-authentication/inform-about-kerberos-principal/#&#34;&gt;Inform OpenText&amp;amp;trade; Analytics Database about the Kerberos principal&lt;/a&gt;.)&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;If you are using the HCatalog Connector, follow the additional steps in &lt;a href=&#34;../../../en/hadoop-integration/using-hcatalog-connector/configuring-security/#&#34;&gt;Configuring security&lt;/a&gt; in the HCatalog Connector documentation.&lt;/p&gt;
&lt;p&gt;If you are using HDFS storage locations, give all node principals read and write permission to the HDFS directory you will use as a storage location.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Hadoop-Integration: Proxy users and delegation tokens</title>
      <link>/en/hadoop-integration/accessing-kerberized-hdfs-data/proxy-users-and-delegation-tokens/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/hadoop-integration/accessing-kerberized-hdfs-data/proxy-users-and-delegation-tokens/</guid>
      <description>
        
        
        &lt;p&gt;An alternative to granting HDFS access to individual OpenText™ Analytics Database users is to use delegation tokens, either directly or with a proxy user. In this configuration, the database accesses HDFS on behalf of some other (Hadoop) user. The Hadoop users need not be database users at all.&lt;/p&gt;
&lt;p&gt;In OpenText™ Analytics Database, you can either specify the name of the Hadoop user to act on behalf of (doAs), or you can directly use a Kerberos delegation token that you obtain from HDFS (Bring Your Own Delegation Token). In the doAs case, the database obtains a delegation token for that user, so both approaches ultimately use delegation tokens to access files in HDFS.&lt;/p&gt;
&lt;p&gt;Use the &lt;a href=&#34;../../../en/sql-reference/config-parameters/hadoop-parameters/#HadoopImpersonationConfig&#34;&gt;HadoopImpersonationConfig&lt;/a&gt; session parameter to specify a user or delegation token to use for HDFS access. Each session can use a different user and can use either doAs or a delegation token. The value of HadoopImpersonationConfig is a set of JSON objects.&lt;/p&gt;
&lt;p&gt;To use delegation tokens of either type (more specifically, when HadoopImpersonationConfig is set), you must access HDFS through WebHDFS.&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Hadoop-Integration: Token expiration</title>
      <link>/en/hadoop-integration/accessing-kerberized-hdfs-data/token-expiration/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/hadoop-integration/accessing-kerberized-hdfs-data/token-expiration/</guid>
      <description>
        
        
        &lt;p&gt;OpenText™ Analytics Database uses Hadoop tokens when using Kerberos tickets (&lt;a href=&#34;../../../en/hadoop-integration/accessing-kerberized-hdfs-data/using-kerberos-with/#&#34;&gt;Using Kerberos with OpenText&amp;amp;trade; Analytics Database&lt;/a&gt;) or doAs (&lt;a href=&#34;../../../en/hadoop-integration/accessing-kerberized-hdfs-data/proxy-users-and-delegation-tokens/user-impersonation-doas/#&#34;&gt;User impersonation (doAs)&lt;/a&gt;). The database attempts to automatically refresh Hadoop tokens before they expire, but you can also set a minimum refresh frequency if you prefer. Use the HadoopFSTokenRefreshFrequency configuration parameter to specify the frequency in seconds:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; ALTER DATABASE exampledb SET HadoopFSTokenRefreshFrequency = &amp;#39;86400&amp;#39;;
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;If the current age of the token is greater than the value specified in this parameter, the database refreshes the token before accessing data stored in HDFS.&lt;/p&gt;
&lt;p&gt;The database does not refresh delegation tokens (&lt;a href=&#34;../../../en/hadoop-integration/accessing-kerberized-hdfs-data/proxy-users-and-delegation-tokens/bring-your-own-delegation-token/#&#34;&gt;Bring your own delegation token&lt;/a&gt;).&lt;/p&gt;

      </description>
    </item>
    
  </channel>
</rss>
