This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

TLS/SSL encryption with Kafka

You can use TLS/SSL encryption between Vertica, your scheduler, and Kakfa.

You can use TLS/SSL encryption between Vertica, your scheduler, and Kakfa. This encryption prevents others from accessing the data that is sent between Kafka and Vertica. It can also verify the identity of all parties involved in data streaming, so no impostor can pose as your Vertica cluster or a Kafka broker.

Some common cases where you want to use SSL encryption between Vertica and Kafka are:

  • Your Vertica database and Kafka communicate over an insecure network. For example, suppose your Kafka cluster is located in a cloud service and your Vertica cluster is within your internal network. In this case, any data you read from Kafka travels over an insecure connection across the Internet.

  • You are required by security policies, laws, or other requirements to encrypt all of your network traffic.

For more information about TLS/SSL encryption in Vertica, see TLS protocol.

Using TLS/SSL between the scheduler and Vertica

The scheduler connects to Vertica the same way other client applications do. There are two ways you can configure Vertica to use SSL/TLS authentication and encryption with clients:

  • If Vertica is configured to use SSL/TLS server authentication, you can choose to have your scheduler confirm the identity of the Vertica server.

  • If Vertica is configured to use mutual SSL/TLS authentication, you can configure your scheduler identify itself to Vertica as well as have it verify the identity of the Vertica server. Depending on your database's configuration, the Vertica server may require your scheduler to use TLS when connecting. See Client authentication with TLS for more information.

For information on encrypted client connections with Vertica, refer to TLS protocol.

The scheduler runs on a Java Virtual Machine (JVM) and uses JDBC to connect to Vertica. It acts like any other JDBC client when connecting to Vertica. To use TLS/SSL encryption for the scheduler's connection to Vertica, use the Java keystore and truststore mechanism to hold the keys and certificates the scheduler uses to identify itself and Vertica.

  • The keystore contains your scheduler's private encryption key and its certificate (public key).

  • The truststore contains CAs that you trust. If you enable authentication, the scheduler uses these CAs to verify the identity of the Vertica cluster it connects to. If one of the CAs in the trust store was used to sign the server's certificate, then the Scheduler knows it can trust the identity of the Vertica server.

You can pass options to the JVM that executes the scheduler through the Linux environment variable named VKCONFIG_JVM_OPTS. You add the parameters to this variable that alter the scheduler's JDBC settings (such as the truststore and keystore for the scheduler's JDBC connection). See Step 2: Set the VKCONFIG_JVM_OPTS Environment Variable for an example.

You can also use the --jdbc-url scheduler option to alter the JDBC configuration. See Common vkconfig script options for more information about the scheduler options and JDBC connection properties for more information about the properties they can alter.

Using TLS/SSL between Vertica and Kafka

You can stream data from Kafka into Vertica two ways: manually using a COPY statement and the KafkaSource UD source function, or automatically using the scheduler.

To directly copy data from Kafka via an SSL connection, you set session variables containing an SSL key and certificate. When KafkaSource finds that you have set these variables, it uses the key and certificate to create a secure connection to Kafka. See Kafka TLS/SSL Example Part 4: Loading Data Directly From Kafka for details.

When automatically streaming data from Kafka to Vertica, you configure the scheduler the same way you do to use an SSL connection to Vertica. When the scheduler executes COPY statements to load data from Kafka, it uses its own keystore and truststore to create an SSL connection to Kafka.

To use an SSL connection when producing data from Vertica to Kafka, you set the same session variables you use when directly streaming data from Kafka via an SSL connection. The KafkaExport function uses these variables to establish a secure connection to Kafka.

See the Apache Kafka documentation for more information about using SSL/TLS authentication with Kafka.

1 - Planning TLS/SSL encryption between Vertica and Kafka

Some things to consider before you begin configuring TLS/SSL:.

Some things to consider before you begin configuring TLS/SSL:

  • Which connections between the scheduler, Vertica, and Kafka needs to be encrypted? In some cases, you may only need to enable encryption between Vertica and Kafka. This scenario is common when Vertica and the Kafka cluster are on different networks. For example, suppose Kafka is hosted in a cloud provider and Vertica is hosted in your internal network. Then the data must travel across the unsecured Internet between the two. However, if Vertica and the scheduler are both in your local network, you may decide that configuring them to use SSL/TLS is unnecessary. In other cases, you will want all parts of the system to be encrypted. For example, you want to encrypt all traffic when Kafka, Vertica, and the scheduler are all hosted in a cloud provider whose network may not be secure.

  • Which connections between the scheduler, Vertica, and Kafka require trust? You can opt to have any of these connections fail if one system cannot verify the identity of another. See Verifying Identities below.

  • Which CAs will you be using to sign each certificate? The simplest way to configure the scheduler, Kafka, and Vertica is to use the same CA to sign all of the certificates you will use when setting up TLS/SSL. Using the same root CA to sign the certificates requires you to be able to edit the configuration of Kafka, Vertica, and the scheduler. If you cannot use the same CA to sign all certificates, all truststores must contain the entire chain of CAs used to sign the certificates, all the way up to the root CA. Including the entire chain of trust ensures each system can verify each other's identities.

Verifying identities

Your primary challenge when configuring TLS/SSL encryption between Vertica, the scheduler, and Kafka is making sure the scheduler, Kafka, and Vertica can all verify each other's identity. The most common problem people have encountered when setting up TLS/SSL encryption is ensuring the remote system can verify the authenticity of a system's certificate. The best way to prevent this problem is to ensure that the all systems have their certificates signed by a CA that all of the systems explicitly trust.

When a system attempts to start an encrypted connection with another system, it sends its certificate to the remote system. This certificate can be signed by one or more Certificate Authorities (CA) that help identify the system making the connection. These signatures form a "chain of trust." A certificate is signed by a CA. That CA, in turn, could have been signed by another CA, and so forth. Often, the chain ends with a CA (referred to as the root CA) from a well-known commercial provider of certificates, such as Comodo SSL or DigiCert, that are trusted by default on many platforms such as operating systems and web browsers.

If the remote system finds a CA in the chain that it trusts, it verifies the identity of the system making the connection, and the connection can continue. If the remote system cannot find the signature of a CA it trusts, it may block the connection, depending on its configuration. Systems can be configured to only allow connections from systems whose identity has been verified.

2 - Configuring your scheduler for TLS connections

The scheduler can use TLS for two different connections: the one it makes to Vertica, and the connection it creates when running COPY statements to retrieve data from Kafka.

The scheduler can use TLS for two different connections: the one it makes to Vertica, and the connection it creates when running COPY statements to retrieve data from Kafka. Because the scheduler is a Java application, you supply the TLS key and the certificate used to sign it in a keystore. You also supply a truststore that contains in the certificates that the scheduler should trust. Both the connection to Vertica and to Kafka can use the same keystore and truststore. You can also choose to use separate keystores and truststores for these two connections by setting different JDBC settings for the scheduler. See JDBC connection properties for a list of these settings.

See Kafka TLS-SSL Example Part 5: Configure the Scheduler for detailed steps on configuring your scheduler to use SSL.

Note that if the Kafka server's parameter client.ssl.auth is set to none or requested, you do not need to create a keystore.

3 - Using TLS/SSL when directly loading data from Kafka

You can manually load data from Kafka using the COPY statement and the KafkaSource user-defined load function (see Manually Copying Data From Kafka).

You can manually load data from Kafka using the COPY statement and the KafkaSource user-defined load function (see Manually consume data from Kafka). To have KafkaSource open a secure connection to Kafka, you must supply it with an SSL key and other information.

When starting, the KafkaSource function checks if several user session variables are defined. These variables contain the SSL key, the certificate used to sign the key, and other information that the function needs to create the SSL connection. See Kafka user-defined session parameters for a list of these variables. If KafkaSource finds these variables are defined, it uses them to create an SSL connection to Kafka.

See Kafka TLS/SSL Example Part 4: Loading Data Directly From Kafka for a step-by-step guide on configuring and using an SSL connection when directly copying data from Kafka.

These variables are also used by the KafkaExport function to establish a secure connection to Kafka when exporting data.

4 - Configure Kafka for TLS/SSL

This page covers procedures for configuring TLS connections Vertica, Kafka, and the scheduler.

This page covers procedures for configuring TLS connections Vertica, Kafka, and the scheduler.

Note that the following example configures TLS for a Kafka server where ssl.client.auth=required, which requires the following:

  • kafka_SSL_Certificate

  • kafka_SSL_PrivateKey_secret

  • kafka_SSL_PrivateKeyPassword_secret

  • A keystore for the Scheduler

If your configuration uses ssl.client.auth=none or ssl.client.auth=requested, these parameters and the scheduler keystore are optional.

Creating certificates for Vertica and clients

The CA certificate in this example is self-signed. In a production environment, you should instead use a trusted CA.

This example uses the same self-signed root CA to sign all of the certificates used by the scheduler, Kafka brokers, and Vertica. If you cannot use the same CA to sign the keys for all of these systems, make sure you include the entire chain of trust in your keystores.

For more information, see Generating TLS certificates and keys.

  1. Generate a private key, root.key.

    $ openssl genrsa -out root.key
    Generating RSA private key, 2048 bit long modulus
    ..............................................................................
    ............................+++
    ...............+++
    e is 65537 (0x10001)
    
  2. Generate a self-signed CA certificate.

    $ openssl req -new -x509 -key root.key -out root.crt
    You are about to be asked to enter information that will be incorporated
    into your certificate request.
    What you are about to enter is what is called a Distinguished Name or a DN.
    There are quite a few fields but you can leave some blank
    For some fields there will be a default value,
    If you enter '.', the field will be left blank.
    -----
    Country Name (2 letter code) [AU]:US
    State or Province Name (full name) [Some-State]:MA
    Locality Name (eg, city) []:Cambridge
    Organization Name (eg, company) [Internet Widgits Pty Ltd]:My Company
    Organizational Unit Name (eg, section) []:
    Common Name (e.g. server FQDN or YOUR name) []:*.mycompany.com
    Email Address []:myemail@mycompany.com
    
  3. Restrict to the owner read/write permissions for root.key and root.crt. Grant read permissions to other groups for root.crt.

    
    $ ls
    root.crt  root.key
    $ chmod 600 root.key
    $ chmod 644 root.crt
    
  4. Generate the server private key, server.key.

    $ openssl genrsa -out server.key
    Generating RSA private key, 2048 bit long modulus
    ....................................................................+++
    ......................................+++
    e is 65537 (0x10001)
    
  5. Create a certificate signing request (CSR) for your CA. Be sure to set the "Common Name" field to a wildcard (asterisk) so the certificate is accepted for all Vertica nodes in the cluster:

    $ openssl req -new -key server.key -out server_reqout.txt
    You are about to be asked to enter information that will be incorporated
    into your certificate request.
    What you are about to enter is what is called a Distinguished Name or a DN.
    There are quite a few fields but you can leave some blank
    For some fields there will be a default value,
    If you enter '.', the field will be left blank.
    -----
    Country Name (2 letter code) [AU]:US
    State or Province Name (full name) [Some-State]:MA
    Locality Name (eg, city) []:Cambridge
    Organization Name (eg, company) [Internet Widgits Pty Ltd]:My Company
    Organizational Unit Name (eg, section) []:
    Common Name (e.g. server FQDN or YOUR name) []:*.mycompany.com
    Email Address []:myemail@mycompany.com
    
    Please enter the following 'extra' attributes
    to be sent with your certificate request
    A challenge password []: server_key_password
    An optional company name []:
    
  6. Sign the server certificate with your CA. This creates the server certificate server.crt.

    $ openssl x509 -req -in server_reqout.txt -days 3650 -sha1 -CAcreateserial -CA root.crt \
        -CAkey root.key -out server.crt
        Signature ok
        subject=/C=US/ST=MA/L=Cambridge/O=My Company/CN=*.mycompany.com/emailAddress=myemail@mycompany.com
        Getting CA Private Key
    
  7. Set the appropriate permissions for the key and certificate.

    $ chmod 600 server.key
    $ chmod 644 server.crt
    

Create a client key and certificate (mutual mode only)

In Mutual Mode, clients and servers verify each other's certificates before establishing a connection. The following procedure creates a client key and certificate to present to Vertica. The certificate must be signed by a CA that Vertica trusts.

The steps for this are identical to those above for creating a server key and certificate for Vertica.

$ openssl genrsa -out client.key
Generating RSA private key, 2048 bit long modulus
................................................................+++
..............................+++
e is 65537 (0x10001)

$ openssl req -new -key client.key -out client_reqout.txt
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [AU]:US
State or Province Name (full name) [Some-State]:MA
Locality Name (eg, city) []:Cambridge
Organization Name (eg, company) [Internet Widgits Pty Ltd]:My Company
Organizational Unit Name (eg, section) []:
Common Name (e.g. server FQDN or YOUR name) []:*.mycompany.com
Email Address []:myemail@mycompany.com

Please enter the following 'extra' attributes
to be sent with your certificate request
A challenge password []: server_key_password
An optional company name []:

$ openssl x509 -req -in client_reqout.txt -days 3650 -sha1 -CAcreateserial -CA root.crt \
  -CAkey root.key -out client.crt
Signature ok
subject=/C=US/ST=MA/L=Cambridge/O=My Company/CN=*.mycompany.com/emailAddress=myemail@mycompany.com
Getting CA Private Key

$ chmod 600 client.key
$ chmod 644 client.crt

Set up mutual mode client-server TLS

Configure Vertica for mutual mode

The following keys and certificates must be imported and then distributed to the nodes on your Vertica cluster with TLS Configuration for Mutual Mode:

  • root.key

  • root.crt

  • server.key

  • server.crt

You can view existing keys and certificates by querying CRYPTOGRAPHIC_KEYS and CERTIFICATES.

  1. Import the server and root keys and certificates into Vertica with CREATE KEY and CREATE CERTIFICATE. See Generating TLS certificates and keys for details.

    => CREATE KEY imported_key TYPE 'RSA' AS '-----BEGIN PRIVATE KEY-----...-----END PRIVATE KEY-----';
    => CREATE CA CERTIFICATE imported_ca AS '-----BEGIN CERTIFICATE-----...-----END CERTIFICATE-----';
    => CREATE CERTIFICATE imported_cert AS '-----BEGIN CERTIFICATE-----...-----END CERTIFICATE-----';
    

    In this example, \set is used to retrieve the contents of root.key, root.crt, server.key, and server.crt.

    => \set ca_cert ''''`cat root.crt`''''
    => \set serv_key ''''`cat server.key`''''
    => \set serv_cert ''''`cat server.crt`''''
    
    => CREATE CA CERTIFICATE root_ca AS :ca_cert;
    CREATE CERTIFICATE
    => CREATE KEY server_key TYPE 'RSA' AS :serv_key;
    CREATE KEY
    => CREATE CERTIFICATE server_cert AS :serv_cert;
    CREATE CERTIFICATE
    
  2. Follow the steps for Mutual Mode in Configuring client-server TLS to set the proper TLSMODE and TLS Configuration parameters.

Configure a client for mutual mode

Clients must have their private key, certificate, and CA certificate. The certificate will be presented to Vertica when establishing a connection, and the CA certificate will be used to verify the server certificate from Vertica.

This example configures the vsql client for mutual mode.

  1. Create a .vsql directory in the user's home directory.

    $ mkdir ~/.vsql
    
  2. Copy client.key, client.crt, and root.crt to the vsql directory.

    $ cp client.key client.crt root.crt ~/.vsql
    
  3. Log into Vertica with vsql and query the SESSIONS system table to verify that the connection is using mutual mode:

    $ vsql
    Password: user-password
    Welcome to vsql, the Vertica Analytic Database interactive terminal.
    
    Type:  \h or \? for help with vsql commands
           \g or terminate with semicolon to execute query
           \q to quit
    
    SSL connection (cipher: DHE-RSA-AES256-GCM-SHA384, bits: 256, protocol: TLSv1.2)
    
    => select user_name,ssl_state from sessions;
     user_name | ssl_state
    -----------+-----------
     dbadmin   | Mutual
    (1 row)
    

Configure Kafka for TLS

Configure the Kafka brokers

This procedure configures Kafka to use TLS with client connections. You can also configure Kafka to use TLS to communicate between brokers. However, inter-broker TLS has no impact on establishing an encrypted connection between Vertica and Kafka.

  1. Create a truststore file for all of your Kafka brokers, importing your CA certificate. This example uses the self-signed root.crt created above.

    => $ keytool -keystore kafka.truststore.jks -alias CARoot -import \
                   -file root.crt
    Enter keystore password: some_password
    Re-enter new password: some_password
    Owner: EMAILADDRESS=myemail@mycompany.com, CN=*.mycompany.com, O=MyCompany, L=Cambridge, ST=MA, C=US
    Issuer: EMAILADDRESS=myemail@mycompany.com, CN=*.mycompany.com, O=MyCompany, L=Cambridge, ST=MA, C=US
    Serial number: c3f02e87707d01aa
    Valid from: Fri Mar 22 13:37:37 EDT 2019 until: Sun Apr 21 13:37:37 EDT 2019
    Certificate fingerprints:
             MD5:  73:B1:87:87:7B:FE:F1:6E:94:55:FD:AF:5D:D0:C3:0C
             SHA1: C0:69:1C:93:54:21:87:C7:03:93:FE:39:45:66:DE:22:18:7E:CD:94
             SHA256: 23:03:BB:B7:10:12:50:D9:C5:D0:B7:58:97:41:1E:0F:25:A0:DB:
                     D0:1E:7D:F9:6E:60:8F:79:A6:1C:3F:DD:D5
    Signature algorithm name: SHA256withRSA
    Subject Public Key Algorithm: 2048-bit RSA key
    Version: 3
    
    Extensions:
    
    #1: ObjectId: 2.5.29.35 Criticality=false
    AuthorityKeyIdentifier [
    KeyIdentifier [
    0000: 50 69 11 64 45 E9 CC C5   09 EE 26 B5 3E 71 39 7C  Pi.dE.....&.>q9.
    0010: E5 3D 78 16                                        .=x.
    ]
    ]
    
    #2: ObjectId: 2.5.29.19 Criticality=false
    BasicConstraints:[
      CA:true
      PathLen:2147483647
    ]
    
    #3: ObjectId: 2.5.29.14 Criticality=false
    SubjectKeyIdentifier [
    KeyIdentifier [
    0000: 50 69 11 64 45 E9 CC C5   09 EE 26 B5 3E 71 39 7C  Pi.dE.....&.>q9.
    0010: E5 3D 78 16                                        .=x.
    ]
    ]
    
    Trust this certificate? [no]:  yes
    Certificate was added to keystore
    
  2. Create a keystore file for the Kafka broker named kafka01. Each broker's keystore should be unique.

    The keytool command adds the a Subject Alternative Name (SAN) used as a fallback when establishing a TLS connection. Use your Kafka' broker's fully-qualified domain name (FQDN) as the value for the SAN and "What is your first and last name?" prompt.

    In this example, the FQDN is kafka01.example.com. The alias for keytool is set to localhost, so local connections to the broker use TLS.

    $ keytool -keystore kafka01.keystore.jks -alias localhost -validity 365 -genkey -keyalg RSA \
          -ext SAN=DNS:kafka01.mycompany.com
    Enter keystore password: some_password
    Re-enter new password: some_password
    What is your first and last name?
      [Unknown]:  kafka01.mycompany.com
    What is the name of your organizational unit?
      [Unknown]:
    What is the name of your organization?
      [Unknown]: MyCompany
    What is the name of your City or Locality?
      [Unknown]:  Cambridge
    What is the name of your State or Province?
      [Unknown]:  MA
    What is the two-letter country code for this unit?
      [Unknown]:  US
    Is CN=Database Admin, OU=MyCompany, O=Unknown, L=Cambridge, ST=MA, C=US correct?
      [no]:  yes
    
    Enter key password for <localhost>
            (RETURN if same as keystore password):
    
  3. Export the Kafka broker's certificate. In this example, the certificate is exported as kafka01.unsigned.crt.

    $ keytool -keystore kafka01.keystore.jks -alias localhost \
                    -certreq -file kafka01.unsigned.crt
     Enter keystore password: some_password
    
  4. Sign the broker's certificate with the CA certificate.

    $ openssl x509 -req -CA root.crt -CAkey root.key -in kafka01.unsigned.crt \
                 -out kafka01.signed.crt -days 365 -CAcreateserial
    Signature ok
    subject=/C=US/ST=MA/L=Cambridge/O=Unknown/OU=MyCompany/CN=Database Admin
    Getting CA Private Key
    
  5. Import the CA certificate into the broker's keystore.

    $ keytool -keystore kafka01.keystore.jks -alias CARoot -import -file root.crt
    Enter keystore password: some_password
    Owner: EMAILADDRESS=myemail@mycompany.com, CN=*.mycompany.com, O=MyCompany, L=Cambridge, ST=MA, C=US
    Issuer: EMAILADDRESS=myemail@mycompany.com, CN=*.mycompany.com, O=MyCompany, L=Cambridge, ST=MA, C=US
    Serial number: c3f02e87707d01aa
    Valid from: Fri Mar 22 13:37:37 EDT 2019 until: Sun Apr 21 13:37:37 EDT 2019
    Certificate fingerprints:
             MD5:  73:B1:87:87:7B:FE:F1:6E:94:55:FD:AF:5D:D0:C3:0C
             SHA1: C0:69:1C:93:54:21:87:C7:03:93:FE:39:45:66:DE:22:18:7E:CD:94
             SHA256: 23:03:BB:B7:10:12:50:D9:C5:D0:B7:58:97:41:1E:0F:25:A0:DB:D0:1E:7D:F9:6E:60:8F:79:A6:1C:3F:DD:D5
    Signature algorithm name: SHA256withRSA
    Subject Public Key Algorithm: 2048-bit RSA key
    Version: 3
    
    Extensions:
    
    #1: ObjectId: 2.5.29.35 Criticality=false
    AuthorityKeyIdentifier [
    KeyIdentifier [
    0000: 50 69 11 64 45 E9 CC C5   09 EE 26 B5 3E 71 39 7C  Pi.dE.....&.>q9.
    0010: E5 3D 78 16                                        .=x.
    ]
    ]
    
    #2: ObjectId: 2.5.29.19 Criticality=false
    BasicConstraints:[
      CA:true
      PathLen:2147483647
    ]
    
    #3: ObjectId: 2.5.29.14 Criticality=false
    SubjectKeyIdentifier [
    KeyIdentifier [
    0000: 50 69 11 64 45 E9 CC C5   09 EE 26 B5 3E 71 39 7C  Pi.dE.....&.>q9.
    0010: E5 3D 78 16                                        .=x.
    ]
    ]
    
    Trust this certificate? [no]:  yes
    Certificate was added to keystore
    
  6. Import the signed Kafka broker certificate into the keystore.

    $ keytool -keystore kafka01.keystore.jks -alias localhost \
                    -import -file kafka01.signed.crt
    Enter keystore password: some_password
    Owner: CN=Database Admin, OU=MyCompany, O=Unknown, L=Cambridge, ST=MA, C=US
    Issuer: EMAILADDRESS=myemail@mycompany.com, CN=*.mycompany.com, O=MyCompany, L=Cambridge, ST=MA, C=US
    Serial number: b4bba9a1828ecaaf
    Valid from: Tue Mar 26 12:26:34 EDT 2019 until: Wed Mar 25 12:26:34 EDT 2020
    Certificate fingerprints:
                MD5:  17:EA:3E:15:B4:15:E9:93:67:EE:59:C0:4F:D1:4C:01
                SHA1: D5:35:B7:F7:44:7C:D6:B4:56:6F:38:2D:CD:3A:16:44:19:C1:06:B7
                SHA256: 25:8C:46:03:60:A7:4C:10:A8:12:8E:EA:4A:FA:42:1D:A8:C5:FB:65:81:74:CB:46:FD:B1:33:64:F2:A3:46:B0
    Signature algorithm name: SHA256withRSA
    Subject Public Key Algorithm: 2048-bit RSA key
    Version: 1
    Trust this certificate? [no]:  yes
    Certificate was added to keystore
    
  7. If you are not logged into the Kafka broker for which you prepared the keystore, copy the truststore and keystore to it using scp. If you have already decided where to store the keystore and truststore files in the broker's filesystem, you can directly copy them to their final destination. This example just copies them to the root user's home directory temporarily. The next step moves them into their final location.

    $ scp kafka.truststore.jks kafka01.keystore.jks root@kafka01.mycompany.com:
    root@kafka01.mycompany.com's password: root_password
    kafka.truststore.jks                              100% 1048     1.0KB/s   00:00
    kafka01.keystore.jks                              100% 3277     3.2KB/s   00:00
    
  8. Repeat steps 2 through 7 for the remaining Kafka brokers.

Allow Kafka to read the keystore and truststore

If you did not copy the truststore and keystore to directory where Kafka can read them in the previous step, you must copy them to a final location on the broker. You must also allow the user account you use to run Kafka to read these files. The easiest way to ensure the user's access is to give this user ownership of these files.

In this example, Kafka is run by a Linux user kafka. If you use another user to run Kafka, be sure to set the permissions on the truststore and keystore files appropriately.

  1. Log into the Kafka broker as root.

  2. Copy the truststore and keystore to a directory where Kafka can access them. There is no set location for these files: you can choose a directory under /etc, or some other location where configuration files are usually stored. This example copies them from root's home directory to Kafka's configuration directory named /opt/kafka/config/. In your own system, this configuration directory may be in a different location depending on how you installed Kafka.

  3. Copy the truststore and keystore to a directory where Kafka can access them. There is no set location for these files: you can choose a directory under /etc, or some other location where configuration files are usually stored. This example copies them from root's home directory to Kafka's configuration directory named /opt/kafka/config/. In your own system, this configuration directory may be in a different location depending on how you installed Kafka.

    ~# cd /opt/kafka/config/
    /opt/kafka/config# cp /root/kafka01.keystore.jks /root/kafka.truststore.jks .
    
  4. If you aren't logged in as a user account that runs Kafka, change the ownership of the truststore and keystore files. This example changes the ownership from root (which is the user currently logged in) to the kafka user:

    /opt/kafka/config# ls -l
    total 80
    ...
    -rw-r--r-- 1 kafka nogroup 1221 Feb 21  2018 consumer.properties
    -rw------- 1 root  root    3277 Mar 27 08:03 kafka01.keystore.jks
    -rw-r--r-- 1 root  root    1048 Mar 27 08:03 kafka.truststore.jks
    -rw-r--r-- 1 kafka nogroup 4727 Feb 21  2018 log4j.properties
    ...
    /opt/kafka/config# chown kafka kafka01.keystore.jks kafka.truststore.jks
    /opt/kafka/config# ls -l
    total 80
    ...
    -rw-r--r-- 1 kafka nogroup 1221 Feb 21  2018 consumer.properties
    -rw------- 1 kafka root    3277 Mar 27 08:03 kafka01.keystore.jks
    -rw-r--r-- 1 kafka root    1048 Mar 27 08:03 kafka.truststore.jks
    -rw-r--r-- 1 kafka nogroup 4727 Feb 21  2018 log4j.properties
    ...
    
  5. Repeat steps 1 through 3 for the remaining Kafka brokers.

Configure Kafka to use TLS

With the truststore and keystore in place, your next step is to edit the Kafka's server.properties configuration file to tell Kafka to use TLS/SSL encryption. This file is usually stored in the Kafka config directory. The location of this directory depends on how you installed Kafka. In this example, the file is located in /opt/kafka/config.

When editing the files, be sure you do not change their ownership. The best way to ensure Linux does not change the file's ownership is to use su to become the user account that runs Kafka, assuming you are not already logged in as that user:

$ /opt/kafka/config# su -s /bin/bash kafka

The server.properties file contains Kafka broker settings in a property=value format. To configure the Kafka broker to use SSL, alter or add the following property settings:

listeners
Host names and ports on which the Kafka broker listens. If you are not using SSL for connections between brokers, you must supply both a PLANTEXT and SSL option. For example:

listeners=PLAINTEXT://hostname:9092,SSL://hostname:9093

ssl.keystore.location
Absolute path to the keystore file.
ssl.keystore.password
Password for the keystore file.
ssl.key.password
Password for the Kafka broker's key in the keystore. You can make this password different than the keystore password if you choose.
ssl.truststore.location
Location of the truststore file.
ssl.truststore.password
Password to access the truststore.
ssl.enabled.protocols
TLS/SSL protocols that Kafka allows clients to use.
ssl.client.auth
Specifies whether SSL authentication is required or optional. The most secure setting for this setting is required to verify the client's identity.

This example configures Kafka to verify client identities via SSL authentication. It does not use SSL to communicate with other brokers, so the server.properties file defines both SSL and PLAINTEXT listener ports. It does not supply a host name for listener ports which tells Kafka to listen on the default network interface.

The lines added to the kafka01 broker's copy of server.properties for this configuration are:

listeners=PLAINTEXT://:9092,SSL://:9093
ssl.keystore.location=/opt/kafka/config/kafka01.keystore.jks
ssl.keystore.password=vertica
ssl.key.password=vertica
ssl.truststore.location=/opt/kafka/config/kafka.truststore.jks
ssl.truststore.password=vertica
ssl.enabled.protocols=TLSv1.2,TLSv1.1,TLSv1
ssl.client.auth=required

You must make these changes to the server.properties file on all of your brokers.

After making your changes to your broker's server.properties files, restart Kafka. How you restart Kafka depends on your installation:

  • If Kafka is running as part of a Hadoop cluster, you can usually restart it from within whatever interface you use to control Hadoop (such as Ambari).

  • If you installed Kafka directly, you can restart it either by directly running the kafka-server-stop.sh and kafka-server-start.sh scripts or via the Linux system's service control commands (such as systemctl). You must run this command on each broker.

Test the configuration

If you have not configured client authentication, you can quickly test whether Kafka can access its keystore by running the command:

$ openssl s_client -debug -connect broker_host_name:9093 -tls1

If Kafka is able to access its keystore, this command will output a dump of the broker's certificate (exit with CTRL+C):

=> # openssl s_client -debug -connect kafka01.mycompany.com:9093 -tls1
CONNECTED(00000003)
write to 0xa4e4f0 [0xa58023] (197 bytes => 197 (0xC5))
0000 - 16 03 01 00 c0 01 00 00-bc 03 01 76 85 ed f0 fe   ...........v....
0010 - 60 60 7e 78 9d d4 a8 f7-e6 aa 5c 80 b9 a7 37 61   ``~x......\...7a
0020 - 8e 04 ac 03 6d 52 86 f5-84 4b 5c 00 00 62 c0 14   ....mR...K\..b..
0030 - c0 0a 00 39 00 38 00 37-00 36 00 88 00 87 00 86   ...9.8.7.6......
0040 - 00 85 c0 0f c0 05 00 35-00 84 c0 13 c0 09 00 33   .......5.......3
0050 - 00 32 00 31 00 30 00 9a-00 99 00 98 00 97 00 45   .2.1.0.........E
0060 - 00 44 00 43 00 42 c0 0e-c0 04 00 2f 00 96 00 41   .D.C.B...../...A
0070 - c0 11 c0 07 c0 0c c0 02-00 05 00 04 c0 12 c0 08   ................
0080 - 00 16 00 13 00 10 00 0d-c0 0d c0 03 00 0a 00 ff   ................
0090 - 01 00 00 31 00 0b 00 04-03 00 01 02 00 0a 00 1c   ...1............
00a0 - 00 1a 00 17 00 19 00 1c-00 1b 00 18 00 1a 00 16   ................
00b0 - 00 0e 00 0d 00 0b 00 0c-00 09 00 0a 00 23 00 00   .............#..
00c0 - 00 0f 00 01 01                                    .....
read from 0xa4e4f0 [0xa53ad3] (5 bytes => 5 (0x5))
0000 - 16 03 01 08 fc                                    .....
             . . .

The above method is not conclusive, however; it only tells you if Kafka is able to find its keystore.

The best test of whether Kafka is able to accept TLS connections is to configure the command-line Kafka producer and consumer. In order to configure these tools, you must first create a client keystore. These steps are identical to creating a broker keystore.

  1. Create the client keystore:

    keytool -keystore client.keystore.jks -alias localhost -validity 365 -genkey -keyalg RSA -ext SAN=DNS:fqdn_of_client_system
    
  2. Respond to the "What is your first and last name?" with the FQDN of the system you will use to run the producer and/or consumer. Answer the rest of the prompts with the details of your organization.

  3. Export the client certificate so it can be signed:

    keytool -keystore client.keystore.jks -alias localhost -certreq -file client.unsigned.cert
    
  4. Sign the client certificate with the root CA:

    openssl x509 -req -CA root.crt -CAkey root.key -in client.unsigned.cert -out client.signed.cert \
            -days 365 -CAcreateserial
    
  5. Add the root CA to keystore:

    keytool -keystore client.keystore.jks -alias CARoot -import -file root.crt
    
  6. Add the signed client certificate to the keystore:

    keytool -keystore client.keystore.jks -alias localhost -import -file client.signed.cert
    
  7. Copy the keystore to a location where you will use it. For example, you could choose to copy it to the same directory where you copied the keystore for the Kafka broker. If you choose to copy it to some other location, or intend to use some other user to run the command-line clients, be sure to add a copy of the truststore file you created for the brokers. Clients can reuse this truststore file for authenticating the Kafka brokers because the same CA is used to sign all of the certificates. Also set the file's ownership and permissions accordingly.

Next, you must create a properties file (similar to the broker's server.properties file) that configures the command-line clients to use TLS. For a client running on the Kafka broker named kafka01, your configuration file could would look like this:

security.protocol=SSL
ssl.truststore.location=/opt/kafka/config/kafka.truststore.jks
ssl.truststore.password=trustore_password
ssl.keystore.location=/opt/kafka/config/client.keystore.jks
ssl.keystore.password=keystore_password
ssl.key.password=key_password
ssl.enabled.protocols=TLSv1.2,TLSv1.1,TLSv1
ssl.client.auth=required

This property file assumes the keystore file is located in the Kafka configuration directory.

Finally, you can run the command line producer or consumer to ensure they can connect and process data. You supply these clients the properties file you just created. The following example assumes you stored the properties file in the Kafka configuration directory, and that Kafka is installed in /opt/kafka:

~# cd /opt/kafka


/opt/kafka# bin/kafka-console-producer.sh --broker-list kafka01.mycompany.com:9093  \
                                          --topic test --producer.config config/client.properties
>test
>test again
>More testing. These messages seem to be getting through!
^D
/opt/kafka# bin/kafka-console-consumer.sh --bootstrap-server kafaka01.mycompany.com:9093  --topic test \
                                          --consumer.config config/client.properties --from-beginning
test
test again
More testing. These messages seem to be getting through!
^C
Processed a total of 3 messages

Loading data from Kafka

After you configure Kafka to accept TLS connections, verify that you can directly load data from it into Vertica. You should perform this step even if you plan to create a scheduler to automatically stream data.

You can choose to create a separate key and certificate for directly loading data from Kafka. This example re-uses the key and certificate created for the Vertica server in part 2 of this example.

You directly load data from Kafka by using the KafkaSource data source function with the COPY statement (see Manually consume data from Kafka). The KafkaSource function creates the connection to Kafka, so it needs a key, certificate, and related passwords to create an encrypted connection. You pass this information via session parameters. See Kafka user-defined session parameters for a list of these parameters.

The easiest way to get the key and certificate into the parameters is by first reading them into vsql variables. You get their contents by using back quotes to read the file contents via the Linux shell. Then you set the session parameters from the variables. Before setting the session parameters, increase the MaxSessionUDParameterSize session parameter to add enough storage space in the session variables for the key and the certificates. They can be larger than the default size limit for session variables (1000 bytes).

The following example reads the server key and certificate and the root CA from the a directory named /home/dbadmin/SSL. Because the server's key password is not saved in a file, the example sets it in a Linux environment variable named KVERTICA_PASS before running vsql. The example sets MaxSessionUDParameterSize to 100000 before setting the session variables. Finally, it enables TLS for the Kafka connection and streams data from the topic named test.

$ export KVERTICA_PASS=server_key_password
$ vsql
Password:
Welcome to vsql, the Vertica Analytic Database interactive terminal.

Type:  \h or \? for help with vsql commands
       \g or terminate with semicolon to execute query
       \q to quit

SSL connection (cipher: DHE-RSA-AES256-GCM-SHA384, bits: 256, protocol: TLSv1.2)

=> \set cert '\''`cat /home/dbadmin/SSL/server.crt`'\''
=> \set pkey '\''`cat /home/dbadmin/SSL/server.key`'\''
=> \set ca '\''`cat /home/dbadmin/SSL/root.crt`'\''
=> \set pass '\''`echo $KVERTICA_PASS`'\''
=> alter session set MaxSessionUDParameterSize=100000;
ALTER SESSION
=> ALTER SESSION SET UDPARAMETER kafka_SSL_Certificate=:cert;
ALTER SESSION
=> ALTER SESSION SET UDPARAMETER kafka_SSL_PrivateKey_secret=:pkey;
ALTER SESSION
=> ALTER SESSION SET UDPARAMETER kafka_SSL_PrivateKeyPassword_secret=:pass;
ALTER SESSION
=> ALTER SESSION SET UDPARAMETER kafka_SSL_CA=:ca;
ALTER SESSION
=> ALTER SESSION SET UDPARAMETER kafka_Enable_SSL=1;
ALTER SESSION
=> CREATE TABLE t (a VARCHAR);
CREATE TABLE
=> COPY t SOURCE KafkaSource(brokers='kafka01.mycompany.com:9093',
                             stream='test|0|-2', stop_on_eof=true,
                             duration=interval '5 seconds')
          PARSER KafkaParser();
 Rows Loaded
-------------
           3
(1 row)

=> SELECT * FROM t;
                            a
---------------------------------------------------------
 test again
 More testing. These messages seem to be getting through!
 test

(3 rows)

Configure the scheduler

The final piece of the configuration is to set up the scheduler to use SSL when communicating with Kafka (and optionally with Vertica). When the scheduler runs a COPY command to get data from Kafka, it uses its own key and certificate to authenticate with Kafka. If you choose to have the scheduler use TLS/SSL to connect to Vertica, it can reuse the same keystore and truststore to make this connection.

Create a truststore and keystore for the scheduler

Because the scheduler is a separate component, it must have its own key and certificate. The scheduler runs in Java and uses the JDBC interface to connect to Vertica. Therefore, you must create a keystore (when ssl.client.auth=required ) and truststore for it to use when making a TLS-encrypted connection to Vertica.

Keep in mind that creating a keystore is optional if your Kafka server sets ssl.client.auth to none or requested.

This process is similar to creating the truststores and keystores for Kafka brokers. The main difference is using the the -dname option for keytool to set the Common Name (CN) for the key to a domain wildcard. Using this setting allows the key and certificate to match any host in the network. This option is especially useful if you run multiple schedulers on different servers to provide redundancy. The schedulers can use the same key and certificate, no matter which server they are running on in your domain.

  1. Create a truststore file for the scheduler. Add the CA certificate that you used to sign the keystore of the Kafka cluster and Vertica cluster. If you are using more than one CA to sign your certificates, add all of the CAs you used.

    $ keytool -keystore scheduler.truststore.jks -alias CARoot -import \
                   -file root.crt
    Enter keystore password: some_password
    Re-enter new password: some_password
    Owner: EMAILADDRESS=myemail@mycompany.com, CN=*.mycompany.com, O=MyCompany, L=Cambridge, ST=MA, C=US
    Issuer: EMAILADDRESS=myemail@mycompany.com, CN=*.mycompany.com, O=MyCompany, L=Cambridge, ST=MA, C=US
    Serial number: c3f02e87707d01aa
    Valid from: Fri Mar 22 13:37:37 EDT 2019 until: Sun Apr 21 13:37:37 EDT 2019
    Certificate fingerprints:
             MD5:  73:B1:87:87:7B:FE:F1:6E:94:55:FD:AF:5D:D0:C3:0C
             SHA1: C0:69:1C:93:54:21:87:C7:03:93:FE:39:45:66:DE:22:18:7E:CD:94
             SHA256: 23:03:BB:B7:10:12:50:D9:C5:D0:B7:58:97:41:1E:0F:25:A0:DB:
                     D0:1E:7D:F9:6E:60:8F:79:A6:1C:3F:DD:D5
    Signature algorithm name: SHA256withRSA
    Subject Public Key Algorithm: 2048-bit RSA key
    Version: 3
    
    Extensions:
    
    #1: ObjectId: 2.5.29.35 Criticality=false
    AuthorityKeyIdentifier [
    KeyIdentifier [
    0000: 50 69 11 64 45 E9 CC C5   09 EE 26 B5 3E 71 39 7C  Pi.dE.....&.>q9.
    0010: E5 3D 78 16                                        .=x.
    ]
    ]
    
    #2: ObjectId: 2.5.29.19 Criticality=false
    BasicConstraints:[
      CA:true
      PathLen:2147483647
    ]
    
    #3: ObjectId: 2.5.29.14 Criticality=false
    SubjectKeyIdentifier [
    KeyIdentifier [
    0000: 50 69 11 64 45 E9 CC C5   09 EE 26 B5 3E 71 39 7C  Pi.dE.....&.>q9.
    0010: E5 3D 78 16                                        .=x.
    ]
    ]
    
    Trust this certificate? [no]:  yes
    Certificate was added to keystore
    
  2. Initialize the keystore, passing it a wildcard host name as the Common Name. The alias parameter in this command is important, as you use it later to identify the key the scheduler must use when creating SSL conections:

    keytool -keystore scheduler.keystore.jks -alias vsched -validity 365 -genkey \
            -keyalg RSA  -dname CN=*.mycompany.com
    
  3. Export the scheduler's key so you can sign it with the root CA:

    $ keytool -keystore scheduler.keystore.jks -alias vsched -certreq \
            -file scheduler.unsigned.cert
    
  4. Sign the scheduler key with the root CA:

    $ openssl x509 -req -CA root.crt -CAkey root.key -in scheduler.unsigned.cert \
            -out scheduler.signed.cert -days 365 -CAcreateserial
    
  5. Re-import the scheduler key into the keystore:

    $ keytool -keystore scheduler.keystore.jks -alias localhost -import -file scheduler.signed.cert
    

Set environment variable VKCONFIG_JVM_OPTS

You must pass several settings to the JDBC interface of the Java Virtual Machine (JVM) that runs the scheduler. These settings tell the JDBC driver where to find the keystore and truststore, as well as the key's password. The easiest way to pass in these settings is to set a Linux environment variable named VKCONFIG_JVM_OPTS. As it starts, the scheduler checks this environment variable and passes any properties defined in it to the JVM.

The properties that you need to set are:

  • javax.net.ssl.keystore: the absolute path to the keystore file to use.

  • javax.net.ssl.keyStorePassword: the password for the scheduler's key.

  • javax.net.ssl.trustStore: The absolute path to the truststore file.

The Linux command line to set the environment variable is:

export VKCONFIG_JVM_OPTS="$VKCONFIG_JVM_OPTS -Djavax.net.ssl.trustStore=/path/to/truststore \
                          -Djavax.net.ssl.keyStore=/path/to/keystore \
                          -Djavax.net.ssl.keyStorePassword=keystore_password"

For example, suppose the scheduler's truststore and keystore are located in the directory /home/dbadmin/SSL. Then you could use the following command to set the VKCONFIG_JVM_OPTS variable:

$ export VKCONFIG_JVM_OPTS="$VKCONFIG_JVM_OPTS \
                           -Djavax.net.ssl.trustStore=/home/dbadmin/SSL/scheduler.truststore.jks \
                           -Djavax.net.ssl.keyStore=/home/dbadmin/SSL/scheduler.keystore.jks \
                           -Djavax.net.ssl.keyStorePassword=key_password"

To ensure that this variable is always set, add the command to the ~/.bashrc or other startup file of the user account that runs the scheduler.

If you require TLS on the JDBC connection to Vertica, add TLSmode=require to the JDBC URL that the scheduler uses. The easiest way to add this is to use the scheduler's --jdbc-url option. Assuming that you use a configuration file for your scheduler, you can add this line to it:

--jdbc-url=jdbc:vertica://VerticaHost:portNumber/databaseName?user=username&password=password&TLSmode=require

For more information about using the JDBC with Vertica, see Java.

Enable TLS in the scheduler configuration

Lastly, enable TLS. Every time you run vkconfig, you must pass it the following options:

--enable-ssl
true, to enable the scheduler to use SSL when connecting to Kafka.
--ssl-ca-alias
Alias for the CA you used to sign your Kafka broker's keys. This must match the value you supplied to the -alias argument of the keytool command to import the CA into the truststore.
--ssl-key-alias
Alias assigned to the schedule key. This value must match the value you supplied to the -alias you supplied to the keytool command when creating the scheduler's keystore.
--ssl-key-password
Password for the scheduler key.

See Common vkconfig script options for details of these options. For convenience and security, add these options to a configuration file that you pass to vkconfig. Otherwise, you run the risk of exposing the key password via the process list which can be viewed by other users on the same system. See Configuration File Format for more information on setting up a configuration file.

Add the following to the scheduler configuration file to allow it to use the keystore and truststore and enable TLS when connecting to Vertica:

enable-ssl=true
ssl-ca-alias=CAroot
ssl-key-alias=vsched
ssl-key-password=vertica
jdbc-url=jdbc:vertica://VerticaHost:portNumber/databaseName?user=username&password=password&TLSmode=require

Start the scheduler

Once you have configured the scheduler to use SSL, start it and verify that it can load data. For example, to start the scheduler with a configuration file named weblog.conf, use the command:

$ nohup vkconfig launch --conf weblog.conf >/dev/null 2>&1 &

5 - Troubleshooting Kafka TLS/SSL connection issues

After configuring Vertica, Kafka, and your scheduler to use TLS/SSL authentication and encryption, you may encounter issues with data streaming.

After configuring Vertica, Kafka, and your scheduler to use TLS/SSL authentication and encryption, you may encounter issues with data streaming. This section explains some of the more common errors you may encounter, and how to trouble shoot them.

Errors when launching the scheduler

You may see errors like this when launching the scheduler:

$ vkconfig launch --conf weblog.conf
java.sql.SQLNonTransientException: com.vertica.solutions.kafka.exception.ConfigurationException:
       No keystore system property found: null
    at com.vertica.solutions.kafka.util.SQLUtilities.getConnection(SQLUtilities.java:181)
    at com.vertica.solutions.kafka.cli.CLIUtil.assertDBConnectionWorks(CLIUtil.java:40)
    at com.vertica.solutions.kafka.Launcher.run(Launcher.java:135)
    at com.vertica.solutions.kafka.Launcher.main(Launcher.java:263)
Caused by: com.vertica.solutions.kafka.exception.ConfigurationException: No keystore system property found: null
    at com.vertica.solutions.kafka.security.KeyStoreUtil.loadStore(KeyStoreUtil.java:77)
    at com.vertica.solutions.kafka.security.KeyStoreUtil.<init>(KeyStoreUtil.java:42)
    at com.vertica.solutions.kafka.util.SQLUtilities.getConnection(SQLUtilities.java:179)
    ... 3 more

The scheduler throws these errors when it cannot locate or read the keystore or truststore files. To resolve this issue:

  • Verify you have set the VKCONFIG_JVM_OPTS Linux environment variable. Without this variable, the scheduler will not know where to find the truststore and keystore to use when creating TLS/SSL connections. See Step 2: Set the VKCONFIG_JVM_OPTS Environment Variable for more information.

  • Verify that the keystore and truststore files are located in the path you set in the VKCONFIG_JVM_OPTS environment variable.

  • Verify that the user account that runs the scheduler has read access to the trustore and keystore files.

  • Verify that the key password you provide in the scheduler configuration is correct. Note that you must supply the password for the key, not the keystore.

Another possible error message is a failure to set up a TLS Keystore:

Exception in thread "main" java.sql.SQLRecoverableException: [Vertica][VJDBC](100024) IOException while communicating with server: java.io.IOException: Failed to create an SSLSocketFactory when setting up TLS: keystore not found.
        at com.vertica.io.ProtocolStream.logAndConvertToNetworkException(Unknown Source)
        at com.vertica.io.ProtocolStream.enableSSL(Unknown Source)
        at com.vertica.io.ProtocolStream.initSession(Unknown Source)
        at com.vertica.core.VConnection.tryConnect(Unknown Source)
        at com.vertica.core.VConnection.connect(Unknown Source)
        . . .

This error can be caused by using a keystore or truststore file in a format other than JKS and not supplying the correct file extension. If the scheduler does not recognize the file extension of your keystore or truststore file name, it assumes the file is in JKS format. If the file isn't in this format, the scheduler will exit with the error message shown above. To correct this error, rename the keystore and truststore files to use the correct file extension. For example, if your files are in PKCS 12 filemat, change their file extension to .p12 or .pks.

Data does not load

If you find that scheduler is not loading data into your database, you should first query the stream_microbatch_history table to determine whether the scheduler is executing microbatches, and if so, what their results are. A faulty TLS/SSL configuration usually results in a status of NETWORK_ISSUE:

=> SELECT frame_start, end_reason, end_reason_message FROM weblog_sched.stream_microbatch_history;
       frame_start       |  end_reason   | end_reason_message
-------------------------+---------------+--------------------
 2019-04-05 11:35:18.365 | NETWORK_ISSUE |
 2019-04-05 11:35:38.462 | NETWORK_ISSUE |

If you suspect an SSL issue, you can verify that Vertica is establishing a connection to Kafka by looking at Kafka's server.log file. Failed SSL connection attempts can appear in this log like this example:

java.io.IOException: Unexpected status returned by SSLEngine.wrap, expected
        CLOSED, received OK. Will not send close message to peer.
        at org.apache.kafka.common.network.SslTransportLayer.close(SslTransportLayer.java:172)
        at org.apache.kafka.common.utils.Utils.closeAll(Utils.java:703)
        at org.apache.kafka.common.network.KafkaChannel.close(KafkaChannel.java:61)
        at org.apache.kafka.common.network.Selector.doClose(Selector.java:739)
        at org.apache.kafka.common.network.Selector.close(Selector.java:727)
        at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:520)
        at org.apache.kafka.common.network.Selector.poll(Selector.java:412)
        at kafka.network.Processor.poll(SocketServer.scala:551)
        at kafka.network.Processor.run(SocketServer.scala:468)
        at java.lang.Thread.run(Thread.java:748)

If you do not see errors of this sort, you likely have a network problem between Kafka and Vertica. If you do see these errors, consider the following debugging steps:

  • Verify that the configuration of your Kafka cluster is uniform. For example, you may see connection errors if some Kafka nodes are set to require client authentication and others aren't.

  • Verify that the Common Names (CN) in the certificates and keys match the host name of the system.

  • Verify that the Kafka cluster is accepting connections on the ports and host names you specify in the server.properties file's listeners property. For example, suppose you use IP addresses in this setting, but use host names when defining the cluster in the scheduler's configuration. Then Kafka may reject the connection attempt by Vertica or Vertica may reject the Kafka node's identity.

  • If you are using client authentication in Kafka, try turning it off to see if the scheduler can connect. If disabling authentication allows the scheduler to stream data, then you can isolate the problem to client authentication. In this case, review the certificates and CAs of both the Kafka cluster and the scheduler. Ensure that the truststores include all of the CAs used to sign the key, up to and including the root CA.

Avro schema registry and KafkaAvroParser

At minimum, the KafkaAvroParser requires the following parameters to create a TLS connection between the Avro Schema Registry and Vertica:

  • schema_registry_url with the https scheme

  • schema_registry_ssl_ca_path

If and only if TLS access fails, determine what TLS schema registry information Vertica requires from the following:

  • Certificate Authority (CA)

  • TLS server certificate

  • TLS key

Provide only the necessary TLS schema registry information with KafkaAvroParser parameters. The TLS information must be accessible in the filesystem of each Vertica node that processes Avro data.

The following example shows how to pass these parameters to KafkaAvroParser:

KafkaAvroParser(
    schema_registry_url='https://localhost:8081'
    schema_registry_ssl_ca_path='path/to/certificate-authority',
    schema_registry_ssl_cert_path='path/to/tls-server-certificate',
    schema_registry_ssl_key_path='path/to/private-tls-key',
    schema_registry_ssl_key_password_path='path/to/private-tls-key-password'
)