This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

OpenText Core Analytics Database

This topic provides an overview of OTCAD.

1: OpenText Core Analytics Database Settings Overview

1.1: Connection Data Settings
1.2: CIDR
1.3: Scaling

2: Overview of Data Pipelines

2.1: Data Pipelines

3: Overview of Integrations

3.1: Integrations

OpenText™ Core Analytics Database (OTCAD) is a managed database service that delivers a unified, high performance advanced analytics and machine learning platform in the cloud. The database gets automatically provisioned in an OpenText AWS account. OpenText manages the database hardware and environment. OTCAD is optimized for efficiency and query performance, ensuring you meet your service level agreements within your budget.

OTCAD enhances the speed and scalability of a database without the need to manage hardware. OTCAD is a data warehouse and data lake platform that enables organizations to deploy databases securely. To get access to OTCAD, contact your sales representative or contact us.

OTCAD UI tour

You can view and manage the provisioned database. After logging in to the OTCAD application, the home page appears. This page displays the three-dot menu and other SQL editor options as shown in this image:

OTCAD user interface

Option	Description
a	More options: Displays the More options button with options to access the SQL editor, Settings, Data pipelines, and the Integrations page.
b	Stop: Displays the option to stop the database.
c	History view: Displays the queries run previously.
d	Favorite view: Displays the bookmarked SQL queries.
e	Help: Displays OpenText™ Analytics Database user documentation.
f	View profile: Displays the user profile and option to sign out of the application.

More options

When you log in to the application, the More options button is displayed with these options:

SQL editor: Select this option to work with the SQL editor.
Settings: Select this option to configure the database settings.
Data pipeline: Select this option to work with data pipelines.
Integrations: Select this option to display the Integrations page.

Database connection options

The Home page displays the option to stop an active database.

To stop OTCAD, select Stop. <--Or-->
To re-start an inactive database, select Start.

SQL Editor

Use OTCAD SQL editor to write, edit, and execute SQL statements in a query view. You can write and execute analytical queries without using any external tools.

You can use the SQL editor to:

gain valuable insights from your data without needing expertise to set up database connections.
write, execute, and save queries that need to be repeated.
view sample data returned from your queries.
validate the execution of SQL queries.
view data about the execution of SQL queries.
export queries into a spread sheet.
execute SQL commands for OTCAD operations, such as creation of schemas, tables, and so on.

Working with SQL editor

After logging in to the OTCAD application, the home page appears.

To launch the SQL editor, select SQL Editor in the three-dots menu. The Schema view displays with a list of available schemas and tables. You can either select a table or create an SQL query.

Create an SQL query

You create SQL queries either in the Home page or the Schema view.

Do one of the following:

Select a table:
- Select Create query in the Schema view. --Or-->
- Select Create new query in the Home page.
Enter your query in the SQL editor.
Select Run.
Enter the name of the query at the top.
To view the schema details of a table, select View Schema. The table columns display along with the datatypes.
To filter the SQL query results, select the filter icon. You can add a filter. search the results by the column name and the match type, and select Apply.
To save the SQL query results, do the following: a. Select Export CSV. b. Enter the file name and select Export.

You can export upto a maximum of 10 thousand records. To import data that exceeds 10 thousand records, modify the SQL query and export again.

History view

The History view tab stores the queries run previously. When you run a query, it appears in the Query History list. Queries of the last 2 weeks only are displayed.

To run a SQL query:

Select ….
Select Run.

To remove a SQL query:

Select ….
Select Remove.

Favorite view

You can bookmark SQL queries and run them again later in the Favorite view tab. Saved SQL queries display as a list of favorites next to the History view tab.

To bookmark a SQL query as a favorite:

Select the SQL query.
Select Run.
Select Favorite.
Enter a name and select Add.

To run a SQL query from the Favorites view:

Select ….
Select Run. After successful execution of the query, the results are displayed in the SQL editor.

To rename a favorite SQL query:

Select the SQL query.
Select ….
Select Rename.
Enter the new name and select Update.

To remove a favorite SQL query:

Select the query.
Select ….
Select Remove.
When prompted for confirmation, select Remove.

1 - OpenText Core Analytics Database Settings Overview

This topic provides an overview of database configuration settings in OTCAD.

You can configure the default database configuration settings in OpenText™ Core Analytics Database (OTCAD).

Log in to OTCAD.
Select Settings from More options. The Settings Overview page opens. In this page, you can configure the following OTCAD settings:
- Connection data: You can configure the default connection data settings in OTCAD.
- CIDR: CIDR is fundamental to subnetting, the process of dividing a large network into smaller, more manageable subnetworks. By using different prefix lengths, you can create subnets with varying numbers of host addresses, enabling efficient use of IP address space.
- Scaling: Database scaling is the ability of a database to obtain or reduce resources depending on the system's load, establishing a direct correlation between the allocation of resources and their demand. Scaling ensures consistency in the database server quality while maximizing resource usage.
Select Connection Data.
Copy this information to connect your database with other applications or clients:
- Host DNS: The DNS host provides the authoritative nameservers that answer or resolve queries for your website. This is how your site is found on the web.
- Host Port: The host port is a network port on a computer or device that is used to establish a connection with another device or service, often for communication or data transfer.
- Database Name: The unique name of the database.
- Refresh Token: The refresh token is a special type of token used in authentication flows, particularly in OAuth 2.0, to obtain new access tokens when the current ones expire.
- Access Token: An access token is a security credential that allows an application or user to access protected resources on a server. It acts as a digital key, verifying the user's or application's identity and authorization level, and is used to grant access without requiring the user to repeatedly provide their credentials.
Select Close to close the Connection Details page.

1.1 - Connection Data Settings

This topic provides an overview of connection data settings in OTCAD.

You can configure the default connection data settings in OTCAD. You can manage IP addresses with Classless Inter-Domain Routing (CIDR) and scale your plan to suit your requirements.

1.2 - CIDR

This topic provides an overview of managing IP addresses with Classless Inter-Domain Routing (CIDR) in OTCAD.

CIDR

Cloud infrastructure engineers use CIDR notation to manage IP address ranges for cloud-based networks. IP addresses enable resources in your Virtual Private Cloud (VPC) to communicate with each other and with resources over the Internet. CIDR notation is a way to represent an IP address and its network mask.

OpenText supports ranges of IP addresses using Classless Inter-Domain Routing (CIDR) notation. For example, 192.168.1.0/24 represents all IPv4 addresses in the range of 192.168.1.0 to 192.168.1.255.

The format of these addresses is as follows:

An individual IPv4 address is 32 bits, with 4 groups of up to 3 decimal digits. For example, 10.0.1.0.
An individual IPv6 address is 128 bits, with 8 groups of 4 hexadecimal digits. For example, 2001:0db8:85a3:0000:0000:8a2e:0370:7334.

Public IPv4 and IPv6 addresses

All subnets have an attribute that determines whether a network interface created in the subnet automatically receives a public IPv4 address. Therefore, when you launch an instance into a subnet that has this attribute enabled, a public IP address is assigned to the primary network interface that's created for the instance. You can control whether your instance receives a public IP address by doing the following:

Modifying the public IP addressing attribute of your subnet.
Enabling or disabling the public IP addressing feature during instance launch, which overrides the subnet's public IP addressing attribute.
Unassigning a public IP address from your instance after launch by managing the IP addresses associated with a network interface.

Public IPv6 addresses are IPv6 addresses that can be configured to remain private or configured to be reachable over the Internet.

Private IPv4 and IPv6 addresses

Private IPv4 addresses are not reachable over the Internet and can be used for communication between the instances in your VPC. When you launch an instance into a VPC, a primary private IP address from the IPv4 address range of the subnet is assigned to the primary network interface (for example, eth0) of the instance. Each instance is also given a private (internal) DNS hostname that resolves to the private IP address of the instance.

You cannot share private IPv6 addresses on the Internet. You can use a private IPv6 address if you want your private networks to support IPv6 and you have no intention of routing traffic from these addresses to the Internet. You can connect to the internet from a resource that has a private IPv6 address. But you must route traffic through a resource in another subnet with a public IPv6 address.

Configure CIDR settings

To configure CIDR settings:

In the Settings Overview page, select View CIDR. The CIDR page displays.
Select Private.
To add an IP address, select Add IP.
From the Value drop-down list, select either IPv4 or IPv6 depending on the IP address you want to add. Do one of the following: a. Enter a valid IPv4 or IPv6 address. b. Copy a valid IPv4 or IPv6 address from another source and select Paste.
Select √.

1.3 - Scaling

This topic provides an overview of scaling your database in OTCAD.

Workload Scaling

Workload scaling enhances application performance and resource efficiency in dynamic environments. Scaling ensures that applications can handle varying workloads without performance degradation. OpenText provides the ability to automatically scale resources in or out based on real-time metrics, allowing organizations to respond quickly to changes in traffic. This elasticity not only improves user experience but also optimizes resource utilization, helping to minimize costs associated with underused or overprovisioned resources.

You cam manage compute to your database to improve the performance of complex analytic queries that have been running for a long time. Adding compute helps these queries run faster. Configure the compute of your database to suit your query requirements and workload performance needs.

To manage compute for your database:

In the Settings Overview page, select Scaling.
Select View Scaling. The horizontal database scaling options are displayed. You can scale the database either manually or automatically. Do one of the following:

a. To manually scale the database, select Manual. b. Slide across the slider horizontally to scale the database depending on the size - small, small-medium, medium, medium-large, or large. c. Select Apply changes. d. In the View scaling log area, click the Audit log and Scaling log tabs to view the log details.
```
      <-Or->
```
a. To automatically scale the database, select Auto. Select this option only if you need to automatically scale the database based on defined parameters. b. From the Scale up to drop-down list, select Small-Medium, Medium, Medium-Large or Large depending on the size of your database.
Note
```
   Define the maximum compute beyond which the system cannot add anything and balance the loads automatically.
```
c. Turn on CPU usage to scale out the database when the CPU usage exceeds a value in the range 1% to 100%. d. Turn on Memory usage to scale out the database when the memory usage exceeeds a value in the range 1% to 100%.
Note
```
  When you apply both CPU usage and Memory usage, the system adds additional compute when either of the thresholds are reached.
```
e. Select Apply changes. f. In the View scaling log area, click the Audit log and Scaling log tabs to view the log details.

The Audit log provides information about the following:
- Date & time
- Event
- Performed by
The Scaling log provides information about the following:
- Date & time
- Scale event
- Compute (units)
- Triggered
- Performed by

2 - Overview of Data Pipelines

This topic provides an overview of data pipelines in OTCAD.

Data pipelines allow you to define where, when, and how you need to load data into OpenText™ Analytics Database with minimal manual intervention.

2.1 - Data Pipelines

This topic provides an overview of data pipelines in OTCAD.

Data pipeline

A data pipeline also known as a data loader is a declarative, automated way to process data continuously from external sources like Amazon S3. They allow you to define where, when, and how you need to load data into OTCAD with minimal manual intervention.

Data pipelines UI tour

You can view and manage data pipelines. After logging in to the OTCAD application, you land on the home page. Select the More options button and Data Pipeline. The Data pipeline page appears as shown in this image:

OTCAD user interface

This page displays the following information in the Overview card:

Total number of pipelines - The Total Pipelines card displays the total number of pipelines configured in the system.
Pipelines that failed execution - Select the Failed Execution drop-down list and the period for which you need to view the pipelines that failed to execute.
Active pipelines - The Active Pipelines card displays the total number of pipelines that are in the Active status. The Pipelines area displays the following information:
Pipeline name - The name of the data pipeline.
Created by - The user ID of the person who created the data pipeline.
Data source - The source location of the files that contain the data to be loaded.
Schema - The tables, columns (fields), data types, and relationships among different tables in the database.
Destination table - The database table where data is written after it has been processed or transformed from a source table or other data source.
Last run on - The timestamp at which the data pipeline was last run.
Last run status - Indicates the status of the data pipeline when it was last run.
Pipeline status - Indicates the present status of the data pipeline.

Option	Description
a	Create a pipeline: Select this option to create a data pipeline.
b	Filter by: Select this option to filter data by different criteria.
c	Search pipeline name or schema: Search for a data pipeline based on the pipeline name or schema.
d	View details: Select this option to view the pipeline details.
e	Edit pipeline: Select this option to edit the data pipeline.
f	Clone pipeline: Select this option to clone a data pipeline.
g	Remove pipeline: Select this option to delete a data pipeline.

Create a data pipeline

To create a data pipeline, do the following:

In the Data Pipelines page, select +Create a pipeline. The Create a pipeline page displays.
In the Pipeline name field, enter the name of the pipeline.
In the Access key ID field, enter your AWS account access key id.
In the Secret access key field, enter your AWS account secret access key.

Note
Provide valid AWS credentials for the Access key ID and Secret access key. Invalid AWS credentials do not allow you to create a data pipeline.
From the Region drop-down list, select the region (geography) of the S3 bucket where the files are present.
In the S3 Bucket/File/Folder path field, enter the name or the folder path where the files are present.
Select the Data is encrypted option to specify the following parameters:
- Select either AWS Key Management Service Key or Customer managed keys if you wish to encrypt and load data into the S3 bucket.
- Select Encryption key ID.
Select Next.
In the Retry limit field, specify the number of times for which the system needs to retry a failed file load.
In the Parameters field, specify the copy parameters. For more information, see Parameters.
Select Next.
From the Destination table drop-down list, select the destination table to which you need to load the data.
Select Next.
Specify the schedule at which the data pipeline needs to run. Do one of the following:
- Select Schedule.
- From the date pickers, select the Start date and End date.
- In the Repeat every field, specify the duration at which the data pipeline needs to run.
- From the Unit drop-down list, select the minute, hour, day, week, or month at which the data pipeline needs to run.
- Select the On day radio button and specify the day on which the data pipeline needs to run.
- Select the On radio button and specify the exact day and month on which the data pipeline needs to run.
- Select the option Trigger when something is added to run the data pipeline when a file is added to the S3 bucket.
- Enter the SQS credentials in the Access key ID, Secret access key, and Resource URL fields. <--Or-->
- Select Execute once.
Select Finish. The data pipeline is created and displayed in the Data Pipelines page.

Filter a data pipeline

You can filter data pipelines based on certain criteria. Filter data pipelines in one of these ways:

A date range on which the data pipelines last ran.
- Select the Filter icon in the Data Pipelines page.
- Expand the Last run range drop-down list.
- In the date picker, select a date range (start date and end date) for which you need to view the data pipelines. For example, select a range from 1 May to 31 May to view data pipelines that were created in the month of May.
Schema
- Select the Filter icon in the Data Pipelines page.
- Expand the Schema drop-down list.
- Select the required schema.
Data source
- Select the Filter icon in the Data Pipelines page.
- Expand the Data source drop-down list.
- Select the data source or data sources for which you wish to view the data pipeline.
Pipeline status
- Select the Filter icon in the Data Pipelines page.
- Expand the Pipeline status drop-down list.
- Select Active to view the data pipelines that are active. A data pipeline is in the Active status when the schedule end date is a future date and there are one or more ingestions that are yet to complete.
- Select Inactive to view the data pipelines that are inactive. A data pipeline is in the Inactive status either when the end date is past or there are no ingestions that are yet to complete.
Id of the person who created the pipeline
- Select the Filter icon in the Data Pipelines page.
- Expand the Created by drop-down list.
- Select the user Id of the person who created the data pipeline.

Search a data pipeline

All data pipelines are displayed by default. You can search data pipelines using specific search criteria. To search data pipelines, do the following:

In the Data pipelines page, select +Search pipeline name or schema.
Enter one of the following search criteria:
- Pipeline name
- Owner of the data pipeline (Created by)
- Schema
- Destination table
You can sort the data pipeline by the pipeline name or the date on which it was last run. To sort the data pipelines in the ascending or descending order, select the Sort icon for the Pipeline name or Last run on column in the Data Pipelines page.

View pipeline details

You can view pipeline details in the Data pipelines page. To view the details of a data pipeline, do the following:

In the Data pipeline page, mouse over the Pipeline name column and select the +View details icon for a data pipeline. You can view the following details of the data pipeline in this page.
- Overview - Displays the owner (user ID) of the data pipeline.
- Configurations - Displays the source and destination paths of the data pipeline. In the Configurations card, click the Edit icon to edit the data source and destination paths of the data pipeline. For more information, see Creating a data pipeline.
- Total instances - Displays the total number of jobs. Select the drop-down list and view information about the instances for the last 24 hours, last week, last month, last 6 months, last one year, or all time.
- To view the reasons for the failure of a job, mouse over an instance with job status Failed and select the View error logs icon that appears for a file name. The error log provides information about the row that failed execution, reason for failure, and rejected data. You can troubleshoot the job with this information and ensure that this job is successfully executed.
- Incident Overview - Displays information about the data pipeline job in a pie chart. Select the drop-down list and view information about the jobs for the last 24 hours, last week, last month, last 6 months, last one year, or all time.
Select Execute pipeline to execute the selected data pipeline. Executing a data pipeline loads all files that have not already been loaded and that have not reached the retry limit. Executing the data pipeline commits the transaction.
Select Edit pipeline to edit the data pipeline.
In the Access key ID field, enter your AWS account access key id.
In the Secret access key field, enter your AWS account secret access key. All the details about the data pipeline are populated, except the Access key ID and Secret access key.
Note
```
    Provide valid AWS credentials for the **Access key ID** and **Secret access key**. Invalid AWS credentials do not allow you to edit a data pipeline.  
```
For more information about editing a data pipeline, see Creating a data pipeline.

Edit a data pipeline

After creating a data pipeline, you can edit the details to suit your requirements. To edit a data pipeline, do the following:

In the Data Pipelines page, mouse over the Pipeline name column and select the +Edit pipeline icon for a data pipeline.
In the Access key ID field, enter your AWS account access key id.
In the Secret access key field, enter your AWS account secret access key. All the details about the data pipeline are populated, except the Access key ID and Secret access key.
Note
```
     Provide valid AWS credentials for the **Access key ID** and **Secret access key**. Invalid AWS credentials do not allow you to edit a data pipeline.  
```
For more information about editing a data pipeline, see Creating a data pipeline.

Clone a data pipeline

You can create a clone or replica of an existing data pipeline. The configurations of the existing data pipeline are copied to the cloned data pipeline. You can edit these configuration settings in the cloned data pipeline.

In the Data Pipelines page, mouse over the Pipeline name column and select the Clone pipeline icon for a data pipeline.
In the Confirmation dialog, select Confirm. The Create a pipeline page displays.
In the Pipeline name field, enter the name of the pipeline.
In the Access key ID field, enter your AWS account access key id.

In the Secret access key field, enter your AWS account secret access key.

Note

     Provide valid AWS credentials for the **Access key ID** and **Secret access key**. Invalid AWS credentials do not allow you to edit a data pipeline.

Information in all other fields is pre-populated. You can edit this information.

Remove a data pipeline

You can delete a data pipeline that is no longer in use or required.

In the Data Pipelines page, mouse over the Pipeline name column and select the +Remove pipeline icon for a data pipeline.
In the Confirmation dialog, select Remove.

3 - Overview of Integrations

This topic provides an overview of third party integrations in OTCAD.

OTCAD integrates with third party applications. This topic provides information about OTCAD integration with third party applications.

3.1 - Integrations

This topic provides an overview of OTCAD integration with third party applications.

OTCAD is a cloud solution that enhances the speed and scalability of an "OpenText™ Analytics Database without the need to manage hardware. Use OTCAD to build databases securely and scale them to large amounts of data in the cloud.

Integrations

When you log in to the application, the More options button is displayed.

Expand More options.
Select Integrations. The Integrations page displays. OTCAD supports machine learning with VerticaPy for data science and AI-driven data science applications. It also supports the following third-party applications:
- IBM Cognos BI See Vertica Integration with IBM Cognos Analytics: Connection Guide
- Grafana See Vertica Integration with Grafana
- Informatica PowerCenter See Vertica QuickStart for Informatica PowerCenter
- DbVisualizer See Vertica Integration with DbVisualizer: Connection Guide
- DBeaver See Vertica Integration with DBeaver: Connection Guide
- Domino Data Lab See End-to-End Machine Learning Solution with Vertica and Domino Data Lab Using VerticaPy
For each third-party application, select View Docs to view the Connection guide.