<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>OpenText Analytics Database 26.2.x – Overview of Data Pipelines</title>
    <link>/en/ot-cad/data-pipelines/</link>
    <description>Recent content in Overview of Data Pipelines on OpenText Analytics Database 26.2.x</description>
    <generator>Hugo -- gohugo.io</generator>
    
	  <atom:link href="/en/ot-cad/data-pipelines/index.xml" rel="self" type="application/rss+xml" />
    
    
      
        
      
    
    
    <item>
      <title>Ot-Cad: Data Pipelines</title>
      <link>/en/ot-cad/data-pipelines/data-pipeline/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/ot-cad/data-pipelines/data-pipeline/</guid>
      <description>
        
        
        &lt;h2 id=&#34;data-pipeline&#34;&gt;Data pipeline&lt;/h2&gt;
&lt;p&gt;A data pipeline also known as a data loader is a declarative, automated way to process data continuously from external sources like Amazon S3. They allow you to define where, when, and how you need to load data into OTCAD with minimal manual intervention.&lt;/p&gt;
&lt;h3 id=&#34;data-pipelines-ui-tour&#34;&gt;Data pipelines UI tour&lt;/h3&gt;
&lt;p&gt;You can view and manage data pipelines. After logging in to the OTCAD application, you land on the home page. Select the &lt;strong&gt;More options&lt;/strong&gt; button and &lt;strong&gt;Data Pipelines&lt;/strong&gt;. The Data pipelines page appears as shown in this image:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;../../../images/ot-cad/data-pipeline.png&#34; alt=&#34;OTCAD user interface&#34;&gt;&lt;/p&gt;
&lt;p&gt;This page displays the following information in the Overview card:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Total pipelines - The &lt;strong&gt;Total Pipelines&lt;/strong&gt; card displays the total number of data pipelines configured in the system.&lt;/li&gt;
&lt;li&gt;Failed execution - The &lt;strong&gt;Failed Execution&lt;/strong&gt; card displays the total number of data pipelines that failed to execute.&lt;/li&gt;
&lt;li&gt;Active pipelines - The &lt;strong&gt;Active Pipelines&lt;/strong&gt; card displays the total number of data pipelines that are in the &lt;strong&gt;Active&lt;/strong&gt; status.
The Pipelines area displays the following information:&lt;/li&gt;
&lt;li&gt;Pipeline name - The name of the data pipeline.&lt;/li&gt;
&lt;li&gt;Created by - The user ID of the person who created the data pipeline.&lt;/li&gt;
&lt;li&gt;Data source - The source location of the files that contain the data to be loaded.&lt;/li&gt;
&lt;li&gt;Schema - The tables, columns (fields), data types, and relationships among different tables in the database.&lt;/li&gt;
&lt;li&gt;Destination table - The database table where data is written after it has been processed or transformed from a source table or other data source.&lt;/li&gt;
&lt;li&gt;Last runtime - The timestamp at which the data pipeline was last run.&lt;/li&gt;
&lt;li&gt;Last run status - Indicates the status of the data pipeline when it was last run.&lt;/li&gt;
&lt;li&gt;Pipeline status - Indicates the present status of the data pipeline.&lt;/li&gt;
&lt;/ul&gt;

&lt;table class=&#34;table table-bordered&#34; &gt;



&lt;tr&gt; 

&lt;th &gt;
Option&lt;/th&gt; 

&lt;th &gt;
Description&lt;/th&gt;&lt;/tr&gt;

&lt;tr&gt; 

&lt;td &gt;
a&lt;/td&gt; 

&lt;td &gt;






&lt;p&gt;&lt;strong&gt;Duration&lt;/strong&gt;: From this list, select the duration for which you need to view the data pipelines. You can choose to view data pipelines for any one of these durations:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Last 24 hours&lt;/li&gt;
&lt;li&gt;Last week&lt;/li&gt;
&lt;li&gt;Last month&lt;/li&gt;
&lt;li&gt;Last 6 months&lt;/li&gt;
&lt;li&gt;Last year&lt;/li&gt;
&lt;li&gt;All time&lt;/li&gt;
&lt;/ul&gt;
&lt;/td&gt;&lt;/tr&gt;

&lt;tr&gt; 

&lt;td &gt;
b&lt;/td&gt; 

&lt;td &gt;
&lt;strong&gt;Filter by&lt;/strong&gt;: Select this option to filter data by different criteria.&lt;/td&gt;&lt;/tr&gt;

&lt;tr&gt; 

&lt;td &gt;
c&lt;/td&gt; 

&lt;td &gt;
&lt;strong&gt;Search pipeline name or schema&lt;/strong&gt;: Search for a data pipeline based on the pipeline name or schema.&lt;/td&gt;&lt;/tr&gt;

&lt;tr&gt; 

&lt;td &gt;
d&lt;/td&gt; 

&lt;td &gt;





&lt;p&gt;&lt;strong&gt;Actions&lt;/strong&gt;: Select the options in this column to perform one of these operations on the data pipelines:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;View details of the data pipeline&lt;/li&gt;
&lt;li&gt;Edit the data pipeline&lt;/li&gt;
&lt;li&gt;Clone a data pipeline&lt;/li&gt;
&lt;li&gt;Pause schedule&lt;/li&gt;
&lt;li&gt;Delete&lt;/li&gt;
&lt;/ul&gt;
&lt;/td&gt;&lt;/tr&gt;

&lt;/table&gt;

&lt;p&gt;&lt;a name=&#34;Datapip&#34;&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&#34;create-a-data-pipeline&#34;&gt;Create a data pipeline&lt;/h2&gt;
&lt;p&gt;You can create a data pipeline using either the AWS object store or Kafka data source. 

&lt;div class=&#34;alert admonition note&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Note&lt;/h4&gt;

When creating data pipelines, ensure that mulitple database objects do not have the same name.

&lt;/div&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#Datapip&#34;&gt;Create a data pipeline using AWS object store&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#Datapip-kafka&#34;&gt;Create a data pipeline using the Kafka data source&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#Datapip-upload&#34;&gt;Create a data pipeline by uploading a file&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;a name=&#34;Datapip&#34;&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3 id=&#34;create-a-data-pipeline-using-aws-object-store&#34;&gt;Create a data pipeline using AWS object store&lt;/h3&gt;
&lt;p&gt;AWS object store, primarily Amazon Simple Storage Service (S3) is a highly scalable, durable, and cost-effective cloud storage service for unstructured data. It stores data as objects in flat containers called buckets.&lt;/p&gt;
&lt;p&gt;To create a data pipeline using AWS object store, do the following:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;In the Data pipelines page, select &lt;strong&gt;+Create a pipeline&lt;/strong&gt;.
The Create a pipeline page is displayed.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;In the &lt;strong&gt;Pipeline name&lt;/strong&gt; field, enter the name of the pipeline.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Choose &lt;strong&gt;AWS object store&lt;/strong&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;In the &lt;strong&gt;Access key ID&lt;/strong&gt; field, enter your AWS account access key id.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;In the &lt;strong&gt;Secret access key&lt;/strong&gt; field, enter your AWS account secret access key.

&lt;div class=&#34;alert admonition note&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Note&lt;/h4&gt;

Provide valid AWS credentials for the &lt;strong&gt;Access key ID&lt;/strong&gt; and &lt;strong&gt;Secret access key&lt;/strong&gt;. Invalid AWS credentials do not allow you to create a data pipeline.

&lt;/div&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;From the &lt;strong&gt;Region&lt;/strong&gt; list, select the region (geography) of the S3 bucket where the files are present.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;In the &lt;strong&gt;S3 Bucket/File/Folder path&lt;/strong&gt; field, enter the name or the folder path where the files are present.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Select the &lt;strong&gt;Data is encrypted&lt;/strong&gt; option to specify the following parameters:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Select either &lt;strong&gt;AWS Key Management Service Key&lt;/strong&gt; or &lt;strong&gt;Customer managed keys&lt;/strong&gt; if you wish to encrypt and load data into the S3 bucket.&lt;/li&gt;
&lt;li&gt;Select &lt;strong&gt;Encryption key ID&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Select &lt;strong&gt;Next&lt;/strong&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;In the &lt;strong&gt;Retry limit&lt;/strong&gt; field, specify the number of times the system should attempt to retry a failed file load.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;In the &lt;strong&gt;Parameters&lt;/strong&gt; field, specify the copy parameters. For more information, see &lt;a href=&#34;../../../en/sql-reference/statements/copy/parameters/#&#34;&gt;Parameters&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Select &lt;strong&gt;Next&lt;/strong&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;From the &lt;strong&gt;Destination table&lt;/strong&gt; list, select the destination table to which you need to load the data.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Select &lt;strong&gt;Next&lt;/strong&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Specify the schedule at which the data pipeline needs to run. Do one of the following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Select &lt;strong&gt;Schedule&lt;/strong&gt;.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;From the date pickers, select the &lt;strong&gt;Start date&lt;/strong&gt; and &lt;strong&gt;End date&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;In the &lt;strong&gt;Repeat every&lt;/strong&gt; field, specify the duration at which the data pipeline needs to run.&lt;/li&gt;
&lt;li&gt;From the &lt;strong&gt;Unit&lt;/strong&gt; list, select the minute, hour, day, week, or month at which the data pipeline needs to run.&lt;/li&gt;
&lt;li&gt;Select the &lt;strong&gt;On day&lt;/strong&gt; option and specify the day on which the data pipeline needs to run.&lt;/li&gt;
&lt;li&gt;Select the &lt;strong&gt;On&lt;/strong&gt; option and specify the exact day and month on which the data pipeline needs to run.&lt;/li&gt;
&lt;li&gt;Select the option &lt;strong&gt;Trigger when something is added&lt;/strong&gt; to run the data pipeline when a file is added to the S3 bucket.&lt;/li&gt;
&lt;li&gt;Enter the SQS credentials in the &lt;strong&gt;Access key ID&lt;/strong&gt;, &lt;strong&gt;Secret access key&lt;/strong&gt;, and &lt;strong&gt;Resource URL&lt;/strong&gt; fields.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&amp;lt;--Or--&amp;gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Select &lt;strong&gt;Execute once&lt;/strong&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Select &lt;strong&gt;Finish&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The data pipeline is created and displayed in the Data pipelines page.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;a name=&#34;Datapip-kafka&#34;&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3 id=&#34;create-a-data-pipeline-using-kafka-data-source&#34;&gt;Create a data pipeline using Kafka data source&lt;/h3&gt;
&lt;p&gt;Define data pipelines to ingest real-time data through &lt;a href=&#34;https://kafka.apache.org/&#34;&gt;Apache Kafka&lt;/a&gt;, an open-source distributed real-time streaming platform by leveraging Kafka topics for efficient streaming and processing. A Kafka topic is a category or feed name for streams of data, similar to a table in a database. Consumers read data from Kafka topics, with topics being organized into partitions for parallel processing. Each message in a topic is a record with a key, value, and timestamp. Kafka topics are logs where messages are ordered by an offset within each partition.&lt;/p&gt;
&lt;p&gt;To create a data pipeline using Kafka data source, do the following:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;In the Data pipelines page, select &lt;strong&gt;+Create a pipeline&lt;/strong&gt;.
The Create a pipeline page is displayed.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;In the &lt;strong&gt;Pipeline name&lt;/strong&gt; field, enter the name of the pipeline.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Choose &lt;strong&gt;Kafka&lt;/strong&gt;.

&lt;div class=&#34;alert admonition note&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Note&lt;/h4&gt;

&lt;pre&gt;&lt;code&gt;  Create a destination table from the SQL editor before creating the data pipeline.
&lt;/code&gt;&lt;/pre&gt;


&lt;/div&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;In the &lt;strong&gt;Bootstrap servers&lt;/strong&gt; field, enter the initial list of Kafka broker addresses that a Kafka client uses to connect to the Kafka cluster.

&lt;div class=&#34;alert admonition note&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Note&lt;/h4&gt;

&lt;pre&gt;&lt;code&gt;  You can use only those bootstrapped servers that are whitelisted by OpenText. For assistance, contact the technical support team.
&lt;/code&gt;&lt;/pre&gt;


&lt;/div&gt;

&lt;div class=&#34;alert admonition note&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Note&lt;/h4&gt;

&lt;pre&gt;&lt;code&gt;  For multi-cluster configurations, provide the addresses as a comma-separated list. For example, 10.20.41.15:9092,10.20.41.16:9092.
&lt;/code&gt;&lt;/pre&gt;


&lt;/div&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Do one of the following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Select &lt;strong&gt;SSL&lt;/strong&gt; if your Kafka broker is configured with SSL.&lt;/p&gt;
&lt;p&gt;SSL is short for Secure Sockets Layer. It is a protocol that creates an encrypted link between a web server and a web browser to ensure that all data transmitted between them is confidential.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;In the &lt;strong&gt;Client key&lt;/strong&gt; field, enter the encrypted client key generated for the SSL certificate.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;In the &lt;strong&gt;Password&lt;/strong&gt; field, enter the password associated with the encrypted client key.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;In the &lt;strong&gt;Client certificate&lt;/strong&gt; field, enter the certificate used to authenticate the client with the Kafka broker.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;In the &lt;strong&gt;CA certificate&lt;/strong&gt; field, enter the certificate authority (CA) certificate used to validate the Kafka broker.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&amp;lt;--Or--&amp;gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Select &lt;strong&gt;SASL&lt;/strong&gt; if your Kafka broker is authenticated with Simple Authentication and Security Layer (SASL).&lt;/p&gt;
&lt;p&gt;SASL is a Kafka framework for authenticating clients and brokers, which can be used with or without TLS/SSL encryption. It allows Kafka to support different authentication mechanisms.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;In the &lt;strong&gt;SASL mechanism&lt;/strong&gt; list, select one of the following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Plain&lt;/strong&gt; - A simple username/password authentication mechanism used with TLS for encryption to implement secure authentication.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;SCRAM - SHA - 256&lt;/strong&gt; - A secure password-based authentication method that uses a challenge-response protocol and the SHA-256 hashing algorithm to verify user credentials without sending the password in plain text over the network.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;SCRAM - SHA - 512&lt;/strong&gt; - A secure authentication mechanism that uses the SHA-512 cryptographic hash function to verify a user&#39;s credentials in a &amp;quot;challenge-response&amp;quot; format, which prevents the password from being sent directly over the network.&lt;/li&gt;
&lt;li&gt;In the &lt;strong&gt;Username&lt;/strong&gt; field, enter a valid username.&lt;/li&gt;
&lt;li&gt;In the &lt;strong&gt;Password&lt;/strong&gt; field, enter the password for SASL authentication.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Select &lt;strong&gt;Proceed&lt;/strong&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;In the Define configuration area, specify the configuration settings for the data source.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;In the &lt;strong&gt;Topic&lt;/strong&gt; field, enter the Kafka topic. A Kafka topic is a logical grouping of messages, split into multiple partitions for parallelism.&lt;/li&gt;
&lt;li&gt;In the &lt;strong&gt;Partition&lt;/strong&gt; box, type or select the number of partitions for the Kafka topic. A partition is a sequence of records within a topic, stored on brokers and consumed independently.&lt;/li&gt;
&lt;li&gt;In the &lt;strong&gt;Start offset&lt;/strong&gt; box, type or select the incremental ID for each partition. The offset is a unique identifier for each record within a partition, used to track consumer progress.&lt;/li&gt;
&lt;li&gt;Select &lt;strong&gt;+Add topic&lt;/strong&gt; to add more topics and partitions.&lt;/li&gt;
&lt;li&gt;Select one of the available parser options depending on the message type in the topic:
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;AVRO&lt;/strong&gt; - An Avro schema registry is a crucial component in systems that utilize Apache Kafka and Avro for data serialization. Its primary purpose is to centralize the management and evolution of Avro schemas, providing a robust mechanism for ensuring data compatibility and governance in streaming data pipelines.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;In the &lt;strong&gt;URL&lt;/strong&gt; field, enter the schema registry URL.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;In the &lt;strong&gt;Subject&lt;/strong&gt; field, enter the subject information from the schema registry.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;In the &lt;strong&gt;Version&lt;/strong&gt; fieled, enter the version of the schema registry.&lt;/p&gt;
&lt;p&gt;&amp;lt;--Or--&amp;gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;In the &lt;strong&gt;External Schema&lt;/strong&gt; field, enter the schema of the AVRO message. Ensure that the schema is in JSON format.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;JSON&lt;/strong&gt; - JSON schema registry for Kafka provides a mechanism to define, manage, and enforce the structure of JSON data being produced to and consumed from Kafka topics. This ensures data consistency and compatibility, especially in distributed systems where multiple applications interact with the same data streams.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Kafka&lt;/strong&gt; - A Kafka schema registry is an external service that acts as a central repository for managing and validating schemas for data in a Kafka cluster. It allows producers to register schemas and consumers to retrieve them, ensuring data consistency and compatibility as schemas evolve over time.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start=&#34;8&#34;&gt;
&lt;li&gt;
&lt;p&gt;Select &lt;strong&gt;Proceed&lt;/strong&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;From the &lt;strong&gt;Destination table&lt;/strong&gt; list, select the destination table to which you need to load the data.

&lt;div class=&#34;alert admonition note&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Note&lt;/h4&gt;

&lt;pre&gt;&lt;code&gt;    New data is added to the end of the selected table without changing existing records.
&lt;/code&gt;&lt;/pre&gt;


&lt;/div&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Select &lt;strong&gt;Proceed&lt;/strong&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Specify the schedule at which the data pipeline needs to run. Do one of the following:&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Select &lt;strong&gt;Schedule&lt;/strong&gt;.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;From the date pickers, select the &lt;strong&gt;Start date&lt;/strong&gt; and &lt;strong&gt;End date&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;In the &lt;strong&gt;Repeat every&lt;/strong&gt; field, specify the duration at which the data pipeline needs to run.&lt;/li&gt;
&lt;li&gt;From the &lt;strong&gt;Unit&lt;/strong&gt; list, select the minute, hour, day, week, or month at which the data pipeline needs to run.&lt;/li&gt;
&lt;li&gt;Select the &lt;strong&gt;On day&lt;/strong&gt; option and specify the day on which the data pipeline needs to run.&lt;/li&gt;
&lt;li&gt;Select the &lt;strong&gt;On&lt;/strong&gt; option and specify the exact day of the week on which the data pipeline needs to run.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&amp;lt;--Or--&amp;gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Select &lt;strong&gt;Execute once&lt;/strong&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;ol start=&#34;12&#34;&gt;
&lt;li&gt;
&lt;p&gt;Select &lt;strong&gt;Finish&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;The data pipeline is created and displayed in the Data pipelines page.&lt;/p&gt;

&lt;div class=&#34;alert admonition note&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Note&lt;/h4&gt;

&lt;pre&gt;&lt;code&gt;     The data pipeline is created successfully only if the SQL query execution is successful. If the SQL query execution fails, the reason for the failure appears in a message. Resolve the SQL query to ensure successful query execution and data pipeline creation.
&lt;/code&gt;&lt;/pre&gt;


&lt;/div&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;a name=&#34;Datapip-upload&#34;&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3 id=&#34;create-a-data-pipeline-by-uploading-a-file&#34;&gt;Create a data pipeline by uploading a file&lt;/h3&gt;
&lt;p&gt;You can create a data pipeline by uploading data from a file in your system.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;In the Data pipelines page, select &lt;strong&gt;+Create a pipeline&lt;/strong&gt;.
The Create a pipeline page is displayed.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;In the &lt;strong&gt;Pipeline name&lt;/strong&gt; field, enter the name of the pipeline.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Choose &lt;strong&gt;Upload file&lt;/strong&gt;.
Ensure that a destination table is created and available before you proceed. If no destination table exists, create a new one from the SQL editor and continue to create the data pipeline.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Click &lt;strong&gt;Browse&lt;/strong&gt; and choose the file that you need to upload. You can upload files that are of type .csv, .tsv, or .parquet only. Ensure that the file size is less than 1 GB.&lt;/p&gt;

&lt;div class=&#34;alert admonition note&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Note&lt;/h4&gt;

&lt;pre&gt;&lt;code&gt;   All uploaded code or content is the sole responsibility of the uploading party. OpenText does not review, endorse, or accept any liability for uploaded content. Uploads are made at the uploading party&#39;s own risk.
&lt;/code&gt;&lt;/pre&gt;


&lt;/div&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;From the &lt;strong&gt;File type&lt;/strong&gt; list, select the type of file you are uploading. Files can be of type &lt;code&gt;.csv&lt;/code&gt;, &lt;code&gt;.tsv&lt;/code&gt;, or &lt;code&gt;.parquet&lt;/code&gt;. For information about these file types, see:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;../../../en/sql-reference/statements/copy/parsers/fcsvparser/#&#34;&gt;FCSVPARSER&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;../../../en/sql-reference/statements/copy/parsers/delimited/#&#34;&gt;DELIMITED&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;../../../en/data-load/data-formats/parquet-data/#&#34;&gt;Parquet data&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;In the &lt;strong&gt;Is header present&lt;/strong&gt; area, choose &lt;strong&gt;Yes&lt;/strong&gt; only if the file you upload has a header. Else, choose &lt;strong&gt;No&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;If the file has a header row, it is displayed in the &lt;strong&gt;Preview&lt;/strong&gt; area along with the first 5 rows of data. If no header row is present in the file, the columns are displayed with default header names such as ucol0, ucol1, and so on.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;From the &lt;strong&gt;Delimiter&lt;/strong&gt; list, select the character or sequence of characters to define boundaries between separate, independent data elements in the file. Select one of the following values:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;,&lt;/strong&gt; - comma delimiter&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;/t&lt;/strong&gt; - tab delimiter&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;|&lt;/strong&gt; - pipe delimiter&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;custom&lt;/strong&gt; - Enter a single ASCII character in the range E&#39;\000&#39; to E&#39;\177&#39;.

&lt;div class=&#34;alert admonition note&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Note&lt;/h4&gt;

The fields &lt;strong&gt;Is header present&lt;/strong&gt; and &lt;strong&gt;Delimiter&lt;/strong&gt; are displayed only for files of type &lt;code&gt;.csv&lt;/code&gt; or &lt;code&gt;.tsv&lt;/code&gt;. These fields are not displayed if the file is of type &lt;code&gt;.parquet&lt;/code&gt;.

&lt;/div&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Click &lt;strong&gt;Proceed&lt;/strong&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;From the &lt;strong&gt;Destination table&lt;/strong&gt; list, select a destination table to which the data needs to be loaded. 

&lt;div class=&#34;alert admonition note&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Note&lt;/h4&gt;

If the destination table has fewer columns than the source table, a message &amp;quot;There is a mismatch in source and destination columns&amp;quot; is displayed. However, if the destination table has more columns than the source table, the destination and source tables are mapped successfully. The extra columns in the destination table are filled with NULL values.

&lt;/div&gt;
In the &lt;strong&gt;Map columns&lt;/strong&gt; area, the source columns and destination columns are displayed.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;In the &lt;strong&gt;Destination columns&lt;/strong&gt; area, expand the list and select the required destination column. You can either choose to retain the default destination column or select a different one based on your requirement.&lt;/p&gt;

&lt;div class=&#34;alert admonition note&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Note&lt;/h4&gt;

&lt;pre&gt;&lt;code&gt;     If the destination column has a different data type than the source and the system cannot coerce the data, the system inserts NULL into the column or returns an error. For example, when a source value of &amp;quot;ABC&amp;quot; (string) is mapped to a destination column of type INTEGER, coercion fails. The system inserts NULL or returns an error.
&lt;/code&gt;&lt;/pre&gt;


&lt;/div&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Click &lt;strong&gt;Create a pipeline&lt;/strong&gt;.  &lt;br&gt;
The data pipeline is created and displayed in the Data pipelines page.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id=&#34;filter-a-data-pipeline&#34;&gt;Filter a data pipeline&lt;/h3&gt;
&lt;p&gt;You can filter data pipelines based on certain criteria. Filter data pipelines in one of these ways:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;A date range on which the data pipelines last ran.
&lt;ul&gt;
&lt;li&gt;Select the &lt;strong&gt;Filter&lt;/strong&gt; icon in the Data Pipelines page.&lt;/li&gt;
&lt;li&gt;Expand the &lt;strong&gt;Last run range&lt;/strong&gt; drop-down list.&lt;/li&gt;
&lt;li&gt;In the date picker, select a date range (start date and end date) for which you need to view the data pipelines. For example, select a range from 1 May to 31 May to view data pipelines that were created in the month of May.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Schema
&lt;ul&gt;
&lt;li&gt;Select the &lt;strong&gt;Filter&lt;/strong&gt; icon in the Data Pipelines page.&lt;/li&gt;
&lt;li&gt;Expand the &lt;strong&gt;Schema&lt;/strong&gt; drop-down list.&lt;/li&gt;
&lt;li&gt;Select the required schema.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Data source
&lt;ul&gt;
&lt;li&gt;Select the &lt;strong&gt;Filter&lt;/strong&gt; icon in the Data Pipelines page.&lt;/li&gt;
&lt;li&gt;Expand the &lt;strong&gt;Data source&lt;/strong&gt; drop-down list.&lt;/li&gt;
&lt;li&gt;Select the data source or data sources for which you wish to view the data pipeline.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Pipeline status
&lt;ul&gt;
&lt;li&gt;Select the &lt;strong&gt;Filter&lt;/strong&gt; icon in the Data Pipelines page.&lt;/li&gt;
&lt;li&gt;Expand the &lt;strong&gt;Pipeline status&lt;/strong&gt; drop-down list.&lt;/li&gt;
&lt;li&gt;Select &lt;strong&gt;Active&lt;/strong&gt; to view the data pipelines that are active. 
A data pipeline is in the &lt;code&gt;Active&lt;/code&gt; status when the schedule end date is a future date and there are one or more ingestions that are yet to complete.&lt;/li&gt;
&lt;li&gt;Select &lt;strong&gt;Inactive&lt;/strong&gt; to view the data pipelines that are inactive.
A data pipeline is in the &lt;code&gt;Inactive&lt;/code&gt; status either when the end date is past or there are no ingestions that are yet to complete.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Id of the person who created the pipeline
&lt;ul&gt;
&lt;li&gt;Select the &lt;strong&gt;Filter&lt;/strong&gt; icon in the Data Pipelines page.&lt;/li&gt;
&lt;li&gt;Expand the &lt;strong&gt;Created by&lt;/strong&gt; drop-down list.&lt;/li&gt;
&lt;li&gt;Select the user Id of the person who created the data pipeline.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id=&#34;search-a-data-pipeline&#34;&gt;Search a data pipeline&lt;/h3&gt;
&lt;p&gt;All data pipelines are displayed by default. You can search data pipelines using specific search criteria. To search data pipelines, do the following:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;In the Data pipelines page, select &lt;strong&gt;+Search pipeline name or schema&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Enter one of the following search criteria:
&lt;ul&gt;
&lt;li&gt;Pipeline name&lt;/li&gt;
&lt;li&gt;Owner of the data pipeline (Created by)&lt;/li&gt;
&lt;li&gt;Schema&lt;/li&gt;
&lt;li&gt;Destination table&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;You can sort the data pipeline by the pipeline name or the date on which it was last run. To sort the data pipelines in the ascending or descending order, select the &lt;strong&gt;Sort&lt;/strong&gt; icon for the &lt;strong&gt;Pipeline name&lt;/strong&gt; or &lt;strong&gt;Last run on&lt;/strong&gt; column in the Data Pipelines page.&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id=&#34;view-pipeline-details&#34;&gt;View pipeline details&lt;/h3&gt;
&lt;p&gt;You can view pipeline details in the Data pipelines page. To view the details of a data pipeline, do the following:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;In the Data pipeline page, mouse over the &lt;strong&gt;Pipeline name&lt;/strong&gt; column and select the &lt;strong&gt;+View details&lt;/strong&gt; icon for a data pipeline.
You can view the following details of the data pipeline in this page.
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Overview&lt;/strong&gt; - Displays the owner (user ID) of the data pipeline.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Configurations&lt;/strong&gt; - Displays the source and destination paths of the data pipeline. In the &lt;strong&gt;Configurations&lt;/strong&gt; card, click the &lt;strong&gt;Edit&lt;/strong&gt; icon to edit the data source and destination paths of the data pipeline. 
For more information, see &lt;a href=&#34;#Datapip&#34;&gt;Creating a data pipeline&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Total instances&lt;/strong&gt; - Displays the total number of jobs. Select the drop-down list and view information about the instances for the last 24 hours, last week, last month, last 6 months, last one year, or all time.&lt;/li&gt;
&lt;li&gt;To view the reasons for the failure of a job, mouse over an instance with job status &lt;strong&gt;Failed&lt;/strong&gt; and select the &lt;strong&gt;View error logs&lt;/strong&gt; icon that appears for a file name.
The error log provides information about the row that failed execution, reason for failure, and rejected data. You can troubleshoot the job with this information and ensure that this job is successfully executed.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Incident Overview&lt;/strong&gt; - Displays information about the data pipeline job in a pie chart. Select the drop-down list and view information about the jobs for the last 24 hours, last week, last month, last 6 months, last one year, or all time.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Select &lt;strong&gt;Execute pipeline&lt;/strong&gt; to execute the selected data pipeline.
Executing a data pipeline loads all files that have not already been loaded and that have not reached the retry limit. Executing the data pipeline commits the transaction.&lt;/li&gt;
&lt;li&gt;Select &lt;strong&gt;Edit pipeline&lt;/strong&gt; to edit the data pipeline.&lt;/li&gt;
&lt;li&gt;In the &lt;strong&gt;Access key ID&lt;/strong&gt; field, enter your AWS account access key id.&lt;/li&gt;
&lt;li&gt;In the &lt;strong&gt;Secret access key&lt;/strong&gt; field, enter your AWS account secret access key.
All the details about the data pipeline are populated, except the &lt;strong&gt;Access key ID&lt;/strong&gt; and &lt;strong&gt;Secret access key&lt;/strong&gt;.&lt;br&gt;

&lt;div class=&#34;alert admonition note&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Note&lt;/h4&gt;

&lt;pre&gt;&lt;code&gt;    Provide valid AWS credentials for the **Access key ID** and **Secret access key**. Invalid AWS credentials do not allow you to edit a data pipeline.  
&lt;/code&gt;&lt;/pre&gt;


&lt;/div&gt;
For more information about editing a data pipeline, see &lt;a href=&#34;#Datapip&#34;&gt;Creating a data pipeline&lt;/a&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id=&#34;edit-a-data-pipeline&#34;&gt;Edit a data pipeline&lt;/h3&gt;
&lt;p&gt;After creating a data pipeline, you can edit the details to suit your requirements. To edit a data pipeline, do the following:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;In the Data Pipelines page, mouse over the &lt;strong&gt;Pipeline name&lt;/strong&gt; column and select the &lt;strong&gt;+Edit pipeline&lt;/strong&gt; icon for a data pipeline.&lt;/li&gt;
&lt;li&gt;In the &lt;strong&gt;Access key ID&lt;/strong&gt; field, enter your AWS account access key id.&lt;/li&gt;
&lt;li&gt;In the &lt;strong&gt;Secret access key&lt;/strong&gt; field, enter your AWS account secret access key.
All the details about the data pipeline are populated, except the &lt;strong&gt;Access key ID&lt;/strong&gt; and &lt;strong&gt;Secret access key&lt;/strong&gt;.&lt;br&gt;

&lt;div class=&#34;alert admonition note&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Note&lt;/h4&gt;

&lt;pre&gt;&lt;code&gt;     Provide valid AWS credentials for the **Access key ID** and **Secret access key**. Invalid AWS credentials do not allow you to edit a data pipeline.  
&lt;/code&gt;&lt;/pre&gt;


&lt;/div&gt;
For more information about editing a data pipeline, see &lt;a href=&#34;#Datapip&#34;&gt;Creating a data pipeline&lt;/a&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id=&#34;clone-a-data-pipeline&#34;&gt;Clone a data pipeline&lt;/h3&gt;
&lt;p&gt;You can create a clone or replica of an existing data pipeline. The configurations of the existing data pipeline are copied to the cloned data pipeline. You can edit these configuration settings in the cloned data pipeline.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;In the Data Pipelines page, mouse over the &lt;strong&gt;Pipeline name&lt;/strong&gt; column and select the &lt;strong&gt;Clone pipeline&lt;/strong&gt; icon for a data pipeline.&lt;/li&gt;
&lt;li&gt;In the Confirmation dialog, select &lt;strong&gt;Confirm&lt;/strong&gt;.
The Create a pipeline page displays.&lt;/li&gt;
&lt;li&gt;In the &lt;strong&gt;Pipeline name&lt;/strong&gt; field, enter the name of the pipeline.&lt;/li&gt;
&lt;li&gt;In the &lt;strong&gt;Access key ID&lt;/strong&gt; field, enter your AWS account access key id.&lt;/li&gt;
&lt;li&gt;In the &lt;strong&gt;Secret access key&lt;/strong&gt; field, enter your AWS account secret access key.

&lt;div class=&#34;alert admonition note&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Note&lt;/h4&gt;

&lt;pre&gt;&lt;code&gt;     Provide valid AWS credentials for the **Access key ID** and **Secret access key**. Invalid AWS credentials do not allow you to edit a data pipeline.  
&lt;/code&gt;&lt;/pre&gt;


&lt;/div&gt;
Information in all other fields is pre-populated. You can edit this information.&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id=&#34;pause-data-ingestion&#34;&gt;Pause data ingestion&lt;/h3&gt;
&lt;p&gt;You can pause the ingestion of data into a pipeline.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;In the Data pipelines page, click &lt;strong&gt;⋮&lt;/strong&gt; in the &lt;strong&gt;Actions&lt;/strong&gt; column.&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;+Pause schedule&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;In the Confirmation dialog, click &lt;strong&gt;Pause&lt;/strong&gt;.
The &lt;strong&gt;Pipeline status&lt;/strong&gt; changes to &lt;strong&gt;Paused&lt;/strong&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id=&#34;resume-data-ingestion&#34;&gt;Resume data ingestion&lt;/h3&gt;
&lt;p&gt;You can resume the ingestion of data into a pipeline.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;In the Data pipelines page, select the pipeline for which data ingestion is paused.&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;⋮&lt;/strong&gt; in the &lt;strong&gt;Actions&lt;/strong&gt; column.&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Resume schedule&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;In the Confirmation dialog, click &lt;strong&gt;Resume&lt;/strong&gt;.
A message &amp;quot;Data ingestion resumed for data pipeline&amp;quot; is displayed.&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id=&#34;delete-a-data-pipeline&#34;&gt;Delete a data pipeline&lt;/h3&gt;
&lt;p&gt;You can delete a data pipeline that is no longer in use or required.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;In the Data pipelines page, click &lt;strong&gt;⋮&lt;/strong&gt; in the &lt;strong&gt;Actions&lt;/strong&gt; column.&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;+Delete&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;In the Confirmation dialog, click &lt;strong&gt;Delete&lt;/strong&gt;.&lt;/li&gt;
&lt;/ol&gt;

      </description>
    </item>
    
  </channel>
</rss>
