1 - Preparing the installation environment
Before installing OpenText Analytics Database, you must configure your environment.
Before installing OpenText™ Analytics Database, you must configure your environment.
To run OpenText™ Analytics Database Enterprise on-premises, follow the numbered instructions below.
To run the OpenText™ Analytics Database in a Virtual Machine instead, see OpenText Analytics Database community edition (CE).
-
Copy the installation file to your home directory. The example shows an rpm
file for CentOS/RHEL, but you may have a deb
file for Debian.
$ scp vertica-version.RHEL8.x86_64.rpm /~
-
Identify the IP address of the current node.
$ ipconfig -a
If the ipconfig command is not found in your path, you can try running it directly using the paths /sbin/ipconfig
or /usr/sbin/ipconfig
. If neither of those work, use the ip command:
$ ip a
Note
The previous commands can return multiple addresses. For example, if your system is configured to use both IPv4 and IPv6 addressing, the commands will list two addresses, one for each address family. In this case, you must determine which address you want to use.
-
Ensure your packages are up to date. Run the command based on your distribution.
On CentOS and RedHat:
$ sudo yum update -y
On openSUSE:
$ sudo zypper up
On Debian and Ubuntu:
$ sudo apt-get update && sudo apt-get upgrade
-
Set swappiness to 0 (recommended).
$ sudo systemctl vm.swappiness=0
-
Verify that SELinux is running in permissive mode or is disabled.
$ sudo setenforce 0
-
Disable the system firewall.
$ sudo systemctl mask firewalld
$ sudo systemctl disable firewalld
$ sudo systemctl stop firewalld
-
Install OpenText Analytics Database.
4 - Loading sample data
OpenText Analytics Database offers several solutions for loading files with structured and unstructred data, and from several formats.
OpenText™ Analytics Database offers several solutions for loading files with structured and unstructred data, and from several formats.
Creating a sample data file
Create a sample CSV file called cities.csv
with the following contents and save it to /home/dbadmin/cities.csv.
City,State,Zip,Population
Boston,MA,02108,694583
Chicago,IL,60601,2705994
Seattle,WA,98101,744955
Dallas,TX,75201,1345047
New York,NY,10001,8398748
Loading structured data from a file
-
Run vsql.
$ vsql
-
Create the cities table.
=> CREATE TABLE cities (
city varchar(20),
state char(2),
zip int,
population int
);
-
Use the COPY statement to load the data from the cities.csv
file. The following command logs exceptions and rejections in the home directory.
=> COPY cities FROM LOCAL '/home/dbadmin/cities.csv' DELIMITER ',' NULL '' EXCEPTIONS '/home/dbadmin/cities_exceptions.log'
REJECTED DATA '/home/dbadmin/cities_rejections.log';
-
Review the rejections log for what data was excluded. Here, the header was excluded.
$ cat /home/dbadmin/cities_rejections.log
City,State,Zip,Population
-
Review the exceptions for details on the error. In this case, the header failed the database's integer data type verification.
$ cat /home/dbadmin/cities_exceptions.log
COPY: Input record 1 has been rejected (Invalid integer format 'Zip' for column 3 (zip)).
Please see /home/dbadmin/cities_rejections.log, record 1 for the rejected record. This record was record 1 from cities.csv
-
To fix this, add SKIP 1 to the original COPY statement. This excludes the first row.
=> COPY cities FROM LOCAL '/home/dbadmin/cities.csv' DELIMITER, 'NULL'
EXCEPTIONS '/home/dbadmin/cities_exceptions.log'
REJECTED DATA '/home/dbadmin/cities_rejections.log' SKIP 1;
Loading unstructured data with flex tables
To load data from another source, the database uses Flex tables. Flex tables simplify data loading by allowing you to load unstructured or "semi-structured" data without having to create a schema or column definitions.
Supported formats include:
-
Avro Data
-
CEF
-
CSV
-
Delimited
-
JSON
-
Create a table called cities_flex. Notice how it does not include column names or data types.
=> CREATE FLEXIBLE TABLE cities_flex();
-
Load the CSV file into the table.
=> COPY cities_flex FROM '/source/cities.csv' PARSER FDELIMITEDPARSER (delimiter=',');
-
Query the cities_flex table, specifying the column names from the original CSV file.
=> SELECT city, state FROM cities_flex;
5 - Monitoring the database
This page includes a collection of general-purpose SQL statements useful for monitoring your database.
This page includes a collection of general-purpose SQL statements useful for monitoring your database.
Check disk space
Check disk space used by tables.
=> SELECT projection_schema, anchor_table_name, to_char(sum(used_bytes)/1024/1024/1024,'999,999.99')
as disk_space_used_gb FROM
projection_storage
GROUP by projection_schema, anchor_table_name ORDER by
disk_space_used_gb desc limit 50;
Check total disk space used.
=> SELECT to_char(sum(used_bytes)/1024/1024/1024,'999,999.99') AS gb FROM projection_storage;
Check the amount of free disk space.
=> SELECT to_char(sum(disk_space_free_mb)/1024,'999,999,999') AS
disk_space_free_gb, to_char(sum(disk_space_used_mb)/1024,'999,999,999') AS
disk_space_used_gb FROM disk_storage;
Adjust data types
Change the Zip and Population columns from VARCHAR to INT.
=> UPDATE cities_flex_keys set data_type_guess='int' WHERE key_name='Zip';
=> UPDATE cities_flex_keys set data_type_guess='int' WHERE key_name='Population';
=> COMMIT;
Refresh the cities_flex_view with the new data types
=> SELECT build_flextable_view('cities_flex');
Materialize the flex table
Materialize the flex table and all columns into a persistent database table.
=> CREATE TABLE cities AS SELECT * from cities_flex_view;
View user information.
=> SELECT user_name, is_super_user, resource_pool, memory_cap_kb, temp_space_cap_kb, run_time_cap FROM users;
Identify users.
=> SELECT * FROM user_sessions;
View queries by user.
=> SELECT * FROM query_profiles WHERE user_name ILIKE '%dbadmin%';
View roles.
=> SELECT * FROM roles;
View resource pool assignments.
=> SELECT user_name, resource_pool FROM users;
View table information.
=> SELECT table_name, is_flextable, is_temp_table, is_system_table, count(*) FROM tables GROUP by 1,2,3,4;
View projection information.
=> SELECT is_segmented, is_aggregate_projection, has_statistics, is_super_projection, count(*) FROM projections GROUP by 1,2,3,4,5;
View update information.
=> SELECT substr(query, 0, instr(query, '')+1) count(*) FROM (SELECT transaction_id, statement_id, upper(query::varchar(30000)) as query FROM query_profiles
WHERE regexp_like(query,'^\s*update\s','i')) sq GROUP BY 1 ORDER BY 1;
View active events.
=> SELECT * FROM active_events WHERE event_problem_description NOT ILIKE '%state to UP';
View backups.
=> SELECT * FROM database_backups;
View disk storage.
=> SELECT node_name, storage_path, storage_usage, storage_status, disk_space_free_percent FROM disk_storage;
View long-running queries
=> SELECT query_duration_us/1000000/60 AS query_duration_mins, table_name, user_name, processed_row_count AS rows_processed, substr(query,0,70) FROM query_profiles
ORDER BY query_duration_us DESCLIMIT 250;
View sizes and counts of Read Optimized Store (ROS) containers.
=> SELECT node_name, projection_name, sum(ros_count), sum(ros_used_bytes) FROM projection_storage GROUP BY 1,2 HAVING sum(ros_count) >= 50
ORDER BY 3 DESC LIMIT 250;
View license consumption.
=> SELECT GET_COMPLIANCE_STATUS();
View how the database complies with your license.
=> SELECT AUDIT('');
Audit the database to check if it complies with raw storage allowance of your license.
=> SELECT AUDIT_LICENSE_SIZE;
Compare storage size of database the database and your license.
=> SELECT /*+(license_utilization)*/
audit_start_timestamp,
database_size_bytes / (1024^3) AS database_size_gb,
license_size_bytes / (1024^3) AS license_size_gb, usage_percent
FROM v_catalog.license_audits ORDER BYaudit_start_timestamp DESC LIMIT 30;