<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>OpenText Analytics Database 26.2.x – Collecting database statistics</title>
    <link>/en/admin/collecting-db-statistics/</link>
    <description>Recent content in Collecting database statistics on OpenText Analytics Database 26.2.x</description>
    <generator>Hugo -- gohugo.io</generator>
    
	  <atom:link href="/en/admin/collecting-db-statistics/index.xml" rel="self" type="application/rss+xml" />
    
    
      
        
      
    
    
    <item>
      <title>Admin: Collecting table statistics</title>
      <link>/en/admin/collecting-db-statistics/collecting-table-statistics/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/admin/collecting-db-statistics/collecting-table-statistics/</guid>
      <description>
        
        
        &lt;p&gt;&lt;a href=&#34;../../../en/sql-reference/functions/performance-analysis-functions/statistics-management-functions/analyze-statistics/#&#34;&gt;ANALYZE_STATISTICS&lt;/a&gt; collects and aggregates data samples and storage information from all nodes that store projections of the target tables.&lt;/p&gt;
&lt;p&gt;You can set the scope of the collection at several levels:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&#34;#Analyze&#34;&gt;Database&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&#34;#Analyze2&#34;&gt;Table&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&#34;#Analyze3&#34;&gt;Table columns&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;ANALYZE_STATISTICS can also control the size of the data sample that it collects.&lt;/p&gt;
&lt;p&gt;&lt;a name=&#34;Analyze&#34;&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&#34;analyze-all-database-tables&#34;&gt;Analyze all database tables&lt;/h2&gt;
&lt;p&gt;If ANALYZE_STATISTICS specifies no table, it collects statistics for all database tables and their projections. For example:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT ANALYZE_STATISTICS (&amp;#39;&amp;#39;);
 ANALYZE_STATISTICS
--------------------
                  0
(1 row)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;&lt;a name=&#34;Analyze2&#34;&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&#34;analyze-a-single-table&#34;&gt;Analyze a single table&lt;/h2&gt;
&lt;p&gt;You can compute statistics on a single table as follows:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT ANALYZE_STATISTICS (&amp;#39;public.store_orders_fact&amp;#39;);
 ANALYZE_STATISTICS
--------------------
                  0
(1 row)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;When you query system table &lt;a href=&#34;../../../en/sql-reference/system-tables/v-catalog-schema/projection-columns/#&#34;&gt;PROJECTION_COLUMNS&lt;/a&gt;, it confirms that statistics have been collected on all table columns for all projections of &lt;code&gt;store_orders_fact&lt;/code&gt;:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT projection_name, statistics_type, table_column_name,statistics_updated_timestamp
    FROM projection_columns WHERE projection_name ilike &amp;#39;store_orders_fact%&amp;#39; AND table_schema=&amp;#39;public&amp;#39;;
   projection_name    | statistics_type | table_column_name | statistics_updated_timestamp
----------------------+-----------------+-------------------+-------------------------------
 store_orders_fact_b0 | FULL            | product_key       | 2019-04-04 18:06:55.747329-04
 store_orders_fact_b0 | FULL            | product_version   | 2019-04-04 18:06:55.747329-04
 store_orders_fact_b0 | FULL            | store_key         | 2019-04-04 18:06:55.747329-04
 store_orders_fact_b0 | FULL            | vendor_key        | 2019-04-04 18:06:55.747329-04
 store_orders_fact_b0 | FULL            | employee_key      | 2019-04-04 18:06:55.747329-04
 store_orders_fact_b0 | FULL            | order_number      | 2019-04-04 18:06:55.747329-04
 store_orders_fact_b0 | FULL            | date_ordered      | 2019-04-04 18:06:55.747329-04
 store_orders_fact_b0 | FULL            | date_shipped      | 2019-04-04 18:06:55.747329-04
 store_orders_fact_b0 | FULL            | quantity_ordered  | 2019-04-04 18:06:55.747329-04
 store_orders_fact_b0 | FULL            | shipper_name      | 2019-04-04 18:06:55.747329-04
 store_orders_fact_b1 | FULL            | product_key       | 2019-04-04 18:06:55.747329-04
 store_orders_fact_b1 | FULL            | product_version   | 2019-04-04 18:06:55.747329-04
...
(20 rows)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;&lt;a name=&#34;Analyze3&#34;&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&#34;analyze-table-columns&#34;&gt;Analyze table columns&lt;/h2&gt;
&lt;p&gt;Within a table, you can narrow scope of analysis to a subset of its columns. Doing so can save significant processing overhead for big tables that contain many columns. It is especially useful if you frequently query these tables on specific columns.

&lt;div class=&#34;admonition important&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Important&lt;/h4&gt;
If you collect statistics on specific columns, be sure to include all columns that you are likely to query. If a query includes other columns in that table, the query optimizer regards the statistics as incomplete for that query and ignores them in its plan.
&lt;/div&gt;&lt;/p&gt;
&lt;p&gt;For example, instead of collecting statistics on all columns in &lt;code&gt;store_orders_fact&lt;/code&gt;, you can select only those columns that are frequently queried: &lt;code&gt;product_key&lt;/code&gt;, &lt;code&gt;product_version&lt;/code&gt;, &lt;code&gt;order_number&lt;/code&gt;, and &lt;code&gt;quantity_shipped&lt;/code&gt;:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT DROP_STATISTICS(&amp;#39;public.store_orders_fact&amp;#39;);
=&amp;gt; SELECT ANALYZE_STATISTICS (&amp;#39;public.store_orders_fact&amp;#39;, &amp;#39;product_key, product_version, order_number, quantity_ordered&amp;#39;);
 ANALYZE_STATISTICS
--------------------
                  0
(1 row)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;If you query &lt;code&gt;PROJECTION_COLUMNS&lt;/code&gt; again, it returns the following results:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT projection_name, statistics_type, table_column_name,statistics_updated_timestamp
    FROM projection_columns WHERE projection_name ilike &amp;#39;store_orders_fact%&amp;#39; AND table_schema=&amp;#39;public&amp;#39;;
   projection_name    | statistics_type | table_column_name | statistics_updated_timestamp
----------------------+-----------------+-------------------+------------------------------
 store_orders_fact_b0 | FULL            | product_key       | 2019-04-04 18:09:40.05452-04
 store_orders_fact_b0 | FULL            | product_version   | 2019-04-04 18:09:40.05452-04
 store_orders_fact_b0 | ROWCOUNT        | store_key         | 2019-04-04 18:09:40.05452-04
 store_orders_fact_b0 | ROWCOUNT        | vendor_key        | 2019-04-04 18:09:40.05452-04
 store_orders_fact_b0 | ROWCOUNT        | employee_key      | 2019-04-04 18:09:40.05452-04
 store_orders_fact_b0 | FULL            | order_number      | 2019-04-04 18:09:40.05452-04
 store_orders_fact_b0 | ROWCOUNT        | date_ordered      | 2019-04-04 18:09:40.05452-04
 store_orders_fact_b0 | ROWCOUNT        | date_shipped      | 2019-04-04 18:09:40.05452-04
 store_orders_fact_b0 | FULL            | quantity_ordered  | 2019-04-04 18:09:40.05452-04
 store_orders_fact_b0 | ROWCOUNT        | shipper_name      | 2019-04-04 18:09:40.05452-04
 store_orders_fact_b1 | FULL            | product_key       | 2019-04-04 18:09:40.05452-04
 store_orders_fact_b1 | FULL            | product_version   | 2019-04-04 18:09:40.05452-04
 store_orders_fact_b1 | ROWCOUNT        | store_key         | 2019-04-04 18:09:40.05452-04
 store_orders_fact_b1 | ROWCOUNT        | vendor_key        | 2019-04-04 18:09:40.05452-04
 store_orders_fact_b1 | ROWCOUNT        | employee_key      | 2019-04-04 18:09:40.05452-04
 store_orders_fact_b1 | FULL            | order_number      | 2019-04-04 18:09:40.05452-04
 store_orders_fact_b1 | ROWCOUNT        | date_ordered      | 2019-04-04 18:09:40.05452-04
 store_orders_fact_b1 | ROWCOUNT        | date_shipped      | 2019-04-04 18:09:40.05452-04
 store_orders_fact_b1 | FULL            | quantity_ordered  | 2019-04-04 18:09:40.05452-04
 store_orders_fact_b1 | ROWCOUNT        | shipper_name      | 2019-04-04 18:09:40.05452-04
(20 rows)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;In this case, columns statistics_type is set to &lt;code&gt;FULL&lt;/code&gt; only for those columns on which you ran ANALYZE_STATISTICS. The remaining table columns are set to &lt;code&gt;ROWCOUNT&lt;/code&gt;, indicating that only &lt;a href=&#34;../../../en/admin/collecting-db-statistics/analyzing-row-counts/&#34;&gt;row statistics&lt;/a&gt; were collected for them.

&lt;div class=&#34;alert admonition note&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Note&lt;/h4&gt;

ANALYZE_STATISTICS always invokes ANALYZE_ROW_COUNT on all table columns, even if ANALYZE_STATISTICS specifies a subset of those columns.

&lt;/div&gt;&lt;/p&gt;
&lt;h2 id=&#34;data-collection-percentage&#34;&gt;Data collection percentage&lt;/h2&gt;
&lt;p&gt;By default, OpenText™ Analytics Database collects a fixed 10-percent sample of statistical data from disk. Specifying a percentage of data to read from disk gives you more control over deciding between sample accuracy and speed.&lt;/p&gt;
&lt;p&gt;The percentage of data you collect affects collection time and accuracy:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;A smaller percentage is faster but returns a smaller data sample, which might compromise histogram accuracy.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;A larger percentage reads more data off disk. Data collection is slower, but &lt;em&gt;a larger data sample enables greater histogram accuracy.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For example:&lt;/p&gt;
&lt;p&gt;Collect data on all projections for &lt;code&gt;shipping_dimension&lt;/code&gt; from 20 percent of the disk:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT ANALYZE_STATISTICS (&amp;#39;shipping_dimension&amp;#39;, 20);
 ANALYZE_STATISTICS
-------------------
                 0
(1 row)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Collect data from the entire disk by setting the &lt;code&gt;percent&lt;/code&gt; parameter to 100:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT ANALYZE_STATISTICS (&amp;#39;shipping_dimension&amp;#39;, &amp;#39;shipping_key&amp;#39;, 100);
 ANALYZE_STATISTICS
--------------------
                  0
(1 row)
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;sampling-size&#34;&gt;Sampling size&lt;/h2&gt;
&lt;p&gt;ANALYZE_STATISTICS constructs a column histogram from a set of rows that it randomly selects from all collected data. Regardless of the percentage setting, the function always creates a statistical sample that contains up to (approximately) the smaller of:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;2&lt;sup&gt;17&lt;/sup&gt; (131,072) rows&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Number of rows that fit in 1 GB of memory&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If a column has fewer rows than the maximum sample size, ANALYZE_STATISTICS reads all rows from disk and analyzes the entire column.

&lt;div class=&#34;alert admonition note&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Note&lt;/h4&gt;

The data collected in a sample range does not indicate how data should be distributed.

&lt;/div&gt;&lt;/p&gt;
&lt;p&gt;The following table shows how ANALYZE_STATISTICS, when set to different percentages, obtains a statistical sample from a given column:

&lt;table class=&#34;table table-bordered&#34; &gt;



&lt;tr&gt; 


&lt;th  class=&#34;hcenter&#34; &gt;
Number of column rows&lt;/th&gt; 


&lt;th  class=&#34;hcenter&#34; &gt;
%&lt;/th&gt; 


&lt;th  class=&#34;hcenter&#34; &gt;
Number of rows read&lt;/th&gt; 


&lt;th  class=&#34;hcenter&#34; &gt;
Number of sampled rows&lt;/th&gt;&lt;/tr&gt;

&lt;tr&gt; 


&lt;td  class=&#34;hright&#34; &gt;


&lt;code&gt;&amp;lt;=&lt;/code&gt;&lt;em&gt;&lt;code&gt;max-sample-size&lt;/code&gt;&lt;/em&gt;&lt;/td&gt; 


&lt;td  class=&#34;hright&#34; &gt;


20&lt;/td&gt; 


&lt;td  class=&#34;hright&#34; &gt;


All&lt;/td&gt; 


&lt;td  class=&#34;hright&#34; &gt;


All&lt;/td&gt;&lt;/tr&gt;

&lt;tr&gt; 


&lt;td  class=&#34;hright&#34; &gt;


400K&lt;/td&gt; 


&lt;td  class=&#34;hright&#34; &gt;


10&lt;/td&gt; 


&lt;td  class=&#34;hright&#34; &gt;


&lt;em&gt;&lt;code&gt;max-sample-size&lt;/code&gt;&lt;/em&gt;&lt;/td&gt; 


&lt;td  class=&#34;hright&#34; &gt;


&lt;em&gt;&lt;code&gt;max-sample-size&lt;/code&gt;&lt;/em&gt;&lt;/td&gt;&lt;/tr&gt;

&lt;tr&gt; 


&lt;td  class=&#34;hright&#34; &gt;


4000K&lt;/td&gt; 


&lt;td  class=&#34;hright&#34; &gt;


10&lt;/td&gt; 


&lt;td  class=&#34;hright&#34; &gt;


400K&lt;/td&gt; 


&lt;td  class=&#34;hright&#34; &gt;


&lt;em&gt;&lt;code&gt;max-sample-size&lt;/code&gt;&lt;/em&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;


&lt;div class=&#34;alert admonition note&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Note&lt;/h4&gt;

When a column specified for ANALYZE_STATISTICS is first in a projection&#39;s sort order, the function reads all data from disk to avoid a biased sample.

&lt;/div&gt;&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Admin: Collecting partition statistics</title>
      <link>/en/admin/collecting-db-statistics/collecting-partition-statistics/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/admin/collecting-db-statistics/collecting-partition-statistics/</guid>
      <description>
        
        
        &lt;p&gt;&lt;a href=&#34;../../../en/sql-reference/functions/performance-analysis-functions/statistics-management-functions/analyze-statistics-partition/#&#34;&gt;ANALYZE_STATISTICS_PARTITION&lt;/a&gt; collects and aggregates data samples and storage information for a range of partitions in the specified table. OpenText™ Analytics Database writes the collected statistics to the database catalog.&lt;/p&gt;
&lt;p&gt;For example, the following table stores sales data and is partitioned by order dates:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;CREATE TABLE public.store_orders_fact
(
    product_key int,
    product_version int,
    store_key int,
    vendor_key int,
    employee_key int,
    order_number int,
    date_ordered date NOT NULL,
    date_shipped date NOT NULL,
    quantity_ordered int,
    shipper_name varchar(32)
);

ALTER TABLE public.store_orders_fact PARTITION BY date_ordered::DATE GROUP BY CALENDAR_HIERARCHY_DAY(date_ordered::DATE, 2, 2) REORGANIZE;
ALTER TABLE public.store_orders_fact ADD CONSTRAINT fk_store_orders_product FOREIGN KEY (product_key, product_version) references public.product_dimension (product_key, product_version);
ALTER TABLE public.store_orders_fact ADD CONSTRAINT fk_store_orders_vendor FOREIGN KEY (vendor_key) references public.vendor_dimension (vendor_key);
ALTER TABLE public.store_orders_fact ADD CONSTRAINT fk_store_orders_employee FOREIGN KEY (employee_key) references public.employee_dimension (employee_key);
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;At the end of each business day you might call ANALYZE_STATISTICS_PARTITION and collect statistics on all data of the latest (today&#39;s) partition:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT ANALYZE_STATISTICS_PARTITION(&amp;#39;public.store_orders_fact&amp;#39;, CURRENT_DATE::VARCHAR(10), CURRENT_DATE::VARCHAR(10));
 ANALYZE_STATISTICS_PARTITION
------------------------------
                            0
(1 row)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The function produces a set of fresh statistics for the most recent partition in &lt;code&gt;store.store_sales_fact&lt;/code&gt;. If you query this table each morning on yesterday&#39;s sales, the optimizer uses these statistics to generate an optimized query plan:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; EXPLAIN SELECT COUNT(*) FROM public.store_orders_fact WHERE date_ordered = CURRENT_DATE-1;

                                           QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 QUERY PLAN DESCRIPTION:
 ------------------------------

 EXPLAIN SELECT COUNT(*) FROM public.store_orders_fact WHERE date_ordered = CURRENT_DATE-1;

 Access Path:
 +-GROUPBY NOTHING [Cost: 2, Rows: 1] (PATH ID: 1)
 |  Aggregates: count(*)
 |  Execute on: All Nodes
 | +---&amp;gt; STORAGE ACCESS for store_orders_fact [Cost: 1, Rows: 222(PARTITION-LEVEL STATISTICS)] (PATH ID: 2)
 | |      Projection: public.store_orders_fact_v1_b1
 | |      Filter: (store_orders_fact.date_ordered = &amp;#39;2019-04-01&amp;#39;::date)
 | |      Execute on: All Nodes
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;narrowing-the-collection-scope&#34;&gt;Narrowing the collection scope&lt;/h2&gt;
&lt;p&gt;Like &lt;a href=&#34;../../../en/sql-reference/functions/performance-analysis-functions/statistics-management-functions/analyze-statistics/#&#34;&gt;ANALYZE_STATISTICS&lt;/a&gt;, ANALYZE_STATISTICS_PARTITION lets you narrow the scope of analysis to a &lt;a href=&#34;../../../en/admin/collecting-db-statistics/collecting-table-statistics/#Analyze3&#34;&gt;subset of a table&#39;s columns&lt;/a&gt;. You can also control the size of the data sample that it collects. For details on these options, see &lt;a href=&#34;../../../en/admin/collecting-db-statistics/collecting-table-statistics/#&#34;&gt;Collecting table statistics&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;collecting-statistics-on-multiple-partition-ranges&#34;&gt;Collecting statistics on multiple partition ranges&lt;/h2&gt;
&lt;p&gt;If you specify multiple partitions, they must be continuous. Different collections of statistics can overlap. For example, the following table t1 is partitioned on column &lt;code&gt;c1&lt;/code&gt;:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT export_tables(&amp;#39;&amp;#39;,&amp;#39;t1&amp;#39;);
                                          export_tables
-------------------------------------------------------------------------------------------------
CREATE TABLE public.t1
(
    a int,
    b int,
    c1 int NOT NULL
)
PARTITION BY (t1.c1);


=&amp;gt; SELECT * FROM t1 ORDER BY c1;
 a  | b  | c1
----+----+----
  1 |  2 |  3
  4 |  5 |  6
  7 |  8 |  9
 10 | 11 | 12
(4 rows)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Given this dataset, you can call ANALYZE_STATISTICS_PARTITION on &lt;code&gt;t1&lt;/code&gt; twice. The successive calls collect statistics for two overlapping ranges of partition keys, 3 through 9 and 6 through 12:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT drop_statistics_partition(&amp;#39;t1&amp;#39;, &amp;#39;&amp;#39;, &amp;#39;&amp;#39;);
 drop_statistics_partition
---------------------------
                         0
(1 row)

=&amp;gt; SELECT analyze_statistics_partition(&amp;#39;t1&amp;#39;, &amp;#39;3&amp;#39;, &amp;#39;9&amp;#39;);
 analyze_statistics_partition
------------------------------
                            0
(1 row)

=&amp;gt; SELECT analyze_statistics_partition(&amp;#39;t1&amp;#39;, &amp;#39;6&amp;#39;, &amp;#39;12&amp;#39;);
 analyze_statistics_partition
------------------------------
                            0
(1 row)

=&amp;gt; SELECT table_name, min_partition_key, max_partition_key, row_count FROM table_statistics WHERE table_name = &amp;#39;t1&amp;#39;;
 table_name | min_partition_key | max_partition_key | row_count
------------+-------------------+-------------------+-----------
 t1         | 3                 | 9                 |         3
 t1         | 6                 | 12                |         3
(2 rows)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;If two statistics collections overlap, the database stores only the most recent statistics for each partition range. Thus, given the previous example, the database uses only statistics from the second collection for partition keys 6 through 9.&lt;/p&gt;
&lt;p&gt;Statistics that are collected for a given range of partition keys always supersede statistics that were previously collected for a subset of that range. For example, given a call to ANALYZE_STATISTICS_PARTITION that specifies partition keys 3 through 12, the collected statistics are a superset of the two sets of statistics collected earlier, so it supersedes both:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;
=&amp;gt; SELECT analyze_statistics_partition(&amp;#39;t1&amp;#39;, &amp;#39;3&amp;#39;, &amp;#39;12&amp;#39;);
 analyze_statistics_partition
------------------------------
                            0
(1 row)

=&amp;gt; SELECT table_name, min_partition_key, max_partition_key, row_count FROM table_statistics WHERE table_name = &amp;#39;t1&amp;#39;;
 table_name | min_partition_key | max_partition_key | row_count
------------+-------------------+-------------------+-----------
 t1         | 3                 | 12                |         4
(1 row)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Finally, ANALYZE_STATISTICS_PARTITION collects statistics on partition keys 3 through 6. This collection is a subset of the previous collection, so the database retains both sets and uses the latest statistics from each:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;
=&amp;gt; SELECT analyze_statistics_partition(&amp;#39;t1&amp;#39;, &amp;#39;3&amp;#39;, &amp;#39;6&amp;#39;);
 analyze_statistics_partition
------------------------------
                            0
(1 row)

=&amp;gt; SELECT table_name, min_partition_key, max_partition_key, row_count FROM table_statistics WHERE table_name = &amp;#39;t1&amp;#39;;
 table_name | min_partition_key | max_partition_key | row_count
------------+-------------------+-------------------+-----------
 t1         | 3                 | 12                |         4
 t1         | 3                 | 6                 |         2
(2 rows)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;&lt;a name=&#34;Supporte&#34;&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&#34;supported-datetime-functions&#34;&gt;Supported date/time functions&lt;/h2&gt;
&lt;p&gt;ANALYZE_STATISTICS_PARTITION can collect partition-level statistics on tables where the partition expression specifies one of the following date/time functions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;a href=&#34;../../../en/sql-reference/functions/data-type-specific-functions/datetime-functions/date/#&#34;&gt;DATE&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;../../../en/sql-reference/functions/data-type-specific-functions/datetime-functions/date-part/#&#34;&gt;DATE_PART&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;../../../en/sql-reference/functions/data-type-specific-functions/datetime-functions/date-trunc/#&#34;&gt;DATE_TRUNC&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;../../../en/sql-reference/functions/data-type-specific-functions/datetime-functions/day/#&#34;&gt;DAY&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;../../../en/sql-reference/functions/data-type-specific-functions/datetime-functions/dayofmonth/#&#34;&gt;DAYOFMONTH&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;../../../en/sql-reference/functions/data-type-specific-functions/datetime-functions/dayofyear/#&#34;&gt;DAYOFYEAR&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;../../../en/sql-reference/functions/data-type-specific-functions/datetime-functions/days/#&#34;&gt;DAYS&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;../../../en/sql-reference/functions/data-type-specific-functions/datetime-functions/extract/#&#34;&gt;EXTRACT&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;../../../en/sql-reference/functions/data-type-specific-functions/datetime-functions/hour/#&#34;&gt;HOUR&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;../../../en/sql-reference/functions/data-type-specific-functions/datetime-functions/minute/#&#34;&gt;MINUTE&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;../../../en/sql-reference/functions/data-type-specific-functions/datetime-functions/month/#&#34;&gt;MONTH&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;../../../en/sql-reference/functions/data-type-specific-functions/datetime-functions/quarter/#&#34;&gt;QUARTER&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;../../../en/sql-reference/functions/data-type-specific-functions/datetime-functions/week/#&#34;&gt;WEEK&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;../../../en/sql-reference/functions/data-type-specific-functions/datetime-functions/week-iso/#&#34;&gt;WEEK_ISO&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;../../../en/sql-reference/functions/data-type-specific-functions/datetime-functions/year/#&#34;&gt;YEAR&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href=&#34;../../../en/sql-reference/functions/data-type-specific-functions/datetime-functions/year-iso/#&#34;&gt;YEAR_ISO&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;requirements-and-restrictions&#34;&gt;Requirements and restrictions&lt;/h2&gt;
&lt;p&gt;The following requirements and restrictions apply to ANALYZE_STATISTICS_PARTITION:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;The table must be partitioned and cannot contain unpartitioned data.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The table partition expression must specify a single column. The following expressions are supported:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Expressions that specify only the column—that is, partition on all column values. For example:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;PARTITION BY ship_date GROUP BY CALENDAR_HIERARCHY_DAY(ship_date, 2, 2)
&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;If the column is a &lt;a href=&#34;../../../en/sql-reference/data-types/datetime-data-types/date/#&#34;&gt;DATE&lt;/a&gt; or &lt;a href=&#34;../../../en/sql-reference/data-types/datetime-data-types/timestamptimestamptz/#&#34;&gt;TIMESTAMP/TIMESTAMPTZ&lt;/a&gt;, the partition expression can specify a &lt;a href=&#34;../../../en/admin/collecting-db-statistics/collecting-partition-statistics/#Supporte&#34;&gt;supported date/time function&lt;/a&gt; that returns that column or any portion of it, such as month or year. For example, the following partition expression specifies to partition on the year portion of column &lt;code&gt;order_date&lt;/code&gt;:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;PARTITION BY YEAR(order_date)
&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Expressions that perform addition or subtraction on the column. For example:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;PARTITION BY YEAR(order_date) -1
&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The table partition expression cannot coerce the specified column to another data type.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Vertica collects no statistics from the following projections:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Live aggregate and Top-K projections&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Projections that are defined to include an SQL function within an expression&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;


      </description>
    </item>
    
    <item>
      <title>Admin: Analyzing row counts</title>
      <link>/en/admin/collecting-db-statistics/analyzing-row-counts/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/admin/collecting-db-statistics/analyzing-row-counts/</guid>
      <description>
        
        
        &lt;p&gt;OpenText™ Analytics Database lets you obtain row counts for projections and for external tables, through &lt;a href=&#34;../../../en/sql-reference/functions/management-functions/storage-functions/do-tm-task/&#34;&gt;ANALYZE_ROW_COUNT&lt;/a&gt; and &lt;a href=&#34;../../../en/sql-reference/functions/performance-analysis-functions/statistics-management-functions/analyze-external-row-count/#&#34;&gt;ANALYZE_EXTERNAL_ROW_COUNT&lt;/a&gt;, respectively.&lt;/p&gt;
&lt;h2 id=&#34;projection-row-count&#34;&gt;Projection row count&lt;/h2&gt;
&lt;p&gt;ANALYZE_ROW_COUNT is a lightweight operation that collects a minimal set of statistics and aggregate row counts for a projection, and saves it in the database catalog. In many cases, this data satisifes many optimizer requirements for producing optimal query plans. This operation is invoked on the following occasions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;At the time intervals specified by configuration parameter &lt;a href=&#34;../../../en/sql-reference/config-parameters/general-parameters/#AnalyzeRowCountInterval&#34;&gt;AnalyzeRowCountInterval&lt;/a&gt;—by default, once a day.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;During loads, the database updates the catalog with the current aggregate row count data for a given table when the percentage of difference between the last-recorded aggregate projection row count and current row count exceeds the setting in configuration parameter &lt;a href=&#34;../../../en/sql-reference/config-parameters/projection-parameters/#ARCCommitPercentage&#34;&gt;ARCCommitPercentage&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;On calls to meta-functions &lt;a href=&#34;../../../en/sql-reference/functions/performance-analysis-functions/statistics-management-functions/analyze-statistics/#&#34;&gt;ANALYZE_STATISTICS&lt;/a&gt; and &lt;a href=&#34;../../../en/sql-reference/functions/performance-analysis-functions/statistics-management-functions/analyze-statistics-partition/#&#34;&gt;ANALYZE_STATISTICS_PARTITION&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You can explicitly invoke ANALYZE_ROW_COUNT through calls to &lt;a href=&#34;../../../en/sql-reference/functions/management-functions/storage-functions/do-tm-task/#&#34;&gt;DO_TM_TASK&lt;/a&gt;. For example:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT DO_TM_TASK(&amp;#39;analyze_row_count&amp;#39;, &amp;#39;store_orders_fact_b0&amp;#39;);
                                              do_tm_task
------------------------------------------------------------------------------------------------------
 Task: row count analyze
(Table: public.store_orders_fact) (Projection: public.store_orders_fact_b0)

(1 row)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;You can change the intervals when the database regularly collects row-level statistics by setting configuration parameter AnalyzeRowCountInterval. For example, you can change the collection interval to 1 hour (3600 seconds):&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; ALTER DATABASE DEFAULT SET AnalyzeRowCountInterval = 3600;
ALTER DATABASE
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;external-table-row-count&#34;&gt;External table row count&lt;/h2&gt;
&lt;p&gt;&lt;a href=&#34;../../../en/sql-reference/functions/performance-analysis-functions/statistics-management-functions/analyze-external-row-count/#&#34;&gt;ANALYZE_EXTERNAL_ROW_COUNT&lt;/a&gt; calculates the exact number of rows in an external table. The optimizer uses this count to optimize for queries that access external tables. This is especially useful when an external table participates in a join. This function enables the optimizer to identify the smaller table to use as the inner input to the join, and facilitate better query performance.&lt;/p&gt;
&lt;p&gt;The following query calculates the exact number of rows in the external table &lt;code&gt;loader_rejects&lt;/code&gt;:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT ANALYZE_EXTERNAL_ROW_COUNT(&amp;#39;loader_rejects&amp;#39;);
 ANALYZE_EXTERNAL_ROW_COUNT
----------------------------
                 0
&lt;/code&gt;&lt;/pre&gt;
      </description>
    </item>
    
    <item>
      <title>Admin: Canceling statistics collection</title>
      <link>/en/admin/collecting-db-statistics/canceling-statistics-collection/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/admin/collecting-db-statistics/canceling-statistics-collection/</guid>
      <description>
        
        
        &lt;p&gt;To cancel statistics collection mid analysis, execute CTRL-C on &lt;a class=&#34;glosslink&#34; href=&#34;../../../en/glossary/vsql/&#34; title=&#34;For more information, see Installing the vsql Client and the more general topic, Using vsql.&#34;&gt;vsql&lt;/a&gt; or call the &lt;a href=&#34;../../../en/sql-reference/functions/management-functions/session-functions/interrupt-statement/&#34;&gt;INTERRUPT_STATEMENT()&lt;/a&gt; function.&lt;/p&gt;
&lt;p&gt;If you want to remove statistics for the specified table or type, call the &lt;a href=&#34;../../../en/sql-reference/functions/performance-analysis-functions/statistics-management-functions/drop-statistics/&#34;&gt;DROP_STATISTICS()&lt;/a&gt; function.

&lt;div class=&#34;admonition caution&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Caution&lt;/h4&gt;

After you drop statistics, it can be time consuming to regenerate them.

&lt;/div&gt;&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Admin: Getting data on table statistics</title>
      <link>/en/admin/collecting-db-statistics/getting-data-on-table-statistics/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/admin/collecting-db-statistics/getting-data-on-table-statistics/</guid>
      <description>
        
        
        &lt;p&gt;OpenText™ Analytics Database provides information about statistics for a given table and its columns and partitions in two ways:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;The query optimizer notifies you about the availability of statistics to process a given query.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;System table 
&lt;code&gt;&lt;a href=&#34;../../../en/sql-reference/system-tables/v-catalog-schema/projection-columns/#&#34;&gt;PROJECTION_COLUMNS&lt;/a&gt;&lt;/code&gt; shows what types of statistics are available for the table columns, and when they were last updated.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;query-evaluation&#34;&gt;Query evaluation&lt;/h2&gt;
&lt;p&gt;During predicate selectivity estimation, the query optimizer can identify when histograms are not available or are out of date. If the value in the predicate is outside the histogram&#39;s maximum range, the statistics are stale. If no histograms are available, then no statistics are available to the plan.&lt;/p&gt;
&lt;p&gt;When the optimizer detects stale or no statistics, such as when it encounters a column predicate for which it has no histogram, the optimizer performs the following actions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Displays and logs a message that you should run 
&lt;code&gt;&lt;a href=&#34;../../../en/sql-reference/functions/performance-analysis-functions/statistics-management-functions/analyze-statistics/#&#34;&gt;ANALYZE_STATISTICS&lt;/a&gt;&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Annotates 
&lt;code&gt;&lt;a href=&#34;../../../en/sql-reference/statements/explain/#&#34;&gt;EXPLAIN&lt;/a&gt;&lt;/code&gt;-generated query plans with a statistics entry.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Ignores stale statistics when it generates a query plan. The optimizer uses other considerations to create a query plan, such as FK-PK constraints.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For example, the following query plan fragment shows no statistics (histograms unavailable):&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;| | +-- Outer -&amp;gt; STORAGE ACCESS for fact [Cost: 604, Rows: 10K (&lt;span class=&#34;code-input&#34;&gt;NO STATISTICS&lt;/span&gt;)]
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The following query plan fragment shows that the predicate falls outside the histogram range:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;| | +-- Outer -&amp;gt; STORAGE ACCESS for fact [Cost: 35, Rows: 1 (&lt;span class=&#34;code-input&#34;&gt;PREDICATE VALUE OUT-OF-RANGE&lt;/span&gt;)]
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;&lt;a name=&#34;Statisti&#34;&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&#34;statistics-data-in-projection_columns&#34;&gt;Statistics data in PROJECTION_COLUMNS&lt;/h2&gt;
&lt;p&gt;Two columns in system table 
&lt;code&gt;&lt;a href=&#34;../../../en/sql-reference/system-tables/v-catalog-schema/projection-columns/#&#34;&gt;PROJECTION_COLUMNS&lt;/a&gt;&lt;/code&gt; show the status of each table column&#39;s statistics, as follows:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;STATISTICS_TYPE&lt;/code&gt; returns the type of statistics that are available for this column, one of the following: &lt;code&gt;NONE&lt;/code&gt;, &lt;code&gt;ROWCOUNT&lt;/code&gt;, or &lt;code&gt;FULL&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;STATISTICS_UPDATED_TIMESTAMP&lt;/code&gt; returns the last time statistics were collected for this column.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For example, the following sample schema defines a table named trades, which groups the highly-correlated columns &lt;code&gt;bid&lt;/code&gt; and &lt;code&gt;ask&lt;/code&gt; and stores the &lt;code&gt;stock&lt;/code&gt; column separately:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; CREATE TABLE trades (stock CHAR(5), bid INT, ask INT);
=&amp;gt; CREATE PROJECTION trades_p (
     stock ENCODING RLE, GROUPED(bid ENCODING DELTAVAL, ask))
     AS (SELECT * FROM trades) ORDER BY stock, bid;
=&amp;gt; INSERT INTO trades VALUES(&amp;#39;acme&amp;#39;, 10, 20);
=&amp;gt; COMMIT;
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Query the &lt;code&gt;PROJECTION_COLUMNS&lt;/code&gt; table for table &lt;code&gt;trades&lt;/code&gt;:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT table_name AS table, projection_name AS projection, table_column_name AS column, statistics_type, statistics_updated_timestamp AS last_updated
     FROM projection_columns WHERE table_name = &amp;#39;trades&amp;#39;;
 table  | projection  | column | statistics_type | last_updated
--------+-------------+--------+-----------------+--------------
 trades | trades_p_b0 | stock  | NONE            |
 trades | trades_p_b0 | bid    | NONE            |
 trades | trades_p_b0 | ask    | NONE            |
 trades | trades_p_b1 | stock  | NONE            |
 trades | trades_p_b1 | bid    | NONE            |
 trades | trades_p_b1 | ask    | NONE            |
(6 rows)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The &lt;code&gt;statistics_type&lt;/code&gt; column returns &lt;code&gt;NONE&lt;/code&gt; for all columns in the &lt;code&gt;trades&lt;/code&gt; table, while &lt;code&gt;statistics_updated_timestamp&lt;/code&gt; is empty because statistics have not yet been collected on this table.&lt;/p&gt;
&lt;p&gt;Now, run 
&lt;code&gt;&lt;a href=&#34;../../../en/sql-reference/functions/performance-analysis-functions/statistics-management-functions/analyze-statistics/#&#34;&gt;ANALYZE_STATISTICS&lt;/a&gt;&lt;/code&gt; on the &lt;code&gt;stock&lt;/code&gt; column:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT ANALYZE_STATISTICS (&amp;#39;public.trades&amp;#39;, &amp;#39;stock&amp;#39;);
 ANALYZE_STATISTICS
--------------------
                  0
(1 row)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Now, when you query &lt;code&gt;PROJECTION_COLUMNS&lt;/code&gt;, it returns the following results:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT table_name AS table, projection_name AS projection, table_column_name AS column, statistics_type, statistics_updated_timestamp AS last_updated
     FROM projection_columns WHERE table_name = &amp;#39;trades&amp;#39;;
 table  | projection  | column | statistics_type |         last_updated
--------+-------------+--------+-----------------+-------------------------------
 trades | trades_p_b0 | stock  | FULL            | 2019-04-03 12:00:12.231564-04
 trades | trades_p_b0 | bid    | ROWCOUNT        | 2019-04-03 12:00:12.231564-04
 trades | trades_p_b0 | ask    | ROWCOUNT        | 2019-04-03 12:00:12.231564-04
 trades | trades_p_b1 | stock  | FULL            | 2019-04-03 12:00:12.231564-04
 trades | trades_p_b1 | bid    | ROWCOUNT        | 2019-04-03 12:00:12.231564-04
 trades | trades_p_b1 | ask    | ROWCOUNT        | 2019-04-03 12:00:12.231564-04
(6 rows)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;This time, the query results contain several changes:

&lt;table class=&#34;table table-bordered&#34; &gt;



&lt;tr&gt; 

&lt;td &gt;
&lt;code&gt;statistics_type&lt;/code&gt;&lt;/td&gt; 

&lt;td &gt;




&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Set to &lt;code&gt;FULL&lt;/code&gt; for the &lt;code&gt;stock&lt;/code&gt; column, confirming that full statistics were run on this column.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Set to &lt;code&gt;ROWCOUNT&lt;/code&gt; for the &lt;code&gt;bid&lt;/code&gt; and &lt;code&gt;ask&lt;/code&gt; columns, confirming that &lt;code&gt;ANALYZE_STATISTICS&lt;/code&gt; always invokes &lt;code&gt;ANALYZE_ROW_COUNT&lt;/code&gt; on all table columns, even if &lt;code&gt;ANALYZE_STATISTICS&lt;/code&gt; specifies a subset of those columns.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/td&gt;&lt;/tr&gt;

&lt;tr&gt; 

&lt;td &gt;
&lt;code&gt;statistics_updated_timestamp&lt;/code&gt;&lt;/td&gt; 

&lt;td &gt;
Set to the same timestamp for all columns, confirming that statistics (either full or row count) were updated on all.&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/p&gt;

      </description>
    </item>
    
    <item>
      <title>Admin: Best practices for statistics collection</title>
      <link>/en/admin/collecting-db-statistics/best-practices-statistics-collection/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/admin/collecting-db-statistics/best-practices-statistics-collection/</guid>
      <description>
        
        
        &lt;p&gt;You should call ANALYZE_STATISTICS or ANALYZE_STATISTICS_PARTITION when one or more of following conditions are true:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Data is bulk loaded for the first time.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;A new projection is refreshed.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The number of rows changes significantly.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;A new column is added to the table.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Column minimum/maximum values change significantly.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;New primary key values with referential integrity constraints are added . The primary key and foreign key tables should be re-analyzed.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Table size notably changes relative to other tables it is joined to—for example, a table that was 50 times larger than another table is now only five times larger.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;A notable deviation in data distribution necessitates recalculating histograms—for example, an event causes abnormally high levels of trading for a particular stock.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The database is inactive for an extended period of time.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;overhead-considerations&#34;&gt;Overhead considerations&lt;/h2&gt;
&lt;p&gt;Running ANALYZE_STATISTICS is an efficient but potentially long-running operation. You can run it concurrently with queries and loads in a production environment. However, the function can incur considerable overhead on system resources (CPU and memory), at the expense of queries and load operations. To minimize overhead, consider calling ANALYZE_STATISTICS_PARTITIONS on those partitions that are subject to significant activity—typically, the most recently loaded partitions, including the table&#39;s active partition. You can further narrow the scope of both functions by specifying a subset of the table columns—generally, those that are queried most often.&lt;/p&gt;
&lt;h2 id=&#34;related-tools&#34;&gt;Related tools&lt;/h2&gt;
&lt;p&gt;You can diagnose and resolve many statistics-related issues by calling &lt;a href=&#34;../../../en/sql-reference/functions/performance-analysis-functions/workload-management-functions/analyze-workload/&#34;&gt;ANALYZE_WORKLOAD&lt;/a&gt;, which returns tuning recommendations. If you update statistics and find that a query still performs poorly, run it through the Database Designer and choose &lt;a href=&#34;../../../en/admin/configuring-db/creating-db-design/general-design-settings/#Incremen&#34;&gt;incremental&lt;/a&gt; as the design type.&lt;/p&gt;

      </description>
    </item>
    
  </channel>
</rss>
