<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>OpenText Analytics Database 26.2.x – Text search functions</title>
    <link>/en/sql-reference/functions/match-and-search-functions/text-search-functions/</link>
    <description>Recent content in Text search functions on OpenText Analytics Database 26.2.x</description>
    <generator>Hugo -- gohugo.io</generator>
    
	  <atom:link href="/en/sql-reference/functions/match-and-search-functions/text-search-functions/index.xml" rel="self" type="application/rss+xml" />
    
    
      
        
      
    
    
    <item>
      <title>Sql-Reference: DELETE_TOKENIZER_CONFIG_FILE</title>
      <link>/en/sql-reference/functions/match-and-search-functions/text-search-functions/delete-tokenizer-config-file/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/sql-reference/functions/match-and-search-functions/text-search-functions/delete-tokenizer-config-file/</guid>
      <description>
        
        
        &lt;p&gt;Deletes a tokenizer configuration file.&lt;/p&gt;
&lt;h2 id=&#34;syntax&#34;&gt;Syntax&lt;/h2&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;SELECT v_txtindex.DELETE_TOKENIZER_CONFIG_FILE (USING PARAMETERS proc_oid=&amp;#39;&lt;span class=&#34;code-variable&#34;&gt;proc_oid&lt;/span&gt;&amp;#39;, confirm={true | false });
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;parameters&#34;&gt;Parameters&lt;/h2&gt;
&lt;dl&gt;
&lt;dt&gt;&lt;code&gt;confirm = [true | false]&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;Boolean flag. Indicates that the configuration file should be removed even if the tokenizer is still in use.
&lt;p&gt;&lt;code&gt;True&lt;/code&gt; — Force deletion of the tokenizer when the used parameter value is True.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;False&lt;/code&gt; — Delete tokenizer if the used parameter value is False.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Default:&lt;/strong&gt;&lt;code&gt;False&lt;/code&gt;&lt;/p&gt;
&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;proc_oid&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;A unique identifier assigned to a tokenizer when it is created. Users must query the system table vs_procedures to get the proc_oid for a given tokenizer name. See &lt;a href=&#34;../../../../../en/admin/using-text-search/stemmers-and-tokenizers/configuring-tokenizer/#&#34;&gt;Configuring a tokenizer&lt;/a&gt; for more information.&lt;/dd&gt;
&lt;/dl&gt;
&lt;h2 id=&#34;examples&#34;&gt;Examples&lt;/h2&gt;
&lt;p&gt;The following example shows how you can use DELETE_TOKENIZER_CONFIG_FILE to delete the tokenizer configuration file:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT v_txtindex.DELETE_TOKENIZER_CONFIG_FILE (USING PARAMETERS proc_oid=&amp;#39;45035996274126984&amp;#39;);
 DELETE_TOKENIZER_CONFIG_FILE
------------------------------
 t
(1 row)
&lt;/code&gt;&lt;/pre&gt;
      </description>
    </item>
    
    <item>
      <title>Sql-Reference: GET_TOKENIZER_PARAMETER</title>
      <link>/en/sql-reference/functions/match-and-search-functions/text-search-functions/get-tokenizer-parameter/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/sql-reference/functions/match-and-search-functions/text-search-functions/get-tokenizer-parameter/</guid>
      <description>
        
        
        &lt;p&gt;Returns the configuration parameter for a given tokenizer.&lt;/p&gt;
&lt;h2 id=&#34;syntax&#34;&gt;Syntax&lt;/h2&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;SELECT v_txtindex.GET_TOKENIZER_PARAMETER(&lt;span class=&#34;code-variable&#34;&gt;parameter_name&lt;/span&gt; USING PARAMETERS proc_oid=&amp;#39;&lt;span class=&#34;code-variable&#34;&gt;proc_oid&lt;/span&gt;&amp;#39;);
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;parameters&#34;&gt;Parameters&lt;/h2&gt;
&lt;dl&gt;
&lt;dt&gt;&lt;code&gt;parameter_name&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;Name of the parameter to be returned.
&lt;p&gt;One of the following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;stopWordsCaseInsensitive&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;minorSeparators&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;majorSeparators&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;minLength&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;maxLength&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;ngramsSize&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;used&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;proc_oid&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;A unique identifier assigned to a tokenizer when it is created. Users must query the system table vs_procedures to get the proc_oid for a given tokenizer name. See &lt;a href=&#34;../../../../../en/admin/using-text-search/stemmers-and-tokenizers/configuring-tokenizer/#&#34;&gt;Configuring a tokenizer&lt;/a&gt; for more information.&lt;/dd&gt;
&lt;/dl&gt;
&lt;h2 id=&#34;examples&#34;&gt;Examples&lt;/h2&gt;
&lt;p&gt;The following examples show how you can use GET_TOKENIZER_PARAMETER.&lt;/p&gt;
&lt;p&gt;Return the stop words used in a tokenizer:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT v_txtindex.GET_TOKENIZER_PARAMETER(&amp;#39;stopwordscaseinsensitive&amp;#39; USING PARAMETERS proc_oid=&amp;#39;45035996274126984&amp;#39;);
 getTokenizerParameter
-----------------------
 devil,TODAY,the,fox
(1 row)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Return the major separators used in a tokenizer:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT v_txtindex.GET_TOKENIZER_PARAMETER(&amp;#39;majorseparators&amp;#39; USING PARAMETERS proc_oid=&amp;#39;45035996274126984&amp;#39;);
 getTokenizerParameter
-----------------------
 {}()&amp;amp;[]
(1 row)
&lt;/code&gt;&lt;/pre&gt;
      </description>
    </item>
    
    <item>
      <title>Sql-Reference: READ_CONFIG_FILE</title>
      <link>/en/sql-reference/functions/match-and-search-functions/text-search-functions/read-config-file/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/sql-reference/functions/match-and-search-functions/text-search-functions/read-config-file/</guid>
      <description>
        
        
        &lt;p&gt;Reads and returns the key-value pairs of all the parameters of a given tokenizer.&lt;/p&gt;
&lt;p&gt;You must use the OVER() clause with this function.&lt;/p&gt;
&lt;h2 id=&#34;syntax&#34;&gt;Syntax&lt;/h2&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;SELECT v_txtindex.READ_CONFIG_FILE(USING PARAMETERS proc_oid=&amp;#39;&lt;span class=&#34;code-variable&#34;&gt;proc_oid&lt;/span&gt;&amp;#39;) OVER ()
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;parameters&#34;&gt;Parameters&lt;/h2&gt;
&lt;dl&gt;
&lt;dt&gt;&lt;code&gt;proc_oid&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;A unique identifier assigned to a tokenizer when it is created. Users must query the system table vs_procedures to get the proc_oid for a given tokenizer name. See &lt;a href=&#34;../../../../../en/admin/using-text-search/stemmers-and-tokenizers/configuring-tokenizer/#&#34;&gt;Configuring a tokenizer&lt;/a&gt; for more information.&lt;/dd&gt;
&lt;/dl&gt;
&lt;h2 id=&#34;examples&#34;&gt;Examples&lt;/h2&gt;
&lt;p&gt;The following example shows how you can use READ_CONFIG_FILE to return the parameters associated with a tokenizer:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT v_txtindex.READ_CONFIG_FILE(USING PARAMETERS proc_oid=&amp;#39;45035996274126984&amp;#39;) OVER();
                config_key | config_value
 --------------------------+---------------------
  majorseparators          | {}()&amp;amp;[]
  stopwordscaseinsensitive | devil,TODAY,the,fox
(2 rows)
&lt;/code&gt;&lt;/pre&gt;
      </description>
    </item>
    
    <item>
      <title>Sql-Reference: SET_TOKENIZER_PARAMETER</title>
      <link>/en/sql-reference/functions/match-and-search-functions/text-search-functions/set-tokenizer-parameter/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/en/sql-reference/functions/match-and-search-functions/text-search-functions/set-tokenizer-parameter/</guid>
      <description>
        
        
        &lt;p&gt;Configures the tokenizer parameters.

&lt;div class=&#34;admonition important&#34; role=&#34;alert&#34;&gt;
&lt;h4 class=&#34;admonition-head&#34;&gt;Important&lt;/h4&gt;
&lt;code&gt;\n, \t,\r&lt;/code&gt; must be entered as Unicode using OpenText™ Analytics Database notation, &lt;code&gt;U&amp;amp;’\000D’&lt;/code&gt;, or using the database escaping notation, &lt;code&gt;E’\r’&lt;/code&gt;. Otherwise, they are taken literally as two separate characters. For example, &lt;code&gt;&amp;quot;\&amp;quot; &amp;amp; &amp;quot;r&amp;quot;&lt;/code&gt;.
&lt;/div&gt;&lt;/p&gt;
&lt;h2 id=&#34;syntax&#34;&gt;Syntax&lt;/h2&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;SELECT v_txtindex.SET_TOKENIZER_PARAMETER (&lt;span class=&#34;code-variable&#34;&gt;parameter_name&lt;/span&gt;, &lt;span class=&#34;code-variable&#34;&gt;parameter_value&lt;/span&gt; USING PARAMETERS proc_oid=&amp;#39;&lt;span class=&#34;code-variable&#34;&gt;proc_oid&lt;/span&gt;&amp;#39;)
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id=&#34;parameters&#34;&gt;Parameters&lt;/h2&gt;
&lt;dl&gt;
&lt;dt&gt;&lt;code&gt;parameter_name&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;Name of the parameter to be configured.
&lt;p&gt;Use one of the following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;stopwordsCaseInsensitive&lt;/code&gt;: List of stop words. All the tokens that belong to the list are ignored. The database supports separators and stop words up to the first 256 Unicode characters.&lt;/p&gt;
&lt;p&gt;If you want to define a stop word that contains a comma or a backslash, then it needs to be escaped.&lt;br /&gt;For example: &lt;code&gt;&amp;quot;Dear Jack\,&amp;quot; &amp;quot;Dear Jack\\&amp;quot;&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Default:&lt;/strong&gt; &lt;code&gt;&#39;&#39;&lt;/code&gt; (empty list)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;majorSeparators&lt;/code&gt;:List of major separators. Enclose in quotes with no spaces between.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Default:&lt;/strong&gt; &lt;code&gt;E&#39; []&amp;lt;&amp;gt;(){}|!;,&#39;&#39;&amp;quot;*&amp;amp;?+\r\n\t&#39;&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;minorSeparators&lt;/code&gt;: List of minor separators. Enclose in quotes with no spaces between.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Default:&lt;/strong&gt; &lt;code&gt;E&#39;/:=@.-$#%\\_&#39;&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;minLength&lt;/code&gt; — Minimum length a token can have, type Integer. Must be greater than 0.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Default:&lt;/strong&gt; &lt;code&gt;&#39;2&#39;&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;maxLength&lt;/code&gt;: Maximum length a token can be. Type Integer. Cannot be greater than 1024 bytes. For information about increasing the token size, see &lt;a href=&#34;../../../../../en/sql-reference/config-parameters/text-search-parameters/#&#34;&gt;Text search parameters&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Default:&lt;/strong&gt; &lt;code&gt;&#39;128&#39;&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;ngramsSize&lt;/code&gt;: Integer value greater than zero. Use only with ngram tokenizers.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Default:&lt;/strong&gt; &lt;code&gt;&#39;3&#39;&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;used&lt;/code&gt;: Indicates when a tokenizer configuration cannot be changed. Type Boolean. After you set used to &lt;code&gt;True&lt;/code&gt;, any calls to setTokenizerParameter fail.&lt;/p&gt;
&lt;p&gt;You must set the parameter &lt;code&gt;used&lt;/code&gt; to &lt;code&gt;True&lt;/code&gt; before using the configured tokenizer. Doing so prevents the configuration from being modified after being used to create a text index.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Default:&lt;/strong&gt; &lt;code&gt;False&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;parameter_value&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;The value of a configuration parameter.
&lt;p&gt;If you want to disable minorSeperators or stopWordsCaseInsensitive, then set their values to &lt;code&gt;&#39;&#39;&lt;/code&gt;.&lt;/p&gt;
&lt;/dd&gt;
&lt;dt&gt;&lt;code&gt;proc_oid&lt;/code&gt;&lt;/dt&gt;
&lt;dd&gt;A unique identifier assigned to a tokenizer when it is created. Users must query the system table vs_procedures to get the proc_oid for a given tokenizer name. See &lt;a href=&#34;../../../../../en/admin/using-text-search/stemmers-and-tokenizers/configuring-tokenizer/#&#34;&gt;Configuring a tokenizer&lt;/a&gt; for more information.&lt;/dd&gt;
&lt;/dl&gt;
&lt;h2 id=&#34;examples&#34;&gt;Examples&lt;/h2&gt;
&lt;p&gt;The following examples show how you can use SET_TOKENIZER_PARAMETER to configure stop words and separators.&lt;/p&gt;
&lt;p&gt;Configure the stop words of a tokenizer:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT v_txtindex.SET_TOKENIZER_PARAMETER(&amp;#39;stopwordsCaseInsensitive&amp;#39;, &amp;#39;devil,TODAY,the,fox&amp;#39; USING PARAMETERS proc_oid=&amp;#39;45035996274126984&amp;#39;);
 SET_TOKENIZER_PARAMETER
-------------------------
 t
(1 row)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Configure the major separators of a tokenizer:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;=&amp;gt; SELECT v_txtindex.SET_TOKENIZER_PARAMETER(&amp;#39;majorSeparators&amp;#39;,E&amp;#39;{}()&amp;amp;[]&amp;#39; USING PARAMETERS proc_oid=&amp;#39;45035996274126984&amp;#39;);
 SET_TOKENIZER_PARAMETER
-------------------------
 t
(1 row)
&lt;/code&gt;&lt;/pre&gt;
      </description>
    </item>
    
  </channel>
</rss>
