The following topics describe techniques that help you manage query execution between your client and your Vertica database.
This is the multi-page printable view of this section. Click here to print.
Managing query execution between the client and Vertica
1 - ResultBufferSize
By default, Vertica uses the ResultBufferSize parameter to determine the maximum size (in bytes) of a result set that a client can retrieve from a server. When ResultBufferSize is enabled, Vertica sends rows of data directly to the client making the query. The number of rows returned to the client at each fetch of data depends on the size (in bytes) of the ResultBufferSize parameter.
Sometimes, the size of the result set requested by the client is greater than what the ResultBufferSize parameter allows. In such cases, Vertica retrieves only a portion of the result set at a time. Each fetch of data returns the amount of data equal to the size set by the ResultBufferSize parameter. Ultimately, as the client iterates over the individual fetches of data, the entire result set is returned.
Benefits of ResultBufferSize
If you are concerned with the effect of your queries on network latency, ResultBufferSize may provide an advantage over MARS. MARS requires that the client wait until all rows of data are written to the server before the client can retrieve the data. This delay may cause latency issues for your network while waiting for the results to be stored.
In addition, MARS requires that you send two separate requests to return rows of data. The first request performs the query execution which stores the result set on the server. The second request retrieves the data rows that are stored on the server. With ResultBufferSize, you only need to send one request. This request both executes and retrieves the data rows of interest.
Query execution with ResultBufferSize
The following graphic shows how Vertica returns rows of data from a database to the client with ResultBufferSize enabled:
The query execution performs the following steps:
-
The client sends a query, such as a SELECT statement, to the server. In the preceding graphic, the first query is named Query 1.
-
The server receives the client's request and begins to send both a description of the result set and the requested rows of data back to the client.
-
After all possible rows are returned to the client, the execution is complete. The size of the data set returned equals either that of the data that was requested or the maximum amount of data that ResultBufferSize parameter can retrieve. If the ResultBufferSize maximum size is not yet reached, Vertica can execute Query 2.
The server can accept Query 2 and perform the same steps that it did for Query 1. If the results for Query 1 had reached the maximum ResultBufferSize allowable, Vertica could not execute Query 2 until the client freed the results from Query 1.
After Query 2 runs, you cannot view the results you retrieved for Query 1, unless you execute Query 1 again.
Setting an unlimited buffer size
Setting ResultBufferSize to 0 tells the client driver to use an unlimited result set buffer. With this setting, the client library allocates as much memory as it needs to read the entire result set of a query. You may choose to set ResultBufferSize to 0 of you want to simulate having multiple active queries over a single database connection at the same time. With an unlimited buffer size, your client can run a query and have its entire result set stored in memory. This ends the first query, so your client can execute a second query before it fully processes the results of the first query.
A drawback of this method is that your query may consume too much memory if your queries return large result sets. This over-allocation of memory can result in the operating system terminating your client. Due to this risk, consider using multiple database connections instead of trying to reuse a single connection for multiple queries. The overhead of multiple database connections is small compared to the overall amount of resources required to process a large data set.
2 - Multiple active result sets (MARS)
You can only enable MARS when you connect to Vertica using a JDBC client connection. MARS allows the execution of multiple queries on a single connection. While ResultBufferSize sends the results of a query directly to the client, MARS stores the results first on the server. Once query execution has finished and all of the results have been stored, you can make a retrieval request to the server to have rows returned to the client.
MARS is set at the session level and must be enabled for every new session. When MARS is enabled, ResultBufferSize is disabled. No error is returned, however the ResultBufferSize parameter is ignored.
Benefits of MARS
In comparison with ResultBufferSize, MARS enables you to store multiple result sets from different queries at the same time. You can also send new queries before all of the results of a previous result set have been returned to the client. This allows applications to decouple query execution from result retrieval so that, on a single connection, you can process different results at the same time.
When you enable ResultBufferSize, you must wait until all result sets have been returned to the client before a new query can be executed.
Another benefit of MARS is that it allows you to free up query resources faster than ResultBufferSize allows. While a query is running, resources are held by that query session. When ResultBufferSize is enabled, a client that is performing slowly might read a single row of a result set and then have to stop to retrieve the next row. This prevents the query from finishing quickly and, therefore, prevents the resources used from being freed up for other applications. With MARS, the speed of the client is irrelevant to the reading of rows. As soon as the results are written to the MARS storage, the resources are freed and the speed at which the client retrieves rows no longer matters.
Query execution with MARS
The following graphic demonstrates how multiple queries to the server are handled when MARS is enabled:
Query 1:
-
Query 1 is sent to the server.
-
Query 1's row description and the status of its result set are returned to the client. However, no results are returned to the client at this time.
-
Query 1 completes and its results are saved on the server.
- You can now send commands to retrieve the rows of Query 1's result set. These rows are stored on the server. Retrieved rows are sent to the client along with the status of the result set. By keeping track of the status of the result set, Vertica is able to keep track of which rows have been retrieved from the server.
-
Now that Query 1 has successfully completed, and its result sets are being stored on the server, Query 2 can be executed.
Query 2:
-
Query 2 is sent to the server.
-
Query 2's row description and the status of its result set are returned to the client. However, no results are returned to the client at this time.
-
Query 2 completes and its results are stored on the server. Both Query 1 and Query 2 now have result sets stored on the server.
-
You can now send retrieval requests to both Query 1 and Query 2's result sets that are stored on the server. Whenever a retrieval request is made for rows from Query 1, the request is sent and rows and the result set status are sent to the client. The same occurs for Query 2.
Once all rows have been read by the client, the MARS storage on the server closes the active results session. The MARS storage on the server is then freed to store more data. The MARS storage also closes and frees once your session is finished.
Enabling and disabling MARS
You can enable and disable MARS in two different ways:
-
To enable MARS using the JDBC client connection properties, see JDBC connection properties.
-
To enable MARS using the SET SESSION command, see SET SESSION MULTIPLEACTIVERESULTSETS.