VAR model example

The following example trains a vector autoregression (VAR) model on an electricity usage dataset, and then makes predictions with that model.

Download and load the data

You can download the data from Mendeley Data. The dataset consists of metrics related to daily electricity load forecasting in Panama, including columns such as the following:

  • datetime: hourly datetime index corresponding to Panama time-zone (UTC-05:00)
  • nat_demand: national electricity load in Panama, measured in megawatt hours
  • T2M_toc: temperature at 2 meters in Tocumen, Panama City
  • QV2M_toc: relative humidity at 2 meters in Tocumen, Panama City
  • TQL_toc: liquid precipitation in Tocumen, Panama City
  • W2M_toc: wind speed at 2 meters in Tocumen, Panama City

The dataset also includes temperature, humidity, precipitation, and wind speed columns for two other cities in Panama: Santiago City and David City.

Unlike the AR algorithm, which can only model a single variable over time, the VAR algorithm captures the relationship between time series variables across lagged time steps. For example, when trained on the above dataset, the VAR algorithm could model whether a rise in temperature tends to lead to an increase in electricity demand hours later.

After you have downloaded the data locally, you can load the data into Vertica with the following statements:

=> CREATE TABLE electric_data(ts timestamp, nat_demand float, T2M_toc float, QV2M_toc float,
    TQL_toc float, W2M_toc float, T2M_san float, QV2M_san float, TQL_san float, W2M_san float, T2M_dav 
    float, QV2M_dav float, TQL_dav float, W2M_dav float, Holiday_ID int, holiday bool, school bool);

=> COPY electric_data (ts,nat_demand,T2M_toc,QV2M_toc,TQL_toc,W2M_toc,T2M_san,QV2M_san,TQL_san,W2M_san,
    T2M_dav,QV2M_dav,TQL_dav,W2M_dav,Holiday_ID,holiday,school) FROM LOCAL 'path-to-data' 
    DELIMITER ',';

The dataset includes some categorical columns, which need to be dropped before training a VAR model:

=> ALTER TABLE electric_data DROP column Holiday_ID;
=> ALTER TABLE electric_data DROP column holiday;
=> ALTER TABLE electric_data DROP column school;

Query the electric_data table to view a sample of the data that will be used to train the model:

=> SELECT * FROM electric_data LIMIT 3;
         ts          | nat_demand |     T2M_toc      |  QV2M_toc   |   TQL_toc    |     W2M_toc      |     T2M_san      |  QV2M_san   |   TQL_san    |     W2M_san      |     T2M_dav      |  QV2M_dav   |   TQL_dav   |     W2M_dav
---------------------+------------+------------------+-------------+--------------+------------------+------------------+-------------+--------------+------------------+------------------+-------------+-------------+------------------
 2015-01-03 01:00:00 |    970.345 | 25.8652587890625 | 0.018576382 |  0.016174316 | 21.8505458178752 | 23.4824462890625 | 0.017271755 | 0.0018553734 | 10.3289487293842 | 22.6621337890625 | 0.016562222 |  0.09609985 | 5.36414795209389
 2015-01-03 02:00:00 |   912.1755 | 25.8992553710938 | 0.018653292 |  0.016418457 | 22.1669442769882 | 23.3992553710938 | 0.017264742 | 0.0013270378 | 10.6815171370742 | 22.5789428710938 | 0.016509432 | 0.087646484 | 5.57247134626166
 2015-01-03 03:00:00 |   900.2688 | 25.9372802734375 |  0.01876786 | 0.0154800415 | 22.4549108762635 | 23.3435302734375 | 0.017211463 | 0.0014281273 | 10.8749237899741 | 22.5310302734375 | 0.016479041 |  0.07873535 | 5.87118374509993
(3 rows)

Train the model

To train a VAR model, use the AUTOREGRESSOR function. In this example, the model is trained with a lag of 3, meaning 3 previous timesteps are taken into account during computation:

=> SELECT AUTOREGRESSOR ( 'var_electric', 'electric_data', 'nat_demand, T2M_toc, QV2M_toc, TQL_toc, W2M_toc, T2M_san, QV2M_san,
    TQL_san, W2M_san, T2M_dav, QV2M_dav, TQL_dav, W2M_dav', 'ts' USING PARAMETERS P=3, compute_mse=True );
WARNING 0:  Only the Yule Walker method is currently supported for Vector Autoregression, setting method to Yule Walker
                      AUTOREGRESSOR
---------------------------------------------------------
 Finished. 48048 elements accepted, 0 elements rejected.
(1 row)

Use the GET_MODEL_SUMMARY function to view a summary of the model, including its coefficient and parameter values:

=> SELECT GET_MODEL_SUMMARY( USING PARAMETERS model_name='var_electric');
                                     GET_MODEL_SUMMARY
-------------------------------------------------------------------------------------------------------

=========
phi_(t-1)
=========
predictor |nat_demand|t2m_toc |  qv2m_toc  | tql_toc |w2m_toc |t2m_san |  qv2m_san  | tql_san  |w2m_san | t2m_dav  |  qv2m_dav  | tql_dav |w2m_dav
----------+----------+--------+------------+---------+--------+--------+------------+----------+--------+----------+------------+---------+--------
nat_demand|  1.14622 |-1.66306|-26024.21189|853.81013| 2.88135|59.48676|-28244.33013|-194.95364| 1.26052|-135.79998|-66425.52272|229.38130| 0.58089
 t2m_toc  |  0.00048 | 0.53187| -112.46610 | 1.14647 |-0.06549| 0.13092| -200.78913 | -1.58964 | 0.21530| -0.46069 | -494.82893 |-1.22152 | 0.13995
 qv2m_toc |  0.00000 | 0.00057|   0.89702  |-0.00218 |-0.00001|-0.00017|   0.06365  |  0.00151 |-0.00003| -0.00012 |  -0.06966  |-0.00027 |-0.00002
 tql_toc  | -0.00002 | 0.01902|   9.77736  | 1.39416 |-0.00235|-0.01847|   0.84052  | -0.27010 | 0.00264|  0.00547 |  15.98251  |-0.19235 |-0.00214
 w2m_toc  | -0.00392 | 2.33136|  99.79514  |14.08998 | 1.44729|-1.80924|-1756.46929 |  1.11483 | 0.12860|  0.18757 | 386.03986  |-3.28553 | 1.08902
 t2m_san  |  0.00008 |-0.28536| 474.30004  | 0.02968 |-0.05527| 1.29437| -475.49532 | -6.68332 | 0.24776| -1.09044 | -871.84132 | 0.10465 | 0.40562
 qv2m_san | -0.00000 | 0.00021|  -0.39224  | 0.00080 | 0.00002|-0.00010|   1.13421  |  0.00252 |-0.00006|  0.00001 |  -0.30554  |-0.00206 | 0.00004
 tql_san  | -0.00003 |-0.00361|  17.26050  | 0.13776 |-0.00267| 0.01761|  -9.63725  |  1.03554 |-0.00053| -0.01907 |  -8.70276  |-0.56572 | 0.00523
 w2m_san  | -0.00007 |-0.92628|   9.34644  |22.01868 | 0.15592|-0.38963| 219.53687  |  6.32666 | 0.98779|  0.50404 | 607.06291  |-2.93982 | 1.01091
 t2m_dav  | -0.00009 |-0.19734| 447.10894  |-2.09032 | 0.00302| 0.27105| -266.05516 | -3.03434 | 0.28049| -0.03718 | -750.51074 |-3.00557 | 0.01414
 qv2m_dav | -0.00000 | 0.00003|  -0.25311  | 0.00255 |-0.00002| 0.00012|   0.36524  | -0.00043 | 0.00001| -0.00001 |   0.55553  |-0.00040 | 0.00010
 tql_dav  | -0.00000 | 0.00638|  36.02787  | 0.40214 |-0.00116| 0.00352|   2.09579  |  0.14142 | 0.00192| -0.01039 | -35.63238  | 0.91257 |-0.00834
 w2m_dav  |  0.00316 |-0.48625| -250.62285 | 6.92672 | 0.13897|-0.30942|  21.40057  | -2.77030 |-0.05098|  0.49791 |  86.43985  |-5.61450 | 1.36653


=========
phi_(t-2)
=========
predictor |nat_demand| t2m_toc | qv2m_toc  |  tql_toc  |w2m_toc | t2m_san | qv2m_san  | tql_san |w2m_san |t2m_dav | qv2m_dav  | tql_dav  | w2m_dav
----------+----------+---------+-----------+-----------+--------+---------+-----------+---------+--------+--------+-----------+----------+---------
nat_demand| -0.50332 |-46.08404|18727.70487|-1054.48563|-1.64150|-36.33188|-4175.69233|498.68770|24.05641|97.39713|15062.02349|-418.70514|-27.63335
 t2m_toc  | -0.00240 |-0.01253 | 304.73283 | -5.13242  |-0.03848|-0.08204 | -27.48349 | 1.86556 | 0.06103| 0.31194| 46.52296  |  0.25737 |-0.30230
 qv2m_toc |  0.00000 |-0.00023 | -0.16013  |  0.00443  | 0.00003| 0.00003 | -0.02433  |-0.00226 | 0.00001| 0.00006|  0.10786  |  0.00055 | 0.00008
 tql_toc  | -0.00003 |-0.01472 |  3.90260  | -0.60709  | 0.00318| 0.01456 | -15.78706 | 0.23611 |-0.00201|-0.00475| -11.18542 |  0.09355 | 0.00227
 w2m_toc  |  0.00026 |-1.93177 | 864.02774 | -33.46154 |-0.64149| 1.99313 | 924.12187 |-2.23520 |-0.12906|-0.46720|-763.47613 |  8.44744 |-1.42164
 t2m_san  | -0.00449 |-0.12273 | 317.82860 | -9.44207  |-0.11911|-0.32597 | 11.71343  | 6.41420 | 0.17817| 0.78725| 147.48679 | -1.93663 |-0.73511
 qv2m_san |  0.00000 | 0.00001 |  0.29127  |  0.00430  | 0.00001| 0.00018 | -0.25311  |-0.00256 | 0.00000|-0.00030|  0.16254  |  0.00213 | 0.00007
 tql_san  | -0.00001 | 0.01255 |  8.58348  |  0.17678  | 0.00236|-0.01895 |  2.89453  |-0.13436 | 0.00492| 0.01080| -5.78848  |  0.39776 |-0.00792
 w2m_san  | -0.00402 |-0.23875 | 152.07427 | -34.35207 |-0.14634| 0.54952 |-458.82955 |-2.37583 |-0.04254| 0.02246|-523.69779 |  7.92699 |-1.13203
 t2m_dav  | -0.00340 |-0.17349 | 218.48562 | -3.75842  |-0.14256|-0.16157 | 104.28631 | 2.42379 | 0.09304| 0.59601| 47.14495  |  0.00436 |-0.36515
 qv2m_dav |  0.00000 | 0.00011 |  0.26173  |  0.00065  | 0.00004| 0.00006 | -0.36303  | 0.00036 |-0.00003|-0.00030|  0.00561  |  0.00151 |-0.00008
 tql_dav  |  0.00002 |-0.01052 | -24.63826 | -0.08533  | 0.00290| 0.00260 |  0.91509  |-0.14088 |-0.00035| 0.00148| 11.70604  | -0.09407 | 0.01194
 w2m_dav  | -0.00222 | 0.06791 | 74.24493  | -7.00070  |-0.16760| 0.41837 | -47.66437 | 2.82942 | 0.03768|-0.35919| -78.61307 |  3.12482 |-0.60352


=========
phi_(t-3)
=========
predictor |nat_demand|t2m_toc | qv2m_toc  | tql_toc |w2m_toc | t2m_san | qv2m_san  | tql_san  | w2m_san |t2m_dav | qv2m_dav  | tql_dav |w2m_dav
----------+----------+--------+-----------+---------+--------+---------+-----------+----------+---------+--------+-----------+---------+--------
nat_demand|  0.28126 |53.13756|3338.49498 |-26.56706| 1.60491|-57.36669|-8104.16577|-172.24549|-38.55564|74.30481|96465.63629|245.06242|44.47450
 t2m_toc  |  0.00221 | 0.46994|-348.87103 | 2.62141 | 0.16466|-0.25496 |-137.21750 |  2.19341 |-0.40865 | 0.30965|1079.93357 | 0.14082 | 0.25677
 qv2m_toc | -0.00000 |-0.00039|  0.32237  |-0.00208 |-0.00004| 0.00015 | -0.01736  |  0.00052 | 0.00004 | 0.00014| -0.12526  |-0.00012 |-0.00006
 tql_toc  |  0.00006 |-0.00387| -17.12826 | 0.11237 |-0.00034| 0.00313 | 13.96030  |  0.08266 |-0.00155 |-0.00080| -0.47584  | 0.14030 | 0.00030
 w2m_toc  |  0.00498 |-0.70555|-955.01026 |17.98079 | 0.18909|-0.64634 | 496.07411 |  2.36480 |-0.07010 | 1.03103| 779.48099 |-5.90297 | 0.44579
 t2m_san  |  0.00510 | 0.47889|-1166.20907| 7.15240 | 0.27574|-0.48793 |-215.11669 |  5.31181 |-0.66397 | 0.72268|1871.66380 | 0.51522 | 0.50609
 qv2m_san | -0.00000 |-0.00029|  0.28296  |-0.00588 |-0.00005|-0.00014 | -0.04735  | -0.00013 | 0.00010 | 0.00040|  0.17917  |-0.00013 |-0.00015
 tql_san  |  0.00004 |-0.00467| -27.72270 |-0.40887 | 0.00016| 0.00234 |  4.26912  | -0.00376 |-0.00318 | 0.00015| 22.54523  | 0.24970 | 0.00081
 w2m_san  |  0.00493 | 1.14186|-373.37429 |13.42031 | 0.02690|-0.47403 | 49.48392  | -3.10954 |-0.11381 |-0.05802| 153.91529 |-4.97422 | 0.21238
 t2m_dav  |  0.00365 | 0.55030|-1007.89174| 3.91560 | 0.22359|-0.44299 |-400.17878 |  4.14859 |-0.58507 | 0.58492|1659.35503 | 2.60303 | 0.54396
 qv2m_dav | -0.00000 |-0.00014|  0.19838  |-0.00385 |-0.00004|-0.00020 | -0.08521  | -0.00031 | 0.00008 | 0.00030|  0.37780  |-0.00163 |-0.00007
 tql_dav  | -0.00002 | 0.00898| -10.89014 |-0.36915 |-0.00241|-0.00164 | -8.04515  | -0.03469 | 0.00140 |-0.00446| 34.64281  | 0.11446 |-0.00757
 w2m_dav  | -0.00120 | 0.51977| 149.50193 |-0.12824 | 0.02324| 0.07751 | 109.64435 | -0.52736 | 0.08222 |-0.46739| -20.74587 | 2.62600 | 0.09365


====
mean
====
predictor | value
----------+--------
nat_demand| 0.00000
 t2m_toc  | 0.00000
 qv2m_toc | 0.00000
 tql_toc  | 0.00000
 w2m_toc  | 0.00000
 t2m_san  | 0.00000
 qv2m_san | 0.00000
 tql_san  | 0.00000
 w2m_san  | 0.00000
 t2m_dav  | 0.00000
 qv2m_dav | 0.00000
 tql_dav  | 0.00000
 w2m_dav  | 0.00000


==================
mean_squared_error
==================
predictor |   value
----------+-----------
nat_demand|17962.23433
 t2m_toc  |  1.69488
 qv2m_toc |  0.00000
 tql_toc  |  0.00055
 w2m_toc  |  2.46161
 t2m_san  |  5.50800
 qv2m_san |  0.00000
 tql_san  |  0.00129
 w2m_san  |  2.09820
 t2m_dav  |  4.17091
 qv2m_dav |  0.00000
 tql_dav  |  0.00131
 w2m_dav  |  0.37255


=================
predictor_columns
=================
"nat_demand", "t2m_toc", "qv2m_toc", "tql_toc", "w2m_toc", "t2m_san", "qv2m_san", "tql_san", "w2m_san", "t2m_dav", "qv2m_dav", "tql_dav", "w2m_dav"

================
timestamp_column
================
ts

==============
missing_method
==============
error

===========
call_string
===========
autoregressor('public.var_electric', 'electric_data', '"nat_demand", "t2m_toc", "qv2m_toc", "tql_toc", "w2m_toc", "t2m_san", "qv2m_san", "tql_san", "w2m_san", "t2m_dav", "qv2m_dav", "tql_dav", "w2m_dav"', 'ts'
USING PARAMETERS p=3, method=yule-walker, missing=error, regularization='none', lambda=1, compute_mse=true, subtract_mean=false);

===============
Additional Info
===============
       Name       | Value
------------------+--------
    lag_order     |   3
  num_predictors  |   13
      lambda      | 1.00000
rejected_row_count|   0
accepted_row_count| 48048

(1 row)

Make predictions

After you have trained the model, use PREDICT_AUTOREGRESSOR to make predictions. The following query makes predictions for 10 timesteps after the end of the training data:

=> SELECT PREDICT_AUTOREGRESSOR (nat_demand, T2M_toc, QV2M_toc, TQL_toc, W2M_toc, T2M_san, QV2M_san, TQL_san, W2M_san, T2M_dav,
    QV2M_dav, TQL_dav, W2M_dav USING PARAMETERS model_name='var_electric', npredictions=10) OVER (ORDER BY ts) FROM electric_data;
 index |    nat_demand    |     T2M_toc      |      QV2M_toc      |      TQL_toc       |     W2M_toc      |     T2M_san      |      QV2M_san      |      TQL_san      |     W2M_san      |     T2M_dav      |      QV2M_dav      |      TQL_dav      |     W2M_dav
-------+------------------+------------------+--------------------+--------------------+------------------+------------------+--------------------+-------------------+------------------+------------------+--------------------+-------------------+------------------
     1 |  1078.0855626373 | 27.5432174013135 | 0.0202896580655671 | 0.0735420728737344 | 10.8175311126823 |  26.430434929925 | 0.0192831578421391 | 0.107198653008438 | 3.05244789585641 | 24.5096742655262 | 0.0184037299403769 |  0.16295453121027 | 3.08228477169708
     2 | 1123.01343816948 | 27.5799917618547 | 0.0204207744201445 | 0.0720447905881737 | 10.8724214941076 | 26.3610153442989 | 0.0194263633273137 | 0.108930877007977 | 2.72589694499722 | 24.4670623561271 | 0.0186472805344351 | 0.165398914107496 | 2.87751855475047
     3 | 1131.90496161147 | 27.3065074421367 | 0.0206625082516192 | 0.0697170726826932 | 10.5264893921207 |   25.81201637743 |  0.019608966941237 |  0.10637712791638 | 2.17340369566314 | 24.1521703335357 | 0.0188528868910987 | 0.167392378142989 | 2.80663029425841
     4 | 1138.96441161386 | 27.4576230482214 | 0.0207001599239755 | 0.0777394805028406 | 10.3601575817394 | 26.1392475032107 | 0.0195632331195498 | 0.104149788020336 | 2.46022124286432 | 24.4888899706856 | 0.0187304304955302 | 0.164373252722466 | 2.78678931032488
     5 | 1171.39047791301 | 28.0057288751278 | 0.0205956267885475 | 0.0848090062223719 | 10.6253279384262 | 27.0670669914329 | 0.0195635438719142 | 0.114456870352482 | 2.76078540220627 | 25.2647929547485 | 0.0187651697256172 |  0.17343826852935 | 2.69927291097792
     6 | 1207.73967000806 | 28.3228814221316 | 0.0206018765585195 | 0.0822472149970854 | 10.9208806093031 | 27.4723192020112 | 0.0197095736612743 | 0.120234389446089 | 2.66435208358109 | 25.5897884504046 | 0.0190138227508656 | 0.181730688934231 | 2.48952353476086
     7 | 1201.36034218262 | 28.0729800850783 | 0.0207697611016373 | 0.0771476148672282 | 10.8746443523915 | 26.9706455927136 | 0.0198268925550646 | 0.108597236269397 |  2.4209888271894 | 25.1762432351852 | 0.0191249010707135 | 0.172747303182877 | 2.23183544428884
     8 | 1224.23208010817 | 28.3089311565846 | 0.0208114201116026 | 0.0850389146168386 | 11.0236068974249 | 27.4555219109112 | 0.0198191134457088 | 0.106317040996309 | 2.53443574238199 | 25.6496524683801 | 0.0189950796868336 | 0.164772798858117 | 2.03398232662887
     9 | 1276.63054426938 | 28.9264909381223 | 0.0207288153284714 | 0.0949502386997371 | 11.3825529048681 | 28.5602280897709 | 0.0198383353555985 | 0.119437241768372 | 2.81035565170478 | 26.6092121876548 | 0.0189965295736137 | 0.173848915902565 | 1.97943631786186
    10 | 1279.80379750225 | 28.9655412855392 | 0.0207630823553549 | 0.0899990557577538 | 11.4992364863754 | 28.5096132340911 | 0.0199982128259766 | 0.122916656987609 |  2.8045000981305 | 26.4950144592136 | 0.0192642881348006 | 0.183789154418929 | 1.97414217031828
(10 rows)

See also