This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

SVD (singular value decomposition)

Singular Value Decomposition (SVD) is a matrix decomposition method that allows you to approximate matrix X with dimensions n-by-p as a product of 3 matrices: X(n-by-p) = U(n-by-k).S(k-byk).VT(k-by-p) where k is an integer from 1 to p, and S is a diagonal matrix.

Singular Value Decomposition (SVD) is a matrix decomposition method that allows you to approximate matrix X with dimensions n-by-p as a product of 3 matrices: X(n-by-p) = U(n-by-k).S(k-byk).VT(k-by-p) where k is an integer from 1 to p, and S is a diagonal matrix. Its diagonal has non-negative values, called singular values, sorted from the largest, at the top left, to the smallest, at the bottom right. All other elements of S are zero.

In practice, the matrix V(p-by-k), which is the transposed version of VT, is more preferred.

If k (an input parameter to the decomposition algorithm, also known as the number of components to keep in the output) is equal to p, the decomposition is exact. If k is less than p, the decomposition becomes an approximation.

An application of SVD is lossy data compression. For example, storing X required n.p elements, while storing the three matrices U, S, and VT requires storing n.k + k + k.p elements. If n=1000, p=10, and k=2, storing X would require 10,000 elements while storing the approximation would require 2,000+4+20 = 2,024 elements. A smaller value of k increases the savings in storage space, while a larger value of k gives a more accurate approximation.

Depending on your data, the singular values may decrease rapidly, which allows you to choose a value of k that is much smaller than the value of p.

Another common application of SVD is to perform the principal component analysis.

You can use the following functions to train and apply the SVD model:

For a complete example, see Computing SVD.

1 - Computing SVD

This SVD example uses a small data set named small_svd.

This SVD example uses a small data set named small_svd. The example shows you how to compute SVD using the given data set. The table is a matrix of numbers. The singular value decomposition is computed using the SVD function. This example computes the SVD of the table matrix and assigns it to a new object, which contains one vector, and two matrices, U and V. The vector contains the singular values. The first matrix, U contains the left singular vectors and V contains the right singular vectors.

Before you begin the example, load the Machine Learning sample data.
  1. Create the SVD model, named svdmodel.

    => SELECT SVD ('svdmodel', 'small_svd', 'x1,x2,x3,x4');
    SVD
    --------------------------------------------------------------
    Finished in 1 iterations.
    Accepted Rows: 8  Rejected Rows: 0
    (1 row)
    
  2. View the summary output of svdmodel.

    
    => SELECT GET_MODEL_SUMMARY(USING PARAMETERS model_name='svdmodel');
    GET_MODEL_SUMMARY
    ----------------------------------------------------------------------------------------------------------
    ----------------------------------------------------------------------------------------------------------
    ----------------------------------------------------------------------------------------------------------
    ----------------------------------------------------------------------------------------------------------
    ----------------------------------------------------------------------------------------------------------
    ----------------------------------------------------------------------------------------------------------
    ----------------------------------------------------------------------------------------------------------
    ----------------------------------------------------------------------------------------------------------
    ----------------------------------------------------------------------------------------------------------
    ----------------------------------------------------------------------------------------------------------
    ---------------------------------------------------------------------------------------
    =======
    columns
    =======
    index|name
    -----+----
    1    | x1
    2    | x2
    3    | x3
    4    | x4
    ===============
    singular_values
    ===============
    index| value  |explained_variance|accumulated_explained_variance
    -----+--------+------------------+------------------------------
    1    |22.58748|      0.95542     |            0.95542
    2    | 3.79176|      0.02692     |            0.98234
    3    | 2.55864|      0.01226     |            0.99460
    4    | 1.69756|      0.00540     |            1.00000
    ======================
    right_singular_vectors
    ======================
    index|vector1 |vector2 |vector3 |vector4
    -----+--------+--------+--------+--------
    1    | 0.58736| 0.08033| 0.74288|-0.31094
    2    | 0.26661| 0.78275|-0.06148| 0.55896
    3    | 0.71779|-0.13672|-0.64563|-0.22193
    4    | 0.26211|-0.60179| 0.16587| 0.73596
    ========
    counters
    ========
    counter_name      |counter_value
    ------------------+-------------
    accepted_row_count|      8
    rejected_row_count|      0
    iteration_count   |      1
    ===========
    call_string
    ===========
    SELECT SVD('public.svdmodel', 'small_svd', 'x1,x2,x3,x4');
    (1 row)
    
  3. Create a new table, named Umat to obtain the values for U.

    => CREATE TABLE Umat AS SELECT APPLY_SVD(id, x1, x2, x3, x4 USING PARAMETERS model_name='svdmodel',
    exclude_columns='id', key_columns='id') OVER() FROM small_svd;
    CREATE TABLE
    
  4. View the results in the Umat table. This table transforms the matrix into a new coordinates system.

    => SELECT * FROM Umat ORDER BY id;
      id |        col1        |        col2        |        col3         |        col4
    -----+--------------------+--------------------+---------------------+--------------------
    1    | -0.494871802886819 | -0.161721379259287 |  0.0712816417153664 | -0.473145877877408
    2    |  -0.17652411036246 | 0.0753183783382909 |  -0.678196192333598 | 0.0567124770173372
    3    | -0.150974762654569 | -0.589561842046029 | 0.00392654610109522 |  0.360011163271921
    4    |  -0.44849499240202 |  0.347260956311326 |   0.186958376368345 |  0.378561270493651
    5    | -0.494871802886819 | -0.161721379259287 |  0.0712816417153664 | -0.473145877877408
    6    |  -0.17652411036246 | 0.0753183783382909 |  -0.678196192333598 | 0.0567124770173372
    7    | -0.150974762654569 | -0.589561842046029 | 0.00392654610109522 |  0.360011163271921
    8    |  -0.44849499240202 |  0.347260956311326 |   0.186958376368345 |  0.378561270493651
    (8 rows)
    
  5. Then, we can optionally transform the data back by converting it from Umat to Xmat. First, we must create the Xmat table and then apply the APPLY_INVERSE_SVD function to the table:

    => CREATE TABLE Xmat AS SELECT APPLY_INVERSE_SVD(* USING PARAMETERS model_name='svdmodel',
    exclude_columns='id', key_columns='id') OVER() FROM Umat;
    CREATE TABLE
    
  6. Then view the data from the Xmat table that was created:

    => SELECT id, x1::NUMERIC(5,1), x2::NUMERIC(5,1), x3::NUMERIC(5,1), x4::NUMERIC(5,1) FROM Xmat
    ORDER BY id;
    id | x1  | x2  | x3  | x4
    ---+-----+-----+-----+-----
    1  | 7.0 | 3.0 | 8.0 | 2.0
    2  | 1.0 | 1.0 | 4.0 | 1.0
    3  | 2.0 | 3.0 | 2.0 | 0.0
    4  | 6.0 | 2.0 | 7.0 | 4.0
    5  | 7.0 | 3.0 | 8.0 | 2.0
    6  | 1.0 | 1.0 | 4.0 | 1.0
    7  | 2.0 | 3.0 | 2.0 | 0.0
    8  | 6.0 | 2.0 | 7.0 | 4.0
    (8 rows)
    

See also