入门
以下教程演示了创建、浏览和使用 Flex 表的基本信息。
将这些 JSON 数据用于本节其余部分的练习:
{"name": "Everest", "type":"mountain", "height":29029, "hike_safety": 34.1}
{"name": "Mt St Helens", "type":"volcano", "height":29029, "hike_safety": 15.4}
{"name": "Denali", "type":"mountain", "height":17000, "hike_safety": 12.2}
{"name": "Kilimanjaro", "type":"mountain", "height":14000 }
{"name": "Mt Washington", "type":"mountain", "hike_safety": 50.6}
创建 Flex 表并加载数据
-
创建名为
mountains
的 Flex 表:=> CREATE flex table mountains();
-
使用 Flex 表解析器
fjsonparser
加载您保存的 JSON 文件:=> COPY mountains from '/home/dbadmin/data/flex/mountains.json' parser fjsonparser(); Rows Loaded ------------- 5 (1 row)
-
查询示例文件中的值:
=> SELECT name, type, height from mountains; name | type | height ---------------+----------+-------- Everest | mountain | 29029 Mt St Helens | volcano | 29029 Denali | mountain | 17000 Kilimanjaro | mountain | 14000 Mt Washington | mountain | (5 rows)
现在,您已经创建一个 Flex 表并加载数据。接下来,了解更多有关在数据库中使用 Flex 表数据的信息。
更多地查询 Flex 表
-
查询 Flex 表以查看您加载的数据,它们存储在
__raw__
列中。示例说明了表的内容,并为了便于说明而添加了换行符:=> \x Expanded display is on. => SELECT * from mountains; [ RECORD 1 ]+---------------------------------------------------------------------------- __identity__ | 1 __raw__ | \001\000\000\000,\000\000\000\004\000\000\000\024\000\000\000\031\000\000\000\ 035\000\000\000$\000\000\0002902934.1Everestmountain\004\000\000\000\024\000\000\000\032\000\ 000\000%\000\000\000)\000\000\000heighthike_safetynametype [ RECORD 2 ]+---------------------------------------------------------------------------- __identity__ | 2 __raw__ | \001\000\000\0000\000\000\000\004\000\000\000\024\000\000\000\031\000\000\000\ 035\000\000\000)\000\000\0002902915.4Mt St Helensvolcano\004\000\000\000\024\000\000\000\032\000\ 000\000%\000\000\000)\000\000\000heighthike_safetynametype [ RECORD 3 ]+---------------------------------------------------------------------------- __identity__ | 3 __raw__ | \001\000\000\000+\000\000\000\004\000\000\000\024\000\000\000\031\000\000\000\ 035\000\000\000#\000\000\0001700012.2Denalimountain\004\000\000\000\024\000\000\000\032\000\000 \000%\000\000\000)\000\000\000heighthike_safetynametype [ RECORD 4 ]+---------------------------------------------------------------------------- __identity__ | 4 __raw__ | \001\000\000\000(\000\000\000\003\000\000\000\020\000\000\000\025\000\000\000\ 000\000\00014000Kilimanjaromountain\003\000\000\000\020\000\000\000\026\000\000\000\032\000\ 000\000heightnametype [ RECORD 5 ]+---------------------------------------------------------------------------- __identity__ | 5 __raw__ | \001\000\000\000)\000\000\000\003\000\000\000\020\000\000\000\024\000\000\000\ 000\000\00050.6Mt Washingtonmountain\003\000\000\000\020\000\000\000\033\000\000\000\037\000\ 000\000hike_safetynametype
-
使用
mapToString()
函数(用于mountains
的__raw__
列)检查其采用可读 JSON 文本格式的内容:=> SELECT maptostring(__raw__) from mountains; MAPTOSTRING ------------------------------------------------------------------------------------------ { "height" : "29029", "hike_safety" : "34.1", "name" : "Everest", "type" : "mountain" } { "height" : "29029", "hike_safety" : "15.4", "name" : "Mt St Helens", "type" : "volcano" } { "height" : "17000", "hike_safety" : "12.2", "name" : "Denali", "type" : "mountain" } { "height" : "14000", "name" : "Kilimanjaro", "type" : "mountain" } { "hike_safety" : "50.6", "name" : "Mt Washington", "type" : "mountain" }
-
现在,使用
compute_flextable_keys()
函数填充mountain_keys
表。Vertica 在您创建 Flex 表时自动生成此表。=> SELECT compute_flextable_keys('mountains'); compute_flextable_keys --------------------------------------------------- Please see public.mountains_keys for updated keys (1 row)
-
查询键表 (
mountains_keys
),并检查结果:=> SELECT * from public.mountains_keys; key_name | frequency | data_type_guess -------------+-----------+----------------- hike_safety | 4 | varchar(20) name | 5 | varchar(26) height | 4 | varchar(20) type | 5 | varchar(20) (4 rows)
构建 Flex 表视图
-
使用
build_flextable_view()
函数填充从mountains_keys
表生成的视图。=> SELECT build_flextable_view('mountains'); build_flextable_view ------------------------------------------------------ The view public.mountains_view is ready for querying (1 row)
-
查询视图
mountains_view
:=> SELECT * from public.mountains_view; hike_safety | name | type | height -------------+---------------+----------+-------- 50.6 | Mt Washington | mountain | 34.1 | Everest | mountain | 29029 22.8 | Kilimanjaro | mountain | 14000 15.4 | Mt St Helens | volcano | 29029 12.2 | Denali | mountain | 17000 (5 rows)
-
使用
view_columns
系统表查询mountains_view
的column_name
列和data_type
列:=> SELECT column_name, data_type from view_columns where table_name = 'mountains_view'; column_name | data_type -------------+------------- hike_safety | varchar(20) name | varchar(26) type | varchar(20) height | varchar(20) (4 rows)
-
审阅查询结果:
-
请注意
data_type
列及其值和大小。这些均为在您使用compute_flextable_keys()
计算 Flex 表键时计算得出。 -
您在调用该函数后查询
mountains_keys
表时,是否还注意到data_type_guess
列?
-
-
使用来自
mountains_view
的data_type
信息,覆盖hike_safety
的data_type_guess
。然后,COMMIT
更改,并使用build_flextable_view()
重新构建视图:=> UPDATE mountains_keys SET data_type_guess = 'float' where key_name = 'hike_safety'; OUTPUT -------- 1 (1 row) => commit; => SELECT build_flextable_view('mountains'); build_flextable_view ------------------------------------------------------ The view public.mountains_view is ready for querying (1 row)
-
接下来,使用
view_columns
系统表。请注意,hike_safety
现在属于 float 数据类型:=> SELECT column_name, data_type from view_columns where table_name = 'mountains_view'; column_name | data_type -------------+------------- hike_safety | float name | varchar(26) type | varchar(20) height | varchar(20) (4 rows)
创建混合 Flex 表
如果您已经知道您定期加载和查询的部分数据需要全面的 Vertica 性能和支持,您可以创建一个混合 Flex 表。混合 Flex 表具有您定义的一个或多个实际列,以及一个用于存储您加载的所有非结构化数据的 __raw__
列。查询实际列比查询 raw 列中的可变数据更快。您可以定义各列的默认值。
-
创建一个混合 Flex 表,并加载相同的示例 JSON 文件:
=> CREATE flex table mountains_hybrid(name varchar(41) default name::varchar(41), hike_safety float default hike_safety::float); => COPY mountains_hybrid from '/home/dbadmin/Downloads/mountains.json' parser fjsonparser(); Rows Loaded ------------- 5 (1 row)
-
使用
compute_flextable_keys_and_build_view()
函数为mountains_hybrid
填充键表并构建视图:=> SELECT compute_flextable_keys_and_build_view('mountains_hybrid'); compute_flextable_keys_and_build_view ------------------------------------------------------------------------------------------------ Please see public.mountains_hybrid_keys for updated keys The view public.mountains_hybrid_view is ready for querying (1 row)
-
查询
mountains_hybrid
键表。再次审阅data_type_guesses
列的值。类型列出了您在创建混合表时声明的列定义:=> SELECT * from mountains_hybrid_keys; key_name | frequency | data_type_guess -------------+-----------+----------------- height | 4 | varchar(20) name | 5 | varchar(41) type | 5 | varchar(20) hike_safety | 4 | float (4 rows)
如果您创建了一个基本 Flex 表,随后发现您希望将一个或多个虚拟列提升为实际列,请参阅实体化 Flex 表以添加列。
实体化混合 Flex 表中的虚拟列
浏览 Flex 表数据后,您可以将 Flex 表中的一个或多个虚拟列提升为实际列。您无需创建单独的列式表。
-
在混合表上调用
materialize_flextable_columns()
函数,指定要实体化的虚拟列数量:=> SELECT materialize_flextable_columns('mountains_hybrid', 3); materialize_flextable_columns -------------------------------------------------------------------------------------------- The following columns were added to the table public.mountains_hybrid: type For more details, run the following query: SELECT * FROM v_catalog.materialize_flextable_columns_results WHERE table_schema = 'public' and table_name = 'mountains_hybrid'; (1 row)
-
您指定要实体化三 (
3
) 列,但该表是使用两个实际列(name
和hike_safety
)创建的。为了满足您指定三列的愿望,该函数仅再提升另外一列type
。下一个示例已展开显示,以垂直列出各列。请注意,刚实体化的列处于ADDED
状态,而不是您在创建该表时所定义的两列所处的EXISTS
状态:=> \x Expanded display is on. => SELECT * from materialize_flextable_columns_results where table_name = 'mountains_hybrid'; -[ RECORD 1 ]-+------------------------------------------------------- table_id | 45035996273766044 table_schema | public table_name | mountains_hybrid creation_time | 2013-11-30 20:09:37.765257-05 key_name | type status | ADDED message | Added successfully -[ RECORD 2 ]-+------------------------------------------------------- table_id | 45035996273766044 table_schema | public table_name | mountains_hybrid creation_time | 2013-11-30 20:09:37.765284-05 key_name | hike_safety status | EXISTS message | Column of same name already exists in table definition -[ RECORD 3 ]-+------------------------------------------------------- table_id | 45035996273766044 table_schema | public table_name | mountains_hybrid creation_time | 2013-11-30 20:09:37.765296-05 key_name | name status | EXISTS message | Column of same name already exists in table definition
-
现在,显示混合表定义,列出
__raw__
列和三个已实体化的列。Flex 表的数据类型由相关联的键表派生而来,因此您可以在必要时对其进行更新。请注意,__raw__
列具有默认的NOT NULL
约束:=> \d mountains_hybrid List of Fields by Tables -[ RECORD 1 ]------------------------------------------------------------ Schema | public Table | mountains_hybrid Column | __raw__ Type | long varbinary(130000) Size | 130000 Default | Not Null | t Primary Key | f Foreign Key | -[ RECORD 2 ]------------------------------------------------------------ Schema | public Table | mountains_hybrid Column | name Type | varchar(41) Size | 41 Default | (MapLookup(mountains_hybrid.__raw__, 'name'))::varchar(41) Not Null | f Primary Key | f Foreign Key | -[ RECORD 3 ]------------------------------------------------------------ Schema | public Table | mountains_hybrid Column | hike_safety Type | float Size | 8 Default | (MapLookup(mountains_hybrid.__raw__, 'hike_safety'))::float Not Null | f Primary Key | f Foreign Key | -[ RECORD 4 ]------------------------------------------------------------ Schema | public Table | mountains_hybrid Column | type Type | varchar(20) Size | 20 Default | (MapLookup(mountains_hybrid.__raw__, 'type'))::varchar(20) Not Null | f Primary Key | f Foreign Key |
您现在已经完成有关 Flex 表基本信息、混合 Flex 表和 Flex 函数运用的入门指南。