This is the multi-page printable view of this section.
Click here to print.
Return to the regular view of this page.
User-defined parser
A parser takes a stream of bytes and passes a corresponding sequence of tuples to the Vertica load process.
A parser takes a stream of bytes and passes a corresponding sequence of tuples to the Vertica load process. You can use user-defined parser functions to parse:
For example, you can load a CSV file using a specific CSV library. See the Vertica SDK for two CSV examples.
COPY supports a single user-defined parser that you can use with a user-defined source and zero or more instances of a user-defined filter. If you implement a UDParser class, you must also implement a corresponding ParserFactory.
Sometimes, you can improve the performance of your parser by adding a chunker. A chunker divides up the input and uses multiple threads to parse it. Chunkers are available only in the C++ API. For details, see Cooperative parse and UDChunker class. Under special circumstances you can further improve performance by using apportioned load, an approach where multiple Vertica nodes parse the input.
1 - UDParser class
You can subclass the UDParser class when you need to parse data that is in a format that the COPY statement's native parser cannot handle.
You can subclass the UDParser
class when you need to parse data that is in a format that the COPY statement's native parser cannot handle.
During parser execution, Vertica always calls three methods: setup()
, process()
, and destroy()
. It might also call getRejectedRecord()
.
UDParser constructor
The UDParser
class performs important initialization required by all subclasses, including initializing the StreamWriter
object used by the parser. Therefore, your constructor must call super()
.
UDParser methods
Your UDParser
subclass must override process()
or processWithMetadata()
:
Note
processWithMetadata()
is available only for user-defined extensions (UDxs) written in the C++ programming language.
-
process()
reads the raw input stream as one large file. If there are any errors or failures, the entire load fails. You can implement process()
with a source or filter that implements processWithMetadata()
, but it might result in parsing errors.
You can implement process()
when the upstream source or filter implements processWithMetadata()
, but it might result in parsing errors.
-
processWithMetadata()
is useful when the data source has metadata about record boundaries available in some structured format that's separate from the data payload. With this interface, the source emits a record length for each record in addition to the data.
By implementing processWithMetadata()
instead of process()
in each phase, you can retain this record length metadata throughout the load stack, which enables a more efficient parse that can recover from errors on a per-message basis, rather than a per-file or per-source basis. KafkaSource and Kafka parsers (KafkaAvroParser, KafkaJSONParser, and KafkaParser) use this mechanism to support per-Kafka-message rejections when individual Kafka messages are corrupted.
Note
To implement processWithMetadata()
, you must override useSideChannel()
to return true
.
Additionally, you must override getRejectedRecord()
to return information about rejected records.
Optionally, you can override the other UDParser
class methods.
Parser execution
The following sections detail the execution sequence each time a user-defined parser is called. The following example overrides the process()
method.
Setting up
COPY calls setup()
before the first time it calls process()
. Use setup()
to perform any initial setup tasks that your parser needs to parse data. This setup includes retrieving parameters from the class context structure or initializing data structures for use during filtering. Vertica calls this method before calling the process()
method for the first time. Your object might be destroyed and re-created during use, so make sure that your object is restartable.
Parsing
COPY calls process()
repeatedly during query execution. Vertica passes this method a buffer of data to parse into columns and rows and one of the following input states defined by InputState
:
-
OK
: currently at the start of or in the middle of a stream
-
END_OF_FILE
: no further data is available.
-
END_OF_CHUNK
: the current data ends on a record boundary and the parser should consume all of it before returning. This input state only occurs when using a chunker.
-
START_OF_PORTION
: the input does not start at the beginning of a source. The parser should find the first end-of-record mark. This input state only occurs when using apportioned load.You can use the getPortion()
method to access the offset and size of the portion.
-
END_OF_PORTION
: the source has reached the end of its portion. The parser should finish processing the last record it started and advance no further. This input state only occurs when using apportioned load.
The parser must reject any data that it cannot parse, so that Vertica can report the rejection and write the rejected data to files.
The process()
method must parse as much data as it can from the input buffer. The buffer might not end on a row boundary. Therefore, it might have to stop parsing in the middle of a row of input and ask for more data. The input can contain null bytes, if the source file contains them, and is not automatically null-terminated.
A parser has an associated StreamWriter
object, which performs the actual writing of the data. When your parser extracts a column value, it uses one of the type-specific methods on StreamWriter
to write that value to the output stream. See Writing Data for more information about these methods.
A single call to process()
might write several rows of data. When your parser finishes processing a row of data, it must call next()
on its StreamWriter
to advance the output stream to a new row. (Usually a parser finishes processing a row because it encounters an end-of-row marker.)
When your process()
method reaches the end of the buffer, it tells Vertica its current state by returning one of the following values defined by StreamState
:
-
INPUT_NEEDED
: the parser has reached the end of the buffer and needs more data to parse.
-
DONE
: the parser has reached the end of the input data stream.
-
REJECT
: the parser has rejected the last row of data it read (see Rejecting Rows).
Tearing down
COPY calls destroy()
after the last time that process()
is called. It frees any resources reserved by the setup()
or process()
method.
Vertica calls this method after the process()
method indicates it has completed parsing the data source. However, sometimes data sources that have not yet been processed might remain. In such cases, Vertica might later call setup()
on the object again and have it parse the data in a new data stream. Therefore, write your destroy()
method so that it leaves an instance of your UDParser
subclass in a state where setup()
can be safely called again.
Reporting rejections
If process()
rejects a row, Vertica calls getRejectedRecord()
to report it. Usually, this method returns an instance of the RejectedRecord
class with details of the rejected row.
Writing data
A parser has an associated StreamWriter
object, which you access by calling getStreamWriter()
. In your process()
implementation, use the
setType()
methods on the StreamWriter
object to write values in a row to specific column indexes. Verify that the data types you write match the data types expected by the schema.
The following example shows how you can write a value of type long
to the fourth column (index 3) in the current row:
StreamWriter writer = getStreamWriter();
...
writer.setLongValue(3, 98.6);
StreamWriter
provides methods for all the basic types, such as setBooleanValue()
, setStringValue()
, and so on. See the API documentation for a complete list of StreamWriter
methods, including options that take primitive types or explicitly set entries to null.
Rejecting rows
If your parser finds data it cannot parse, it should reject the row by:
-
Saving details about the rejected row data and the reason for the rejection. These pieces of information can be directly stored in a RejectedRecord
object, or in fields on your UDParser
subclass, until they are needed.
-
Updating the row's position in the input buffer by updating input.offset
so it can resume parsing with the next row.
-
Signaling that it has rejected a row by returning with the value StreamState.REJECT
.
-
Returning an instance of the RejectedRecord
class with the details about the rejected row.
Breaking up large loads
Vertica provides two ways to break up large loads. Apportioned load allows you to distribute a load among several database nodes. Cooperative parse (C++ only) allows you to distribute a load among several threads on one node.
API
The UDParser API provides the following methods for extension by subclasses:
The UDParser API provides the following methods for extension by subclasses:
A UDParser
uses a StreamWriter
to write its output. StreamWriter
provides methods for all the basic types, such as setBooleanValue()
, setStringValue()
, and so on. In the Java API this class also provides the setValue()
method, which automatically sets the data type.
The methods described so far write single column values. StreamWriter
also provides a method to write a complete row from a map. The setRowFromMap()
method takes a map of column names and values and writes all the values into their corresponding columns. This method does not define new columns but instead writes values only to existing columns. The JsonParser
example uses this method to write arbitrary JSON input. (See Java example: JSON parser.)
Note
The setRowFromMap()
method does not automatically advance the input to the next line; you must call next()
. You can thus read a row and then override selected column values.
setRowsFromMap()
also populates any VMap ('raw') column of Flex Tables (see Flex tables) with the entire provided map. For most cases, setRowsFromMap()
is the appropriate way to populate a Flex Table. However, you can also generate a VMap value into a specified column using setVMap()
, similar to other setValue()
methods.
The setRowFromMap()
method automatically coerces the input values into the types defined for those columns using an associated TypeCoercion
. In most cases, using the default implementation (StandardTypeCoercion
) is appropriate.
TypeCoercion
uses policies to govern its behavior. For example, the FAIL_INVALID_INPUT_VALUE
policy means invalid input is treated as an error instead of using a null value. Errors are caught and handled as rejections (see "Rejecting Rows" in User-defined parser). Policies also govern whether input that is too long is truncated. Use the setPolicy()
method on the parser's TypeCoercion
to set policies. See the API documentation for supported values.
You might need to customize type coercion beyond setting these policies. To do so, subclass one of the provided implementations of TypeCoercion
and override the
asType()
methods. Such customization could be necessary if your parser reads objects that come from a third-party library. A parser handling geo-coordinates, for example, might override asLong
to translate inputs like "40.4397N" into numbers. See the Vertica API documentation for a list of implementations.
The UDParser API provides the following methods for extension by subclasses:
In Python, the process()
method requires both an input buffer and an output buffer (see InputBuffer and OutputBuffer APIs). The input buffer represents the source of the information that you want to parse. The output buffer delivers the filtered information to Vertica.
In the event the filter rejects a record, use the method REJECT()
to identify the rejected data and the reason for the rejection.
2 - UDChunker class
You can subclass the UDChunker class to allow your parser to support Cooperative Parse.
You can subclass the UDChunker
class to allow your parser to support Cooperative parse. This class is available only in the C++ API.
Fundamentally, a UDChunker
is a very simplistic parser. Like UDParser
, it has the following three methods: setup()
, process()
, and destroy()
. You must override process()
; you may override the others. This class has one additional method, alignPortion()
, which you must implement if you want to enable Apportioned load for your UDChunker
.
Setting up and tearing down
As with UDParser
, you can define initialization and cleanup code for your chunker. Vertica calls setup()
before the first call to process()
and destroy()
after the last call to process()
. Your object might be reused among multiple load sources, so make sure that setup()
completely initializes all fields.
Chunking
Vertica calls process()
to divide an input into chunks that can be parsed independently. The method takes an input buffer and an indicator of the input state:
-
OK
: the input buffer begins at the start of or in the middle of a stream.
-
END_OF_FILE
: no further data is available.
-
END_OF_PORTION
: the source has reached the end of its portion. This state occurs only when using apportioned load.
If the input state is END_OF_FILE
, the chunker should set the input.offset
marker to input.size
and return DONE
. Returning INPUT_NEEDED
is an error.
If the input state is OK
, the chunker should read data from the input buffer and find record boundaries. If it finds the end of at least one record, it should align the input.offset
marker with the byte after the end of the last record in the buffer and return CHUNK_ALIGNED
. For example, if the input is "abc~def" and "~" is a record terminator, this method should set input.offset
to 4, the position of "d". If process()
reaches the end of the input without finding a record boundary, it should return INPUT_NEEDED
.
You can divide the input into smaller chunks, but consuming all available records in the input can have better performance. For example, a chunker could scan backwards from the end of the input to find a record terminator, which might be the last of many records in the input, and return it all as one chunk without scanning through the rest of the input.
If the input state is END_OF_PORTION
, the chunker should behave as it does for an input state of OK
, except that it should also set a flag. When called again, it should find the first record in the next portion and align the chunk to that record.
The input data can contain null bytes, if the source file contains them. The input argument is not automatically null-terminated.
The process()
method must not block indefinitely. If this method cannot proceed for an extended period of time, it should return KEEP_GOING
. Failing to return KEEP_GOING
has several consequences, such as preventing your user from being able to cancel the query.
See C++ example: delimited parser and chunker for an example of the process()
method using chunking.
Aligning portions
If your chunker supports apportioned load, implement the alignPortion()
method. Vertica calls this method one or more times, before calling process()
, to align the input offset with the beginning of the first complete chunk in the portion. The method takes an input buffer and an indicator of the input state:
-
START_OF_PORTION
: the beginning of the buffer corresponds to the start of the portion. You can use the getPortion()
method to access the offset and size of the portion.
-
OK
: the input buffer is in the middle of a portion.
-
END_OF_PORTION
: the end of the buffer corresponds to the end of the portion or beyond the end of a portion.
-
END_OF_FILE
: no further data is available.
The method should scan from the beginning of the buffer to the start of the first complete record. It should set input.offset
to this position and return one of the following values:
-
DONE
, if it found a chunk. input.offset
is the first byte of the chunk.
-
INPUT_NEEDED
, if the input buffer does not contain the start of any chunk. It is an error to return this from an input state of END_OF_FILE
.
-
REJECT
, if the portion (not buffer) does not contain the start of any chunk.
API
The UDChunker API provides the following methods for extension by subclasses:
3 - ParserFactory class
If you write a parser, you must also write a factory to produce parser instances.
If you write a parser, you must also write a factory to produce parser instances. To do so, subclass the ParserFactory
class.
Parser factories are singletons. Your subclass must be stateless, with no fields containing data. Your subclass also must not modify any global variables.
The ParserFactory
class defines the following methods. Your subclass must override the prepare()
method. It may override the other methods.
Setting up
Vertica calls plan()
once on the initiator node to perform the following tasks:
-
Check any parameters that have been passed from the function call in the COPY statement and error messages if there are any issues. You read the parameters by getting a ParamReader
object from the instance of ServerInterface
passed into your plan()
method.
-
Store any information that the individual hosts need in order to parse the data. For example, you could store parameters in the PlanContext
instance passed in through the planCtxt
parameter. The plan()
method runs only on the initiator node, and the prepareUDSources()
method runs on each host reading from a data source. Therefore, this object is the only means of communication between them.
You store data in the PlanContext
by getting a ParamWriter
object from the getWriter()
method. You then write parameters by calling methods on the ParamWriter
such as setString
.
Note
ParamWriter
offers only the ability to store simple data types. For complex types, you need to serialize the data in some manner and store it as a string or long string.
Creating parsers
Vertica calls prepare()
on each node to create and initialize your parser, using data stored by the plan()
method.
Defining parameters
Implement getParameterTypes()
to define the names and types of parameters that your parser uses. Vertica uses this information to warn callers about unknown or missing parameters. Vertica ignores unknown parameters and uses default values for missing parameters. While you should define the types and parameters for your function, you are not required to override this method.
Defining parser outputs
Implement getParserReturnType()
to define the data types of the table columns that the parser outputs. If applicable, getParserReturnType()
also defines the size, precision, or scale of the data types. Usually, this method reads data types of the output table from the argType
and perColumnParamReader
arguments and verifies that it can output the appropriate data types. If getParserReturnType()
is prepared to output the data types, it calls methods on the SizedColumnTypes
object passed in the returnType
argument. In addition to the data type of the output column, your method should also specify any additional information about the column's data type:
-
For binary and string data types (such as CHAR, VARCHAR, and LONG VARBINARY), specify its maximum length.
-
For NUMERIC types, specify its precision and scale.
-
For Time/Timestamp types (with or without time zone), specify its precision (-1 means unspecified).
-
For ARRAY types, specify the maximum number of elements.
-
For all other types, no length or precision specification is required.
Supporting cooperative parse
To support Cooperative parse, implement prepareChunker()
and return an instance of your UDChunker
subclass. If isChunkerApportionable()
returns true
, then it is an error for this method to return null.
Cooperative parse is currently supported only in the C++ API.
Supporting apportioned load
To support Apportioned load, your parser, chunker, or both must support apportioning. To indicate that the parser can apportion a load, implement isParserApportionable()
and return true
. To indicate that the chunker can apportion a load, implement isChunkerApportionable()
and return true
.
The isChunkerApportionable()
method takes a ServerInterface
as an argument, so you have access to the parameters supplied in the COPY statement. You might need this information if the user can specify a record delimiter, for example. Return true
from this method if and only if the factory can create a chunker for this input.
API
The ParserFactory API provides the following methods for extension by subclasses:
If you are using Apportioned load to divide a single input into multiple load streams, implement isParserApportionable()
and/or isChunkerApportionable()
and return true
. Returning true
from these methods does not guarantee that Verticawill apportion the load. However, returning false
from both indicates that it will not try to do so.
If you are using Cooperative parse, implement prepareChunker()
and return an instance of your UDChunker
subclass. Cooperative parse is supported only for the C++ API.
Vertica calls the prepareChunker()
method only for unfenced functions. This method is not available when you use the function in fenced mode.
If you want your chunker to be available for apportioned load, implement isChunkerApportionable()
and return true
.
After creating your ParserFactory
, you must register it with the RegisterFactory
macro.
The ParserFactory API provides the following methods for extension by subclasses:
The ParserFactory API provides the following methods for extension by subclasses:
4 - C++ example: BasicIntegerParser
The BasicIntegerParser example parses a string of integers separated by non-numeric characters.
The BasicIntegerParser
example parses a string of integers separated by non-numeric characters. For a version of this parser that uses continuous load, see C++ example: ContinuousIntegerParser.
Loading and using the example
Load and use the BasicIntegerParser
example as follows.
=> CREATE LIBRARY BasicIntegerParserLib AS '/home/dbadmin/BIP.so';
=> CREATE PARSER BasicIntegerParser AS
LANGUAGE 'C++' NAME 'BasicIntegerParserFactory' LIBRARY BasicIntegerParserLib;
=> CREATE TABLE t (i integer);
=> COPY t FROM stdin WITH PARSER BasicIntegerParser();
0
1
2
3
4
5
\.
Implementation
The BasicIntegerParser
class implements only the process()
method from the API. (It also implements a helper method for type conversion.) This method processes each line of input, looking for numbers on each line. When it advances to a new line it moves the input.offset
marker and checks the input state. It then writes the output.
In the factory, the plan()
method is a no-op; there are no parameters to check. The prepare()
method instantiates the parser using the macro provided by the SDK:
The getParserReturnType()
method declares the single output:
As for all UDxs written in C++, the example ends by registering its factory:
5 - C++ example: ContinuousIntegerParser
The ContinuousIntegerParser example is a variation of BasicIntegerParser.
The ContinuousIntegerParser
example is a variation of BasicIntegerParser
. Both examples parse integers from input strings. ContinuousIntegerParser
uses Continuous load to read data.
Loading and using the example
Load the ContinuousIntegerParser
example as follows.
=> CREATE LIBRARY ContinuousIntegerParserLib AS '/home/dbadmin/CIP.so';
=> CREATE PARSER ContinuousIntegerParser AS
LANGUAGE 'C++' NAME 'ContinuousIntegerParserFactory'
LIBRARY ContinuousIntegerParserLib;
Use it in the same way that you use BasicIntegerParser
. See C++ example: BasicIntegerParser.
Implementation
ContinuousIntegerParser
is a subclass of ContinuousUDParser
. Subclasses of ContinuousUDParser
place the processing logic in the run()
method.
For a more complex example of a ContinuousUDParser
, see ExampleDelimitedParser
in the examples. (See Downloading and running UDx example code.) ExampleDelimitedParser
uses a chunker; see C++ example: delimited parser and chunker.
6 - Java example: numeric text
This NumericTextParser example parses integer values spelled out in words rather than digits (for example "one two three" for one-hundred twenty three).
This NumericTextParser
example parses integer values spelled out in words rather than digits (for example "one two three" for one-hundred twenty three). The parser:
-
Accepts a single parameter to set the character that separates columns in a row of data. The separator defaults to the pipe (|) character.
-
Ignores extra spaces and the capitalization of the words used to spell out the digits.
-
Recognizes the digits using the following words: zero, one, two, three, four, five, six, seven, eight, nine.
-
Assumes that the words spelling out an integer are separated by at least one space.
-
Rejects any row of data that cannot be completely parsed into integers.
-
Generates an error, if the output table has a non-integer column.
Loading and using the example
Load and use the parser as follows:
Parser implementation
The following code implements the parser.
ParserFactory implementation
The following code implements the parser factory.
NumericTextParser
accepts a single optional parameter named separator
. This parameter is defined in the getParameterType()
method, and the plan()
method stores its value. NumericTextParser
outputs only integer values. Therefore, if the output table contains a column whose data type is not integer, the getParserReturnType()
method throws an exception.
7 - Java example: JSON parser
The JSON Parser consumes a stream of JSON objects.
The JSON Parser consumes a stream of JSON objects. Each object must be well formed and on a single line in the input. Use line breaks to delimit the objects. The parser uses the field names as keys in a map, which become column names in the table. You can find the code for this example in /opt/vertica/packages/flextable/examples. This directory also contains an example data file.
This example uses the setRowFromMap()
method to write data.
Loading and using the example
Load the library and define the JSON parser, using the third-party library (gson-2.2.4.jar
) as follows. See the comments in JsonParser.java for a download URL:
You can now define a table and then use the JSON parser to load data into it, as follows:
The data file contains a value (hike_safety) that was not loaded because the table definition did not include that column. The data file follows:
Implementation
The following code shows the process()
method from JsonParser.java. The parser attempts to read the input into a Map.
If the read is successful, the JSON Parser calls setRowFromMap()
:
The factory, JsonParserFactory.java, instantiates and returns a parser in the prepare()
method. No additional setup is required.
8 - C++ example: delimited parser and chunker
The ExampleDelimitedUDChunker class divides an input at delimiter characters.
The ExampleDelimitedUDChunker
class divides an input at delimiter characters. You can use this chunker with any parser that understands delimited input. ExampleDelimitedParser
is a ContinuousUDParser
subclass that uses this chunker.
Loading and using the example
Load and use the example as follows.
=> CREATE LIBRARY ExampleDelimitedParserLib AS '/home/dbadmin/EDP.so';
=> CREATE PARSER ExampleDelimitedParser AS
LANGUAGE 'C++' NAME 'DelimitedParserFrameworkExampleFactory'
LIBRARY ExampleDelimitedParserLib;
=> COPY t FROM stdin WITH PARSER ExampleDelimitedParser();
0
1
2
3
4
5
6
7
8
9
\.
Chunker implementation
This chunker supports apportioned load. The alignPortion()
method finds the beginning of the first complete record in the current portion and aligns the input buffer with it. The record terminator is passed as an argument and set in the constructor.
The process()
method has to account for chunks that span portion boundaries. If the previous call was at the end of a portion, the method set a flag. The code begins by checking for and handling that condition. The logic is similar to that of alignPortion()
, so the example calls it to do part of the division.
Now the method looks for the delimiter. If the input began at the end of a portion, it sets the flag.
Finally, process()
moves the input offset and returns.
Factory implementation
The file ExampleDelimitedParser.cpp
defines a factory that uses this UDChunker
. The chunker supports apportioned load, so the factory implements isChunkerApportionable()
:
The prepareChunker()
method creates the chunker:
9 - Python example: complex types JSON parser
The following example details a UDParser that takes a JSON object and parses it into complex types.
The following example details a UDParser that takes a JSON object and parses it into complex types. For this example, the parser assumes the input data are arrays of rows with two integer fields. The input records should be separated by newline characters. If any row fields aren't specified by the JSON input, the function parses those fields as NULL.
The source code for this UDParser also contains a factory method for parsing rows that have an integer and an array of integer fields. The implementation of the parser is independent of the return type in the factory, so you can create factories with different return types that all point to the ComplexJsonParser()
class in the prepare()
method. The complete source code is in /opt/vertica/sdk/examples/python/UDParsers.py
.
Loading and using the example
Load the library and create the parser as follows:
=> CREATE OR REPLACE LIBRARY UDParsers AS '/home/dbadmin/examples/python/UDParsers.py' LANGUAGE 'Python';
=> CREATE PARSER ComplexJsonParser AS LANGUAGE 'Python' NAME 'ArrayJsonParserFactory' LIBRARY UDParsers;
You can now define a table and then use the JSON parser to load data into it, for example:
=> CREATE TABLE orders (a bool, arr array[row(a int, b int)]);
CREATE TABLE
=> COPY orders (arr) FROM STDIN WITH PARSER ComplexJsonParser();
[]
[{"a":1, "b":10}]
[{"a":1, "b":10}, {"a":null, "b":10}]
[{"a":1, "b":10},{"a":10, "b":20}]
[{"a":1, "b":10}, {"a":null, "b":null}]
[{"a":1, "b":2}, {"a":3, "b":4}, {"a":5, "b":6}, {"a":7, "b":8}, {"a":9, "b":10}, {"a":11, "b":12}, {"a":13, "b":14}]
\.
=> SELECT * FROM orders;
a | arr
--+--------------------------------------------------------------------------
| []
| [{"a":1,"b":10}]
| [{"a":1,"b":10},{"a":null,"b":10}]
| [{"a":1,"b":10},{"a":10,"b":20}]
| [{"a":1,"b":10},{"a":null,"b":null}]
| [{"a":1,"b":2},{"a":3,"b":4},{"a":5,"b":6},{"a":7,"b":8},{"a":9,"b":10},{"a":11,"b":12},{"a":13,"b":14}]
(6 rows)
Setup
All Python UDxs must import the Vertica SDK library. ComplexJsonParser()
also requires the json library.
Factory implementation
The prepare()
method instantiates and returns a parser:
getParserReturnType()
declares that the return type must be an array of rows that each have two integer fields:
Parser implementation
The process()
method reads in data with an InputBuffer
and then splits that input data on the newline character. The method then passes the processed data to the writeRows()
method. writeRows()
turns each data row into a JSON object, checks the type of that JSON object, and then writes the appropriate value or object to the output.