C++ example: BasicIntegerParser
The BasicIntegerParser
example parses a string of integers separated by non-numeric characters. For a version of this parser that uses continuous load, see C++ example: ContinuousIntegerParser.
Loading and using the example
Load and use the BasicIntegerParser
example as follows.
=> CREATE LIBRARY BasicIntegerParserLib AS '/home/dbadmin/BIP.so';
=> CREATE PARSER BasicIntegerParser AS
LANGUAGE 'C++' NAME 'BasicIntegerParserFactory' LIBRARY BasicIntegerParserLib;
=> CREATE TABLE t (i integer);
=> COPY t FROM stdin WITH PARSER BasicIntegerParser();
0
1
2
3
4
5
\.
Implementation
The BasicIntegerParser
class implements only the process()
method from the API. (It also implements a helper method for type conversion.) This method processes each line of input, looking for numbers on each line. When it advances to a new line it moves the input.offset
marker and checks the input state. It then writes the output.
virtual StreamState process(ServerInterface &srvInterface, DataBuffer &input,
InputState input_state) {
// WARNING: This implementation is not trying for efficiency.
// It is trying for simplicity, for demonstration purposes.
size_t start = input.offset;
const size_t end = input.size;
do {
bool found_newline = false;
size_t numEnd = start;
for (; numEnd < end; numEnd++) {
if (input.buf[numEnd] < '0' || input.buf[numEnd] > '9') {
found_newline = true;
break;
}
}
if (!found_newline) {
input.offset = start;
if (input_state == END_OF_FILE) {
// If we're at end-of-file,
// emit the last integer (if any) and return DONE.
if (start != end) {
writer->setInt(0, strToInt(input.buf + start, input.buf + numEnd));
writer->next();
}
return DONE;
} else {
// Otherwise, we need more data.
return INPUT_NEEDED;
}
}
writer->setInt(0, strToInt(input.buf + start, input.buf + numEnd));
writer->next();
start = numEnd + 1;
} while (true);
}
};
In the factory, the plan()
method is a no-op; there are no parameters to check. The prepare()
method instantiates the parser using the macro provided by the SDK:
virtual UDParser* prepare(ServerInterface &srvInterface,
PerColumnParamReader &perColumnParamReader,
PlanContext &planCtxt,
const SizedColumnTypes &returnType) {
return vt_createFuncObject<BasicIntegerParser>(srvInterface.allocator);
}
The getParserReturnType()
method declares the single output:
virtual void getParserReturnType(ServerInterface &srvInterface,
PerColumnParamReader &perColumnParamReader,
PlanContext &planCtxt,
const SizedColumnTypes &argTypes,
SizedColumnTypes &returnType) {
// We only and always have a single integer column
returnType.addInt(argTypes.getColumnName(0));
}
As for all UDxs written in C++, the example ends by registering its factory:
RegisterFactory(BasicIntegerParserFactory);