This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Data Input

Operators in the Data Input category

Home > Data Input

Operators

OperatorDescription
Arrow File ScanScan data from an Arrow file
CSV File ScanScan data from a CSV file
CSVOld File ScanScan data from a CSVOld file
File ListerSelect a dataset version and output one filename tuple per file
File ScanScan data from a file
File Scan From InputScan data from file paths provided by input tuples
JSONL File ScanScan data from a JSONL file
Text InputSource data from manually inputted text

Total: 8 operators

1 - Arrow File Scan

Scan data from an Arrow file

Home > Data Input

Input Properties

PropertyRequirementTypeDefaultDescription
FileString-
LimitInteger-Max output count
OffsetInteger-Starting point of output

Output Ports

PortMode
0Set Snapshot

2 - CSV File Scan

Scan data from a CSV file

Home > Data Input

Input Properties

PropertyRequirementTypeDefaultDescription
FileString-
File EncodingUTF_8, UTF_16, US_ASCIIUTF_8Decoding charset to use on input
LimitInteger-Max output count
OffsetInteger-Starting point of output
DelimiterString,Delimiter to separate each line into fields
HeaderBooleantrueWhether the CSV file contains a header line

Output Ports

PortMode
0Set Snapshot

3 - CSVOld File Scan

Scan data from a CSVOld file

Home > Data Input

Input Properties

PropertyRequirementTypeDefaultDescription
FileString-
File EncodingUTF_8, UTF_16, US_ASCIIUTF_8Decoding charset to use on input
LimitInteger-Max output count
OffsetInteger-Starting point of output
DelimiterString,Delimiter to separate each line into fields
HeaderBooleantrueWhether the CSV file contains a header line

Output Ports

PortMode
0Set Snapshot

4 - File Lister

Select a dataset version and output one filename tuple per file

Home > Data Input

Input Properties

PropertyRequirementTypeDefaultDescription
DatasetString-

Output Ports

PortMode
0Set Snapshot

5 - File Scan

Scan data from a file

Home > Data Input

Input Properties

PropertyRequirementTypeDefaultDescription
FileString-
EncodingUTF_8, UTF_16, US_ASCIIUTF_8
ExtractBooleanfalse
↳ Include FilenameBooleanfalse
Attribute Typestring, single string, integer, long,
double, boolean, timestamp, binary,
large binary
string
Attribute NameStringline
LimitInteger-
OffsetInteger-

Output Ports

PortMode
0Set Snapshot

6 - File Scan From Input

Scan data from file paths provided by input tuples

Home > Data Input

Input Properties

PropertyRequirementTypeDefaultDescription
EncodingUTF_8, UTF_16, US_ASCIIUTF_8
ExtractBooleanfalse
Include FilenameBooleanfalse
Attribute Typestring, single string, integer, long,
double, boolean, timestamp, binary,
large binary
string
Attribute NameStringline
LimitInteger-
OffsetInteger-

Output Ports

PortMode
0Set Snapshot

7 - JSONL File Scan

Scan data from a JSONL file

Home > Data Input

Input Properties

PropertyRequirementTypeDefaultDescription
FileString-
File EncodingUTF_8, UTF_16, US_ASCIIUTF_8Decoding charset to use on input
LimitInteger-Max output count
OffsetInteger-Starting point of output
FlattenBooleanfalseFlatten nested objects and arrays

Output Ports

PortMode
0Set Snapshot

8 - Text Input

Source data from manually inputted text

Home > Data Input

Input Properties

PropertyRequirementTypeDefaultDescription
TextString-
Attribute Typestring, single string, integer, long,
double, boolean, timestamp, binary,
large binary
string
Attribute NameStringline
LimitInteger-
OffsetInteger-

Output Ports

PortMode
0Set Snapshot