This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Python

Operators in the Python category

Home > User-defined Functions > Python

Operators

OperatorDescription
2-in Python UDFUser-defined function operator in Python script
Python Lambda FunctionModify or add a new column with more ease
Python Table ReducerReduce Table to Tuple
1-out Python UDFUser-defined function operator in Python script
Python UDFUser-defined function operator in Python script

Total: 5 operators

1 - 1-out Python UDF

User-defined function operator in Python script

Home > User Defined Functions > Python

Input Properties

PropertyRequirementTypeDefaultDescription
Python scriptCode (python)See template belowInput your code here
Worker countInteger1Specify how many parallel workers to launch
ColumnsList-The columns of the source
↳ Attribute NameString-
↳ Attribute Typestring, integer, long, double, boolean,
timestamp, binary, large_binary
-

Default Code Template

Python script

# from pytexera import *
# class GenerateOperator(UDFSourceOperator):
# 
#     @overrides
#     
#     def produce(self) -> Iterator[Union[TupleLike, TableLike, None]]:
#         yield

Output Ports

PortMode
0Set Snapshot

2 - 2-in Python UDF

User-defined function operator in Python script

Home > User Defined Functions > Python

Input Properties

PropertyRequirementTypeDefaultDescription
Python scriptCode (python)See template belowInput your code here
Worker countInteger1Specify how many parallel workers to launch
Retain input columnsBooleantrueKeep the original input columns?
Extra output column(s)List-Name of the newly added output columns that the
UDF will produce, if any
↳ Attribute NameString-
↳ Attribute Typestring, integer, long, double, boolean,
timestamp, binary, large_binary
-

Default Code Template

Python script

# Choose from the following templates:
# 
# from pytexera import *
# 
# class ProcessTupleOperator(UDFOperatorV2):
#     
#     @overrides
#     def process_tuple(self, tuple_: Tuple, port: int) -> Iterator[Optional[TupleLike]]:
#         yield tuple_
# 
# class ProcessBatchOperator(UDFBatchOperator):
#     BATCH_SIZE = 10 # must be a positive integer
# 
#     @overrides
#     def process_batch(self, batch: Batch, port: int) -> Iterator[Optional[BatchLike]]:
#         yield batch
# 
# class ProcessTableOperator(UDFTableOperator):
# 
#     @overrides
#     def process_table(self, table: Table, port: int) -> Iterator[Optional[TableLike]]:
#         yield table

Output Ports

PortMode
0Set Snapshot

3 - Python Lambda Function

Modify or add a new column with more ease

Home > User Defined Functions > Python

Input Properties

PropertyRequirementTypeDefaultDescription
Add/Modify column(s)List-
↳ Attribute NameString-
↳ ExpressionString-
↳ Attribute Typestring, integer, long, double, boolean,
timestamp, binary, large_binary
-

Output Ports

PortMode
0Set Snapshot

4 - Python Table Reducer

Reduce Table to Tuple

Home > User Defined Functions > Python

Input Properties

PropertyRequirementTypeDefaultDescription
Output columnsList-
↳ Attribute NameString-
↳ ExpressionString-
↳ Attribute Typestring, integer, long, double, boolean,
timestamp, binary, large_binary
-

Output Ports

PortMode
0Set Snapshot

5 - Python UDF

User-defined function operator in Python script

Home > User Defined Functions > Python

Input Properties

PropertyRequirementTypeDefaultDescription
Python scriptCode (python)See template belowInput your code here
Worker countInteger1Specify how many parallel workers to launch
Retain input columnsBooleantrueKeep the original input columns?
Extra output column(s)List-Name of the newly added output columns that the
UDF will produce, if any
↳ Attribute NameString-
↳ Attribute Typestring, integer, long, double, boolean,
timestamp, binary, large_binary
-

Default Code Template

Python script

# Choose from the following templates:
# 
# from pytexera import *
# 
# class ProcessTupleOperator(UDFOperatorV2):
#     
#     @overrides
#     def process_tuple(self, tuple_: Tuple, port: int) -> Iterator[Optional[TupleLike]]:
#         yield tuple_
# 
# class ProcessBatchOperator(UDFBatchOperator):
#     BATCH_SIZE = 10 # must be a positive integer
# 
#     @overrides
#     def process_batch(self, batch: Batch, port: int) -> Iterator[Optional[BatchLike]]:
#         yield batch
# 
# class ProcessTableOperator(UDFTableOperator):
# 
#     @overrides
#     def process_table(self, table: Table, port: int) -> Iterator[Optional[TableLike]]:
#         yield table

Output Ports

PortMode
0Set Snapshot