pyspark.sql.datasource.InputPartition#

class pyspark.sql.datasource.InputPartition(value)[source]#

A base class representing an input partition returned by the partitions() method of DataSourceReader.

Notes

This class must be picklable.

Examples

Use the default input partition implementation:

>>> def partitions(self):
...     return [InputPartition(1)]

Subclass the input partition class:

>>> from dataclasses import dataclass
>>> @dataclass
... class RangeInputPartition(InputPartition):
...     start: int
...     end: int
>>> def partitions(self):
...     return [RangeInputPartition(1, 3), RangeInputPartition(4, 6)]