pyspark.pandas.range¶
-
pyspark.pandas.
range
(start: int, end: Optional[int] = None, step: int = 1, num_partitions: Optional[int] = None) → pyspark.pandas.frame.DataFrame[source]¶ Create a DataFrame with some range of numbers.
The resulting DataFrame has a single int64 column named id, containing elements in a range from
start
toend
(exclusive) with step valuestep
. If only the first parameter (i.e. start) is specified, we treat it as the end value with the start value being 0.This is like the range function in SparkSession and is used primarily for testing.
- Parameters
- startint
the start value (inclusive)
- endint, optional
the end value (exclusive)
- stepint, optional, default 1
the incremental step
- num_partitionsint, optional
the number of partitions of the DataFrame
- Returns
- DataFrame
Examples
When the first parameter is specified, we generate a range of values up till that number.
>>> ps.range(5) id 0 0 1 1 2 2 3 3 4 4
When start, end, and step are specified:
>>> ps.range(start = 100, end = 200, step = 20) id 0 100 1 120 2 140 3 160 4 180