pyspark.sql.functions.grouping#

pyspark.sql.functions.grouping(col)[source]#

Aggregate function: indicates whether a specified column in a GROUP BY list is aggregated or not, returns 1 for aggregated or 0 for not aggregated in the result set.

New in version 2.0.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters

colColumn or str: column to check if it’s aggregated.

Returns

Column: returns 1 for aggregated or 0 for not aggregated in the result set.

Examples

>>> df = spark.createDataFrame([("Alice", 2), ("Bob", 5)], ("name", "age"))
>>> df.cube("name").agg(grouping("name"), sum("age")).orderBy("name").show()
+-----+--------------+--------+
| name|grouping(name)|sum(age)|
+-----+--------------+--------+
| NULL|             1|       7|
|Alice|             0|       2|
|  Bob|             0|       5|
+-----+--------------+--------+