pyspark.pandas.CategoricalIndex.map#

CategoricalIndex.map(mapper)[source]#

Map values using input correspondence (a dict, Series, or function).

Maps the values (their categories, not the codes) of the index to new categories. If the mapping correspondence is one-to-one the result is a CategoricalIndex which has the same order property as the original, otherwise an Index is returned.

If a dict or Series is used any unmapped category is mapped to missing values. Note that if this happens an Index will be returned.

Parameters
mapperfunction, dict, or Series

Mapping correspondence.

Returns
CategoricalIndex or Index

Mapped index.

See also

Index.map

Apply a mapping correspondence on an Index.

Series.map

Apply a mapping correspondence on a Series

Series.apply

Apply more complex functions on a Series

Examples

>>> idx = ps.CategoricalIndex(['a', 'b', 'c'])
>>> idx  
CategoricalIndex(['a', 'b', 'c'],
                 categories=['a', 'b', 'c'], ordered=False, dtype='category')
>>> idx.map(lambda x: x.upper())  
CategoricalIndex(['A', 'B', 'C'],
                 categories=['A', 'B', 'C'], ordered=False, dtype='category')
>>> pser = pd.Series([1, 2, 3], index=pd.CategoricalIndex(['a', 'b', 'c'], ordered=True))
>>> idx.map(pser)  
CategoricalIndex([1, 2, 3],
                 categories=[1, 2, 3], ordered=False, dtype='category')
>>> idx.map({'a': 'first', 'b': 'second', 'c': 'third'})  
CategoricalIndex(['first', 'second', 'third'],
                 categories=['first', 'second', 'third'], ordered=False, dtype='category')

If the mapping is one-to-one the ordering of the categories is preserved:

>>> idx = ps.CategoricalIndex(['a', 'b', 'c'], ordered=True)
>>> idx  
CategoricalIndex(['a', 'b', 'c'],
                 categories=['a', 'b', 'c'], ordered=True, dtype='category')
>>> idx.map({'a': 3, 'b': 2, 'c': 1})  
CategoricalIndex([3, 2, 1],
                 categories=[3, 2, 1], ordered=True, dtype='category')

If the mapping is not one-to-one an Index is returned:

>>> idx.map({'a': 'first', 'b': 'second', 'c': 'first'})
Index(['first', 'second', 'first'], dtype='object')

If a dict is used, all unmapped categories are mapped to None and the result is an Index:

>>> idx.map({'a': 'first', 'b': 'second'})
Index(['first', 'second', None], dtype='object')