pyspark.pandas.CategoricalIndex.map#

CategoricalIndex.map(mapper)[source]#

Map values using input correspondence (a dict, Series, or function).

Maps the values (their categories, not the codes) of the index to new categories. If the mapping correspondence is one-to-one the result is a CategoricalIndex which has the same order property as the original, otherwise an Index is returned.

If a dict or Series is used any unmapped category is mapped to missing values. Note that if this happens an Index will be returned.

Parameters

mapperfunction, dict, or Series: Mapping correspondence.

Returns

CategoricalIndex or Index: Mapped index.

See also

Index.map: Apply a mapping correspondence on an Index.
Series.map: Apply a mapping correspondence on a Series
Series.apply: Apply more complex functions on a Series

Examples

>>> idx = ps.CategoricalIndex(['a', 'b', 'c'])
>>> idx  
CategoricalIndex(['a', 'b', 'c'],
                 categories=['a', 'b', 'c'], ordered=False, dtype='category')

>>> idx.map(lambda x: x.upper())  
CategoricalIndex(['A', 'B', 'C'],
                 categories=['A', 'B', 'C'], ordered=False, dtype='category')

>>> pser = pd.Series([1, 2, 3], index=pd.CategoricalIndex(['a', 'b', 'c'], ordered=True))
>>> idx.map(pser)  
CategoricalIndex([1, 2, 3],
                 categories=[1, 2, 3], ordered=False, dtype='category')

>>> idx.map({'a': 'first', 'b': 'second', 'c': 'third'})  
CategoricalIndex(['first', 'second', 'third'],
                 categories=['first', 'second', 'third'], ordered=False, dtype='category')

If the mapping is one-to-one the ordering of the categories is preserved:

>>> idx = ps.CategoricalIndex(['a', 'b', 'c'], ordered=True)
>>> idx  
CategoricalIndex(['a', 'b', 'c'],
                 categories=['a', 'b', 'c'], ordered=True, dtype='category')

>>> idx.map({'a': 3, 'b': 2, 'c': 1})  
CategoricalIndex([3, 2, 1],
                 categories=[3, 2, 1], ordered=True, dtype='category')

If the mapping is not one-to-one an Index is returned:

>>> idx.map({'a': 'first', 'b': 'second', 'c': 'first'})
Index(['first', 'second', 'first'], dtype='object')

If a dict is used, all unmapped categories are mapped to None and the result is an Index:

>>> idx.map({'a': 'first', 'b': 'second'})
Index(['first', 'second', None], dtype='object')