Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
153 views
in Technique[技术] by (71.8m points)

python - numpy convert categorical string arrays to an integer array

I'm trying to convert a string array of categorical variables to an integer array of categorical variables.

Ex.

import numpy as np
a = np.array( ['a', 'b', 'c', 'a', 'b', 'c'])
print a.dtype
>>> |S1

b = np.unique(a)
print b
>>>  ['a' 'b' 'c']

c = a.desired_function(b)
print c, c.dtype
>>> [1,2,3,1,2,3] int32

I realize this can be done with a loop but I imagine there is an easier way. Thanks.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

np.unique has some optional returns

return_inverse gives the integer encoding, which I use very often

>>> b, c = np.unique(a, return_inverse=True)
>>> b
array(['a', 'b', 'c'], 
      dtype='|S1')
>>> c
array([0, 1, 2, 0, 1, 2])
>>> c+1
array([1, 2, 3, 1, 2, 3])

it can be used to recreate the original array from uniques

>>> b[c]
array(['a', 'b', 'c', 'a', 'b', 'c'], 
      dtype='|S1')
>>> (b[c] == a).all()
True

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...