python - replace value by using regex to np.nan

Question

Welcome To Ask or Share your Answers For Others

python - replace value by using regex to np.nan

posted Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - replace value by using regex to np.nan

I have a dataframe as below :

data1 = {"first":["alice", "bob", "carol"],
         "last_huge":["foo", "bar", "baz"]}
df = pd.DataFrame(data1)

For example , I want to replace all character 'o' to 'a':

Then I do

df.replace({"o":"a"},regex=True)
Out[668]: 
   first last
0  alice  faa
1    bab  bar
2  caral  baz

It give back what I need .

However, when I want to replace 'o' to np.nan , It will change entire string to np.nan. Is there any explanation from pandas' document? I can find some information through the source code .

More Information:(It will change whole string to np.nan)

df.replace({"o":np.nan},regex=True)
Out[669]: 
   first last
0  alice  NaN
1    NaN  bar
2    NaN  baz

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-17T01:33:47+0000

NaN is consistently used as a placeholder for missing, when replacing part of a string with "missing" it can only mean the entire entry is compromised. I've heard this called NaN pollution (or similar, will see if I can find some references), in that if NaN touches the data is compromised.

That said, that's not always the case:

In [11]: s = pd.Series([1, 2, np.nan, 4])

In [12]: s.sum()
Out[12]: 7.0

In [13]: s.sum(skipna=False)
Out[13]: nan

In some languages you'll see skipna=False as the default behaviour, some vehemently argue that NaN should always pollute all data. Pandas takes a somewhat more pragmatic approach...

The real question is what do you expect it to do in the case of NaN?

Categories

python - replace value by using regex to np.nan

python - replace value by using regex to np.nan

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags