Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
252 views
in Technique[技术] by (71.8m points)

python - Searching Multiple Strings in pandas without predefining number of strings to use

I'm wondering if there's a more general way to do the below? I'm wondering if there's a way to create the st function so that I can search a non-predefined number of strings?

So for instance, being able to create a generalized st function, and then type st('Governor', 'Virginia', 'Google)

here's my current function, but it predefines two words you can use. (df is a pandas DataFrame)

def search(word1, word2, word3 df):
    """
    allows you to search an intersection of three terms
    """
    return df[df.Name.str.contains(word1) & df.Name.str.contains(word2) & df.Name.str.contains(word3)]

st('Governor', 'Virginia', newauthdf)
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

str.contains can take regex. so you can use '|'.join(words) as the pattern; to be safe map to re.escape as well:

>>> df
                 Name
0                Test
1            Virginia
2              Google
3  Google in Virginia
4               Apple

[5 rows x 1 columns]
>>> words = ['Governor', 'Virginia', 'Google']

'|'.join(map(re.escape, words)) would be the search pattern:

>>> import re
>>> pat = '|'.join(map(re.escape, words))
>>> df.Name.str.contains(pat)
0    False
1     True
2     True
3     True
4    False
Name: Name, dtype: bool

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...