Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
229 views
in Technique[技术] by (71.8m points)

RegEx: extract Key=Value pairs with Escape =

I have String (inside i have escaping of equal sing =) :

attr1=ActiveList attr2=<Resource URI="/All Active Lists/_CAL/Infrastructure/Active Directory/Inventory/Inventory - By User (Dedd)" ID="DSdddCcSSS=="/> attr3=ActiveLis  attr4=ActiveList attr5=<Resource URI="/All Active Lists/_CAL/Infrastructure/Active Directory/Inventory/Inventory - By User (Bobo)" ID="CCCsSSdDDD=="/> attr6=ActiveLis 

P.S. Sometimes might be:

key=value = otherthink

How do i covert it to key=value pairs with regex:

attr1=ActiveList 
attr2=<Resource URI="/All Active Lists/_CAL/Infrastructure/Active Directory/Inventory/Inventory - By User (Dedd)" ID="DSdddCcSSS=="/> 
attr3=ActiveLis  
attr4=ActiveList 
attr5=<Resource URI="/All Active Lists/_CAL/Infrastructure/Active Directory/Inventory/Inventory - By User (Bobo)" ID="CCCsSSdDDD=="/>   
attr6=ActiveLis 
key=value = otherthink

I'v tried few patterns eg.

s?(w+)s?=s?(.(?!=(?<!\=))(?!w+=))+

The target language(s): Java & Python. i prefer pure RegEx

Without success. :-

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Assuming that you either have a word like ActiveList or something in bewteen <> you could is this reg exp:

s*(w+)s*=s*(w+|<.*?>)s*

First group will capture the attr name and second will capture a word (ActiveList) or anything between <> like <Resource URI="/All Active Lists/_CAL/Infrastructure/Active Directory/Inventory/Inventory - By User (Bobo)" ID="CCCsSSdDDD=="/>

Then, you just need to iterate over results and join them with =:

>>> for attr, value in re.findall("s*(w+)s*=s*(w+|<.*?>)s*",text):
        print "%s=%s" % (attr, value)

attr1=ActiveList
attr2=<Resource URI="/All Active Lists/_CAL/Infrastructure/Active Directory/Inventory/Inventory - By User (Dedd)" ID="DSdddCcSSS=="/>
attr3=ActiveLis
attr4=ActiveList
attr5=<Resource URI="/All Active Lists/_CAL/Infrastructure/Active Directory/Inventory/Inventory - By User (Bobo)" ID="CCCsSSdDDD=="/>
attr6=ActiveLis

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...