Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
259 views
in Technique[技术] by (71.8m points)

regex - Python Regular Expression from File

I want to extract lines following some sequence from a file. E.g. a file contains many lines and I want line in sequence

journey (a,b) from station south chennai to station punjab chandigarh
journey (c,d) from station jammu katra to city punjab chandigarh
journey (e) from station 

let's say above is the code and I want to extract the following information from the first first two lines:

e.g this is the sequence first word is journey--- then brackets will contain two words, ---- then word from --- and then it could be word station or city --- and then again any string --- then again word to --- and then it could be word station or city---

What would be the regular expression for that? Note: Words in brackets may contain special characters e.g -,_

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This will return the elements you want:

import re

s = '''journey (a,b) from station south chennai to station punjab chandigarh
journey (c,d) from station jammu katra to city punjab chandigarh
journey (e) from station
journey (c,d) from station ANYSTRING jammu katra to ANYSTRING city punjab chandigarh
'''

matches_single = re.findall('journey (([^,]+,[^,]+)) from (S+ S+s{0,1}S*) to (S+ S+s{0,1}S*)', s)
for match in matches_single:
    print(match)
matches_line = re.findall('(journey ([^,]+,[^,]+) from S+ S+s{0,1}S* to S+ S+s{0,1}S*)', s)
for match in matches_line:
    print(match)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...