Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
460 views
in Technique[技术] by (71.8m points)

python - How can I use pyparsing to parse nested expressions that have multiple opener/closer types?

I'd like to use pyparsing to parse an expression of the form: expr = '(gimme [some {nested [lists]}])', and get back a python list of the form: [[['gimme', ['some', ['nested', ['lists']]]]]]. Right now my grammar looks like this:

nestedParens = nestedExpr('(', ')')
nestedBrackets = nestedExpr('[', ']')
nestedCurlies = nestedExpr('{', '}')
enclosed = nestedParens | nestedBrackets | nestedCurlies

Presently, enclosed.searchString(expr) returns a list of the form: [[['gimme', ['some', '{nested', '[lists]}']]]]. This is not what I want because it's not recognizing the square or curly brackets, but I don't know why.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Here's a pyparsing solution that uses a self-modifying grammar to dynamically match the correct closing brace character.

from pyparsing import *

data = '(gimme [some {nested, nested [lists]}])'

opening = oneOf("( { [")
nonBracePrintables = ''.join(c for c in printables if c not in '(){}[]')
closingFor = dict(zip("({[",")}]"))
closing = Forward()
# initialize closing with an expression
closing << NoMatch()
closingStack = []
def pushClosing(t):
    closingStack.append(closing.expr)
    closing << Literal( closingFor[t[0]] )
def popClosing():
    closing << closingStack.pop()
opening.setParseAction(pushClosing)
closing.setParseAction(popClosing)

matchedNesting = nestedExpr( opening, closing, Word(alphas) | Word(nonBracePrintables) )

print matchedNesting.parseString(data).asList()

prints:

[['gimme', ['some', ['nested', ',', 'nested', ['lists']]]]]

Updated: I posted the above solution because I had actually written it over a year ago as an experiment. I just took a closer look at your original post, and it made me think of the recursive type definition created by the operatorPrecedence method, and so I redid this solution, using your original approach - much simpler to follow! (might have a left-recursion issue with the right input data though, not thoroughly tested):

from pyparsing import *

enclosed = Forward()
nestedParens = nestedExpr('(', ')', content=enclosed) 
nestedBrackets = nestedExpr('[', ']', content=enclosed) 
nestedCurlies = nestedExpr('{', '}', content=enclosed) 
enclosed << (Word(alphas) | ',' | nestedParens | nestedBrackets | nestedCurlies)


data = '(gimme [some {nested, nested [lists]}])' 

print enclosed.parseString(data).asList()

Gives:

[['gimme', ['some', ['nested', ',', 'nested', ['lists']]]]]

EDITED: Here is a diagram of the updated parser, using the railroad diagramming support coming in pyparsing 3.0. railroad diagram


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

56.9k users

...