Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
477 views
in Technique[技术] by (71.8m points)

c++ - Prevent the Boost Spirit Symbol parser from accepting a keyword too early

How can I prevent the Boost Spirit Symbol parser from accepting a keyword (symbol) when starts with a valid keyword (symbol). I would like the construct to fail parsing ‘ONEMORE’ as a whole and not succeed in parsing ‘ONE’ because that is a valid keyword and then fail on ‘MORE”.

Here is the actual output of the code below:

Keyword as a number: 1
Keyword as a number: 2
Keyword as a number: 1
Invalid keyword: MORETHREE

And this is what I like it to be:

Keyword as a number: 1
Keyword as a number: 2
Invalid keyword: ONEMORE
Keyword as a number: 3

The code is just a sample to get the point across.

#include <boost/config/warning_disable.hpp>
#include <boost/spirit/include/qi.hpp>
#include <iostream>
#include <string>

using namespace std;
namespace qi = boost::spirit::qi;
namespace ascii = boost::spirit::ascii;

void printNumber( unsigned u )
{
    cout << "Keyword as a number: " << u << endl;
}

void printInvalidKeyword( const string &s )
{
    cout << "Invalid keyword: " << s << endl;
}

template <typename Iterator>
struct keyword_parser : qi::grammar<Iterator, ascii::space_type>
{
    struct mySymbols_ : qi::symbols<char, unsigned>
    {
        mySymbols_()
        {
            add
            ("ONE"   , 1)
            ("TWO"   , 2)
            ("THREE" , 2)
            ;
        }

    } mySymbols;

    keyword_parser() : keyword_parser::base_type(start)
    {
        using qi::_1;
        using qi::raw;
        using ascii::char_;

        start %= *(
                   mySymbols[&printNumber]
                    |
                   invalid[&printInvalidKeyword]
                   );

        invalid = +char_;

    }

    qi::rule<Iterator, ascii::space_type> start;
    qi::rule<Iterator, std::string(), ascii::space_type> invalid;
};

int main()
{
    using boost::spirit::ascii::space;
    typedef std::string::const_iterator iterator_type;
    typedef keyword_parser<iterator_type> keyword_parser;

    std::string s = "ONE TWO ONEMORE THREE";
    iterator_type b = s.begin();
    iterator_type e = s.end();
    phrase_parse(b, e, keyword_parser(), space);

    return 0;
}
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Look at qi::repository::distinct or take some measures yourself:

start %= *(
           keyword  [cout << val("Keyword as a number: ") << _1 << endl]
         | invalid  [cout << val("Invalid keyword: ")     << _1 << endl]
         );

keyword = mySymbols >> !(char_("a-zA-Z0-9_"));

invalid = +ascii::graph;

The rules being declared as

qi::rule<Iterator, ascii::space_type> start;

// lexemes do not ignore embedded skippables
qi::rule<Iterator, int()> keyword;
qi::rule<Iterator, std::string()> invalid;

See it Live On Coliru

Prints:

Keyword as a number: 1
Keyword as a number: 2
Invalid keyword: ONEMORE
Keyword as a number: 2

Full source:

#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <iostream>
#include <string>

using namespace std;
namespace qi    = boost::spirit::qi;
namespace phx   = boost::phoenix;
namespace ascii = boost::spirit::ascii;

template <typename Iterator>
struct keyword_parser : qi::grammar<Iterator, ascii::space_type>
{
    struct mySymbols_ : qi::symbols<char, unsigned>
    {
        mySymbols_()
        {
            add
            ("ONE"   , 1)
            ("TWO"   , 2)
            ("THREE" , 2)
            ;
        }

    } mySymbols;

    keyword_parser() : keyword_parser::base_type(start)
    {
        using qi::_1;
        using ascii::char_;
        using phx::val;

        start %= *(
                   keyword  [cout << val("Keyword as a number: ") << _1 << endl]
                 | invalid  [cout << val("Invalid keyword: ")     << _1 << endl]
                 );

        keyword = mySymbols >> !(char_("a-zA-Z0-9_"));

        invalid = +ascii::graph;

    }

    qi::rule<Iterator, ascii::space_type> start;
    // lexemes do not ignore embedded skippables
    qi::rule<Iterator, int()> keyword;
    qi::rule<Iterator, std::string()/*IMPLICIT LEXEME:, ascii::space_type*/> invalid;
};

int main()
{
    using boost::spirit::ascii::space;
    typedef std::string::const_iterator iterator_type;
    typedef keyword_parser<iterator_type> keyword_parser;

    std::string s = "ONE TWO ONEMORE THREE";
    iterator_type b = s.begin();
    iterator_type e = s.end();
    phrase_parse(b, e, keyword_parser(), space);

    return 0;
}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...