Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
444 views
in Technique[技术] by (71.8m points)

java - How Get error messages of antlr parsing?

I wrote a grammar with antlr 4.4 like this :

grammar CSV;

file
  :  row+ EOF
  ;

row
  :  value (Comma value)* (LineBreak | EOF)
  ;

value
  :  SimpleValueA
  |  QuotedValue
  ;

Comma
  :  ','
  ;

LineBreak
  :  '
'? '
'
  |  '
'
  ;

SimpleValue
  :  ~(',' | '
' | '
' | '"')+
  ;

QuotedValue
  :  '"' ('""' | ~'"')* '"'
  ;

then I use antlr 4.4 for generating parser & lexer, this process is successful

after generate classes I wrote some java code for using grammar

import org.antlr.v4.runtime.ANTLRInputStream;
import org.antlr.v4.runtime.CommonTokenStream;

public class Main {

    public static void main(String[] args)
    {
        String source =  ""a","b","c";
        CSVLexer lex = new CSVLexer(new ANTLRInputStream(source));
        CommonTokenStream tokens = new CommonTokenStream(lex);
        tokens.fill();
        CSVParser parser = new CSVParser(tokens);
        CSVParser.FileContext file = parser.file();
    }
}

all of above code is a parser for CSV strings for example : ""a","b",c"

Window Output :

line 1:8 token recognition error at: '"c'
line 1:10 missing {SimpleValue, QuotedValue} at '<EOF>'

I want to know How I can get this errors from a method (getErrors() or ...) in code-behind not as result of output window

Can anyone help me ?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Using ANTLR for CSV parsing is a nuclear option IMHO, but since you're at it...

  • Implement the interface ANTLRErrorListener. You may extend BaseErrorListener for that. Collect the errors and append them to a list.
  • Call parser.removeErrorListeners() to remove the default listeners
  • Call parser.addErrorListener(yourListenerInstance) to add your own listener
  • Parse your input

Now, for the lexer, you may either do the same thing removeErrorListeners/addErrorListener, or add the following rule at the end:

UNKNOWN_CHAR : . ;

With this rule, the lexer will never fail (it will generate UNKNOWN_CHAR tokens when it can't do anything else) and all errors will be generated by the parser (because it won't know what to do with these UNKNOWN_CHAR tokens). I recommend this approach.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...