Well, I found the answer, which was given by @BalusC on a different thread:
- If you just want to use a XML based
tool to traverse it: JTidy.
- If you like to unit test the HTML:
HtmlUnit
- If you like to extract specific data
from the HTML: Jsoup
Thank you @BalusC.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…