Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
575 views
in Technique[技术] by (71.8m points)

jquery - A JavaScript parser for DOM

We have a special requirement in a project where we have to parse a string of HTML (from an AJAX response) client side via JavaScript only. Thats right no parsing in PHP or Java! I've been going through StackOverflow, this entire week and have yet not got an acceptable solution.

Some more details on the requirements:

  • We can use any library (preferably dojo and / or jQuery) or go native!

  • We need to parse an Entire HTML Document that we receive as a string, including the <head> and <body>.

  • We also need to serialise out the parsed DOM structures to strings at times.

  • Finally, We don't want to append the parsed DOM to the current Document. Rather, we'll send it back to the server for permanent storage.

Eg: We need something like

var dom = HTMLtoDOM('<html><head><title> This is the old title. </title></head></html>');
    dom.getElementsByTagName('title')[0].innerHTML = "This is a new Title";

With my research, these are our options:

  1. A TinyMCE Parser. Problem? We need to necessarily include an editor I think. How about for parsing HTML where we don't need an editor?

  2. John Resig's Parser. Should be our best bet. Unfortunately, the parser is crashing when the entire contents of a page is given to it!

  3. The jQuery $(htmlString) or the dojo.toDom(htmlString). Both rely on DocumentFragment and hence gobble up <head> and <body>!

EDIT: We want to serialize the HTML so we may catch certain custom HTML Commnets via RegExp. We need to give users the opportunity to edit meta tags, title tags etc hence the HTML Parser.

Oh and I feel I will be murdered in Stack Overflow even if I just hint at parsing HTML via RegExp!!!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can leverage the current document without appending any nodes to it.

Try something like this:

function toNode(html) {
    var doc = document.createElement('html');
    doc.innerHTML = html;
    return doc;
}

var node = toNode('<html><head><title> This is the old title. </title></head></html>');

console.log(node);?

http://jsfiddle.net/6SvqA/3/


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...