Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
526 views
in Technique[技术] by (71.8m points)

php - DOMDocument: Ignore Duplicate Element IDs

I'm putting some page content (which has been run through Tidy, but doesn't need to be if this is a source of problems) into DOMDocument using DOMDocument::loadHTML.

It's coming up with various errors:

ID x already defined in Entity, line X

Is there any way to make either DOMDocument (or Tidy) ignore or strip out duplicate element IDs, so it will actually create the DOMDocument?

Thanks. :)

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

A quick search on the subject reveals this (incorrect) bug report:

http://bugs.php.net/bug.php?id=46136

The last reply states the following:

You're using HTML 4 rules to load an XHTML document. Either use the load() method to parse as XML or the libxml_use_internal_errors() function to ignore the warnings.

I can't be sure if you are encountering this problem for the same reasons, since you did not include a reference to the HTML page being loaded. In any case, using libxml_use_internal_errors() should at least suppress the error.

ID's in HTML documents are generally unique, so the best solution would still be validating your document, if at all possible.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...