I'm trying to get the elements in an HTML doc that contain the following pattern of text: #S{11}
<h2> this is cool #12345678901 </h2>
So, the previous would match by using:
soup('h2',text=re.compile(r' #S{11}'))
And the results would be something like:
[u'blahblah #223409823523', u'thisisinteresting #293845023984']
I'm able to get all the text that matches (see line above). But I want the parent element of the text to match, so I can use that as a starting point for traversing the document tree. In this case, I'd want all the h2 elements to return, not the text matches.
Ideas?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…