HTML Parsing - Get Innermost HTML Tags -


When I parse the HTML then I want to get only the most unlimited tags for the entire document. My intention is to parse the data semantically from the HTML document.

So if I have something like HTML

   

Let me lonely & lt; Td> X & lt; / Td> and & lt; Td> Y & lt; / Td> should be. Is it possible to use this beautiful soup or LXML?

In the .NET I have used the library to do all the html parsing is easy to load this DOM And you can choose from the nodes, in your case no one can select the nodes with no child that it helps.


Comments

Popular posts from this blog

python - Overriding the save method in Django ModelForm -

html - CSS autoheight, but fit content to height of div -

qt - How to prevent QAudioInput from automatically boosting the master volume to 100%? -