python - How do I select an element with the exact class using cssselect in lxml? -
I am scanning LXML HTML on the web, but I'm having a problem when I select HTML for example :
html.cssselect ('a.asig') Let me select elements as class = "asig" but selection also contains elements in which their ID For example, "acig" occurs:
& lt; A class = "asig drcha" ... & gt; What can I do to get the elements only and not the elements of Aceg? Thanks! Use either html.xpath and adjust accordingly, or Look for the following code, to find out when there is a lot to find in the class. Lxml import HTML sample = '& lt ;? XML version = "1.0" encoding = "UTF-8"? & Gt; & Lt; Root & gt; & Lt; A class = to "acig" & gt; I'm right & Lt; / A & gt; & Lt; A class = "asig drcha" & gt; I'm wrong. & Lt; / A & gt; & Lt; / Root & gt; 'Tree = html.fromstring (sample) print tree. xpath ("// a [@ class = 'asig'] / text ()") [0] print tree.cssselect ("a [class = 'asig']") [0] .text The result is as follows:
I am right I am right [ending in ss] Note how the last line cssselect was used to hope that it helps.
Comments
Post a Comment