I want to get the current index <a target="_blank" rel="noopener noreferrer

An easy way to set priority? about soupsieve HOT 18 CLOSED

facelessuser commented on August 16, 2024

An easy way to set priority?

from soupsieve.

Comments (18)

facelessuser commented on August 16, 2024

Specificity really only applies when you are in a stylesheet, and you are trying to see which style (based on the specificity of the selectors) gets applied. It really doesn't apply in the context we are using them in SoupSieve.

I guess I would have to see an example of what you are trying to do, and why it doesn't work.

from soupsieve.

yjqiang commented on August 16, 2024

Just by using requests.get(url='https://m.banzhuer.com/booklist/5393_39/'), and you can get a html file. And you can find <li><a> href="/booklist/5393_38/" class="xbk">1481 - 1520章</a><li><a class="xbk this tb">1521 - 1560章</a><li><a href="/booklist/5393_40/" class="xbk">1561 - 1600章</a><li>.
<a> class="xbk this tb">1521 - 1560章</a> is the current index, and I want to find the next index. It should be easy(<a href="/booklist/5393_40/" class="xbk">1561 - 1600章</a> is exactly what I want). But there is a problem, this code repeats 4 times in the html. So I have to filter them. It is very hard for me.
And there is another problem. I want to find all the indexes, but all of them repeats 4 times. I have to say, this website is disgusting.

from soupsieve.

facelessuser commented on August 16, 2024

I feel like you should be able to do something like li:nth-child(3) a.xbk, but I still don't exactly understand what you are trying to target.

from soupsieve.

yjqiang commented on August 16, 2024

I want to get the current index
and <a> class="xbk this tb">1521 - 1560章</a> is what I want. But that tag repeats 4 times(You can check it, and even their parents looks the same, too.). I can't find a good way to filter.
I want to get the next index

and <a href="/booklist/5393_40/" class="xbk">1561 - 1600章</a> is what I want(just behind the current index). But that tag repeats 4 times, too. I can't find a good way to filter.
I want to get all of the indexes

But I also can't filter them.

from soupsieve.

facelessuser commented on August 16, 2024

Are you able to retrieve the current index without issue? If so, I may have an idea.

from soupsieve.

yjqiang commented on August 16, 2024

Are you able to retrieve the current index without issue? If so, I may have an idea.

I can get the index easily. The problem is that there are clones. But I don't want to use something like set() in python. Can I make it just by using soupsieve?

from soupsieve.

facelessuser commented on August 16, 2024

If you are able to get the current index, and extract that text, you could then maybe construct another selector li:contains("1521 - 1560章") + li > a.xbk.

So something like:

select('li:contains("{}") + li > a.xbk'.format(current.text))

You'd have to play around and figure out what works best. This is really outside the scope of Soup Sieve support, as I'm more interested in addressing bugs and features, and not applications of the library, but maybe this helps.

from soupsieve.

yjqiang commented on August 16, 2024

If you are able to get the current index, and extract that text, you could then maybe construct another selector li:contains("1521 - 1560章") + li > a.xbk.

So something like:
select('li:contains("{}") + li > a.xbk'.format(current.text))
You'd have to play around and figure out what works best. This is really outside the scope of Soup Sieve support, as I'm more interested in addressing bugs and features, and not applications of the library, but maybe this helps.

But how can I get all of the indexes? I mean I can get them easily, but the problem is that it repeats in the html. I can't filter.

from soupsieve.

facelessuser commented on August 16, 2024

I didn't look too close, but while it repeats, I thought their parents were different.

I might be wrong, but you could get all indexes with parent li.

Anyways, not all problems can be solved with just selectors. Sometimes you may have to use additional logic if the HTML is constructed in a way that does not easily lend too simple selectors.

from soupsieve.

yjqiang commented on August 16, 2024

And my way to filter, to get the next index(The codes about current index, next index and indexes are executed in different places, so you can't get the tag of current index and then get the next index, just one code and get the result.)
select('div[class="fenye"]:first-child div[class="showpage r3"]:first-child > ul li > a[class="xbk this tb"] + li > a[class="xbk"]')
That is very hard for me to read. But maybe you can help my by giving another way.

from soupsieve.

yjqiang commented on August 16, 2024

And maybe you can provide the priority of the function in the document?

from soupsieve.

facelessuser commented on August 16, 2024

If you have a complicated selector, you can annotate it with CSS comments. It can be helpful when you come back to understand what it is you were doing.

>>> selector = """
... /* This isn't complicated, but we're going to annotate it anyways.
...    This is the a class */
... .a,
... /* This is the b class */
... .b,
... /* This is the c class */
... .c
... """
>>> sv.select(selector, soup)
[<p class="a">Cat</p>, <p class="b">Dog</p>, <p class="c">Mouse</p>]

Like with regular expressions, sometimes a selector solution can be complex. Complex doesn't always mean its a bad solution, sometimes that is just necessary.

Unfortunately, I don't have time to analyze the HTML you've provided in great detail and provide a complete solution.

from soupsieve.

facelessuser commented on August 16, 2024

I'm also not sure what you mean by function priority.

from soupsieve.

yjqiang commented on August 16, 2024

I'm also not sure what you mean by function priority.

li > a[class="xbk this tb"] ~ li > a[class="xbk"] means ((li > a[class="xbk this tb"]) ~ li) > a[class="xbk"](I use parentheses to point out the priority and this code is wrong and can't be executed by soupseive). So maybe you can provide the priority(like * / + - in Maths)?

from soupsieve.

facelessuser commented on August 16, 2024

A complex selector (one with combinators such as >, +, etc.) are evaluated from right to left. Checkout this answer on Stack Overflow: https://stackoverflow.com/a/8135729/3609487.

Soup Sieve basically does the same thing. The element under consideration is evaluated with a[class="xbk"], then it checks that it has the parent li, then it checks that li comes after the sibling a[class="xbk this tb"], and that sibling is a child of li.

Soup Sieve doesn't try to spell out the entire CSS spec. It is expected that the user will reference CSS rules if they are confused about how a selectors should work, but I will consider the suggestion.

from soupsieve.

yjqiang commented on August 16, 2024

A complex selector (one with combinators such as >, +, etc.) are evaluated from right to left. Checkout this answer on Stack Overflow: https://stackoverflow.com/a/8135729/3609487.

Soup Sieve basically does the same thing. The element under consideration is evaluated with a[class="xbk"], then it checks that it has the parent li, then it checks that li comes after the sibling a[class="xbk this tb"], and that sibling is a child of li.

Soup Sieve doesn't try to spell out the entire CSS spec. It is expected that the user will reference CSS rules if they are confused about how a selectors should work, but I will consider the suggestion.

Thx.
But why from right to left? Commonly, the left one would be parent, and the right one would be child. And if you search the right one first, maybe there would be too many results. But if you check the parents first and then search the children from the results, I think it can save time.

from soupsieve.

facelessuser commented on August 16, 2024

The tree is crawled from parent to child, but each tag is matched. When the tag is matched, we start from the right. First we want to know if it is even the element that we want, then we look at ancestry.

from soupsieve.

yjqiang commented on August 16, 2024

Thanks a lot.

from soupsieve.

An easy way to set priority? about soupsieve HOT 18 CLOSED

Comments (18)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent