Comments (7)
I don't know why you use limit to control the number of the results. I think people would need one or all of the results.
Not necessarily. As internally it is using a generator, this gives you a way to terminate the search once you get all you need to work with, saving you from having to parse the entire tree. It depends on the kind of data you are working with and what your task is.
Regardless of whether it makes sense, this models BeautifulSoups API in this respect as it always had a limit
as well.
With all of that said, for completeness with the original Beautiful Soup API, I can see the desire to have the familiar select_one()
as well with its convenience of not having to index into the the first in the list. This seems a reasonable request.
Keep in mind, Beautiful Soup will be using Soup Sieve as its select library moving forward, so you'll get the select_one
builtin into the tag objects for free there automatically. I always wrote Soup Sieve to be used internally in Beautiful Soup, so at the time select
seemed like all I'd need. It wasn't until afterwards that I realized there were cases I would still like using it externally as well. Especially being able to precompile patterns, and build a document parser using match
to return only the elements you care about.
If I could go back, I'd probably call the two functions select
and select_all
. Oh well, looks like it is select
and select_one
moving forward 🙂 .
from soupsieve.
And thanks for the css seletors you invented, like !=, that is what I need now.
Well, I didn't ivent it, it was something other libraries have been using for a while. I know JQuery and the lxml CSSSelect library implement it. It seemed too logical and convenient not to add here as well.
I like the precompile. It looks like the re module (re.compile).
Yeah, that was kind of my model when I wrote it. Seemed like if you had a script with known things you were looking for, instead of calling select
over and over, and reparsing the selector string and building the search object, it would be best to:
-
Allow you to build that thing once and reuse it.
-
Cache them in case you didn't think to precompile it to begin with.
Both of these the re
module employs.
from soupsieve.
Had a little time this morning, so I implemented the feature via #46. It will be in the 1.5.0
release. Not sure when that release is coming yet. Maybe not too far out. I imagine major releases will slow down moving forward as I feel the project has a solid feature base now, but I think I'll at least get this one out within the week.
from soupsieve.
find_all(name, attrs, recursive, string, limit, **kwargs)
Just found it, and this makes sense. I like the precompile. It looks like the re
module (re.compile
).
from soupsieve.
And thanks for the css seletors you invented, like !=
, that is what I need now.
from soupsieve.
With all of that said, for completeness with the original Beautiful Soup API, I can see the desire to have the familiar
select_one()
as well with its convenience of not having to index into the the first in the list. This seems a reasonable request.
So now I just need to sit down and wait for the new select_one
api.
from soupsieve.
FYI, I've released 1.5.0
as I found a bug in nth
selectors while I was adding the :dir()
selector. I wanted to get the fix out, so I went ahead and cut the 1.5.0
release.
from soupsieve.
Related Issues (20)
- CDATA handling in HTML changed in lxml parser with libxml2 2.9.12 HOT 21
- Interesting psuedo class to keep an eye on `:in()` HOT 8
- Rework internal structure of "relations" HOT 1
- circular dependency /bs4 HOT 15
- Attribute selectors vs \n in values HOT 5
- Change in `:has()` CSS Level 4 spec - document our difference or update? HOT 1
- hatch? HOT 5
- Using Hatch in Python 3.6 is technically not allowed HOT 7
- setup.py is mentioned in readme but there is no setup.py HOT 2
- Invalid syntax error on python3.4 HOT 5
- Tracking `:scope` issue related to relative selector lists (`:has()`) HOT 1
- pyproject.toml: validation error since setuptools 61.2.0 HOT 8
- PermissionError: [Errno 13] Permission denied HOT 4
- missing dependency on `bs4` HOT 7
- LXML does not currently generate wheels for Python 3.11 on Windows
- `:has()` is no longer forgiving HOT 1
- malformed attribute selector HOT 7
- The new type hints cause pytest to hang after test session HOT 4
- Attribute Selector Case Sensitivity: Whitespace HOT 1
- Potentially rework CSS parsing
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from soupsieve.