Comments (1)
@Michael-Purtill ElementRef::wrap
only works if the Node
is an element. However, next_sibling
can return things that aren't elements, like text. As a result, you need to call the iterator next_siblings
and filter that instead.
Here's a version that works:
use scraper::Html;
use scraper::Selector;
fn main() {
let response = ureq::get("https://en.wiktionary.org/wiki/pes#Czech")
.call()
.expect("invalid request");
let response_text = response.into_string().expect("invalid response");
let doc = Html::parse_document(&response_text);
let h2_selector = Selector::parse("#Czech").expect("invalid selector");
let h2 = doc
.select(&h2_selector)
.next()
.expect("missing h2")
.parent()
.expect("missing parent");
println!("{}\n", scraper::ElementRef::wrap(h2).expect("h2 is not an element").html());
let element = h2
.next_siblings()
.find(|node| node.value().is_element())
.expect("missing next_sibling");
println!("{}", scraper::ElementRef::wrap(element).expect("element is not an element").html());
}
yields:
<h2><span id="Czech" class="mw-headline">Czech</span><span class="mw-editsection"><span class="mw-editsection-bracket">[</span><a title="Edit section: Czech" href="/w/index.php?title=pes&action=edit&section=16">edit</a><span class="mw-editsection-bracket">]</span></span></h2>
<div class="sister-wikipedia sister-project noprint floatright" style="border: 1px solid #aaa; font-size: 90%; background: #f9f9f9; width: 250px; padding: 4px; text-align: left;"><div style="float: left;"><div class="floatnone"><img alt="" srcset="//upload.wikimedia.org/wikipedia/commons/thumb/6/63/Wikipedia-logo.png/66px-Wikipedia-logo.png 1.5x, //upload.wikimedia.org/wikipedia/commons/thumb/6/63/Wikipedia-logo.png/88px-Wikipedia-logo.png 2x" data-file-width="200" height="44" decoding="async" width="44" src="//upload.wikimedia.org/wikipedia/commons/thumb/6/63/Wikipedia-logo.png/44px-Wikipedia-logo.png" data-file-height="200"></div></div><div style="margin-left: 60px;">Czech <a title="Wikipedia" href="/wiki/Wikipedia">Wikipedia</a> has an article on:<div style="margin-left: 10px;"><b lang="cs" class="Latn"><a class="extiw" title="w:cs:pes" href="https://en.wikipedia.org/wiki/cs:pes">pes</a></b></div></div><span class="interProject"><a href="https://en.wikipedia.org/wiki/cs:pes" class="extiw" title="w:cs:pes">Wikipedia <sup>cs</sup></a></span></div>
from scraper.
Related Issues (20)
- Make element traversal more convenient HOT 2
- [Feature Request] Find by Text HOT 1
- Dom Nodes closes prematurely on recursion HOT 4
- How to select contains and start with? HOT 1
- any way to scrape in a stream? HOT 2
- Support for `:has()` selector HOT 8
- Implement Send for ElementRef HOT 16
- Convert <br> to '\n' in `text`? HOT 1
- future created by async block is not `Send` HOT 2
- Upgrade ahash HOT 1
- Html and its children do not impl Send HOT 5
- Scraper logs 1gb HOT 4
- Select::parse fails due to borrow the css query HOT 26
- select() doesn't work on current ElementRef HOT 2
- The problem of creating a structure with the Select field HOT 7
- Get text of element without children HOT 8
- Future is no longer Send HOT 5
- Should ego-tree be reexported by this crate? HOT 1
- Selector for nodes
- More explicit description to Element type and some doc complement HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from scraper.