phpgt / cssxpath Goto Github PK
View Code? Open in Web Editor NEWTranslate CSS selectors to XPath queries
Home Page: https://www.php.gt/cssxpath
License: MIT License
Translate CSS selectors to XPath queries
Home Page: https://www.php.gt/cssxpath
License: MIT License
The pseudo selectors are the only selectors that haven't been tested yet. Here is the code coverage: https://24-117023753-gh.circle-artifacts.com/0/TestCoverage/index.html
Here's an example of the Dom repository's CI:
https://github.com/PhpGt/Dom/blob/master/.github/workflows/ci.yml#L70-L86
It pushes to Codecov, which will automatically make the badge in the README work.
.my-class
works fine, but div.my-class
or multiple classes such as .my-class.another-class
fail.
All other PHP.Gt repositories have a good code quality score on Scrutinizer-CI, but with all the logic being within a single class and many nested conditionals, CssXpath is getting a really bad score.
https://scrutinizer-ci.com/g/PhpGt/CssXPath/code-structure/master
form [name]
should select all elements with a name attribute that are a child of a form. The selection breaks:
TypeError : Argument 1 passed to Gt\Dom\HTMLCollection::__construct() must be an instance of DOMNodeList, bool given, called in /home/g105b/Code/PhpGt/DomTemplate/vendor/phpgt/dom/src/ParentNode.php on line 72
There is currently no functionality when using :not
due to how the regex parsing is set up, but for completeness this would be nice in v2.
Can't find clear examples of use of this class.
Please give more clear examples
When an array-named element selector is used, it fails to match anything.
When specifying the value for an attribute in querySelector() method like tag[attr='value'] , the method doesn't work as expected and returns null.
Example to reproduce the issue:
<?php
require "vendor/autoload.php";
$html = file_get_contents("https://github.com");
$document = new \Gt\Dom\HTMLDocument($html);
//will print "Enterprise"
echo $document->querySelector("nav > ul > li > a[data-ga-click]")->innerText . "\n";
//will throw PHP Notice: Trying to get property 'innerText' of non-object
echo $document->querySelector("nav > ul > li > a[data-ga-click='(Logged out) Header, go to Enterprise']")->innerText . "\n";
The expected behaviour is that the last line should print "Enterprise"
If I try to run the exact querySelector call in my browser, it works correctly and both lines print "Enterprise":
document.querySelector("nav > ul > li > a[data-ga-click]").innerText
document.querySelector("nav > ul > li > a[data-ga-click='(Logged out) Header, go to Enterprise']").innerText
These are currently not implemented. Specific usage that I've hit is when wanting to add the selected
attribute to the last option of a select element.
I have noticed that HTML attribute names are matched in a case-sensitive manner, for example:
Given a document with the contents <div data-FOO='bar'>baz</div>
, the selector "[data-foo='bar']"
does not match.
This is the correct behaviour for XML documents, which are case-sensitive everywhere, but not for HTML documents, where tag names and attribute names are case-insensitive.
I've submitted a draft PR with a trivial fix for this, but this will break matching in XML documents.
Perhaps there should be an optional argument to the Translator
constructor which specifies the document's DOMDocumentType? Or a HtmlTranslator subclass with the different behaviour?
I'll be happy to contribute the code, but wanted to get opinions from the community.
I should like to add that I'm very grateful for this excellent package!
HTML:
<form>
<button name="do" value="save">Save!</button>
</form>
PHP:
$saveButton = $document->querySelector("form [name='do'][value='save']");
Exception raised: Gt\Dom\Exception\XPathQueryException - Query is malformed: //form//[@value="save"]
As a quick fix, I can change the query selector to: form button[name='do'][value='save']
, and by explicitly mentioning the button
is being selected, the problem goes away.
I think the issue is within the "attribute" part of the regex on line 15. It should optionally match an element before it, outside of its named matching group.
Dependabot needs to be tamed, as per PhpGt/WebEngine#568
There is a bug when selecting option
elements using a selector like [name=from] option
. The translated xpath selector actually selects all options in the document - not good.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.