A tree-sitter parser for XML & DTD files.
- Extensible Markup Language (XML) 1.0
- Associating Schemas with XML documents 1.0
- Associating Style Sheets with XML documents 1.0
- Neovim
- Helix (has alternatives)
- Emacs
- Zed
XML & DTD grammars for tree-sitter
License: MIT License
A tree-sitter parser for XML & DTD files.
<From>Jani</from>
is invalid in XML but the parser allows it.
The parser should ensure that start tags and end tags have the same name.
Just leave it to the language server/linter.
Specification: https://www.w3.org/TR/xml/#sec-starttags
Example: https://www.w3schools.com/xml/note_error.xml
I'd like to try out this nice project but didn't manage to install it according to the official doc. The command tree-sitter generate
needs to be run from the directories where grammar.js
are but this is not obvious for new users. I didn't find out yet how to install the parser for example, even on the cited page.
tree-sitter --version
)tree-sitter 0.22.2
Hey Devs ๐๐ป ,
This might be more of an understanding issue that a bug, because I'm new to the tree-sitter syntax :swe.
I'm using tree-sitter-xml 0.6.1
alongside tree-sitter 0.22.2
in a rust project.
In which I want parse a xml file like this:
...
<import>
<module name="module1" />
<module name="module2" />
<module name="module3" />
<module name="module4" />
...
</import>
...
My goal is to check if certain modules are listed which I try with this query:
(element
(_
(Name) @name
(Attribute
(AttValue) @module_name
)
)
(#eq? @name "module")
(#any-of? @module_name "module1" "module3" "module4")
)
Apparently this query produces no matches. Without the (#any-of? @module_name "module1" "module3" "module4")
I get all modules.
I expected to Match against only the against "module1" "module3" and "module4".
thanks in advance ๐
see in the description
see in the description
see in the description
tree-sitter --version
)4e8a9e654f90834be77a80a33264a54434a2ead3
When using longer XML tags the parser seems to (mismatch?) generate an incorrect tree.
Note: I am using the C API.
Here's the bad parse tree:
(ERROR)(1:1 - 2:48)
(prolog)(1:1 - 2:1)
(XMLDecl)(1:1 - 1:39)
(<?)(1:1 - 1:3)
(xml)(1:3 - 1:6)
(version)(1:7 - 1:14)
(=)(1:14 - 1:15)
(")(1:15 - 1:16)
(VersionNum)(1:16 - 1:19)
(")(1:19 - 1:20)
(encoding)(1:21 - 1:29)
(=)(1:29 - 1:30)
(")(1:30 - 1:31)
(EncName)(1:31 - 1:36)
(")(1:36 - 1:37)
(?>)(1:37 - 1:39)
(STag)(2:1 - 2:21)
(<)(2:1 - 2:2)
(Name)(2:2 - 2:20)
(>)(2:20 - 2:21)
(content)(2:21 - 2:27)
(CharData)(2:21 - 2:27)
(</)(2:27 - 2:29)
(Name)(2:29 - 2:47)
(>)(2:47 - 2:48)
(document)(1:1 - 2:44)
(prolog)(1:1 - 2:1)
(XMLDecl)(1:1 - 1:39)
(<?)(1:1 - 1:3)
(xml)(1:3 - 1:6)
(version)(1:7 - 1:14)
(=)(1:14 - 1:15)
(")(1:15 - 1:16)
(VersionNum)(1:16 - 1:19)
(")(1:19 - 1:20)
(encoding)(1:21 - 1:29)
(=)(1:29 - 1:30)
(")(1:30 - 1:31)
(EncName)(1:31 - 1:36)
(")(1:36 - 1:37)
(?>)(1:37 - 1:39)
(root: element)(2:1 - 2:44)
(STag)(2:1 - 2:19)
(<)(2:1 - 2:2)
(Name)(2:2 - 2:18)
(>)(2:18 - 2:19)
(content)(2:19 - 2:25)
(CharData)(2:19 - 2:25)
(ETag)(2:25 - 2:44)
(</)(2:25 - 2:27)
(Name)(2:27 - 2:43)
(>)(2:43 - 2:44)
<?xml version="1.1" encoding="UTF-8"?>
<exampleofaverylong>foobar</exampleofaverylong>
tree-sitter --version
)tree-sitter 0.22.6 (b40f342067a89cd6331bf4c27407588320f3c263)
Parsing the following document
<test>]</test>
results in the following tree:
(ERROR [0, 0] - [1, 0]
(STag [0, 0] - [0, 6]
(Name [0, 1] - [0, 5]))
(content [0, 6] - [1, 0]
(CharData [0, 6] - [1, 0])))
Probably a false positive happening in the scanner when looking for CDATA delimiters.
echo '<test>]</test>' > test.xml
tree-sitter parse test.xml
(document [0, 0] - [1, 0]
root: (element [0, 0] - [0, 14]
(STag [0, 0] - [0, 6]
(Name [0, 1] - [0, 5]))
(content [0, 6] - [0, 7]
(CharData [0, 6] - [0, 7]))
(ETag [0, 7] - [0, 14]
(Name [0, 9] - [0, 13]))))
No response
tree-sitter --version
)No response
PE references are not parsed in all allowed contexts.
See https://github.com/tree-sitter-grammars/tree-sitter-xml/actions/runs/7993455735#summary-21829321669
Most DTD nodes should be replaceable with PE references.
No response
tree-sitter --version
)No response
getting a compile error for nvim-treesitter[xml] in the scanner.c function 'string push', compiler is seeing an implicit declaration of function "max". There is a macro in the file that should define it on windows, but it doesn't seem to be doing so.
I expect nvim-treesitter to compile, but it doesn't.
No response
I am trying to submit an extension for Zed, but the build process files with the error:
Error: Failed to instantiate wasm module: language version 12 is too old for wasm
Wasm build to compile correctly
No response
issue: zed-industries/extensions#590 (comment)
proposed Zed extension: https://github.com/sweetppro/zed-xml
I'm trying to add XML support to the Emacs world, but I cannot since your repo isn't a standard tree-sitter-[language] kind of repository. I think it would be nice if you could split this into two repos so you have one language per repo. ๐ค
Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.