Comments (7)
Even worse, it stops minifying after the >
token. Meaning it definitely erroneously considers the first >
as the end of tag.
$ cat test.html
<tag attr="var > 1" foo bar >
$ cat test.min.html
<tag attr="var> 1" foo bar >%
from htmlclean.
Hi @gilbsgilbs, thank you for the comment.
That is correct behavior for htmlclean because that HTML code is definitely wrong. htmlclean supposes valid HTML code.
Also, some language parsers (web browsers) may mistake by wrong HTML code.
That is, you should replace that >
with >
regardless of htmlclean.
See: https://github.com/anseki/htmlclean#note
htmlclean is not validator, htmlclean does not check that the code is valid.
htmlclean should be simple, light weight and small.
from htmlclean.
@anseki Thanks for your answer. You are right. It's not valid HTML, I totally understand that. However, it's very common to write this kind of things in Angular 1 and pretty much all browsers doesn't consider this as a closing token. Can't imagine having to write ng-if
statements with a > b
, I'm not even sure it would work.
That being said, I reckon htmlclean should at least raise a warning (because minification obviously failed in this case) or be a bit more permissive, and it should avoid altering attribute values at all price anyways. If not, the "safe" keyword should definitely be removed from the readme; it's just as unsafe as htmlmin
.
from htmlclean.
Thank you for your proposal.
That "safe" means that htmlclean never changes structure of document.
See: https://github.com/anseki/htmlclean#note
htmlclean supposes valid HTML code, and htmlclean doesn't understand HTML at all.
To check the HTML code is valid or not, HTML parser is required. I think that htmlclean should not do that because others already do that.
See: https://github.com/anseki/htmlclean#see-also
You can use protect
or unprotect
option to control the changing code.
https://github.com/anseki/htmlclean#protect
https://github.com/anseki/htmlclean#unprotect
from htmlclean.
Thanks. After a quick check in HTML5 spec it appears that we were both wrong: this is not invalid HTML.
8.2.4.38 Attribute value (double-quoted) state
Consume the next input character:
U+0022 QUOTATION MARK (")
Switch to the after attribute value (quoted) state.
U+0026 AMPERSAND (&)
Switch to the character reference in attribute value state, with the additional allowed character being U+0022 QUOTATION MARK (").
U+0000 NULL
Parse error. Append a U+FFFD REPLACEMENT CHARACTER character to the current attribute's value.
EOF
Parse error. Switch to the data state. Reconsume the EOF character.
Anything else
Append the current input character to the current attribute's value.
https://www.w3.org/TR/html5/single-page.html#attribute-value-(double-quoted)-state
Meaning that the only characters that have a special meaning for a HTML attribute value are:
- U+0022 QUOTATION MARK (")
- U+0026 AMPERSAND (&) (only if it is ambiguous I guess)
- U+0000 NULL => Parse error
- EOF => Parse error
Anything else is valid and considered as the attribute value.
from htmlclean.
Thank you for the important information.
I will update htmlclean to support that spec in future version.
Anyhow, we had better escape those characters for HTML parsers.
from htmlclean.
I updated the code.
Please try new version.
E.g.
INPUT:
A B C <element attr1 = " value 'value' < > < > " " attr2 = ' value "value" < > < > " ' attrNoValue > D E F
OUTPUT:
A B C<element attr1=" value 'value' < > < > " " attr2=' value "value" < > < > " ' attrNoValue> D E F
from htmlclean.
Related Issues (11)
- Remove commented out code? HOT 1
- question(comparison): why htmlclean better, than alternatives? HOT 2
- 'Module not found' error when using in angular app HOT 10
- Add in a param to leave SVGs alone, as compression breaks already optimised SVGs HOT 4
- New lines are not removed {\n} HOT 8
- IE conditional directives from html5 boilerplate cause invalid output HOT 4
- Support the inline SVG HOT 1
- Recreate directory tree HOT 12
- extra space inserted after <div> followed by new line HOT 7
- Improving documentation HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from htmlclean.