Coder Social home page Coder Social logo

Encode.forJavaScript() about owasp-java-encoder HOT 7 CLOSED

owasp avatar owasp commented on May 29, 2024
Encode.forJavaScript()

from owasp-java-encoder.

Comments (7)

ilatypov avatar ilatypov commented on May 29, 2024 1

Sorry for the spam above, but I agree with the original post by Nathan and with the analysis by @pthorson.

https://github.com/OWASP/owasp-java-encoder/blob/789a380/core/src/main/java/org/owasp/encoder/JavaScriptEncoder.java#L128

The encoding of raw - as raw \- results in an invalid JSON (but a valid ECMAScript) string literal.

string = quotation-mark *char quotation-mark

      char = unescaped /
          escape (
              %x22 /          ; "    quotation mark  U+0022
              %x5C /          ; \    reverse solidus U+005C
              %x2F /          ; /    solidus         U+002F
              %x62 /          ; b    backspace       U+0008
              %x66 /          ; f    form feed       U+000C
              %x6E /          ; n    line feed       U+000A
              %x72 /          ; r    carriage return U+000D
              %x74 /          ; t    tab             U+0009
              %x75 4HEXDIG )  ; uXXXX                U+XXXX

      escape = %x5C              ; \

      quotation-mark = %x22      ; "

      unescaped = %x20-21 / %x23-2E / %x30-5B / %x5D-10FFFF

https://www.rfc-editor.org/errata_search.php?rfc=8259

CharacterEscapeSequence ::
    SingleEscapeCharacter
    NonEscapeCharacter

SingleEscapeCharacter :: one of
    ' " \ b f n r t v

NonEscapeCharacter ::
    SourceCharacter but not one of EscapeCharacter or LineTerminator

https://www.ecma-international.org/ecma-262/6.0/#sec-literals-string-literals
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String#Escape_notation
https://www.json.org/

There is little need to have the "mode" parameter in the Javascript string encoder. It brings unnecessary usage complexity as encoding for the worst case would still be parseable in the easiest use case as well.

In addition, the existing code missed extra attack surfaces,

https://security.stackexchange.com/questions/11091/is-this-json-encoding-vulnerable-to-cdata-injection/11097#11097

This suggests to encode with raw \uHHHH the following single characters, regardless of "mode" (and add another method signature without the parameter, deprecating the current method). This will add to the existing encodings of the double quote raw " as raw \" et c. but remove the ECMA compatible non-JSON encoding raw \-. The encoding of the forward slash raw / as raw \/ appears compatible with both ECMA and JSON. To sum up, these characters need replacing on output.

  • the double quote raw " with raw \" against escaping a javascript string surrounded by double quotes,
  • the single quote raw ' with raw \u0027 against escaping a javascript string surrounded by single quotes,
  • the forward slash raw / with raw \/ against constructing a closing script tag in a javascript string,
  • the opening angle bracket raw < against closing the script tag or opening a comment tag,
  • the closing angle bracket raw > against closing a possible CDATA wrapper around the javascript text embedded in XHTML (and HTML?),
  • U+2028 and U+2029 allowed in JSON but not in Javascript against disrupting the javascript engine.
  • the ampersand raw & against the mandatory HTML entity decoding in XHTML documents embedding scripts without a CDATA wrapper. (HTML documents do not apply the entity decoding to embedded scripts but the suggested string literal encoding does not change the string's value).

I understand that these layers look secondary to JSON or Javascript string literal encoding per se, but leaving the task of protecting the string literal against these layers would require re-implementing the encoder for each use case. Attempting to apply additional encoding with simple .replace() chaining will suffer from destroying the context of the backslash-protected characters. Luckily, the suggested "preliminary optimization" by using the raw \uHHHH encoding is both acceptable by the lower level interpreter (javascript engine) and defending against additional interpreters on top of it.

from owasp-java-encoder.

pthorson avatar pthorson commented on May 29, 2024

Perhaps the application of this rule?

if (mode == Mode.BLOCK || mode == Mode.HTML) {
        // in <script> blocks, we need to prevent the browser from seeing
        // "</anything>" and "<!--". To do so we escape "/" as "\/" and
        // escape "-" as "\-". 

from owasp-java-encoder.

ndp-opendap avatar ndp-opendap commented on May 29, 2024

I think so.

from owasp-java-encoder.

jmanico avatar jmanico commented on May 29, 2024

from owasp-java-encoder.

jmanico avatar jmanico commented on May 29, 2024

This seems like a good argument. Would you care to give us a PR? I'll also as Jeff to take a look at this.

from owasp-java-encoder.

jmanico avatar jmanico commented on May 29, 2024

PS: The earlier spam was accidental and I deleted it. I'm back to updating this project.

from owasp-java-encoder.

jmanico avatar jmanico commented on May 29, 2024

After speaking to Jeff on this, we have the following response and wish to close the issue and open a new feature request.

Summary: encodeForJavaScript is not meant to be safe for JSON, but we do plan to add a new JSON encode function, in the meantime consider https://github.com/yahoo/serialize-javascript type logic to safely embed JSON on a page.

from owasp-java-encoder.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.