First of all, please forgive me posting it here. This is rather a shortcoming in the KML specification or the Google Earth implementation, respectively. But I am not familiar with Google's issue tracking system and the following could be a useful enhancement for owasp-java-encoder, too.
What's the problem?
I just realized that KML files I am generating are vulnerable to injection despite (in my opinion correct) use of HTML encoding powered by owasp-java-encoder. Owasp's Encode.forHtml/forXml
works correctly: the input <strike>oops</strike>
gets encoded to <strike>oops</strike>
. I do not want my users to inject HTML.
However, the KML specification is unclear about encoding of angle brackets, please see KML Reference Errata. And the software consuming the KML (Google Earth) will render it as HTML.
How can this problem be solved?
According to the errata page mentioned above, numeric character entities have to be used in favor of entity references, i.e. <
and >
must be encoded to <
and >
instead of <
and >
. From looking at owasp's HTMLEncoder
/XMLEncoder
classes I cannot find an option to control the encode behavior. Do you think this would be a reasonable enhancement? Alternatively, do you think introducing a special KMLEncoder
would be reasonable?
How to reproduce?
- Download Google Earth, start it and then load a KML file with this content:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<kml xmlns="http://www.opengis.net/kml/2.2" xmlns:gx="http://www.google.com/kml/ext/2.2">
<Document>
<name><![CDATA[HTML encoding]]></name>
<Style id="myStyle">
<BalloonStyle>
<text>
<![CDATA[
Text should be bold only: <b>$[text]</b>
]]>
</text>
</BalloonStyle>
</Style>
<Folder>
<name>Placemarks</name>
<Placemark>
<name><![CDATA[insufficient escaping]]></name>
<address><![CDATA[1200-C Agora Drive, Bel Air, MD 21014, US]]></address>
<styleUrl><![CDATA[#myStyle]]></styleUrl>
<ExtendedData>
<Data name="text">
<value><![CDATA[<strike>oops</strike>]]></value>
</Data>
</ExtendedData>
</Placemark>
<Placemark>
<name><![CDATA[sufficient escaping]]></name>
<address><![CDATA[Leinstraat 104A, B-9660 Opbrakel, Belgium]]></address>
<styleUrl><![CDATA[#myStyle]]></styleUrl>
<ExtendedData>
<Data name="text">
<value><![CDATA[<strike>fine</strike>]]></value>
</Data>
</ExtendedData>
</Placemark>
</Folder>
</Document>
</kml>
- Open the placemark called insufficient escaping and see the strikethrough word:
- Open the placemark called sufficient escaping and see everything's working correctly:
Please note that this issue can not be reproduced when uploading the KML file to the web-based Google Earth. Things may have changed in the online version. However, the desktop version is still maintained (last build: February 6, 2018).