Coder Social home page Coder Social logo

rarecoil / unwebpack-sourcemap Goto Github PK

View Code? Open in Web Editor NEW
485.0 10.0 180.0 337 KB

Extract uncompiled, uncompressed SPA code from Webpack source maps.

Home Page: https://medium.com/@rarecoil/spa-source-code-recovery-by-un-webpacking-source-maps-ef830fc2351d

License: MIT License

JavaScript 16.97% TypeScript 10.59% Python 65.97% SCSS 0.86% EJS 5.61%
webpack4 sourcemaps information-disclosure security-tools security-tool spa

unwebpack-sourcemap's Introduction

unwebpack-sourcemap

Archive Notice (April 15 2022)

This script seems to be helpful for many, but unfortunately I also do not have time to maintain it and properly code review the work of potential contributors. I'll leave it in an archived state for a while for anyone that wants to fork it, but I will eventually delete this repository.

Recover uncompiled TypeScript sources, JSX, and more from Webpack sourcemaps.

As single-page applications take over the world, more and more is being asked of the browser as a client. It is common for SPAs to use Webpack to handle browser script build processes. Usually, Webpack will transpile React/Vue/TypeScript/etc. to JavaScript, minify/compress it, and then serve it as a single bundle to the application.

However, Webpack also produces JavaScript source maps to assist in the debugging and development process; when things go wrong, the browser's debugger can use the SourceMap to point to a line in the code that contains the issue at hand. Most developers do not adequately protect the source maps and ship them to production environments.

When the browser was simply handling an array of JavaScript files concatenated and (maybe) packed, this wasn't so much of an issue. However, developers of SPAs assume the use of JavaScript as an intermediate representation. Developers often expect production to contain obfuscated and/or otherwise-processed scripts, and do not understand just what the sourcemaps contain in many cases. This model aligns closely with shipping binaries: source is compiled and you ship the interpretable version. If this is the case, the sourcemap is akin to leaking your source alongside the "binary" (bundle) you have made. The bundle can be reverse engineered just as a binary can, but sourcemaps make this far easier.

Usage

The script requires Python3, BeautifulSoup4 and requests. Install dependencies with pip3 install -r requirements.txt. The script can handle downloaded sourcemaps, or attempt to parse them from remote sources for you. In all of these cases, we will assume that you have a directory you have created called output alongside the script:

\$ mkdir output

In order of increasing noisiness, to unpack a local sourcemap:

\$ ./unwebpack_sourcemap.py --local /path/to/source.map output

To unpack a remote sourcemap:

\$ ./unwebpack_sourcemap.py https://pathto.example.com/source.map output

To attempt to read all <script src> on an HTML page, fetch JS assets, look for sourceMappingURI, and pull sourcemaps from remote sources:

\$ ./unwebpack_sourcemap.py --detect https://pathto.example.com/spa_root/ output

I'm a developer and this scares me. What do?

You have a few options:

  1. Turn off sourcemaps in production entirely.
  2. Push sourcemaps to a private server, and ACL sourcemap URIs to developers only.
  3. Load sourcemaps from local sources only and do not push them to production.

Example Vulnerable Application

An example TypeScript+React application is included in example-react-ts-app. You can run this locally and run the script against it.

Contributions

This is an alpha-level script built for a series of engagements I was working on in which sourcemaps are disclosed in production environments. It currently is only meant to work with TypeScript+React and TypeScript+Vue templates. Pull requests to harden the script, make it read more sourcemaps, et cetera are greatly appreciated.

License

MIT.

unwebpack-sourcemap's People

Contributors

arthur4ires avatar dee-see avatar dependabot[bot] avatar kartiksoneji avatar ra80533 avatar rarecoil avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

unwebpack-sourcemap's Issues

Unable to handle urls that start with // and crash when returning False from _parse_sourcemap

Here is a small patch that fixes the two issues.

@@ -91,7 +91,7 @@ class SourceMapExtractor(object):
     def _parse_remote_sourcemap(self, uri):
         """GET a remote sourcemap and parse it."""
         data = self._get_remote_data(uri)
-        if data is not None:
+        if data is not None and data:
             self._parse_sourcemap(data, True)
         else:
             print("WARNING: Could not retrieve sourcemap from URI %s" % uri)
@@ -116,6 +116,8 @@ class SourceMapExtractor(object):
             next_target_uri = ""
             if parsed_uri.scheme != '':
                 next_target_uri = source
+            elif source.startswith("//"):
+                next_target_uri = urlparse(uri).scheme + ":" + source
             else:
                 current_uri = urlparse(uri)
                 built_uri = current_uri.scheme + "://" + current_uri.netloc + source

error in URL parsing

Great tool, but keep receiving error when parsing. Seems like a slash is missing.

The initial map is detected, but then it fails, altough the jquery.min.js file is at that location....

c:\temp\portal>unwebpack_sourcemap.py --detect --make-directory --disable-ssl-verification https://192.168.1.210:4443 output
C:\Users\gebruiker\AppData\Local\Programs\Python\Python310\lib\site-packages\urllib3\connectionpool.py:1013: InsecureRequestWarning: Unverified HTTPS request is being made to host '192.168.1.210'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings
warnings.warn(
C:\Users\gebruiker\AppData\Local\Programs\Python\Python310\lib\site-packages\urllib3\connectionpool.py:1013: InsecureRequestWarning: Unverified HTTPS request is being made to host '192.168.1.210'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings
warnings.warn(
Detecting sourcemaps in HTML at https://192.168.1.210:4443/
C:\Users\gebruiker\AppData\Local\Programs\Python\Python310\lib\site-packages\urllib3\connectionpool.py:1013: InsecureRequestWarning: Unverified HTTPS request is being made to host '192.168.1.210'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings
warnings.warn(
Detected sourcemap at remote location https://192.168.1.210:4443//vendors.js.map
C:\Users\gebruiker\AppData\Local\Programs\Python\Python310\lib\site-packages\urllib3\connectionpool.py:1013: InsecureRequestWarning: Unverified HTTPS request is being made to host '192.168.1.210'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/1.26.x/advanced-usage.html#ssl-warnings
warnings.warn(
Detected sourcemap at remote location https://192.168.1.210:4443//main.js.map
Traceback (most recent call last):
File "C:\Users\gebruiker\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\models.py", line 382, in prepare_url
scheme, auth, host, port, path, query, fragment = parse_url(url)
File "C:\Users\gebruiker\AppData\Local\Programs\Python\Python310\lib\site-packages\urllib3\util\url.py", line 394, in parse_url
return six.raise_from(LocationParseError(source_url), None)
File "", line 3, in raise_from
urllib3.exceptions.LocationParseError: Failed to parse: https://192.168.1.210:4443jquery.min.js

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\temp\portal\unwebpack_sourcemap.py", line 359, in
extractor.run()
File "C:\temp\portal\unwebpack_sourcemap.py", line 69, in run
detected_sourcemaps = self._detect_js_sourcemaps(self._target)
File "C:\temp\portal\unwebpack_sourcemap.py", line 128, in _detect_js_sourcemaps
js_data, last_target_uri = self._get_remote_data(next_target_uri)
File "C:\temp\portal\unwebpack_sourcemap.py", line 213, in _get_remote_data
result = requests.get(uri, verify=False)
File "C:\Users\gebruiker\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\api.py", line 75, in get
return request('get', url, params=params, **kwargs)
File "C:\Users\gebruiker\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Users\gebruiker\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\sessions.py", line 528, in request
prep = self.prepare_request(req)
File "C:\Users\gebruiker\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\sessions.py", line 456, in prepare_request
p.prepare(
File "C:\Users\gebruiker\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\models.py", line 316, in prepare
self.prepare_url(url, params)
File "C:\Users\gebruiker\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\models.py", line 384, in prepare_url
raise InvalidURL(*e.args)
requests.exceptions.InvalidURL: Failed to parse: https://192.168.1.210:4443jquery.min.js

Folder names ending with spaces are invalid on Windows

Hi,

Thanks for putting this together, it works great! I have a small tweak required on Windows to the sanitise_filesystem_name function. I found with one sourcemap I tested, the post-sanitized filename ended up with a trailing space, which is not allowed on Windows.

I created a simple local patch by doing this before the return from this method:

valid_filename = valid_filename.strip()

Let me know if you're happy with this and I can provide a patch.

Cheers,
Peter

Traceback

Traceback (most recent call last):
File "./unwebpack_sourcemap.py", line 350, in
extractor.run()
File "./unwebpack_sourcemap.py", line 67, in run
self._parse_remote_sourcemap(sourcemap)
File "./unwebpack_sourcemap.py", line 93, in _parse_remote_sourcemap
data = self._get_remote_data(uri)
File "./unwebpack_sourcemap.py", line 217, in _get_remote_data
print("WARNING: Got status code %d for URI %s" % (uri, result.status_code))
TypeError: %d format: a number is required, not str

script doesnt work

when i use the script i got this error

Traceback (most recent call last):
File "./unwebpack_sourcemap.py", line 362, in
extractor.run()
File "./unwebpack_sourcemap.py", line 71, in run
self._parse_remote_sourcemap(self._target)
File "./unwebpack_sourcemap.py", line 95, in _parse_remote_sourcemap
data, final_uri = self._get_remote_data(uri)
File "./unwebpack_sourcemap.py", line 215, in _get_remote_data
if self.disable_verify_ssl == True:
AttributeError: 'SourceMapExtractor' object has no attribute 'disable_verify_ssl'

Receiving a non-200 status code causes a TypeError

TypeError: the JSON object must be str, bytes or bytearray, not NoneType

if result.status_code == 200:
return result.text
else:
print("WARNING: Got status code %d for URI %s" % (result.status_code, uri))
return False

data = self._get_remote_data(uri)
if data is not None:
self._parse_sourcemap(data, True)
else:
print("WARNING: Could not retrieve sourcemap from URI %s" % uri)

Line 218 should be changed to return None rather than False.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.