Comments (8)
Hey there! That's interesting. I don't think the network panel errors is a huge problem, as long as the assets load. Do they appear to be missing/broken in the saved HTML file, when the browser renders the page?
from monolith.
@snshn Yes, they don't load. This is then reflected in the Network panel - The CSS file in question was several megabytes and so the data URL didn't work. The page then looked weird because it didn't have the CSS
from monolith.
Uh-oh, that's a real boo-boo. Thank you for letting me know. I've done some research prior on minimizing image size, I believe that could help. What browser was it in, if I may ask?
from monolith.
@snshn Modern chrome 😅 (120 I think). You can embed the string in JavaScript then create a blob URL though if it's too long, e.g:
img.src = new URL.createObjectURL(new Blob([data here]))
, then the blob url would be very short, that does mess things up though with the initial load.
For CSS and JS you could turn <link rel="stylesheet" href="https://example.com/style.css">
to <style>[embedded css here]</style>
and likewise for script[src]
to <script>
(be careful with attributes like defer though)
from monolith.
Ideally JS should be unobtrusive, many people disable it by default (e.g. NoScript extension), and modifying the original document way too much could cause other problems (aside from blowing up the code base of Monolith), but you're suggesting very good solutions, thank you. I'll look into MHTML more, it could be a good format for solving these types of issues, but I agree that it's a big problem, I'm honestly a bit shocked there's a limit on the length in the browser, I assumed it's something like file size — as long as the machine has enough memory, it should work.
from monolith.
@snshn Your project seems a lot like SingleFile, you could see how they do it
from monolith.
If it's been solved in SingleFile, it's likely done the way you've described it — using JS to build blobs and unwrapping assets, embedding them into the HTML. I rather stay away from relying on JS, but if there's no better alternative, that approach is better than nothing. It still looks like a browser issue more than the tool's issue, as there's no length limit in the spec, that's just something browsers impose. I will eventually make Monolith reduce file size of embedded assets by not using base64 when not necessary, that's upcoming in future versions. I'll see how MHTML behaves and what could be done with large files (e.g. splitting one .js file into multiple consequently-included .js files, doing the same with .css assets). There's always a way.
from monolith.
@snshn For all styles and scripts you can simply inline them using script and style tags instead of link/script[src]. This should solve the issue most of the time, for images you could:
- compress them (convert to avif/webp, etc)
- just inline them anyways and hope it works out (browser have different limits)
- pick 1 or 2 then use JS on page load providing at least an ok solution for people without JS.
from monolith.
Related Issues (20)
- whole progress failed caused by get favicon.ico HOT 1
- Unicode mangling
- [Feature request] Simple way to permanently store and use Blacklist of domains HOT 13
- Incomplete output on broken HTML like https://distrowatch.com/table.php?distribution=void HOT 5
- Save apple-touch-icon too HOT 3
- "https://mp.weixin.qq.com" web title and CSS switch image on click not work
- How to get just HTML, no <script> HOT 1
- Additionally fetch dynamic content HOT 3
- [proposal] An option to remove alternative sources for media urls HOT 1
- HTML page content partially invisible HOT 2
- Site doesnt work HOT 1
- download path ? HOT 1
- Site doesn't work, redirected towards ct.captcha-delivery.com HOT 1
- What's the default location sites are being saved to? HOT 3
- Relax glibc version requirement HOT 3
- Aggregate multiple html files? HOT 3
- Saving all files separately like IDM. HOT 3
- Minimize output
- Saving Facebook webpages results in a broken output HOT 7
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from monolith.