xarantolus / collect Goto Github PK
View Code? Open in Web Editor NEWA server to collect & archive websites that also supports video downloads
Home Page: https://010.one/Collect/
License: MIT License
A server to collect & archive websites that also supports video downloads
Home Page: https://010.one/Collect/
License: MIT License
I am not sure what you what to know so:
I get this:
Error 500
ENOENT: no such file or directory, open 'public/s/content.json'
Have this installed:
npm -v
5.8.0
nodejs -v
v8.11.4
and run this to start
sudo npm start production
When visiting a downloaded website that has HTML content but a .php
or .md
extension, the server assumes the wrong content type. Browsers usually display the file wrong.
Use the API to delete & update sites with client side javascript
Add an option to use cookies. They should be added to the request
option in the website
function in tools/download.ts
.
This would also require to change the 'New Entry' form and the files that accept input from this form.
I would be great to have an option (either global or by site) to allow unauthentified acces to website (and also potentially list of website)
Main problem is that this would need to distribute the call to the auth middleware to the url handlers or another equivalent solution
As this server is for archiving all types of websites, it should also support downloading only videos. To specify whether to download only the video or the website with the video, the url could be given as video:https://youtube.com/watch?v=....
. or with another option on the new
page.
This features would use youtube-dl
because it supports a wide range of sites.
An user should be redirected to the page they wanted to go to after logging in
A route that deletes a saved site
Since this isn't done in the class, it is done individually by all routes that need it.
This redundant code should be refactored
Have you considered using webarchive format and or pdf for archiving? .war is pretty standard now
When clicking on the title in the table view, redirect to a site with an iframe so that the header is still displayed
There is no api route for changing the title of a site
Add an option to use the website-scraper-phantom
module to download pages.
This can might be accomplished by using the module if it is installed (users just have to install the module & the normal install doesn't fail if their platform is not supported by PhantomJS)
There aren't yet any events for:
Improve the "details" page:
This software is
website-scraper
node module, so it's still on an older versionWhile it still works (I've had it running for 4+ years), it feels like it's time for an overhaul. The following could be good steps:
browser.js
works, but is a mess)2 & 4 should be possible without major rewrites, the others are somewhat more involved.
Maybe it also makes sense to rewrite the server in Go because I like it better, but I'm not sure if there's a good website-scraper
-like module for Go. I actually tried writing something like that that started a chromium browser and used SingleFile to download pages, but it didn't work that well.
I'm not sure when (or if) I'll work on this.
Add an editor(link to it from the details page) for html pages so they can be edited from the webinterface
The layout of the details page looks horrible in mobile browsers.
If an operation is running, the title on initial page load is "(1) (1) Title", while it should be "(1) Title"
Right now, any client can connect to the server to receive events. This should be restricted to people who have an api token or the right cookie.
How to reproduce: register events and enter io.connect()
in your browser console
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.