Coder Social home page Coder Social logo

montferret / ferret-server Goto Github PK

View Code? Open in Web Editor NEW
29.0 9.0 6.0 368 KB

Advanced declarative web scraping

License: Apache License 2.0

Makefile 1.40% Go 98.44% Dockerfile 0.16%
golang query-language data-mining scraping scraping-api crawling scraper server ferret-server

ferret-server's Introduction

Ferret Server

Go Report Status Build Status Discord Chat Ferret release MIT License

Server for advanced web scraping.
Open API defintion.

Features

  • Scripts persistence
  • Scraped data persistence
  • Script execution scheduling
  • Integration with 3rd party systems
  • Web Hooks
  • Security

WIP

Be aware, that the project is under heavy development.
There is no documentation and some things may change in the final release.

Installation

Binary

You can download latest binaries from here.

Source code

Production

  • Go >=1.11
  • Chrome or Docker
  • ArangoDB

Development

Quick start

ferret-server --db=http://0.0.0.0:8529

ferret-server's People

Contributors

dependabot-preview[bot] avatar dependabot-support avatar dependabot[bot] avatar ziflex avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ferret-server's Issues

Dependabot can't resolve your Go dependency files

Dependabot can't resolve your Go dependency files.

As a result, Dependabot couldn't update your dependencies.

The error Dependabot encountered was:

verifying github.com/MontFerret/[email protected]/go.mod: checksum mismatch
	downloaded: h1:FnLRMYxmWxiuv10sNS9XgEy4/x01qgBcLgrIzqbzwHA=
	go.sum:     h1:9t7z7TqeVYgabjFQz9UytgbUotR0sA/k/3EUlPRV4aw=

SECURITY ERROR
This download does NOT match an earlier download recorded in go.sum.
The bits may have been replaced on the origin server, or an attacker may
have intercepted the download attempt.

For more information, see 'go help module-auth'.

If you think the above is an error on Dependabot's side please don't hesitate to get in touch - we'll do whatever we can to fix it.

View the update logs.

Script dedicated collection

Create the possibility to create a collection keeping all output data which is dedicated to a specific script .

Dependabot can't resolve your Go dependency files

Dependabot can't resolve your Go dependency files.

As a result, Dependabot couldn't update your dependencies.

The error Dependabot encountered was:

go: github.com/go-openapi/[email protected] requires
	github.com/docker/[email protected]: reading github.com/docker/go-units/go.mod at revision v0.4.0: unknown revision

If you think the above is an error on Dependabot's side please don't hesitate to get in touch - we'll do whatever we can to fix it.

View the update logs.

Dependabot can't resolve your Go dependency files

Dependabot can't resolve your Go dependency files.

As a result, Dependabot couldn't update your dependencies.

The error Dependabot encountered was:

verifying github.com/MontFerret/[email protected]/go.mod: checksum mismatch
	downloaded: h1:FnLRMYxmWxiuv10sNS9XgEy4/x01qgBcLgrIzqbzwHA=
	go.sum:     h1:9t7z7TqeVYgabjFQz9UytgbUotR0sA/k/3EUlPRV4aw=

SECURITY ERROR
This download does NOT match an earlier download recorded in go.sum.
The bits may have been replaced on the origin server, or an attacker may
have intercepted the download attempt.

For more information, see 'go help module-auth'.

If you think the above is an error on Dependabot's side please don't hesitate to get in touch - we'll do whatever we can to fix it.

View the update logs.

Dependabot can't resolve your Go dependency files

Dependabot can't resolve your Go dependency files.

As a result, Dependabot couldn't update your dependencies.

The error Dependabot encountered was:

github.com/MontFerret/ferret-server/server: cannot find module providing package github.com/MontFerret/ferret-server/server

If you think the above is an error on Dependabot's side please don't hesitate to get in touch - we'll do whatever we can to fix it.

View the update logs.

Dependabot can't resolve your Go dependency files

Dependabot can't resolve your Go dependency files.

As a result, Dependabot couldn't update your dependencies.

The error Dependabot encountered was:

github.com/MontFerret/ferret-server/server: cannot find module providing package github.com/MontFerret/ferret-server/server

If you think the above is an error on Dependabot's side please don't hesitate to get in touch - we'll do whatever we can to fix it.

View the update logs.

Dependabot can't resolve your Go dependency files

Dependabot can't resolve your Go dependency files.

As a result, Dependabot couldn't update your dependencies.

The error Dependabot encountered was:

verifying github.com/MontFerret/[email protected]/go.mod: checksum mismatch
	downloaded: h1:FnLRMYxmWxiuv10sNS9XgEy4/x01qgBcLgrIzqbzwHA=
	go.sum:     h1:9t7z7TqeVYgabjFQz9UytgbUotR0sA/k/3EUlPRV4aw=

SECURITY ERROR
This download does NOT match an earlier download recorded in go.sum.
The bits may have been replaced on the origin server, or an attacker may
have intercepted the download attempt.

For more information, see 'go help module-auth'.

If you think the above is an error on Dependabot's side please don't hesitate to get in touch - we'll do whatever we can to fix it.

View the update logs.

Dependabot can't resolve your Go dependency files

Dependabot can't resolve your Go dependency files.

As a result, Dependabot couldn't update your dependencies.

The error Dependabot encountered was:

verifying github.com/MontFerret/[email protected]/go.mod: checksum mismatch
	downloaded: h1:FnLRMYxmWxiuv10sNS9XgEy4/x01qgBcLgrIzqbzwHA=
	go.sum:     h1:9t7z7TqeVYgabjFQz9UytgbUotR0sA/k/3EUlPRV4aw=

SECURITY ERROR
This download does NOT match an earlier download recorded in go.sum.
The bits may have been replaced on the origin server, or an attacker may
have intercepted the download attempt.

For more information, see 'go help module-auth'.

If you think the above is an error on Dependabot's side please don't hesitate to get in touch - we'll do whatever we can to fix it.

View the update logs.

Webhooks

Run script execution using web hooks.

Email notifications

Add the possibility to notify when a scripts starts or ends or gets cancelled by sending an email.

Documentation

This is a reminder and one of the milestones towards 1.0 release.
Guidelines, API documentations and examples.
Preferably as a GitHub Page with nice theme.
With link or integration with Ferret Documentation

Dependabot can't resolve your Go dependency files

Dependabot can't resolve your Go dependency files.

As a result, Dependabot couldn't update your dependencies.

The error Dependabot encountered was:

verifying github.com/MontFerret/[email protected]/go.mod: checksum mismatch
	downloaded: h1:FnLRMYxmWxiuv10sNS9XgEy4/x01qgBcLgrIzqbzwHA=
	go.sum:     h1:9t7z7TqeVYgabjFQz9UytgbUotR0sA/k/3EUlPRV4aw=

SECURITY ERROR
This download does NOT match an earlier download recorded in go.sum.
The bits may have been replaced on the origin server, or an attacker may
have intercepted the download attempt.

For more information, see 'go help module-auth'.

If you think the above is an error on Dependabot's side please don't hesitate to get in touch - we'll do whatever we can to fix it.

View the update logs.

Dependabot can't resolve your Go dependency files

Dependabot can't resolve your Go dependency files.

As a result, Dependabot couldn't update your dependencies.

The error Dependabot encountered was:

go: golang.org/x/[email protected]: invalid version: git fetch -f origin refs/heads/*:refs/heads/* refs/tags/*:refs/tags/* in /opt/go/gopath/pkg/mod/cache/vcs/ed42bd05533fd84ae290a5d33ebd3695a0a2b06131beebd5450825bee8603aca: exit status 128:

If you think the above is an error on Dependabot's side please don't hesitate to get in touch - we'll do whatever we can to fix it.

View the update logs.

Dependabot can't resolve your Go dependency files

Dependabot can't resolve your Go dependency files.

As a result, Dependabot couldn't update your dependencies.

The error Dependabot encountered was:

verifying github.com/MontFerret/[email protected]/go.mod: checksum mismatch
	downloaded: h1:FnLRMYxmWxiuv10sNS9XgEy4/x01qgBcLgrIzqbzwHA=
	go.sum:     h1:9t7z7TqeVYgabjFQz9UytgbUotR0sA/k/3EUlPRV4aw=

SECURITY ERROR
This download does NOT match an earlier download recorded in go.sum.
The bits may have been replaced on the origin server, or an attacker may
have intercepted the download attempt.

For more information, see 'go help module-auth'.

If you think the above is an error on Dependabot's side please don't hesitate to get in touch - we'll do whatever we can to fix it.

View the update logs.

Dependabot can't resolve your Go dependency files

Dependabot can't resolve your Go dependency files.

As a result, Dependabot couldn't update your dependencies.

The error Dependabot encountered was:

github.com/MontFerret/ferret-server/server: cannot find module providing package github.com/MontFerret/ferret-server/server

If you think the above is an error on Dependabot's side please don't hesitate to get in touch - we'll do whatever we can to fix it.

View the update logs.

Webhook notifications

Add the possibility to create webhooks for 3rd party systems to notify when a scripts starts or ends or gets cancelled.

Execution history

Add a collection that keeps all past and active executions.

It must have the following data:

  • Unique id
  • Script id
  • Script revision
  • Params that were used
  • Status (queued, running, cancelled, complete, or error)
  • Started by (schedule, hook or manual)
  • Start time
  • End time
  • Logs

And add an API endpoints:

  • projects/{project}/history - Query all entries
  • projects/{project}/history/{scriptId} - Query all entries for a given script

history is not a final segment name.

Dependabot can't resolve your Go dependency files

Dependabot can't resolve your Go dependency files.

As a result, Dependabot couldn't update your dependencies.

The error Dependabot encountered was:

go: github.com/go-openapi/[email protected] requires
	github.com/go-openapi/[email protected]: reading github.com/go-openapi/analysis/go.mod at revision v0.19.10: unknown revision

If you think the above is an error on Dependabot's side please don't hesitate to get in touch - we'll do whatever we can to fix it.

View the update logs.

Dependabot can't resolve your Go dependency files

Dependabot can't resolve your Go dependency files.

As a result, Dependabot couldn't update your dependencies.

The error Dependabot encountered was:

verifying github.com/MontFerret/[email protected]/go.mod: checksum mismatch
	downloaded: h1:FnLRMYxmWxiuv10sNS9XgEy4/x01qgBcLgrIzqbzwHA=
	go.sum:     h1:9t7z7TqeVYgabjFQz9UytgbUotR0sA/k/3EUlPRV4aw=

SECURITY ERROR
This download does NOT match an earlier download recorded in go.sum.
The bits may have been replaced on the origin server, or an attacker may
have intercepted the download attempt.

For more information, see 'go help module-auth'.

If you think the above is an error on Dependabot's side please don't hesitate to get in touch - we'll do whatever we can to fix it.

View the update logs.

Dependabot can't resolve your Go dependency files

Dependabot can't resolve your Go dependency files.

As a result, Dependabot couldn't update your dependencies.

The error Dependabot encountered was:

github.com/MontFerret/ferret-server/server: cannot find module providing package github.com/MontFerret/ferret-server/server

If you think the above is an error on Dependabot's side please don't hesitate to get in touch - we'll do whatever we can to fix it.

View the update logs.

Dependabot can't resolve your Go dependency files

Dependabot can't resolve your Go dependency files.

As a result, Dependabot couldn't update your dependencies.

The error Dependabot encountered was:

go: github.com/go-openapi/[email protected] requires
	github.com/go-openapi/[email protected]: reading github.com/go-openapi/jsonpointer/go.mod at revision v0.19.3: unknown revision

If you think the above is an error on Dependabot's side please don't hesitate to get in touch - we'll do whatever we can to fix it.

View the update logs.

Dependabot can't resolve your Go dependency files

Dependabot can't resolve your Go dependency files.

As a result, Dependabot couldn't update your dependencies.

The error Dependabot encountered was:

verifying github.com/MontFerret/[email protected]/go.mod: checksum mismatch
	downloaded: h1:FnLRMYxmWxiuv10sNS9XgEy4/x01qgBcLgrIzqbzwHA=
	go.sum:     h1:9t7z7TqeVYgabjFQz9UytgbUotR0sA/k/3EUlPRV4aw=

SECURITY ERROR
This download does NOT match an earlier download recorded in go.sum.
The bits may have been replaced on the origin server, or an attacker may
have intercepted the download attempt.

For more information, see 'go help module-auth'.

If you think the above is an error on Dependabot's side please don't hesitate to get in touch - we'll do whatever we can to fix it.

View the update logs.

Authentication

Do not have particular ideas about authentication with some roles.
Just a draft:

  • System level (Owner)
  • Project level (Admin)
  • User level (User)
  • Guest level ? (Not sure)

Data querying

Add the possibility to query the results data using native AQL.

There are some security concerns.
Probably, we will need to implement #2 first, in order to limit access to system collections.

Dependabot can't resolve your Go dependency files

Dependabot can't resolve your Go dependency files.

As a result, Dependabot couldn't update your dependencies.

The error Dependabot encountered was:

go: github.com/rs/[email protected] requires
	github.com/zenazn/[email protected]: reading github.com/zenazn/goji/go.mod at revision v0.9.0: unknown revision

If you think the above is an error on Dependabot's side please don't hesitate to get in touch - we'll do whatever we can to fix it.

View the update logs.

Data validation

Add the possibility to validate script results before saving them using JSON Schema.
Must be optional.

Dependabot can't resolve your Go dependency files

Dependabot can't resolve your Go dependency files.

As a result, Dependabot couldn't update your dependencies.

The error Dependabot encountered was:

verifying github.com/MontFerret/[email protected]/go.mod: checksum mismatch
	downloaded: h1:FnLRMYxmWxiuv10sNS9XgEy4/x01qgBcLgrIzqbzwHA=
	go.sum:     h1:9t7z7TqeVYgabjFQz9UytgbUotR0sA/k/3EUlPRV4aw=

SECURITY ERROR
This download does NOT match an earlier download recorded in go.sum.
The bits may have been replaced on the origin server, or an attacker may
have intercepted the download attempt.

For more information, see 'go help module-auth'.

If you think the above is an error on Dependabot's side please don't hesitate to get in touch - we'll do whatever we can to fix it.

View the update logs.

Dependabot can't resolve your Go dependency files

Dependabot can't resolve your Go dependency files.

As a result, Dependabot couldn't update your dependencies.

The error Dependabot encountered was:

verifying github.com/MontFerret/[email protected]/go.mod: checksum mismatch
	downloaded: h1:FnLRMYxmWxiuv10sNS9XgEy4/x01qgBcLgrIzqbzwHA=
	go.sum:     h1:9t7z7TqeVYgabjFQz9UytgbUotR0sA/k/3EUlPRV4aw=

SECURITY ERROR
This download does NOT match an earlier download recorded in go.sum.
The bits may have been replaced on the origin server, or an attacker may
have intercepted the download attempt.

For more information, see 'go help module-auth'.

If you think the above is an error on Dependabot's side please don't hesitate to get in touch - we'll do whatever we can to fix it.

View the update logs.

Dependabot can't resolve your Go dependency files

Dependabot can't resolve your Go dependency files.

As a result, Dependabot couldn't update your dependencies.

The error Dependabot encountered was:

go: github.com/arangodb/[email protected] requires
	github.com/dgrijalva/[email protected]+incompatible: reading github.com/dgrijalva/jwt-go/go.mod at revision v3.2.0: unknown revision

If you think the above is an error on Dependabot's side please don't hesitate to get in touch - we'll do whatever we can to fix it.

View the update logs.

Dependabot can't resolve your Go dependency files

Dependabot can't resolve your Go dependency files.

As a result, Dependabot couldn't update your dependencies.

The error Dependabot encountered was:

github.com/MontFerret/ferret-server/server: cannot find module providing package github.com/MontFerret/ferret-server/server

If you think the above is an error on Dependabot's side please don't hesitate to get in touch - we'll do whatever we can to fix it.

View the update logs.

Add endpoint to run script manually

POST projects/{projectId}/runtime/start/{scriptId}
With the possibility to redefine param values.
Should return a unique id that can be used to check the execution status and retrieve output data.

Scheduling

Create scripts scheduling.
There is a caveat.
If we do in-memory scheduling - it wont' work well on horizontal scaling (if we deploy multiple instances of the server).
If we do 3rd party scheduling (like using Chronos) - we will increase complexity of deployment.

ArangoDB, unfortunately, does not support change events which would simplify the task.

System settings

Design system settings:

  • Global level
  • Project level
  • Derived from global

What to store:

  • Chrome instance(s) address
  • Remote persistence systems
  • Security (users and roles)
  • Proxies
  • etc?

Dependabot can't resolve your Go dependency files

Dependabot can't resolve your Go dependency files.

As a result, Dependabot couldn't update your dependencies.

The error Dependabot encountered was:

verifying github.com/MontFerret/[email protected]/go.mod: checksum mismatch
	downloaded: h1:FnLRMYxmWxiuv10sNS9XgEy4/x01qgBcLgrIzqbzwHA=
	go.sum:     h1:9t7z7TqeVYgabjFQz9UytgbUotR0sA/k/3EUlPRV4aw=

SECURITY ERROR
This download does NOT match an earlier download recorded in go.sum.
The bits may have been replaced on the origin server, or an attacker may
have intercepted the download attempt.

For more information, see 'go help module-auth'.

If you think the above is an error on Dependabot's side please don't hesitate to get in touch - we'll do whatever we can to fix it.

View the update logs.

Dependabot can't resolve your Go dependency files

Dependabot can't resolve your Go dependency files.

As a result, Dependabot couldn't update your dependencies.

The error Dependabot encountered was:

verifying github.com/MontFerret/[email protected]/go.mod: checksum mismatch
	downloaded: h1:FnLRMYxmWxiuv10sNS9XgEy4/x01qgBcLgrIzqbzwHA=
	go.sum:     h1:9t7z7TqeVYgabjFQz9UytgbUotR0sA/k/3EUlPRV4aw=

SECURITY ERROR
This download does NOT match an earlier download recorded in go.sum.
The bits may have been replaced on the origin server, or an attacker may
have intercepted the download attempt.

For more information, see 'go help module-auth'.

If you think the above is an error on Dependabot's side please don't hesitate to get in touch - we'll do whatever we can to fix it.

View the update logs.

Dependabot can't resolve your Go dependency files

Dependabot can't resolve your Go dependency files.

As a result, Dependabot couldn't update your dependencies.

The error Dependabot encountered was:

go: github.com/go-openapi/[email protected] requires
	github.com/google/[email protected]: reading github.com/google/uuid/go.mod at revision v1.1.1: unknown revision

If you think the above is an error on Dependabot's side please don't hesitate to get in touch - we'll do whatever we can to fix it.

View the update logs.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.