Coder Social home page Coder Social logo

ipfs-search / nsfw-server Goto Github PK

View Code? Open in Web Editor NEW
5.0 5.0 3.0 19.93 MB

A simple Node.js server to run nsfw.js for images from IPFS and return its results.

License: MIT License

JavaScript 91.71% Dockerfile 8.29%
ipfs nodejs nsfw nsfwjs tensorflowjs

nsfw-server's Introduction

pipeline status Maintainability Test Coverage Documentation Status Go Reference Backers on Open Collective Sponsors on Open Collective

Search engine for the Interplanetary Filesystem. Sniffs the DHT gossip and indexes file and directory hashes.

Metadata and contents are extracted using ipfs-tika, searching is done using OpenSearch, queueing is done using RabbitMQ. The crawler is implemented in Go, the API and frontend are built using Node.js.

The ipfs-search command consists of two components: the crawler and the sniffer. The sniffer extracts hashes from the gossip between nodes. The crawler extracts data from the hashes and indexes them.

Docs

Documentation is hosted on Read the Docs, based on files contained in the docs folder. In addition, there's extensive Go docs for the internal API as well as SwaggerHub OpenAPI documentation for the REST API.

Contact

Please find us on our Freenode/Riot/Matrix channel #ipfs-search:matrix.org.

Snapshots

ipfs-search provides the daily snapshot for all of the indexed data using snapshots. To learn more about downloading and restoring snapshots please refer to the relevant section in our documentation.

Related repo's

Contributors wanted

Building a search engine like this takes a considerable amount of resources (money and TLC). If you are able to help out with either of them, do reach out (see the contact section in this file).

Please read the Contributing.md file before contributing.

Roadmap

For discussing and suggesting features, look at the issues.

External dependencies

  • Go 1.19
  • OpenSearch 2.3.x
  • RabbitMQ / AMQP server
  • NodeJS 9.x
  • IPFS 0.7
  • Redis

Internal dependencies

Building

$ go get ./...
$ make

Running

Docker

The most convenient way to run the crawler is through Docker. Simply run:

docker-compose up

This will start the crawler, the sniffer and all its dependencies. Hashes can also be queued for crawling manually by running ipfs-search a <hash> from within the running container. For example:

docker-compose exec ipfs-crawler ipfs-search add QmS4ustL54uo8FzR9455qaxZwuMiUhyvMcX9Ba8nUH4uVv

Ansible deployment

Automated deployment can be done on any (virtual) Ubuntu 16.04 machine. The full production stack is automated and can be found in it's own repository.

Contributors

This project exists thanks to all the people who contribute.

Backers

Thank you to all our backers! ๐Ÿ™ [Become a backer]

Sponsors


ipfs-search is supported by NLNet through the EU's Next Generation Internet (NGI0) programme.


RedPencil is supporting the hosting of ipfs-search.com.

Support this project by becoming a sponsor. Your logo will show up here with a link to your website. [Become a sponsor]

nsfw-server's People

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

nsfw-server's Issues

502 CORS errors from nsfw server on unsupprted file formats

When searching images, the browser makes a lot of calls to the nsfw server.
A bunch of them return 502 errors with a reason about CORS headers not being set. It seems related to unsupported fileformats, such as TIFF and WEBP.

E.g.:
https://ipfs-search.com/#/search?q=references.name%3A%28%22%2a.webp%22%29&page=1&type=images
https://ipfs-search.com/#/search?q=references.name%3A%28%22%2a.tiff%22%29&page=2&type=images&last_seen=%2a

Expected behavior for unsupported fileformats is to get a 415 error, not a 5xx error.

502 Gateway errors

On roughly half of the calls to the nsfw api, a 502 error (bad gateway) is being returned.

image

Allow concurrent requests

From logs here below it seems that requests somehow are handled sequentially.
Need to figure out if this is because of an operating system limitation, browser throttling or a server configuration, and fix it.

1640794675385 info 127.0.0.1 GET /classify {"url":"https://gateway.ipfs.io/ipfs/bafkreiazgl737iphtrr4777ju7o4chrniet7knzmtnydizhzbechwhnofi"} 415 (18366 ms)
1640794675517 info 127.0.0.1 GET /classify {"url":"https://gateway.ipfs.io/ipfs/QmbNA4nN3eLR12MBd6kVLkLiR82Pf69qZQGG4DdoM6siVq"} 200 (129 ms)
1640794675630 info 127.0.0.1 GET /classify {"url":"https://gateway.ipfs.io/ipfs/QmXnEU8ftNL4xxZeN1ksLNqVW9f7DHmeKZzEhYVKbmw76G"} 200 (111 ms)
1640794675792 info 127.0.0.1 GET /classify {"url":"https://gateway.ipfs.io/ipfs/QmSZzv7ux1LGwpehVcCMQ9ec945X6qE4qyjKDhCVwY25iw"} 200 (161 ms)
1640794675919 info 127.0.0.1 GET /classify {"url":"https://gateway.ipfs.io/ipfs/QmcPw2yPbtTiZQuabDnPgEHquURhwyDojomhSAc56y4DNM"} 200 (125 ms)
1640794678124 info 127.0.0.1 GET /classify {"url":"https://gateway.ipfs.io/ipfs/QmWx6ThfE8r6yYjAHSsc2xzJLS5nGdPyua6A75aRrLkP84"} 200 (18374 ms)

Excessive memory usage

I'm seeing 20g+ memory usage for NSFW-server in production.

Would be great to have a 'neat' way to limit this.

IPFS_API_ADDRESS setting undocumented

Related to #24

Another question is whether we can't simply hash files without actually starting a node. It does seem the start: false simply doesn't come through. If it did, there would be no need to use or configure a locally running node.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.