Topic: robots-txt Goto Github
Some thing interesting about robots-txt
Some thing interesting about robots-txt
robots-txt,Dark Web Informationgathering Footprinting Scanner and Recon Tool Release. Dark Web is an Information Gathering Tool I made in python 3. To run Dark Web, it only needs a domain or ip. Dark Web can work with any Linux distros if they support Python 3. Author: AKASHBLACKHAT(help for ethical hackers)
User: akashblackhat
Home Page: https://www.youtube.com/channel/UCWlAUmBOM07RXwwfZyfj_uw
robots-txt,Lightweight robots.txt parser and generator written in Rust.
User: alexander-irbis
robots-txt,Opt-Out tool to check Copyright reservations in a way that even machines can understand.
User: alexjc
robots-txt,Makes it easy to add robots.txt, sitemap and web app manifest during build to your Astro app.
User: alextim
robots-txt,Open-Source Python Based SEO Web Crawler
User: beb7
Home Page: https://greenflare.io
robots-txt,Manage the robots.txt from the Kirby config file
User: bnomei
robots-txt,A set of reusable Java components that implement functionality common to any web crawler
Organization: crawler-commons
robots-txt,Parser for robots.txt for node.js
User: ekalinin
robots-txt,advertools - online marketing productivity and analysis tools
User: eliasdabbas
Home Page: https://advertools.readthedocs.io
robots-txt,Optimizes your site's robots.txt to reduce server load and CO2 footprint by blocking unnecessary crawlers while allowing major search engines and specific tools.
Organization: emilia-capital
Home Page: https://joost.blog/plugins/eco-friendly-robots-txt/
robots-txt,This package helps you to add meta-tags, sitemap.xml and robots.txt into your project easily.
User: engincanv
robots-txt,:robot: robots.txt as a service. Crawls robots.txt files, downloads and parses them to check rules through an API
User: fooock
Home Page: https://robotstxt.io
robots-txt,Known tags and settings suggested to opt out of having your content used for AI training.
User: healsdata
robots-txt,Generator robots.txt for node js
Organization: itgalaxy
robots-txt,A webpack plugin to generate a robots.txt file
Organization: itgalaxy
robots-txt,grobotstxt is a native Go port of Google's robots.txt parser and matcher library.
User: jimsmart
robots-txt,Java sitemap generator. This library generates a web sitemap, can ping Google, generate RSS feed, robots.txt and more with friendly, easy to use Java 8 functional style of programming
User: jirkapinkas
robots-txt,Simple robots.txt template. Keep unwanted robots out (disallow). White lists (allow) legitimate user-agents. Useful for all websites.
User: jonasjacek
Home Page: https://www.ditig.com/publications/robots-txt-template
robots-txt,.Net Core Plugin Manager, extend web applications using plugin technology enabling true SOLID and DRY principles when developing applications
User: k3ldar
robots-txt,List of useful links, tools and resources
User: kappa-wingman
robots-txt,An Astro project template for decent projects: auth, i18next, Bootstrap, sitemap, webworker, robots.txt, preact, react, endpoints, endpoint clients, OAuth, various Astro features and data loading preconfigured
User: kyr0
Home Page: https://astro-launchpad.vercel.app
robots-txt,ScrapeGPT is a RAG-based Telegram bot designed to scrape and analyze websites, then answer questions based on the scraped content. The bot utilizes Retrieval Augmented Generation and webscraping to return natural language answers to the user's queries.
User: lexiestleszek
robots-txt,Privacy Web Search Engine (not meta, own crawler)
User: liameno
robots-txt,🧑🏻👩🏻 "We are people, not machines" - An initiative to know the creators of a website. Contains the information about humans to the web building - A Nuxt Module to statically integrate and generate a humans.txt author file - Based on the HumansTxt Project.
User: luxdamore
Home Page: https://luxdamore.github.io/nuxt-humans-txt
robots-txt,Behat extension for testing some On-Page SEO factors: meta title/description, canonical, hreflang, meta robots, robots.txt, redirects, sitemap validation, HTML validation, performance...
User: marcortola
robots-txt,Gatsby plugin that automatically creates robots.txt for your site
User: mdreizin
Home Page: https://mdreizin.github.io/gatsby-plugin-robots-txt
robots-txt,Ultimate Website Sitemap Parser
Organization: mediacloud
Home Page: https://mediacloud.org/
robots-txt,Laravel package to manage robots
User: mguinea
robots-txt,Enumerate old versions of robots.txt paths using Wayback Machine for content discovery
User: mhmdiaa
robots-txt,Python-based web crawling script with randomized intervals, user-agent rotation, and proxy server IP rotation to outsmart website bots and prevent blocking.
User: mlartist
robots-txt,a tool that gets all paths at robots.txt and opens it in the browser.
User: momenbasel
robots-txt,NuxtJS module for robots.txt
Organization: nuxt-modules
robots-txt,YiraBot: Simplifying Web Scraping for All. A user-friendly tool for developers and enthusiasts, offering command-line ease and Python integration. Ideal for research, SEO, and data collection.
User: owenorcan
robots-txt,A python script to check if URLs are allowed or disallowed by a robots.txt file.
User: p0dalirius
Home Page: https://podalirius.net/
robots-txt,A simple and flexible web crawler that follows the robots.txt policies and crawl delays.
Organization: puerkitobio
robots-txt,Polite, slim and concurrent web crawler.
Organization: puerkitobio
robots-txt,Robots.txt parser and fetcher for Elixir
User: ravern
Home Page: https://hexdocs.pm/gollum
robots-txt,🤖 A curated list of websites that restrict access to AI Agents, AI crawlers and GPTs
User: samber
robots-txt,NodeJS robots.txt parser with support for wildcard (*) matching.
User: samclarke
robots-txt,Go robots.txt parser
User: samclarke
robots-txt,A pure-Python robots.txt parser with support for modern conventions.
Organization: scrapy
robots-txt,Determine if a page may be crawled from robots.txt, robots meta tags and robot headers
Organization: spatie
Home Page: https://spatie.be/en/opensource/php
robots-txt,Provides robots.txt middleware for .NET core
Organization: stormid
robots-txt,Php class for robots.txt parse
User: t1gor
robots-txt,Site-Scanner - Web application vulnerability assessment tool.
User: talmaika
robots-txt,The robots.txt exclusion protocol implementation for Go language
User: temoto
robots-txt,Simple robots generation module for Silverstripe (SS 4 and above)
User: tractorcow
robots-txt,A simple but powerful web crawler library for .NET
Organization: turnersoftware
robots-txt,A "robots.txt" parsing and querying library for .NET
Organization: turnersoftware
robots-txt,An extensible robots.txt parser and client library, with full support for every directive and specification.
Organization: vipnytt
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.