nw-final-project's Introduction

Website Info Scanner

Scan a particular website and extract useful information

Description for some features:

robots.txt
IP Address
Full URL
Domain Name
Whois detail
weak issues
location & phone call or other info

So for the user, when he types a website, the bunches of info will come out

robots.txt

Web Robots (also known as Web Wanderers, Crawlers, or Spiders), are programs that traverse the Web automatically. Search engines such as Google use them to index the web content, spammers use them to scan for email addresses, and they have many other uses.

In general, it tells the crawlers and spiders which part of data are allowed or disallowed to get. You should always look up the robots.txt first when you decide to extract info from a particular website.

There are two important considerations when using /robots.txt:

robots can ignore your /robots.txt. Especially malware robots that scan the web for security vulnerabilities, and email address harvesters used by spammers will pay no attention.
the /robots.txt file is a publicly available file. Anyone can see what sections of your server you don't want robots to use.

WHOIS

WHOIS (pronounced as the phrase who is) is a query and response protocol that is widely used for querying databases that store the registered users or assignees of an Internet resource, such as a domain name, an IP address block, or an autonomous system, but is also used for a wider range of other information. The protocol stores and delivers database content in a human-readable format.[1] The WHOIS protocol is documented in RFC 3912.

Resources:

Recommend Projects

ssun28 / nw-final-project Goto Github PK

nw-final-project's Introduction

Website Info Scanner

Scan a particular website and extract useful information

robots.txt

WHOIS

Resources:

nw-final-project's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent