pistasjis / listencaddy Goto Github PK
View Code? Open in Web Editor NEWA Caddy plugin that checks for scrapers who try to find sensitive files and reports them to AbuseIPDB.
License: Apache License 2.0
A Caddy plugin that checks for scrapers who try to find sensitive files and reports them to AbuseIPDB.
License: Apache License 2.0
I'd like to allowlist my ISP's entire IP range (to account for IP changes) for things like admin panels, and block + report anyone else who accesses these admin panels (likely bots scraping for endpoints like /admin
).
Attempting to use a subnet does not work with the existing allowlist IP feature as it seems to use regex instead. If the user has a simple /24 or /16, they COULD probably just do 123\.456\..*\..*
(/16, looks cursed yeah), but my ISP has a /12 in which case allowlisting the entire three octets will not work and will allow unrelated IP addresses, as xx.yxz
and xx.xyz
would be in the same /12 subnet but not xx.abc
.
Hi, yesterday I received a weird email saying I exceeded my 5000 API request limit on AbuseIPDB. I found that very unusual because I have never ever exceeded even 1k requests a day, and checked my AbuseIPDB account and saw that I reported the same IP with the same URI 7 times.
https://www.abuseipdb.com/check/162.240.159.246
That was weird but definitely was not showing 5000+ reports. I checked my Caddy logs on that same IP and I found that it was scanning my IP and domain hundreds of thousands of times and ListenCaddy was reporting basically every URI because it was an extremely loud scanner (wc -l
counts lines in output).
[root@girlcock ~]# grep 'reporting IP to AbuseIPDB' access-2023-11-17T*.log | wc -l
29344
[root@girlcock ~]# grep '162.240.159.246' access-2023-11-17T*.log | wc -l
149519
[root@girlcock ~]# grep 'manager/html' access-2023-11-17T*.log | wc -l
13266
[root@girlcock ~]#
So basically, in one single day, that IP made approximately 100k+ requests to me and ListenCaddy sent reports 29344 times (collectively which obviously exceeded daily 5k request limit), and one of the endpoints that IP was constantly scanning was Apache Tomcat (/manager/html
) which ListenCaddy was constantly reporting even though it's the exact same offender and URI.
To avoid future API usage saturation, could suppression be implemented per-IP?
Sometimes I get hits from bad IPs with suspicious or weird user agents, or even just scanning user-agents in general. Would be nice to have the ability to add the user-agent in the report message too.
(hope you're still interested in maintaining this project, I use it a lot :P)
I am thinking maybe making a sqlite database or something like that would be good. Or a csv.
There are cases where I'd like to test if the regex (and other stuff) work just fine in a real situation, but I don't want to report my own IP. Or I'd like to keep a universal regex but allowlist an IP that I know will access said blocked path such as an external server.
Current workaround for the former use-case is to just use regex101.com, copy the regex I have, and type in a path and see if it picks it up.
Right now, ListenCaddy splits the port in IPs based on the ":" part. This works fine for IPv4, but when it comes to IPv6, it does not work. I have found a RegEx (of course I wouldn't make RegEx) that works on both IPv4 and IPv6: https://regexr.com/3hpvt
Not sure about performance though
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.