Comments (13)
Perfect! That produced the full text of the OSError
, and the problem was then very easy to identify. A directory contained a symbolic link to itself:
/var/www/websites/userN_homepage/subdirectory# la
total 4
lrwxrwxrwx 1 websites websites 1 Aug 30 10:24 Bad_Link -> .
This is the full error output:
Traceback (most recent call last):
File "/root/wordfence_test.py", line 25, in <module>
locate(path=path)
File "/root/wordfence_test.py", line 17, in locate
for path in search_directory(real_path):
File "/root/wordfence_test.py", line 9, in search_directory
yield from search_directory(item.path)
File "/root/wordfence_test.py", line 9, in search_directory
yield from search_directory(item.path)
File "/root/wordfence_test.py", line 9, in search_directory
yield from search_directory(item.path)
[Previous line repeated 40 more times]
File "/root/wordfence_test.py", line 8, in search_directory
if item.is_dir():
OSError: [Errno 40] Too many levels of symbolic links: '/var/www/websites/userN_homepage/subdirectory/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/Bad_Link/'
Removing the Bad_Link
now allows the scan to complete without issue.
Thank you for that! Your patience and support is very much appreciated :)
from wordfence-cli.
Thanks for working with me to diagnose that! We need to handle this better within Wordfence CLI. I'm going to leave this issue open for now and update it to reflect that we need additional handling for recursive symlinks.
from wordfence-cli.
That exception is raised from another exception that should be an OSError
and appear earlier in the output. Can you share that part of the message?
from wordfence-cli.
Thanks for replying.
I'm not seeing any other error in the output before the exception I posted. Sent all debug output to a file, and grepped the file for OSError
, OSE
, Error
, and I'm seeing nothing.
from wordfence-cli.
The error would be written to stderr rather than stdout. Just to confirm, did you use something like 2>&1
to capture the stderr output as well?
It initially looks like a permission issue, but that should impact your find
usage as well. What are the permissions on /var/www/webpages
? I noticed in your find
command, you're using /var/www/webpages
whereas when running Wordfence CLI you're using /var/www/websites
. Are you using the same path in both tests?
from wordfence-cli.
My apologies--the /var/www/webpages
and /var/www/websites
are standing in for a specific directory name which I redacted. To confirm, I used the same path in every test. This path contains about 100 subdirectories, each of which is a personal homepage for a unique user.
I did use 2>&1
to redirect all output to a text file.
I agree that it felt like a permissions issue, but the fact that I can iterate through every single subdirectory without issue with a for loop seems to suggest otherwise. Each subdirectory within /var/www/websites
has the same owner and group--there are no exceptions.
/var/www
|-- websites
| |-- user1_homepage
| |-- user2_homepage
| |-- user3_homepage
...
| |-- user99_homepage
| `-- user100_homepage
from wordfence-cli.
What are the modes on each directory tier? /var
, /var/www
, /var/www/websites
, /var/www/websites/userX_homepage
I suspect there's a directory in that hierarchy that has read without execute. By creating a similar structure and setting the permissions for /var/www/websites
to 444, I am able to recreate the behavior you've encountered where the scanner is unable to find files, but shell globbing (as in your for d in /var/www/websites/*
example) does still work.
Directories should have the execute permission. Adding that should fix the scanning issue, assuming it's missing.
from wordfence-cli.
Thanks for that idea.
So normally everything within /var/www
and /var/www/websites
is 750, with a few 770 and 755.
I tried setting everything to 777 just to test, and I got the exact same errors as reported above happening in the exact same place.
I also want to reconfirm that every one of the homepage directories (/var/www/websites/userX_homepage
) is owned by the same user and group. So it is odd that scanning should fail for some and not for others.
from wordfence-cli.
It sounds like the permissions should be OK, then. There aren't any additional permission controls like ACLs in place, are there? Is the user who is running Wordfence CLI the same as the owner of the directories?
Can you confirm that find /var/www/websites -type f
yields the expected results without permission issues?
from wordfence-cli.
No ACLs in place. I've been running the testing as the root user. find /var/www/websites -type f
does return everything without complaint.
We do use the apache module mpm_itk (http://mpm-itk.sesse.net/), and so /var/www/websites
has a uid and gid that is not just unique to it and its subdirectories, but is also not in use anywhere else on the system. Its home is /var/www/websites
, and when I attempted to run the wordfence scan as this user, I first was prompted to initialize the ~/.config
and ~/.cache
directories. The debug output is identical to my previous tests, the only difference being that this test was only scanning /var/www/websites
, and not /var/www/
, as this user has no permissions for /var/www/
. I then also ran find /var/www/websites -type f
as this user, and as with the root user it returned everything without complaint.
Not being familiar with the wordfence code, it almost feels as if a buffer were getting filled up. Does that seem possible?
from wordfence-cli.
If this issues occurs when running as root
, it's almost certainly not a permission issue. The queue used for sending located files to the actual scan workers does have a fixed size (currently 1,000), but that should just block until the workers process the queue; it shouldn't trigger an OSError
(which is the underlying cause of the error you're seeing).
Do you have the ability to run Python on this system? I've put together a test script with a simplified version of the file locator so hopefully we can actually capture the underlying error.
import os
import sys
def search_directory(path: str):
contents = os.scandir(path)
for item in contents:
if item.is_dir():
yield from search_directory(item.path)
elif item.is_file():
yield item.path
def locate(path: str):
real_path = os.path.realpath(path)
if os.path.isdir(real_path):
for path in search_directory(real_path):
print(path)
else:
print(real_path)
path = sys.argv[1]
print(f'Base path: {path}')
locate(path=path)
You can save this script to a file and then run the following:
python3 /path/to/script.py /var/www/websites
That should yield an OSError
that will help us identify the underlying cause.
from wordfence-cli.
To prevent the need for debugging such issues outside of the Wordfence CLI tool in the future, I've added #31 to the next milestone.
from wordfence-cli.
Created 5 symlinks to test a variety of scenarios, and verified error no longer occurs, and instead a notification appears in the output as "Recursive symlink detected at /path/being/scanned" for each occurrence:
- Parent directory just outside of root of scan path
- Different path than scan path
- Recursive - same as root of scan path
- Recursive - from subdirectory to parent directory of scan path
- Recursive - from subdirectory to self of scan path
Symlinks that are not recursive do not present a warning and scan without error.
Additional testing included a variety of file paths with both stdout and progress output, adding in options that include number of workers, file types to include, large and small scale malware findings, debug, verbose, routing stderr to file, defining output-path, as root, as user, with and without allow-io-errors option.
from wordfence-cli.
Related Issues (20)
- Build RPM package HOT 2
- Add subcommand to count number of WordPress installations HOT 9
- Scanning directory containing sibling symlinks results in files being scanned multiple times HOT 3
- Scanning for sites with count-sites or vuln-scan on Mac fails when symlinks are present HOT 2
- Add status and error column to non-human output for all applicible subcommands
- Add number of sites scanned to the end of vuln-scan results
- Clarify error message in the event a scan does not complete due to IO error HOT 5
- Result queue can fill up causing malware scans to hang
- Feature: First Scan, Incremental Scan and On-demand Scan HOT 2
- Input from stdin isn't processed until an entire chunk is available HOT 1
- New Wordfence CLI install produces "Latin-1" UnicodeEncodeError HOT 11
- CLI misses malware reported by plugin HOT 2
- Implement vectorscan support to improve malware scan performance HOT 9
- On MacOS, when you remediate one or more files and enable debug logging, FileType.<TYPE> is output HOT 1
- Feature Request: Add argument to skip api.github.com version check HOT 2
- Inconsistent exit code with remediate depending on if install via pip or binary HOT 2
- Missing RPM from the crb repo HOT 2
- Add public contribution and security guidelines
- Add option to limit CPU usage for malware scans
- Non-ASCII characters in filenames can result in encoding errors HOT 9
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from wordfence-cli.