Comments (9)
Hi,
First of all I would like to suggest you to watch operations log inside moosefs client.
Simply execute:
cat /home/data/logs/nginx/.oplog
You will be able to see what is really going on inside mfsmount and what kind of operations are executed.
Please send us results from .oplog MooseFS object.
It's look like your Nginx using cache and execute flush operation from time to time.
Please check "flush" parameter in you Nginx configuration.
http://nginx.org/en/docs/http/ngx_http_log_module.html
from moosefs.
Hi,
The oplog like:
02.17 09:23:17.971750: uid:0 gid:0 pid:24716 cmd:write (292010,201,569577): OK (201)
02.17 09:23:17.999473: uid:0 gid:0 pid:24716 cmd:write (292010,288,569778): OK (288)
02.17 09:23:18.014322: uid:0 gid:0 pid:24716 cmd:write (292010,240,570066): OK (240)
02.17 09:23:18.031706: uid:0 gid:0 pid:24716 cmd:write (292010,241,570306): OK (241)
02.17 09:23:18.124160: uid:0 gid:0 pid:24716 cmd:write (292010,355,570547): OK (355)
02.17 09:23:18.129366: uid:0 gid:0 pid:24716 cmd:write (292010,201,570902): OK (201)
02.17 09:23:18.178810: uid:0 gid:0 pid:24716 cmd:write (292010,313,571103): OK (313)
02.17 09:23:18.203245: uid:0 gid:0 pid:24716 cmd:write (292010,345,571416): OK (345)
02.17 09:23:18.219016: uid:0 gid:0 pid:24716 cmd:write (292010,199,571761): OK (199)
02.17 09:23:18.264838: uid:0 gid:0 pid:24716 cmd:write (292010,243,571960): OK (243)
02.17 09:23:18.299924: uid:0 gid:0 pid:24716 cmd:write (292010,218,572203): OK (218)
02.17 09:23:18.323139: uid:0 gid:0 pid:24716 cmd:write (292010,250,572421): OK (250)
02.17 09:23:18.331646: uid:0 gid:0 pid:24716 cmd:write (292010,249,572671): OK (249)
02.17 09:23:18.347683: uid:0 gid:0 pid:24716 cmd:write (292010,273,572920): OK (273)
02.17 09:23:18.359451: uid:0 gid:0 pid:24716 cmd:write (292010,256,573193): OK (256)
02.17 09:23:18.364929: uid:0 gid:0 pid:24716 cmd:write (292010,271,573449): OK (271)
02.17 09:23:18.368938: uid:0 gid:0 pid:24716 cmd:write (292010,268,573720): OK (268)
02.17 09:23:18.375024: uid:0 gid:0 pid:24716 cmd:write (292010,250,573988): OK (250)
02.17 09:23:18.378074: uid:0 gid:0 pid:24716 cmd:write (292010,249,574238): OK (249)
02.17 09:23:18.391258: uid:0 gid:0 pid:24716 cmd:write (292010,269,574487): OK (269)
02.17 09:23:18.441630: uid:0 gid:0 pid:24716 cmd:write (292010,284,574756): OK (284)
02.17 09:23:18.466238: uid:0 gid:0 pid:24716 cmd:write (292010,331,575040): OK (331)
02.17 09:23:18.466359: uid:0 gid:0 pid:24716 cmd:write (292010,352,575371): OK (352)
02.17 09:23:18.483821: uid:0 gid:0 pid:24716 cmd:write (292010,276,575723): OK (276)
02.17 09:23:18.498730: uid:0 gid:0 pid:24716 cmd:write (292010,374,575999): OK (374)
02.17 09:23:18.504726: uid:0 gid:0 pid:24716 cmd:write (292010,277,576373): OK (277)
02.17 09:23:18.517666: uid:0 gid:0 pid:24716 cmd:write (292010,276,576650): OK (276)
02.17 09:23:18.521152: uid:0 gid:0 pid:24716 cmd:write (292010,276,576926): OK (276)
02.17 09:23:18.522902: uid:0 gid:0 pid:24716 cmd:write (292010,206,577202): OK (206)
02.17 09:23:18.529218: uid:0 gid:0 pid:24716 cmd:write (292010,387,577408): OK (387)
02.17 09:23:18.530717: uid:0 gid:0 pid:24716 cmd:write (292010,298,577795): OK (298)
02.17 09:23:18.544620: uid:0 gid:0 pid:24716 cmd:write (292010,323,578093): OK (323)
02.17 09:23:18.581265: uid:0 gid:0 pid:24716 cmd:write (292010,236,578416): OK (236)
02.17 09:23:18.606531: uid:0 gid:0 pid:24716 cmd:write (292010,201,578652): OK (201)
02.17 09:23:18.671402: uid:0 gid:0 pid:24716 cmd:write (292010,308,578853): OK (308)
02.17 09:23:18.703426: uid:0 gid:0 pid:24716 cmd:write (292010,198,579161): OK (198)
02.17 09:23:18.776515: uid:0 gid:0 pid:24716 cmd:write (292010,199,579359): OK (199)
02.17 09:23:18.803434: uid:0 gid:0 pid:24716 cmd:write (292010,348,579558): OK (348)
02.17 09:23:18.847969: uid:0 gid:0 pid:24716 cmd:write (292010,199,579906): OK (199)
02.17 09:23:18.903167: uid:0 gid:0 pid:24716 cmd:write (292010,308,580105): OK (308)
02.17 09:23:18.982354: uid:0 gid:0 pid:24716 cmd:write (292010,425,580413): OK (425)
02.17 09:23:18.990796: uid:0 gid:0 pid:24716 cmd:write (292010,200,580838): OK (200)
02.17 09:23:18.992178: uid:0 gid:0 pid:24716 cmd:write (292010,292,581038): OK (292)
02.17 09:23:19.017782: uid:0 gid:0 pid:24716 cmd:write (292010,271,581330): OK (271)
02.17 09:23:19.024706: uid:0 gid:0 pid:24716 cmd:write (292010,297,581601): OK (297)
02.17 09:23:19.049587: uid:0 gid:0 pid:24716 cmd:write (292010,295,581898): OK (295)
02.17 09:23:19.203131: uid:0 gid:0 pid:24716 cmd:write (292010,209,582193): OK (209)
02.17 09:23:19.251906: uid:0 gid:0 pid:24716 cmd:write (292010,305,582402): OK (305)
02.17 09:23:19.276591: uid:0 gid:0 pid:24716 cmd:write (292010,324,582707): OK (324)
02.17 09:23:19.284785: uid:0 gid:0 pid:24716 cmd:write (292010,199,583031): OK (199)
02.17 09:23:19.377134: uid:0 gid:0 pid:24716 cmd:write (292010,296,583230): OK (296)
02.17 09:23:19.382819: uid:0 gid:0 pid:24716 cmd:write (292010,197,583526): OK (197)
02.17 09:23:19.382937: uid:0 gid:0 pid:24716 cmd:write (292010,289,583723): OK (289)
02.17 09:23:19.403451: uid:0 gid:0 pid:24716 cmd:write (292010,313,584012): OK (313)
02.17 09:23:19.460539: uid:0 gid:0 pid:24716 cmd:write (292010,210,584325): OK (210)
02.17 09:23:19.486978: uid:0 gid:0 pid:24716 cmd:write (292010,298,584535): OK (298)
02.17 09:23:19.547490: uid:0 gid:0 pid:24716 cmd:write (292010,306,584833): OK (306)
02.17 09:23:19.604439: uid:0 gid:0 pid:24716 cmd:write (292010,316,585139): OK (316)
02.17 09:23:19.621328: uid:0 gid:0 pid:24716 cmd:write (292010,314,585455): OK (314)
02.17 09:23:19.648566: uid:0 gid:0 pid:24716 cmd:write (292010,201,585769): OK (201)
02.17 09:23:19.764895: uid:0 gid:0 pid:24716 cmd:write (292010,329,585970): OK (329)
02.17 09:23:19.828972: uid:0 gid:0 pid:24716 cmd:write (292010,323,586299): OK (323)
02.17 09:23:19.834529: uid:0 gid:0 pid:24716 cmd:write (292010,197,586622): OK (197)
02.17 09:23:19.870466: uid:0 gid:0 pid:24716 cmd:write (292010,319,586819): OK (319)
02.17 09:23:19.954825: uid:0 gid:0 pid:24716 cmd:write (292010,403,587138): OK (403)
02.17 09:23:19.965577: uid:0 gid:0 pid:24716 cmd:write (292010,404,587541): OK (404)
02.17 09:23:19.981905: uid:0 gid:0 pid:24716 cmd:write (292010,401,587945): OK (401)
...
And when in another machine to cat nginx logs, it sometimes would return:
cat: 114.log: Input/output error
Corresponding oplog are as follows
02.17 09:36:46.840322: uid:0 gid:0 pid:0 cmd:invalidate cache (292010:0:67108864): ok
02.17 09:36:46.840817: uid:0 gid:0 pid:0 cmd:invalidate cache (292010:0:67108864): ok
02.17 09:36:47.437919: uid:0 gid:0 pid:8181 cmd:opendir (1): OK [handle:00000001]
02.17 09:36:47.438748: uid:0 gid:0 pid:8181 cmd:readdir (1,4096,270): OK (200)
02.17 09:36:47.438825: uid:0 gid:0 pid:8181 cmd:readdir (1,4096,270): OK (no data)
02.17 09:36:47.438880: uid:0 gid:0 pid:0 cmd:releasedir (1): OK
02.17 09:36:47.439335: uid:0 gid:0 pid:8181 cmd:getattr (1): OK (1.0,[drwxr-xr-x:0040755,2,65534,65534,1487295407,1487294549,1487294549,2002734])
02.17 09:36:47.439913: uid:0 gid:0 pid:8181 cmd:lookup (1,114.log): OK (0.0,292010,1.0,[-rw-r--r--:0100644,1,65534,65534,1487294770,1487295407,1487295407,10051099])
02.17 09:36:47.440560: uid:0 gid:0 pid:8181 cmd:lookup (1,114.log): OK (0.0,292010,1.0,[-rw-r--r--:0100644,1,65534,65534,1487294770,1487295407,1487295407,10051099])
02.17 09:36:47.994046: uid:0 gid:0 pid:8195 cmd:lookup (1,114.log): OK (0.0,292010,1.0,[-rw-r--r--:0100644,1,65534,65534,1487294770,1487295407,1487295407,10051099])
02.17 09:36:47.994152: uid:0 gid:0 pid:8195 cmd:open (292010) (using cached data from lookup): OK (direct_io:0,keep_cache:0) [handle:02000001]
02.17 09:36:52.994360: uid:0 gid:0 pid:8195 cmd:read (292010,131072,0): EIO (Input/output error)
02.17 09:36:52.994431: uid:0 gid:0 pid:8195 cmd:read (292010,131072,131072): EIO (Input/output error)
02.17 09:36:52.994572: uid:0 gid:0 pid:8195 cmd:read (292010,4096,0): EIO (Input/output error)
02.17 09:36:52.994694: uid:0 gid:0 pid:8195 cmd:flush (292010): OK
02.17 09:36:52.994731: uid:0 gid:0 pid:0 cmd:release (292010): OK
Sometimes it will return correctly, but the data has a delay of nearly 20 seconds
I do not use cache(either buffer) for access log.
My access log config is:
access_log /home/data/logs/access.log main;
Thanks for your help !
from moosefs.
Hi,
I would like to suggest you to do MooseFS update to 3.0.88 version.
Please check if you have any errors on your NIC interface or on switch port.
Good idea is to check system log on MooseFS client machine.
from moosefs.
Hi,
Do you have any updates according to EIO problem?
from moosefs.
Hi,
Yes, I updated, but the problem still exists. I suspect that this problem is caused by high write operations, So I changed the directory of nginx log.
from moosefs.
OK.
Thank you for this information.
Personally I believe that problem is connected with some other aspects.
Also I would like to add that you can try to mount MooseFS client in DIRECT mode like:
mfsmount -o mfscachemode=DIRECT -H master.host.name /mnt/mfs
or set specific objects extra attributes for log folder like:
mfsseteattr -r -f nodatacache /mnt/mfs/log_folder
This option do not require mfsmount remount operation.
Would you be so kind and tell us what is your current hardware configuration, I mean number of chunkservers, HDD's, LAN?
Also is your MooseFS cluster is installed on bare metal or VM's?
By the way.
We have MooseFS clients with over 28000 write operations per minute in production environment.
from moosefs.
Hi,
Thank for your help, and I will try again with “mfscachemode=DIRECT”.
My system configuration is as follows:
1 Master (Mem 64G)
3 chunkservers (250G, 250G,1.8T) RAID1
Network bandwidth 1G
MooseFS cluster is installed on bare metal
Mfs Client is in Docker container
from moosefs.
Thanks,
I will check scenario with mfsmount and Docker.
from moosefs.
Hi,
DIRECT mode can solve the problem!
When I change to direct mode, the delay is reduced from 20 seconds to less than 1 second.
Thank for your help!
from moosefs.
Related Issues (20)
- supports IPv6 HOT 4
- [BUG] The data displayed by mfs has garbled characters HOT 8
- mfsmaster -a restore hangs with 100% CPU usage HOT 5
- [Question] 2 copys of chunks on one chunkserver HOT 1
- [BUG] Performance impact and write amplification with CHANGELOG_SAVE_MODE = 2 HOT 9
- Do the Master and Chunk servers have to be the same architecture? HOT 3
- chunkserver: High speed rebalance blocks deletions? HOT 7
- [BUG] fuse: bad mount point `/matrix/synapse/storage/media-store/': Input/output error HOT 2
- [FEATURE] Official packages of MooseFS / MooseFS Pro for Debian 12 Bookworm HOT 2
- [BUG] mfsbdev and map + unmap + map on /dev/ndb0 = input/output error HOT 1
- [FEATURE] mfsclient mfstimeout default 0 HOT 1
- mfsmaster register error: No such file or directory HOT 3
- Can't mount MooseFS on Proxmox 8.1 properly. HOT 4
- MooseFS 3.x Erasure Code Support
- [BUG] mfsmaster hung and in unkillable D state HOT 3
- [BUG] DeprecationWarning: 'cgi' is deprecated and slated for removal in Python 3.13 HOT 2
- [FEATURE] mfsbdev as standard (TCP/Unix Socket) NBD server HOT 1
- [BUG] Empty chunks and copies with different checksums HOT 8
- Recovery data from chunks without metadata :) HOT 4
- [BUG] FUSE mount forces DIRECT I/O mode with Samba
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from moosefs.