Comments (17)
Following the steps
- using
client.create(file_path, WriteOptions::default()).await?.close().await?;
create a empty file - and then append the data into this file.
self.client.append(file_path).await?.write(data).await?
from hdfs-native.
And when using the append, the error logs as follows
from hdfs-native.
And the code is here: https://github.com/zuston/incubator-uniffle/blob/hdfs-native/rust/experimental/server/src/store/hdfs.rs
from hdfs-native.
Thanks for this, the write path isn't currently very resilient to failures and definitely could use some improvements. I wanted to get a bare minimum write path working, but there's all kinds of situations the DFSOutputStream tries to handle. Some info that would be useful:
- Does this happen deterministically or was this a random error you ran into?
- Any stats about the size of the file you're appending to or how many replicas.
- What version of Hadoop is this talking to?
- What features does your cluster use? (security, HA namenodes, RBF, etc.)
from hdfs-native.
Does this happen deterministically or was this a random error you ran into?
deterministically
Any stats about the size of the file you're appending to or how many replicas.
Append to a empty file and 3 replica
we use hadoop 3.2.1 and enable kerberos, HA name ode and RBF
from hdfs-native.
Append to a empty file and 3 replica
What do you mean by append to empty file? Does the file already exist with no size and you open for appending? Or does the file not exist yet?
from hdfs-native.
Can you run with debug rust logs and share the output? It should include all RPC messages being sent with the namenode. That exact thing is part of the integration test so not sure why it's holding onto the lease still
from hdfs-native.
Actually I took a look at your code again. Is this function: https://github.com/zuston/incubator-uniffle/blob/hdfs-native/rust/experimental/server/src/store/hdfs.rs#L291 missing a close after appending? That could cause the issue you're seeing, if you're doing multiple appends.
from hdfs-native.
Oh. yes. I don't close this client.
Do you mean the every append operation should be closed? And is the client thread safe for multiple appending at the same time?
from hdfs-native.
Do you mean the every append operation should be closed? And is the client thread safe for multiple appending at the same time?
append
opens a writer in append mode. If you keep the writer around you can write to it multiple times, but you need to close it when you're done. There's no way currently to guarantee a read will see the data until you close.
Not sure exactly what you're asking if it's thread safe. You can't open the same file for append multiple times. You could share a writer across threads behind a mutex
from hdfs-native.
append opens a writer in append mode. If you keep the writer around you can write to it multiple times, but you need to close it when you're done. There's no way currently to guarantee a read will see the data until you close.
Got it.
Not sure exactly what you're asking if it's thread safe. You can't open the same file for append multiple times. You could share a writer across threads behind a mutex
The client should be shared in multiple threads to use append different files at the same time, right?
from hdfs-native.
The client should be shared in multiple threads to use append different files at the same time, right?
Ah yes that is the case!
from hdfs-native.
Thanks @Kimahriman Let me take a try!
from hdfs-native.
Now, it works.
from hdfs-native.
Awesome! Maybe I'll add a warning if the file writer is dropped without closing
from hdfs-native.
Awesome! Maybe I'll add a warning if the file writer is dropped without closing
Maybe it should close implicitly in drop trait
from hdfs-native.
Awesome! Maybe I'll add a warning if the file writer is dropped without closing
Maybe it should close implicitly in drop trait
Yeah that would be nice but it's tricky since it's async. I think the only way would be spawning a fire and forget task without error checking?
from hdfs-native.
Related Issues (20)
- Support vectorized reading HOT 1
- Support append HOT 6
- Move objectstore implementation to its own crate
- Clarify the license HOT 2
- Add examples of the usage
- Facing `org.apache.hadoop.ipc.RpcNoSuchMethodException` when connecting to hdfs using nameservice HOT 5
- File permission different between `list_status` and `hadoop fs -ls` HOT 3
- Create fsspec implementation
- Implement custom digest-md5 algorithm
- Create benchmarks comparing JVM based client HOT 1
- Add lease renewal
- Upgrade object_store to 0.8
- `hdfs_native::Client::new_with_config` cannot be used because the `Configuration` type is not public. HOT 4
- Add datanode heartbeating
- Split long reading and writing tests into dedicated test
- Figure out flakiness with erasure code writing
- Support DataNode connection caching
- Remove need for Hadoop binary for running tests
- Implement token RPC encryption
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hdfs-native.