Coder Social home page Coder Social logo

Comments (17)

zuston avatar zuston commented on July 19, 2024 1

Following the steps

  1. using client.create(file_path, WriteOptions::default()).await?.close().await?; create a empty file
  2. and then append the data into this file. self.client.append(file_path).await?.write(data).await?

from hdfs-native.

zuston avatar zuston commented on July 19, 2024

And when using the append, the error logs as follows

image

from hdfs-native.

zuston avatar zuston commented on July 19, 2024

And the code is here: https://github.com/zuston/incubator-uniffle/blob/hdfs-native/rust/experimental/server/src/store/hdfs.rs

from hdfs-native.

Kimahriman avatar Kimahriman commented on July 19, 2024

Thanks for this, the write path isn't currently very resilient to failures and definitely could use some improvements. I wanted to get a bare minimum write path working, but there's all kinds of situations the DFSOutputStream tries to handle. Some info that would be useful:

  • Does this happen deterministically or was this a random error you ran into?
  • Any stats about the size of the file you're appending to or how many replicas.
  • What version of Hadoop is this talking to?
  • What features does your cluster use? (security, HA namenodes, RBF, etc.)

from hdfs-native.

zuston avatar zuston commented on July 19, 2024

Does this happen deterministically or was this a random error you ran into?

deterministically

Any stats about the size of the file you're appending to or how many replicas.

Append to a empty file and 3 replica

we use hadoop 3.2.1 and enable kerberos, HA name ode and RBF

from hdfs-native.

Kimahriman avatar Kimahriman commented on July 19, 2024

Append to a empty file and 3 replica

What do you mean by append to empty file? Does the file already exist with no size and you open for appending? Or does the file not exist yet?

from hdfs-native.

Kimahriman avatar Kimahriman commented on July 19, 2024

Can you run with debug rust logs and share the output? It should include all RPC messages being sent with the namenode. That exact thing is part of the integration test so not sure why it's holding onto the lease still

from hdfs-native.

Kimahriman avatar Kimahriman commented on July 19, 2024

Actually I took a look at your code again. Is this function: https://github.com/zuston/incubator-uniffle/blob/hdfs-native/rust/experimental/server/src/store/hdfs.rs#L291 missing a close after appending? That could cause the issue you're seeing, if you're doing multiple appends.

from hdfs-native.

zuston avatar zuston commented on July 19, 2024

https://github.com/zuston/incubator-uniffle/blob/hdfs-native/rust/experimental/server/src/store/hdfs.rs#L291

Oh. yes. I don't close this client.

Do you mean the every append operation should be closed? And is the client thread safe for multiple appending at the same time?

from hdfs-native.

Kimahriman avatar Kimahriman commented on July 19, 2024

Do you mean the every append operation should be closed? And is the client thread safe for multiple appending at the same time?

append opens a writer in append mode. If you keep the writer around you can write to it multiple times, but you need to close it when you're done. There's no way currently to guarantee a read will see the data until you close.

Not sure exactly what you're asking if it's thread safe. You can't open the same file for append multiple times. You could share a writer across threads behind a mutex

from hdfs-native.

zuston avatar zuston commented on July 19, 2024

append opens a writer in append mode. If you keep the writer around you can write to it multiple times, but you need to close it when you're done. There's no way currently to guarantee a read will see the data until you close.

Got it.

Not sure exactly what you're asking if it's thread safe. You can't open the same file for append multiple times. You could share a writer across threads behind a mutex

The client should be shared in multiple threads to use append different files at the same time, right?

from hdfs-native.

Kimahriman avatar Kimahriman commented on July 19, 2024

The client should be shared in multiple threads to use append different files at the same time, right?

Ah yes that is the case!

from hdfs-native.

zuston avatar zuston commented on July 19, 2024

Thanks @Kimahriman Let me take a try!

from hdfs-native.

zuston avatar zuston commented on July 19, 2024

Now, it works.

from hdfs-native.

Kimahriman avatar Kimahriman commented on July 19, 2024

Awesome! Maybe I'll add a warning if the file writer is dropped without closing

from hdfs-native.

zuston avatar zuston commented on July 19, 2024

Awesome! Maybe I'll add a warning if the file writer is dropped without closing

Maybe it should close implicitly in drop trait

from hdfs-native.

Kimahriman avatar Kimahriman commented on July 19, 2024

Awesome! Maybe I'll add a warning if the file writer is dropped without closing

Maybe it should close implicitly in drop trait

Yeah that would be nice but it's tricky since it's async. I think the only way would be spawning a fire and forget task without error checking?

from hdfs-native.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.