Coder Social home page Coder Social logo

Comments (6)

wajda avatar wajda commented on August 29, 2024

Your config and networking seems to be perfectly fine. There might be several reasons why it fails and most (if not all) of them are likely related to the Glue environment. Many people reported different issues related to specific proprietary Spark environments like Databricks or AWS. Unfortunately, because those platforms are closed source it's sometime extremely difficult to debug them, or to find a qualified tech support that is well aware of the system internals. So, all I'm trying to do right now is just to find a fancy way to say - I don't know what breaks things for you.

One thing that I would suspect is bytecode incompatibility (check that Scala and Spark versions for which the agent bundle is compiled match the ones used in your Glue).
Another possible thing is an XY problem - there might be another error causing the Glue shutdown, but because shutdown happens to quickly the original error might just not have a chance to be written into the logs.
Yet another thing I would try is to replace codeless init with programatic init method for Spline. While enabling Splin listener via Spark configuration works perfectly fine for open-source Spark, proprietary forks sometimes don't cope well with it.

So, all in all, keep on looking.

from spline-spark-agent.

colonbrack3t avatar colonbrack3t commented on August 29, 2024

Thanks for the response @wajda, any chance you can point me towards programatic init of the agent in pyspark? The Readme shows an example in Scala/Java I think.
About the bytecode incompatibility, I'm using Glue 4.0 which has Scala 2.0 and Spark 3.3 (Python 3). I'm using the spark-3.3-spline-agent-bundle_2.12-2.0.0.jar so I think in theory everything should be compatible?
Also, if I remove the spline conf job parameters the job succeeds, so it's likely related to spline - I think you're right about unexpected shutdown meaning no useful logs get written though.

from spline-spark-agent.

wajda avatar wajda commented on August 29, 2024

any chance you can point me towards programatic init of the agent in pyspark? The Readme shows an example in Scala/Java I think.

It's very similar. Here's the example - https://github.com/AbsaOSS/spline-spark-agent/blob/develop/examples/src/main/python/python_example.py

from spline-spark-agent.

colonbrack3t avatar colonbrack3t commented on August 29, 2024

Thanks @wajda, I've tried that and unfortunately yields the same error. I have just tried using a much older spline agent version (0.7.13) which appears to work. Possibly I will try to find the highest version agent that works to try to determine what patch/update is breaking for me

from spline-spark-agent.

wajda avatar wajda commented on August 29, 2024

That would be a great help. Thank you!

from spline-spark-agent.

colonbrack3t avatar colonbrack3t commented on August 29, 2024

It seems like version 0.7.13 is the highest version that works in this situation. Using the next version up (1.0.0) causes the same error. Issue #602 references getting a different issue in Glue 4.0 with version 1.0.0, that was then fixed in 1.1.0. However 1.1.0 also seems to yield the same result for me. I also don't see anything wildly different about their setup vs my own

from spline-spark-agent.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.