Coder Social home page Coder Social logo

Sparklens for streaming about sparklens HOT 4 OPEN

qubole avatar qubole commented on May 17, 2024
Sparklens for streaming

from sparklens.

Comments (4)

iamrohit avatar iamrohit commented on May 17, 2024

Thanks for bringing this up @dominikabasaj. This is definitely on the radar and we will be adding support for Streaming. I will encourage you to wear a PM hat and help us define the requirements/use cases/etc around this feature. This will help us validate what we are thinking and makes sure you get what you are looking for in this feature. CC: @itsvikramagr

from sparklens.

iamrohit avatar iamrohit commented on May 17, 2024

@dominikabasaj

Here is one way to get it working with streaming job. I haven't tried it with streaming yet. Let me know if this serves your purpose.

1.Start your application with --packages qubole:sparklens:0.1.2-s_2.11 but don't specify the extraListener config.
2. As part of your application, do the following:

import com.qubole.sparklens.QuboleNotebookListener
val QNL = new QuboleNotebookListener(sc.getConf)
sc.addSparkListener(QNL)

Basically, create a listener(note that this is Notebook listener and not JobListener) and register it.
3. within your streaming function (whatever is repeatedly called), wrap your code in the following:

QNL.profileIt {
    //Your code here
}

Alternatively, if you need more control:

if (QNL.estimateSize() > QNL.getMaxDataSize()) {
  QNL.purgeJobsAndStages()
}
val startTime = System.currentTimeInMillis
<-- Your scala code here -->
endTime = System.currentTimeInMillis
//wait for some time to get all events to accumulate 
Thread.sleep(QNL.getWaiTimeInSeconds())
println(QNL.getStats(startTime, endTime))
  1. Checkout https://github.com/qubole/sparklens/blob/master/src/main/scala/com/qubole/sparklens/QuboleNotebookListener.scala for more information.

thanks!

from sparklens.

akumarb2010 avatar akumarb2010 commented on May 17, 2024

Sorry for duplicating, but this issue is also related to streaming, so just thought of updating.

We have tried using QuboleJobListener for structured streaming , but it will only provide reports after terminating the streaming query and also it provides for all the Jobs together (not batch wise)

But in general, as these Structured streaming applications are continuously running, users/developers will be interested to see stats for every few batches.

Detailed proposal is attached as below. Please review and provide your inputs.

Structured_streaming_sparklens.pdf

from sparklens.

abhishekd0907 avatar abhishekd0907 commented on May 17, 2024

@dominikabasaj @akumarb2010
You can check out our new project Streaminglens if you plan to use Sparklens for Streaming applications.

from sparklens.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.