Hi, I think we can simplify default usage of Engine a bit - let's ma

I really like this idea. And I think it's closer to what <a class="user-mention notran

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Improvement proposal about jgit-spark-connector HOT 4 CLOSED

src-d commented on May 20, 2024 1

Improvement proposal

from jgit-spark-connector.

Comments (4)

eiso commented on May 20, 2024

I really like this idea. And I think it's closer to what @mcuadros was thinking of as well. /cc @marnovo @campoy

from jgit-spark-connector.

marnovo commented on May 20, 2024

Yes, looks more straightforward.

Two questions regarding the comparison to the current:

from sourced.engine import Engine
from pyspark.sql import SparkSession
from pyspark.sql.functions import *

spark = SparkSession.builder \
        .master("local[*]").appName("Examples") \
        .getOrCreate()

engine = Engine(spark, "/repositories")

In practical terms, what is lost in terms of flexibility from the current approach?
Is it more likely that abstracting this away from the user might break something under certain circumstances or make it harder to debug in case of a problem?

from jgit-spark-connector.

ajnavarro commented on May 20, 2024

Is not possible to do the proposed API changes on the engine because, when you start a spark-shell, notebook, and so on, the spark session is provided. Because of that, we wouldn't create a new session inside the Engine wrapper, we should use the existing one.

For the other hand, use standard python console is another of the several ways that we could run the engine (JVM application, scala-shell, pyspark-shell, jupyter scala and python notebooks, zeppelin notebooks, and so on.)

If it's really important support the standard python shell right now, we can do that, but I think we already have a lot of methods to use the engine in a really simple way.

add pyspark to PYTHONPATH during installation

I think this is a really bad practice. Change environment variables during a dependency download? I don't think that is even possible using pipy.

from jgit-spark-connector.

eiso commented on May 20, 2024

@ajnavarro I think your argument makes sense. Closing this issue now.

from jgit-spark-connector.

Recommend Projects

Improvement proposal about jgit-spark-connector HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent