Currently there is a number of problems with pyspark.sql.col

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Partially addressed by <a class="issue-link js-issue-link" data-error-text="Failed to

Review Column annotations about pyspark-stubs HOT 4 CLOSED

zero323 commented on June 1, 2024

Review Column annotations

from pyspark-stubs.

Comments (4)

harpaj commented on June 1, 2024

@zero323,
The following valid snippet currently results in an error:

df = df.withColumn("score", sum((F.col("sim"), F.col("weight"))))
(note that sum is the stdlib sum, not the one from pyspark functions)

The error is Argument 2 to "withColumn" of "DataFrame" has incompatible type "Union[Column, int]"; expected "Column"

We actually do have a proper annotation for radd on Column, but it has a type: ignore. It sounds like this is connected to the python/mypy#2129 you mention above, but that one has been fixed. Do you think the type: ignore can be safely removed now?

from pyspark-stubs.

zero323 commented on June 1, 2024

@harpaj Indeed, it looks like it should be safe to remove type: ignore now.

However I don't think that's really the source of the problem here. With ingore this for example is valid:

expr: Column = 1 + col("b")

It looks like the problem is more that we don't have dependent types here, and mypy cannot infer that sum(NonEmptyIterable[Column]) is Column. Instead in consider both cases:

iterable is empty and we default to int,
iterable is non empty and we get Column

If you want sum to type check you should rather start with literal

sum([F.col("sim"), F.col("weight")], lit(0))

from pyspark-stubs.

zero323 commented on June 1, 2024

Partially addressed by #194

from pyspark-stubs.

github-actions commented on June 1, 2024

This issue haven't seen any activity in a while.

from pyspark-stubs.

Recommend Projects

Review Column annotations about pyspark-stubs HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent