Comments (5)
Sure there is a loadDataFrame method that takes a schema:
scala> import org.neo4j.spark._
import org.neo4j.spark._
scala> val neo = Neo4j(sc)
neo: org.neo4j.spark.Neo4j = org.neo4j.spark.Neo4j@750cd36d
scala> val df = neo.cypher("match (n) return id(n) as id1, id(n) as id2").partitions(4).batch(100).loadDataFrame(("id1"->"int"),("id2"->"int"))
df: org.apache.spark.sql.DataFrame = [id1: bigint, id2: bigint]
scala> df.take(3)
res2: Array[org.apache.spark.sql.Row] = Array([0,0], [1,1], [2,2])
from neo4j-spark-connector.
But it also work with just the query (it inspects the result metadata then)
val df = neo.cypher("match (n) return id(n) as id1, id(n) as id2").partitions(4).batch(100).loadDataFrame
df: org.apache.spark.sql.DataFrame = [id1: bigint, id2: bigint]
from neo4j-spark-connector.
Thanks. Solved.
from neo4j-spark-connector.
scala> val df = neo.cypher("match (n:Emp) return id(n) as id, n.name as name, n.dept as dept").loadDataFrame(("id"->"long"),("name"->"string"),("dept"->"long"))
df: org.apache.spark.sql.DataFrame = [id: bigint, name: string ... 1 more field]
scala> df.count()
res15: Long = 28 // When I have only 7 nodes in my Graph. Any specific reason for this repetitions?
from neo4j-spark-connector.
@asarraf What happens if you run the same query as Cypher in Neo4j?
from neo4j-spark-connector.
Related Issues (20)
- Issue : When updating node data via the Spark Connector HOT 2
- java.lang.LinkageError - Spark 3.1.2 neo4j - 4.1.0_for_spark_3 HOT 1
- IllegalArgumentException: Please provide a valid READ query
- Write example should be completed
- Code examples to use both Scala and Python HOT 1
- Update Spark v5 documentation for Neo4j v5 HOT 2
- can't acquire ExclusiveLock HOT 3
- very slow writing of data HOT 3
- Upgrade Cypher DSL to the latest version that supports a JDK 8 baseline HOT 1
- spark version validation fails on EMR / EMR Serverless HOT 1
- Not able to Insert Neo4j Map Data type using the neo4j-spark connector
- First-class support to GDS
- Project build fails after SBT upgrade to 1.9.0, but works for 1.8.3 HOT 1
- Support for pushdown limit
- Add example notebooks
- Transaction Retries using pyspark HOT 1
- Problem with datetime properties with null values HOT 1
- 5.2.0 missing from maven? HOT 2
- Neo4j connector is currently unusable to write data from Databricks with Unity Catalog enabled HOT 2
- Streaming reads: offsets are not loaded from checkpoint when restarting a stream after failure. HOT 10
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from neo4j-spark-connector.