Coder Social home page Coder Social logo

spark-clickhouse's Introduction

clickhouse spark connector

connector #spark DataFrame -> Yandex #ClickHouse table

Example

    import io.clickhouse.ext.ClickhouseConnectionFactory
    import io.clickhouse.ext.spark.ClickhouseSparkExt._
    import org.apache.spark.sql.SparkSession

    // spark config
    val sparkSession = SparkSession.builder
      .master("local")
      .appName("local spark")
      .getOrCreate()

    val sc = sparkSession.sparkContext
    val sqlContext = sparkSession.sqlContext
    
    // create test DF
    case class Row1(name: String, v: Int, v2: Int)
    val df = sqlContext.createDataFrame(1 to 1000 map(i => Row1(s"$i", i, i + 10)) )

    // clickhouse params
    
    // any node 
    val anyHost = "localhost"
    val db = "tmp1"
    val tableName = "t1"
    // cluster configuration must be defined in config.xml (clickhouse config)
    val clusterName = Some("perftest_1shards_1replicas"): Option[String]

    // define clickhouse datasource
    implicit val clickhouseDataSource = ClickhouseConnectionFactory.get(anyHost)
    
    // create db / table
    //df.dropClickhouseDb(db, clusterName)
    df.createClickhouseDb(db, clusterName)
    df.createClickhouseTable(db, tableName, "mock_date", Seq("name"), clusterNameO = clusterName)

    // save DF to clickhouse table
    val res = df.saveToClickhouse("tmp1", "t1", (row) => java.sql.Date.valueOf("2000-12-01"), "mock_date", clusterNameO = clusterName)
    assert(res.size == 1)
    assert(res.get("localhost") == Some(df.count()))

Docker image Docker

spark-clickhouse's People

Contributors

dmitrybe avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.