Coder Social home page Coder Social logo

Comments (5)

clisho avatar clisho commented on July 19, 2024 1

@clisho Bug fixed 96ec3f0

ok,谢谢!目前也忙得像狗,等缓过神若有增加一些特性,到时再提交上去

from flink-connector-clickhouse.

itinycheng avatar itinycheng commented on July 19, 2024

你好,下载代码测试,发现Source在不设置scan.partition.*等参数时,多并行度下,读取了N份数据出来。能如何解决这个问题呢?谢谢!

您好,能提供下具体配置和表相关情况不?

from flink-connector-clickhouse.

clisho avatar clisho commented on July 19, 2024

你好,下载代码测试,发现Source在不设置scan.partition.*等参数时,多并行度下,读取了N份数据出来。能如何解决这个问题呢?谢谢!

您好,能提供下具体配置和表相关情况不?

您好!
====================flink代码====================

final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment()
.setRuntimeMode(RuntimeExecutionMode.BATCH);
StreamTableEnvironment tableEnv = StreamTableEnvironment.create(env);

tableEnv.executeSql(
"CREATE TABLE event_jian_ying_2 (\n" +
" arg STRING,\n" +
" user_id DECIMAL(11,0),\n" +
" event_time TIMESTAMP\n" +
") WITH (\n" +
" 'connector' = 'clickhouse',\n" +
" 'username' = 'default',\n" +
" 'password' = '...',\n" +
" 'url' = '...',\n" +
" 'database-name' = '...',\n" +
" 'table-name' = 'event_jian_ying_2',\n" +
" 'sink.batch-size' = '500',\n" +
" 'sink.flush-interval' = '1000',\n" +
" 'sink.max-retries' = '3'" +
")");

Table table = tableEnv.sqlQuery("select * from event_jian_ying_2");
table.execute().print();

====================表结构====================
CREATE TABLE dxp.event_jian_ying_2
(

`user_id` UInt64,

`event_time` DateTime,

`arg` String

)
ENGINE = MergeTree
ORDER BY event_time
SETTINGS index_granularity = 8192

我看ClickHouseBatchInputFormat中的createInputSplits方法

@OverRide
public InputSplit[] createInputSplits(int minNumSplits) {
int splitNum = parameterValues != null ? parameterValues.length : minNumSplits;
return createGenericInputSplits(splitNum);
}

还没全看代码,parameterValues 不知道存在什么内容。但是minNumSplits这个的值是执行的并行度,若parameterValues 为空且我执行的查询语句是没分片的,这样每个Split执行同样的查询语句,就导致查询N份了。这种情况splitNum应该设置为1吧?

另外下面个人的建议
1、scan.partition.column能否支持“数值”跟“日期”?
2、scan.partition.num,若不设置取并行度
3、scan.partition.lower-bound、scan.partition.upper-bound若不传值,使用scan.partition.column去获取最小值与最大值,然后进行分片

谢谢!

from flink-connector-clickhouse.

itinycheng avatar itinycheng commented on July 19, 2024

@clisho , parameterValues 是存放 between ? and ? 所需的两个参数的;

  1. scan.partition.column 当前只支持整型数值,其他暂时还没计划支持;
  2. 在不设置scan.partition.num 时没办法多并行度运行(不能确定数据以何种方式并行查找),分布式表除外,可以并行读取local 表;
  3. lower-bound, upper-bound赋默认值这个是个不错的功能,但近期没空闲时间做这块,欢迎帮实现该Feature;

多次读取相同数据是一个bug,计划今晚修复;

from flink-connector-clickhouse.

itinycheng avatar itinycheng commented on July 19, 2024

@clisho
Bug fixed 96ec3f0

from flink-connector-clickhouse.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.