env: 1,flink 1.7.1 2,hudi 0.15.0 3, hadoop 3.3.4 4,hive 3.1.3 Usin

COW table does not need compaction, only MOR needs that. </blockquote

flinksql uses hudi to write to hdfs and synchronize to hive about hudi HOT 15 OPEN

biao-lvwan commented on September 18, 2024

flinksql uses hudi to write to hdfs and synchronize to hive

from hudi.

Comments (15)

danny0405 commented on September 18, 2024

2024-08-29 23:11:28 org.apache.hudi.exception.HoodieIOException: Could not check if hdfs://hdfs-name-node:9820/flink/ceshi9 is a valid table

Did you write into the table successfully?

from hudi.

biao-lvwan commented on September 18, 2024

2024-08-29 23:11:28 org.apache.hudi.exception.HoodieIOException: Could not check if hdfs://hdfs-name-node:9820/flink/ceshi9 is a valid table

Did you write into the table successfully?

Yes, flink reads and writes normally, but synchronization in hive does not work.

from hudi.

biao-lvwan commented on September 18, 2024

2024-08-29 23:11:28 org.apache.hudi.exception.HoodieIOException: Could not check if hdfs://hdfs-name-node:9820/flink/ceshi9 is a valid table

Did you write into the table successfully?

I lowered the version and couldn't sync using flink1.16+hudi0.13, but I used flink run -c org.apache.hudi.sink.compact.HoodieFlinkCompactor /opt/flink/lib/hudi-flink1.16-bundle- 0.13.1.jar --path s3a://ceshi/hudi9/ can be merged successfully. What is the possible reason for this?

from hudi.

danny0405 commented on September 18, 2024

This is not write job, it's the separate compaction job, I still believe it is related with the classloader.

from hudi.

biao-lvwan commented on September 18, 2024

compaction

Although this version can perform manual compaction, it still cannot automatically compactCREATE TABLE test_hudi_flink4 (
id int PRIMARY KEY NOT ENFORCED,
name VARCHAR(10),
price int,
ts int,
dtVARCHAR(10)
)
PARTITIONED BY (dt)
WITH (
'connector' = 'hudi',
'path' = 'hdfs://hdfs-name-node:9820/flink/ceshi4',
'table.type' = 'COPY_ON_WRITE',
'hoodie.datasource.write.keygenerator.class' = 'org.apache.hudi.keygen.ComplexAvroKeyGenerator',
'hoodie.datasource.write.recordkey.field' = 'id',
'hoodie.datasource.write.hive_style_partitioning' = 'true',
'hive_sync.enable' = 'true',
'hive_sync.mode' = 'hms',
'hive_sync.metastore.uris' = 'thrift://hive-metastore-server:9083',
'hive_sync.conf.dir'='/opt/hive/conf',
'hive_sync.db' = 'hudi',
'hive_sync.table' = 'test_hudi_flink4',
'hive_sync.partition_fields' = 'dt',
'hive_sync.partition_extractor_class' = 'org.apache.hudi.hive.HiveStylePartitionValueExtractor'
);

same error

from hudi.

danny0405 commented on September 18, 2024

COW table does not need compaction, only MOR needs that.

from hudi.

biao-lvwan commented on September 18, 2024

COW table does not need compaction, only MOR needs that.

When synchronizing hive, set 'changelog.enabled' = 'true',
'compaction.async.enabled' = 'true',
'compaction.delta_commits' ='2', automatic compression is not possible？

from hudi.

danny0405 commented on September 18, 2024

You declare the table as cow, cow does not generate logs so the compaction is needless.

'table.type' = 'COPY_ON_WRITE'

from hudi.

biao-lvwan commented on September 18, 2024

You declare the table as cow, cow does not generate logs so the compaction is needless.
'table.type' = 'COPY_ON_WRITE'

Sorry, the screenshot is wrong
CREATE TABLE test_hudi_flink9 (
id int PRIMARY KEY NOT ENFORCED,
name VARCHAR(10),
price int,
ts int,
dt VARCHAR(10)
)
PARTITIONED BY (dt)
WITH (
'connector' = 'hudi',
'path' = 's3a://ceshi/hudi9/',
'table.type' = 'MERGE_ON_READ',
'hoodie.datasource.write.keygenerator.class' = 'org.apache.hudi.keygen.ComplexAvroKeyGenerator',
'hoodie.datasource.write.recordkey.field' = 'id',
'hoodie.datasource.write.hive_style_partitioning' = 'true',
'changelog.enabled' = 'true',
'compaction.async.enabled' = 'true',
'compaction.delta_commits' ='2',
'compaction.trigger.strategy' = 'num_commits',
'hive_sync.enable'='true',
'hive_sync.table'='t_hdm',
'hive_sync.db'='default',
'hive_sync.mode' = 'hms',
'hive_sync.metastore.uris' = 'thrift://hive-metastore:9083'
);

I use MERGE_ON_READ，Whether using s3 or minio ,Will not be automatically compressed and merged into storage，What may be the cause?

from hudi.

danny0405 commented on September 18, 2024

you can check the compaction scheduling log in the JM log file, if the plan exists, maybe there are some parquet write errors.

from hudi.

biao-lvwan commented on September 18, 2024

you can check the compaction scheduling log in the JM log file, if the plan exists, maybe there are some parquet write errors.
I have tried many versions. Both the cow and mor modes of flinksql run normally, but an error occurs during compaction. The log is as follows. What may be the cause of this? I see that there is no error in JM. The error is in tm.
jm.txt
tm.txt

from hudi.

flinksql uses hudi to write to hdfs and synchronize to hive about hudi HOT 15 OPEN

Comments (15)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent