Comments (15)
2024-08-29 23:11:28 org.apache.hudi.exception.HoodieIOException: Could not check if hdfs://hdfs-name-node:9820/flink/ceshi9 is a valid table
Did you write into the table successfully?
from hudi.
2024-08-29 23:11:28 org.apache.hudi.exception.HoodieIOException: Could not check if hdfs://hdfs-name-node:9820/flink/ceshi9 is a valid table
Did you write into the table successfully?
Yes, flink reads and writes normally, but synchronization in hive does not work.
from hudi.
2024-08-29 23:11:28 org.apache.hudi.exception.HoodieIOException: Could not check if hdfs://hdfs-name-node:9820/flink/ceshi9 is a valid table
Did you write into the table successfully?
I lowered the version and couldn't sync using flink1.16+hudi0.13, but I used flink run -c org.apache.hudi.sink.compact.HoodieFlinkCompactor /opt/flink/lib/hudi-flink1.16-bundle- 0.13.1.jar --path s3a://ceshi/hudi9/ can be merged successfully. What is the possible reason for this?
from hudi.
This is not write job, it's the separate compaction job, I still believe it is related with the classloader.
from hudi.
compaction
Although this version can perform manual compaction, it still cannot automatically compactCREATE TABLE test_hudi_flink4 (
id int PRIMARY KEY NOT ENFORCED,
name VARCHAR(10),
price int,
ts int,
dtVARCHAR(10)
)
PARTITIONED BY (dt)
WITH (
'connector' = 'hudi',
'path' = 'hdfs://hdfs-name-node:9820/flink/ceshi4',
'table.type' = 'COPY_ON_WRITE',
'hoodie.datasource.write.keygenerator.class' = 'org.apache.hudi.keygen.ComplexAvroKeyGenerator',
'hoodie.datasource.write.recordkey.field' = 'id',
'hoodie.datasource.write.hive_style_partitioning' = 'true',
'hive_sync.enable' = 'true',
'hive_sync.mode' = 'hms',
'hive_sync.metastore.uris' = 'thrift://hive-metastore-server:9083',
'hive_sync.conf.dir'='/opt/hive/conf',
'hive_sync.db' = 'hudi',
'hive_sync.table' = 'test_hudi_flink4',
'hive_sync.partition_fields' = 'dt',
'hive_sync.partition_extractor_class' = 'org.apache.hudi.hive.HiveStylePartitionValueExtractor'
);
from hudi.
COW table does not need compaction, only MOR needs that.
from hudi.
COW table does not need compaction, only MOR needs that.
When synchronizing hive, set 'changelog.enabled' = 'true',
'compaction.async.enabled' = 'true',
'compaction.delta_commits' ='2', automatic compression is not possible?
from hudi.
You declare the table as cow, cow does not generate logs so the compaction is needless.
'table.type' = 'COPY_ON_WRITE'
from hudi.
You declare the table as cow, cow does not generate logs so the compaction is needless.
'table.type' = 'COPY_ON_WRITE'
Sorry, the screenshot is wrong
CREATE TABLE test_hudi_flink9 (
id int PRIMARY KEY NOT ENFORCED,
name VARCHAR(10),
price int,
ts int,
dt VARCHAR(10)
)
PARTITIONED BY (dt)
WITH (
'connector' = 'hudi',
'path' = 's3a://ceshi/hudi9/',
'table.type' = 'MERGE_ON_READ',
'hoodie.datasource.write.keygenerator.class' = 'org.apache.hudi.keygen.ComplexAvroKeyGenerator',
'hoodie.datasource.write.recordkey.field' = 'id',
'hoodie.datasource.write.hive_style_partitioning' = 'true',
'changelog.enabled' = 'true',
'compaction.async.enabled' = 'true',
'compaction.delta_commits' ='2',
'compaction.trigger.strategy' = 'num_commits',
'hive_sync.enable'='true',
'hive_sync.table'='t_hdm',
'hive_sync.db'='default',
'hive_sync.mode' = 'hms',
'hive_sync.metastore.uris' = 'thrift://hive-metastore:9083'
);
I use MERGE_ON_READ,Whether using s3 or minio ,Will not be automatically compressed and merged into storage,What may be the cause?
from hudi.
you can check the compaction scheduling log in the JM log file, if the plan exists, maybe there are some parquet write errors.
from hudi.
you can check the compaction scheduling log in the JM log file, if the plan exists, maybe there are some parquet write errors.
I have tried many versions. Both the cow and mor modes of flinksql run normally, but an error occurs during compaction. The log is as follows. What may be the cause of this? I see that there is no error in JM. The error is in tm.
jm.txt
tm.txt
from hudi.
Related Issues (20)
- [SUPPORT] Compile Hudi 0.15 with Scala 2.13 and Spark 3.2
- Executor executes action [initialize instant ] error HOT 5
- [SUPPORT]After adding new field in the schema,when readinglog file missing required field, HUDI MOR table query failed HOT 4
- [SUPPORT] org/apache/calcite/plan/RelOptRule HOT 1
- [SUPPORT] Properties file corruption caused by write failure HOT 5
- [SUPPORT] Duplicate records getting inserted at Hudi S3 sink HOT 5
- [SUPPORT] For example, when two writers write to non overlapping files, both writes are allowed to succeed. However, when the writes from different writers overlap (touch the same set of files), only one of them will succeed. Please note that this feature is currently experimental and requires external lock providers to acquire locks briefly at critical sections during the write. More on lock providers below. HOT 1
- [SUPPORT] in https://hudi.apache.org/docs/concurrency_control its written ``` Please note that this feature is currently experimental and requires external lock providers to acquire locks briefly at critical sections during the write. More on lock providers below.``` is this general for OCC or is it experimental for internal lock providers ??
- [SUPPORT] OCC experimental ? HOT 2
- [SUPPORT] Compile Hudi 0.15 with Spark 3.5 and Scala 2.13 HOT 1
- [SUPPORT] No easy way to append classpath in hudi hive sync HOT 1
- [SUPPORT] Hudi Sync tool is dependant on Hadoop 2.10.2 and Hadoop AWS 2.10.2. Need upgrade to newer versions like 3.3.4 HOT 2
- [SUPPORT] run_sync_tool.sh hudi-sync needs to upgraded to avoid AWS SDK V1 warning message
- [SUPPORT] When using AWS Hadoop 3.3.4 libraries, Hudi Sync will give java.lang.ClassNotFoundException: org.apache.hadoop.fs.statistics.IOStatisticsSource HOT 1
- [SUPPORT] Hudi sync requires hadoop and hive installed. Very heavy weight HOT 2
- [SUPPORT] COW+hiveStylePartitioning+glob.paths on Spark: reads incomplete values of partition column HOT 2
- flinksql writes to hudi and then synchronizes hive HOT 4
- [SUPPORT]Schema evolution setting affects Spark's 'describe table' output HOT 2
- [SUPPORT] "Failed to read schema/check compatibility" on Hudi upgrade from 0.12.2 to Hudi 0.14.1 HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hudi.