Coder Social home page Coder Social logo

apache / doris Goto Github PK

View Code? Open in Web Editor NEW
11.4K 278.0 3.0K 708.95 MB

Apache Doris is an easy-to-use, high performance and unified analytics database.

Home Page: https://doris.apache.org

License: Apache License 2.0

CMake 0.12% C++ 28.16% Python 5.87% C 0.44% Shell 0.88% Java 41.57% Makefile 0.01% Lex 0.06% Thrift 0.38% CSS 0.01% JavaScript 0.02% Dockerfile 0.02% Yacc 0.01% Ruby 0.02% HTML 0.01% TypeScript 0.17% Less 0.03% Mustache 0.01% Assembly 0.01% Groovy 22.23%
olap database hadoop hive hudi iceberg real-time sql bigquery dbt

doris's Introduction

Apache Doris

License GitHub release Jenkins Vec Total Lines Join the Doris Community at Slack Join the chat at https://gitter.im/apache-doris/Lobby EN doc CN doc

Apache Doris is an easy-to-use, high-performance and real-time analytical database based on MPP architecture, known for its extreme speed and ease of use. It only requires a sub-second response time to return query results under massive data and can support not only high-concurrent point query scenarios but also high-throughput complex analysis scenarios.

All this makes Apache Doris an ideal tool for scenarios including report analysis, ad-hoc query, unified data warehouse, and data lake query acceleration. On Apache Doris, users can build various applications, such as user behavior analysis, AB test platform, log retrieval analysis, user portrait analysis, and order analysis.

🎉 Version 2.1.1 released now. Check out the 🔗Release Notes here. The 2.1 verison delivers exceptional performance with 100% higher out-of-the-box queries proven by TPC-DS 1TB tests, enhanced data lake analytics that are 4-6 times speedier than Trino and Spark, solid support for semi-structured data analysis with new Variant types and suite of analytical functions, asynchronous materialized views for query acceleration, optimized real-time writing at scale, and better workload management with stability and runtime SQL resource tracking.

🎉 Version 2.0.8 is now released ! This fully evolved and stable release is ready for all users to upgrade. Check out the 🔗Release Notes here.

👀 Have a look at the 🔗Official Website for a comprehensive list of Apache Doris's core features, blogs and user cases.

📈 Usage Scenarios

As shown in the figure below, after various data integration and processing, the data sources are usually stored in the real-time data warehouse Apache Doris and the offline data lake or data warehouse (in Apache Hive, Apache Iceberg or Apache Hudi).

Apache Doris is widely used in the following scenarios:

  • Reporting Analysis

    • Real-time dashboards
    • Reports for in-house analysts and managers
    • Highly concurrent user-oriented or customer-oriented report analysis: such as website analysis and ad reporting that usually require thousands of QPS and quick response times measured in milliseconds. A successful user case is that Doris has been used by the Chinese e-commerce giant JD.com in ad reporting, where it receives 10 billion rows of data per day, handles over 10,000 QPS, and delivers a 99 percentile query latency of 150 ms.
  • Ad-Hoc Query. Analyst-oriented self-service analytics with irregular query patterns and high throughput requirements. XiaoMi has built a growth analytics platform (Growth Analytics, GA) based on Doris, using user behavior data for business growth analysis, with an average query latency of 10 seconds and a 95th percentile query latency of 30 seconds or less, and tens of thousands of SQL queries per day.

  • Unified Data Warehouse Construction. Apache Doris allows users to build a unified data warehouse via one single platform and save the trouble of handling complicated software stacks. Chinese hot pot chain Haidilao has built a unified data warehouse with Doris to replace its old complex architecture consisting of Apache Spark, Apache Hive, Apache Kudu, Apache HBase, and Apache Phoenix.

  • Data Lake Query. Apache Doris avoids data copying by federating the data in Apache Hive, Apache Iceberg, and Apache Hudi using external tables, and thus achieves outstanding query performance.

🖥️ Core Concepts

📂 Architecture of Apache Doris

The overall architecture of Apache Doris is shown in the following figure. The Doris architecture is very simple, with only two types of processes.

  • Frontend (FE): user request access, query parsing and planning, metadata management, node management, etc.

  • Backend (BE): data storage and query plan execution

Both types of processes are horizontally scalable, and a single cluster can support up to hundreds of machines and tens of petabytes of storage capacity. And these two types of processes guarantee high availability of services and high reliability of data through consistency protocols. This highly integrated architecture design greatly reduces the operation and maintenance cost of a distributed system.

The overall architecture of Apache Doris

In terms of interfaces, Apache Doris adopts MySQL protocol, supports standard SQL, and is highly compatible with MySQL dialect. Users can access Doris through various client tools and it supports seamless connection with BI tools.

💾 Storage Engine

Doris uses a columnar storage engine, which encodes, compresses, and reads data by column. This enables a very high compression ratio and largely reduces irrelavant data scans, thus making more efficient use of IO and CPU resources. Doris supports various index structures to minimize data scans:

  • Sorted Compound Key Index: Users can specify three columns at most to form a compound sort key. This can effectively prune data to better support highly concurrent reporting scenarios.
  • MIN/MAX Indexing: This enables effective filtering of equivalence and range queries for numeric types.
  • Bloom Filter: very effective in equivalence filtering and pruning of high cardinality columns
  • Invert Index: This enables fast search for any field.

💿 Storage Models

Doris supports a variety of storage models and has optimized them for different scenarios:

  • Aggregate Key Model: able to merge the value columns with the same keys and significantly improve performance

  • Unique Key Model: Keys are unique in this model and data with the same key will be overwritten to achieve row-level data updates.

  • Duplicate Key Model: This is a detailed data model capable of detailed storage of fact tables.

Doris also supports strongly consistent materialized views. Materialized views are automatically selected and updated, which greatly reduces maintenance costs for users.

🔍 Query Engine

Doris adopts the MPP model in its query engine to realize parallel execution between and within nodes. It also supports distributed shuffle join for multiple large tables so as to handle complex queries.

The Doris query engine is vectorized, with all memory structures laid out in a columnar format. This can largely reduce virtual function calls, improve cache hit rates, and make efficient use of SIMD instructions. Doris delivers a 5–10 times higher performance in wide table aggregation scenarios than non-vectorized engines.

Apache Doris uses Adaptive Query Execution technology to dynamically adjust the execution plan based on runtime statistics. For example, it can generate runtime filter, push it to the probe side, and automatically penetrate it to the Scan node at the bottom, which drastically reduces the amount of data in the probe and increases join performance. The runtime filter in Doris supports In/Min/Max/Bloom filter.

🚅 Query Optimizer

In terms of optimizers, Doris uses a combination of CBO and RBO. RBO supports constant folding, subquery rewriting, predicate pushdown and CBO supports Join Reorder. The Doris CBO is under continuous optimization for more accurate statistical information collection and derivation, and more accurate cost model prediction.

Technical Overview: 🔗Introduction to Apache Doris

🎆 Why choose Apache Doris?

  • 🎯 Easy to Use: Two processes, no other dependencies; online cluster scaling, automatic replica recovery; compatible with MySQL protocol, and using standard SQL.

  • 🚀 High Performance: Extremely fast performance for low-latency and high-throughput queries with columnar storage engine, modern MPP architecture, vectorized query engine, pre-aggregated materialized view and data index.

  • 🖥️ Single Unified: A single system can support real-time data serving, interactive data analysis and offline data processing scenarios.

  • ⚛️ Federated Querying: Supports federated querying of data lakes such as Hive, Iceberg, Hudi, and databases such as MySQL and Elasticsearch.

  • Various Data Import Methods: Supports batch import from HDFS/S3 and stream import from MySQL Binlog/Kafka; supports micro-batch writing through HTTP interface and real-time writing using Insert in JDBC.

  • 🚙 Rich Ecology: Spark uses Spark-Doris-Connector to read and write Doris; Flink-Doris-Connector enables Flink CDC to implement exactly-once data writing to Doris; DBT Doris Adapter is provided to transform data in Doris with DBT.

🙌 Contributors

Apache Doris has graduated from Apache incubator successfully and become a Top-Level Project in June 2022.

Currently, the Apache Doris community has gathered more than 400 contributors from nearly 200 companies in different industries, and the number of active contributors is close to 100 per month.

Monthly Active Contributors

Contributor over time

We deeply appreciate 🔗community contributors for their contribution to Apache Doris.

👨‍👩‍👧‍👦 Users

Apache Doris now has a wide user base in China and around the world, and as of today, Apache Doris is used in production environments in thousands of companies worldwide. More than 80% of the top 50 Internet companies in China in terms of market capitalization or valuation have been using Apache Doris for a long time, including Baidu, Meituan, Xiaomi, Jingdong, Bytedance, Tencent, NetEase, Kwai, Sina, 360, Mihoyo, and Ke Holdings. It is also widely used in some traditional industries such as finance, energy, manufacturing, and telecommunications.

The users of Apache Doris: 🔗Users

Add your company logo at Apache Doris Website: 🔗Add Your Company

👣 Get Started

📚 Docs

All Documentation 🔗Docs

⬇️ Download

All release and binary version 🔗Download

🗄️ Compile

See how to compile 🔗Compilation

📮 Install

See how to install and deploy 🔗Installation and deployment

🧩 Components

📝 Doris Connector

Doris provides support for Spark/Flink to read data stored in Doris through Connector, and also supports to write data to Doris through Connector.

🔗apache/doris-flink-connector

🔗apache/doris-spark-connector

🌈 Community and Support

📤 Subscribe Mailing Lists

Mail List is the most recognized form of communication in Apache community. See how to 🔗Subscribe Mailing Lists

🙋 Report Issues or Submit Pull Request

If you meet any questions, feel free to file a 🔗GitHub Issue or post it in 🔗GitHub Discussion and fix it by submitting a 🔗Pull Request

🍻 How to Contribute

We welcome your suggestions, comments (including criticisms), comments and contributions. See 🔗How to Contribute and 🔗Code Submission Guide

⌨️ Doris Improvement Proposals (DSIP)

🔗Doris Improvement Proposal (DSIP) can be thought of as A Collection of Design Documents for all Major Feature Updates or Improvements.

🔑 Backend C++ Coding Specification

🔗 Backend C++ Coding Specification should be strictly followed, which will help us achieve better code quality.

💬 Contact Us

Contact us through the following mailing list.

Name Scope
[email protected] Development-related discussions Subscribe Unsubscribe Archives

🧰 Links

📜 License

Apache License, Version 2.0

Note Some licenses of the third-party dependencies are not compatible with Apache 2.0 License. So you need to disable some Doris features to be complied with Apache 2.0 License. For details, refer to the thirdparty/LICENSE.txt

doris's People

Contributors

adonis0147 avatar bitetheddddt avatar caiconghui avatar dataroaring avatar eldenmoon avatar emmymiao87 avatar englefly avatar gabriel39 avatar happenlee avatar hello-stephen avatar hf200012 avatar jacktengg avatar jackwener avatar jibing-li avatar kaijchen avatar kikyou1997 avatar morningman avatar morrysnow avatar mrhhsg avatar mryange avatar starocean999 avatar wangbo avatar xiejiann avatar xinyizzz avatar xy720 avatar yangzhg avatar yiguolei avatar yujun777 avatar zhangstar333 avatar zy-kkk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

doris's Issues

llvm 3.3有bug

debian stretch(当前的stable版本),gcc version 6.3.0 20170516 (Debian 6.3.0-18)

编译be时报错

cstddef:51:11: error: no member named 'max_align_t' in the global namespace

经查,是老版本llvm的bug。
参考clang++ only compiles C++11 program using boost::format when -std=c++11 option is droppedTeach Clang to provide ::max_align_t in C11 and C++11 modes.。手动在cfe/lib/Headers/stddef.h的85行之后添加

    #if __STDC_VERSION__ >= 201112L || __cplusplus >= 201103L
    typedef struct {
      long long __clang_max_align_nonce1
            __attribute__((__aligned__(__alignof__(long long))));
      long double __clang_max_align_nonce2
            __attribute__((__aligned__(__alignof__(long double))));
    } max_align_t;
    #define __CLANG_MAX_ALIGN_T_DEFINED
    #endif

这个问题解决。但重新编译又报新的错误

clang-3.3: /home/crc/palo/thirdparty/src/llvm-3.3.src/tools/clang/lib/Sema/SemaTemplateInstantiate.cpp:2695: llvm::PointerUnion<clang::Decl*, llvm::SmallVector<clang::Decl*, 4u>> clang::LocalInstantiationScope::findInstantiationOf(const clang::Decl*): Assertion `isa(D) && "declaration not instantiated in this scope"' failed.

经查,还是老版本llvm的bug。见Bug 18653 - Abort after class declaration as template argument

建议更新一下依赖的llvm的版本。

ubuntu16.04下编译palo报错

gcc 5.4编译palo,编译thirdparty阶段顺利通过,palo_be在链接阶段报错,详情信息如下

`[100%] Linking CXX executable palo_be
../../../../thirdparty/installed//lib/libglog.a(libglog_la-utilities.o):在函数‘google::GetStackTrace(void**, int, int)’中:

/myresearch/palo/thirdparty/src/glog-0.3.3/src/stacktrace_libunwind-inl.h:65:对‘_Ux86_64_getcontext’未定义的引用
/myresearch/palo/thirdparty/src/glog-0.3.3/src/stacktrace_libunwind-inl.h:66:对‘_ULx86_64_init_local’未定义的引用
/myresearch/palo/thirdparty/src/glog-0.3.3/src/stacktrace_libunwind-inl.h:78:对‘_ULx86_64_step’未定义的引用
/myresearch/palo/thirdparty/src/glog-0.3.3/src/stacktrace_libunwind-inl.h:70:对‘_ULx86_64_get_reg’未定义的引用
/myresearch/palo/thirdparty/src/glog-0.3.3/src/stacktrace_libunwind-inl.h:78:对‘_ULx86_64_step’未定义的引用
collect2: error: ld returned 1 exit status
src/service/CMakeFiles/palo_be.dir/build.make:164: recipe for target 'src/service/palo_be' failed
make[2]: *** [src/service/palo_be] Error 1
CMakeFiles/Makefile2:696: recipe for target 'src/service/CMakeFiles/palo_be.dir/all' failed
make[1]: *** [src/service/CMakeFiles/palo_be.dir/all] Error 2
Makefile:127: recipe for target 'all' failed
make: *** [all] Error 2
`

google找到https://code.google.com/p/google-glog/issues/detail?id=207
修改了glog-0.3.3/libglog.pc.in,增加了一行,重新编译还是报这个错,帮忙看下,谢谢

Unknown database 'default_cluster:bddatabase'

Cluster 名字:bdcluster , 数据库:bddatabase
通过 java和 navicat 连接的时候报错 Unknown database 'default_cluster:bddatabase', 不知道该如何制定连接串

About sort field of non-aggregation table sql statement

In https://github.com/baidu/palo/blob/master/README.md:

In non-aggregation type of table, columns are not distinguished between dimensions and metrics, but should specify the sort columns in order to sort all rows. Palo will sort the table data according to the sort columns without any aggregation. The following figure gives an example of creating non-aggregation table.

-- Create non-aggregation table --
CREATE TABLE example_tbl (
    `date`      DATE,
    id          BIGINT,
    country     VARCHAR(32),
    click       BIGINT,
    cost        BIGINT
) DUPLICATE KEY(`date`, id, country)
DISTRIBUTED BY HASH(id) BUCKETS 32;

should specify the sort columns in order to sort all rows

Question:
where is the sort column in sql statement? I assumed it should be like:
date DATE Sort,
right? Is there something missing?

thirdparty编译失败

===== begin build boost_1_64_0
Building Boost.Build engine with toolset gcc... tools/build/src/engine/bin.linuxx86_64/b2
Detecting Python version... 2.7
Detecting Python root... /data/anaconda
Unicode/ICU support for Boost.Regex?... not found.
Backing up existing Boost.Build configuration in project-config.jam.2
Generating Boost.Build configuration in project-config.jam...

Bootstrapping is done. To build, run:

./b2

To adjust configuration, edit 'project-config.jam'.
Further information:

Performing configuration checks

- 32-bit                   : no  (cached)
- 64-bit                   : yes (cached)
- arm                      : no  (cached)
- mips1                    : no  (cached)
- power                    : no  (cached)
- sparc                    : no  (cached)
- x86                      : yes (cached)
- symlinks supported       : yes (cached)
- C++11 mutex              : yes (cached)
- lockfree boost::atomic_flag : yes (cached)
- Boost.Config Feature Check: cxx11_auto_declarations : yes (cached)
- Boost.Config Feature Check: cxx11_constexpr : yes (cached)
- Boost.Config Feature Check: cxx11_defaulted_functions : yes (cached)
- Boost.Config Feature Check: cxx11_final : yes (cached)
- Boost.Config Feature Check: cxx11_hdr_mutex : yes (cached)
- Boost.Config Feature Check: cxx11_hdr_tuple : yes (cached)
- Boost.Config Feature Check: cxx11_lambdas : yes (cached)
- Boost.Config Feature Check: cxx11_noexcept : yes (cached)
- Boost.Config Feature Check: cxx11_nullptr : yes (cached)
- Boost.Config Feature Check: cxx11_rvalue_references : yes (cached)
- Boost.Config Feature Check: cxx11_template_aliases : yes (cached)
- Boost.Config Feature Check: cxx11_thread_local : yes (cached)
- Boost.Config Feature Check: cxx11_variadic_templates : yes (cached)
- zlib                     : yes (cached)
- bzip2                    : no  (cached)
- iconv (libc)             : yes (cached)
- icu                      : no  (cached)
- icu (lib64)              : no  (cached)
- native-atomic-int32-supported : yes (cached)
- native-syslog-supported  : yes (cached)
- pthread-supports-robust-mutexes : yes (cached)
- compiler-supports-visibility : yes (cached)
- compiler-supports-ssse3  : yes (cached)
- compiler-supports-avx2   : yes (cached)
- has_icu builds           : no  (cached)
- gcc visibility           : yes (cached)
- long double support      : yes (cached)
- zlib                     : yes (cached)
- bzip2                    : no  (cached)

Component configuration:

- atomic                   : building
- chrono                   : building
- container                : building
- context                  : building
- coroutine                : building
- coroutine2               : building
- date_time                : building
- exception                : building
- fiber                    : building
- filesystem               : building
- graph                    : not building
- graph_parallel           : not building
- iostreams                : building
- locale                   : building
- log                      : building
- math                     : building
- metaparse                : building
- mpi                      : not building
- program_options          : building
- python                   : not building
- random                   : building
- regex                    : building
- serialization            : building
- signals                  : building
- system                   : building
- test                     : building
- thread                   : building
- timer                    : building
- type_erasure             : building
- wave                     : building

libs/fiber/src/context.cpp: 在构造函数‘boost::fibers::context::context(boost::fibers::dispatcher_context_t, const boost::context::preallocated&, const default_stack&, boost::fibers::scheduler*)’中:
libs/fiber/src/context.cpp:236:14: 错误:调用重载的‘callcc(const std::allocator_arg_t&, const boost::context::preallocated&, const default_stack&, boost::fibers::context::context(boost::fibers::dispatcher_context_t, const boost::context::preallocated&, const default_stack&, boost::fibers::scheduler*)::__lambda5)’有歧义
});
^
libs/fiber/src/context.cpp:236:14: 附注:备选是:
In file included from /data/palo/palo/thirdparty/..//thirdparty/installed/include/boost/fiber/context.hpp:28:0,
from libs/fiber/src/context.cpp:7:
/data/palo/palo/thirdparty/..//thirdparty/installed/include/boost/context/continuation.hpp:469:1: 附注:boost::context::continuation boost::context::callcc(std::allocator_arg_t, StackAlloc, Fn&&, Arg ...) [with StackAlloc = boost::context::preallocated; Fn = const boost::context::basic_fixedsize_stackboost::context::stack_traits&; Arg = {boost::fibers::context::context(boost::fibers::dispatcher_context_t, const boost::context::preallocated&, const default_stack&, boost::fibers::scheduler*)::__lambda5}]
callcc( std::allocator_arg_t, StackAlloc salloc, Fn && fn, Arg ... arg) {
^
/data/palo/palo/thirdparty/..//thirdparty/installed/include/boost/context/continuation.hpp:483:1: 附注:boost::context::continuation boost::context::callcc(std::allocator_arg_t, boost::context::preallocated, StackAlloc, Fn&&, Arg ...) [with StackAlloc = boost::context::basic_fixedsize_stackboost::context::stack_traits; Fn = boost::fibers::context::context(boost::fibers::dispatcher_context_t, const boost::context::preallocated&, const default_stack&, boost::fibers::scheduler*)::__lambda5; Arg = {}]
callcc( std::allocator_arg_t, preallocated palloc, StackAlloc salloc, Fn && fn, Arg ... arg) {
^
/data/palo/palo/thirdparty/..//thirdparty/installed/include/boost/context/continuation.hpp:514:1: 附注:boost::context::continuation boost::context::callcc(std::allocator_arg_t, boost::context::preallocated, StackAlloc, Fn&&) [with StackAlloc = boost::context::basic_fixedsize_stackboost::context::stack_traits; Fn = boost::fibers::context::context(boost::fibers::dispatcher_context_t, const boost::context::preallocated&, const default_stack&, boost::fibers::scheduler*)::__lambda5]
callcc( std::allocator_arg_t, preallocated palloc, StackAlloc salloc, Fn && fn) {
^
/data/palo/palo/thirdparty/..//thirdparty/installed/include/boost/context/continuation.hpp:457:1: 附注:boost::context::continuation boost::context::callcc(Fn&&, Arg ...) [with Fn = const std::allocator_arg_t&; Arg = {boost::context::preallocated, boost::context::basic_fixedsize_stackboost::context::stack_traits, boost::fibers::context::context(boost::fibers::dispatcher_context_t, const boost::context::preallocated&, const default_stack&, boost::fibers::scheduler*)::__lambda5}; <模板形参-1-3> = void]
callcc( Fn && fn, Arg ... arg) {
^
...skipped <pbin.v2/libs/fiber/build/gcc-4.8.5/release/link-static/threading-multi>libboost_fiber.a(clean) for lack of <pbin.v2/libs/fiber/build/gcc-4.8.5/release/link-static/threading-multi>context.o...
...skipped <pbin.v2/libs/fiber/build/gcc-4.8.5/release/link-static/threading-multi>libboost_fiber.a for lack of <pbin.v2/libs/fiber/build/gcc-4.8.5/release/link-static/threading-multi>context.o...
...skipped <p/data/palo/palo/thirdparty/installed/lib>libboost_fiber.a for lack of <pbin.v2/libs/fiber/build/gcc-4.8.5/release/link-static/threading-multi>libboost_fiber.a...
libs/fiber/src/context.cpp: 在构造函数‘boost::fibers::context::context(boost::fibers::dispatcher_context_t, const boost::context::preallocated&, const default_stack&, boost::fibers::scheduler*)’中:
libs/fiber/src/context.cpp:236:14: 错误:调用重载的‘callcc(const std::allocator_arg_t&, const boost::context::preallocated&, const default_stack&, boost::fibers::context::context(boost::fibers::dispatcher_context_t, const boost::context::preallocated&, const default_stack&, boost::fibers::scheduler*)::__lambda5)’有歧义
});
^
libs/fiber/src/context.cpp:236:14: 附注:备选是:
In file included from /data/palo/palo/thirdparty/..//thirdparty/installed/include/boost/fiber/context.hpp:28:0,
from libs/fiber/src/context.cpp:7:
/data/palo/palo/thirdparty/..//thirdparty/installed/include/boost/context/continuation.hpp:469:1: 附注:boost::context::continuation boost::context::callcc(std::allocator_arg_t, StackAlloc, Fn&&, Arg ...) [with StackAlloc = boost::context::preallocated; Fn = const boost::context::basic_fixedsize_stackboost::context::stack_traits&; Arg = {boost::fibers::context::context(boost::fibers::dispatcher_context_t, const boost::context::preallocated&, const default_stack&, boost::fibers::scheduler*)::__lambda5}]
callcc( std::allocator_arg_t, StackAlloc salloc, Fn && fn, Arg ... arg) {
^
/data/palo/palo/thirdparty/..//thirdparty/installed/include/boost/context/continuation.hpp:483:1: 附注:boost::context::continuation boost::context::callcc(std::allocator_arg_t, boost::context::preallocated, StackAlloc, Fn&&, Arg ...) [with StackAlloc = boost::context::basic_fixedsize_stackboost::context::stack_traits; Fn = boost::fibers::context::context(boost::fibers::dispatcher_context_t, const boost::context::preallocated&, const default_stack&, boost::fibers::scheduler*)::__lambda5; Arg = {}]
callcc( std::allocator_arg_t, preallocated palloc, StackAlloc salloc, Fn && fn, Arg ... arg) {
^
/data/palo/palo/thirdparty/..//thirdparty/installed/include/boost/context/continuation.hpp:514:1: 附注:boost::context::continuation boost::context::callcc(std::allocator_arg_t, boost::context::preallocated, StackAlloc, Fn&&) [with StackAlloc = boost::context::basic_fixedsize_stackboost::context::stack_traits; Fn = boost::fibers::context::context(boost::fibers::dispatcher_context_t, const boost::context::preallocated&, const default_stack&, boost::fibers::scheduler*)::__lambda5]
callcc( std::allocator_arg_t, preallocated palloc, StackAlloc salloc, Fn && fn) {
^
/data/palo/palo/thirdparty/..//thirdparty/installed/include/boost/context/continuation.hpp:457:1: 附注:boost::context::continuation boost::context::callcc(Fn&&, Arg ...) [with Fn = const std::allocator_arg_t&; Arg = {boost::context::preallocated, boost::context::basic_fixedsize_stackboost::context::stack_traits, boost::fibers::context::context(boost::fibers::dispatcher_context_t, const boost::context::preallocated&, const default_stack&, boost::fibers::scheduler*)::__lambda5}; <模板形参-1-3> = void]
callcc( Fn && fn, Arg ... arg) {
^
...skipped <pbin.v2/libs/fiber/build/gcc-4.8.5/release/threading-multi>libboost_fiber.so.1.64.0 for lack of <pbin.v2/libs/fiber/build/gcc-4.8.5/release/threading-multi>context.o...
...skipped <p/data/palo/palo/thirdparty/installed/lib>libboost_fiber.so.1.64.0 for lack of <pbin.v2/libs/fiber/build/gcc-4.8.5/release/threading-multi>libboost_fiber.so.1.64.0...
...skipped <p/data/palo/palo/thirdparty/installed/lib>libboost_fiber.so for lack of <p/data/palo/palo/thirdparty/installed/lib>libboost_fiber.so.1.64.0...
...failed updating 2 targets...

三台alived ,但是报错ERROR 1050 (42S01): Failed to find enough host in all backends. need: 3

CREATE TABLE ssb.lineorder (
lo_orderkey BIGINT,
lo_linenumber BIGINT,
lo_custkey INT,
lo_partkey INT,
lo_suppkey INT,
lo_orderdate INT,
lo_orderpriotity VARCHAR(16) REPLACE,
lo_shippriotity INT SUM,
lo_quantity BIGINT SUM,
lo_extendedprice BIGINT SUM,
lo_ordtotalprice BIGINT SUM,
lo_discount BIGINT SUM,
lo_revenue BIGINT SUM,
lo_supplycost BIGINT SUM,
lo_tax BIGINT SUM,
lo_commitdate BIGINT SUM,
lo_shipmode VARCHAR(11) REPLACE )
DISTRIBUTED BY RANDOM;

mysql> SHOW PROC '/backends' ;
+-----------------+-----------+--------------+-----------------------------------+---------------+--------+----------+---------------------+---------------------+-------+----------------------+-----------------------+-----------+
| Cluster | BackendId | IP | HostName | HeartbeatPort | BePort | HttpPort | LastStartTime | LastHeartbeat | Alive | SystemDecommissioned | ClusterDecommissioned | TabletNum |
+-----------------+-----------+--------------+-----------------------------------+---------------+--------+----------+---------------------+---------------------+-------+----------------------+-----------------------+-----------+
| default_cluster | 10001 | 192.168.4.67 | db86-test-qc1 | 9050 | 9060 | 8040 | 2017-09-21 11:29:54 | 2017-09-21 17:09:15 | true | false | true | 10 |
| default_cluster | 10002 | 192.168.4.69 | db88-test-qc1 | 9050 | 9060 | 8040 | 2017-09-21 11:40:14 | 2017-09-21 17:09:15 | true | false | false | 23 |
| default_cluster | 10003 | 192.168.4.67 | db86-test-qc1 | 9060 | -1 | -1 | N/A | N/A | false | false | false | 0 |
| default_cluster | 10004 | 192.168.4.37 | qc1-testenv3-tomcat-quantgroup.cn | 9050 | 9060 | 8040 | 2017-09-21 16:19:34 | 2017-09-21 17:09:15 | true | false | false | 27 |
+-----------------+-----------+--------------+-----------------------------------+---------------+--------+----------+---------------------+---------------------+-------+----------------------+-----------------------+-----------+
4 rows in set (0.00 sec)

BE编译报错,还请帮忙看下

Determining if the pthread_create exist failed with the following output:
Change Dir: /root/palo/palo-master/be/build/CMakeFiles/CMakeTmp

Run Build Command:"/usr/bin/gmake" "cmTC_709a2/fast"
/usr/bin/gmake -f CMakeFiles/cmTC_709a2.dir/build.make CMakeFiles/cmTC_709a2.dir/build
gmake[1]: Entering directory /root/palo/palo-master/be/build/CMakeFiles/CMakeTmp' Building C object CMakeFiles/cmTC_709a2.dir/CheckSymbolExists.c.o /usr/bin/cc -o CMakeFiles/cmTC_709a2.dir/CheckSymbolExists.c.o -c /root/palo/palo-master/be/build/CMakeFiles/CMakeTmp/CheckSymbolExists.c Linking C executable cmTC_709a2 /usr/bin/cmake -E cmake_link_script CMakeFiles/cmTC_709a2.dir/link.txt --verbose=1 /usr/bin/cc CMakeFiles/cmTC_709a2.dir/CheckSymbolExists.c.o -o cmTC_709a2 -rdynamic CMakeFiles/cmTC_709a2.dir/CheckSymbolExists.c.o: In function main':
CheckSymbolExists.c:(.text+0x16): undefined reference to pthread_create' collect2: error: ld returned 1 exit status gmake[1]: *** [cmTC_709a2] Error 1 gmake[1]: Leaving directory /root/palo/palo-master/be/build/CMakeFiles/CMakeTmp'
gmake: *** [cmTC_709a2/fast] Error 2

File /root/palo/palo-master/be/build/CMakeFiles/CMakeTmp/CheckSymbolExists.c:
/* */
#include <pthread.h>

int main(int argc, char** argv)
{
(void)argv;
#ifndef pthread_create
return ((int*)(&pthread_create))[argc];
#else
(void)argc;
return 0;
#endif
}

Determining if the function pthread_create exists in the pthreads failed with the following output:
Change Dir: /root/palo/palo-master/be/build/CMakeFiles/CMakeTmp

Run Build Command:"/usr/bin/gmake" "cmTC_69589/fast"
/usr/bin/gmake -f CMakeFiles/cmTC_69589.dir/build.make CMakeFiles/cmTC_69589.dir/build
gmake[1]: Entering directory /root/palo/palo-master/be/build/CMakeFiles/CMakeTmp' Building C object CMakeFiles/cmTC_69589.dir/CheckFunctionExists.c.o /usr/bin/cc -DCHECK_FUNCTION_EXISTS=pthread_create -o CMakeFiles/cmTC_69589.dir/CheckFunctionExists.c.o -c /usr/share/cmake-3.5/Modules/CheckFunctionExists.c Linking C executable cmTC_69589 /usr/bin/cmake -E cmake_link_script CMakeFiles/cmTC_69589.dir/link.txt --verbose=1 /usr/bin/cc -DCHECK_FUNCTION_EXISTS=pthread_create CMakeFiles/cmTC_69589.dir/CheckFunctionExists.c.o -o cmTC_69589 -rdynamic -lpthreads /usr/bin/ld: cannot find -lpthreads collect2: error: ld returned 1 exit status gmake[1]: *** [cmTC_69589] Error 1 gmake[1]: Leaving directory /root/palo/palo-master/be/build/CMakeFiles/CMakeTmp'
gmake: *** [cmTC_69589/fast] Error 2

编译be出错,重复定义struct std::hash<__int128>

编译be时报错

In file included from /home/crc/palo/be/src/runtime/decimal_value.h:34:0,
from /home/crc/palo/be/src/udf/udf.cpp:27:
/home/crc/palo/be/src/util/hash_util.hpp:303:8: error: redefinition of ‘struct std::hash<__int128>’
struct hash<__int128> {
^~~~~~~~~~~~~~
In file included from /usr/include/c++/6/bits/basic_string.h:5643:0,
from /usr/include/c++/6/string:52,
from /usr/include/c++/6/bits/locale_classes.h:40,
from /usr/include/c++/6/bits/ios_base.h:41,
from /usr/include/c++/6/ios:42,
from /usr/include/c++/6/ostream:38,
from /usr/include/c++/6/iostream:39,
from /home/crc/palo/be/src/udf/udf.cpp:23:
/usr/include/c++/6/bits/functional_hash.h:153:3: error: previous definition of ‘struct std::hash<__int128>’
_Cxx_hashtable_define_trivial_hash(__GLIBCXX_TYPE_INT_N_0)
^

我使用的是debian stretch(当前的stable版本),gcc version 6.3.0 20170516 (Debian 6.3.0-18)
gcc -print-prog-name=cc1plus -v的结果如下
ignoring nonexistent directory "/usr/lib/gcc/x86_64-linux-gnu/6/../../../../x86_64-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
/usr/include/c++/6
/usr/include/x86_64-linux-gnu/c++/6
/usr/include/c++/6/backward
/usr/lib/gcc/x86_64-linux-gnu/6/include
/usr/local/include
/usr/lib/gcc/x86_64-linux-gnu/6/include-fixed
/usr/include
End of search list.

手动把be/CMakeLists.txt中的CLANG_BASE_FLAGS设置如下
set(CLANG_BASE_FLAGS
"-I/usr/include/c++/6/"
"-I/usr/include/x86_64-linux-gnu/c++/6")

cluster decommission无法完成

使用3个be建立cluster后,删除了一个be,然后重新添加这个be,这个时候cluster是不能把这个be恢复到其所在cluster里的。这个时候使用alter cluster cluster_name properties("instance_num" = "2")设置副本为2,这个cluster就一直在cluster decommission状态,无法完成副本减少

在Red Hat 6.3 上编译thirdparty在llvm报错。

-- Build files have been written to: /root/zzp/palo-master/thirdparty/src/llvm-3.3.src/build
[  1%] Built target count
[  1%] Built target compiler-rt-headers
[  1%] Built target LLVMHello
[  1%] Building CXX object projects/compiler-rt/lib/interception/CMakeFiles/RTInterception.x86_64.dir/interception_type_test.cc.o
[  2%] Built target RTSanitizerCommon.x86_64
[  6%] Built target LLVMSupport
Scanning dependencies of target clang_rt.san-x86_64
Scanning dependencies of target RTSanitizerCommon.test.x86_64
[  6%] Linking CXX static library libRTSanitizerCommon.test.x86_64.a
[ 14%] Built target clang_rt.x86_64
[ 14%] Building CXX object projects/compiler-rt/lib/sanitizer_common/CMakeFiles/clang_rt.san-x86_64.dir/sanitizer_allocator.cc.o
/root/zzp/palo-master/thirdparty/src/llvm-3.3.src/projects/compiler-rt/lib/interception/interception_type_test.cc:28: error: reference to ‘OFF64_T’ is ambiguous
/root/zzp/palo-master/thirdparty/src/llvm-3.3.src/projects/compiler-rt/lib/interception/interception.h:31: error: candidates are: typedef __sanitizer::OFF64_T OFF64_T
/root/zzp/palo-master/thirdparty/src/llvm-3.3.src/projects/compiler-rt/lib/interception/../sanitizer_common/sanitizer_internal_defs.h:80: error:                 typedef __sanitizer::u64 __sanitizer::OFF64_T
/root/zzp/palo-master/thirdparty/src/llvm-3.3.src/projects/compiler-rt/lib/interception/interception_type_test.cc:28: error: reference to ‘OFF64_T’ is ambiguous
/root/zzp/palo-master/thirdparty/src/llvm-3.3.src/projects/compiler-rt/lib/interception/interception.h:31: error: candidates are: typedef __sanitizer::OFF64_T OFF64_T
/root/zzp/palo-master/thirdparty/src/llvm-3.3.src/projects/compiler-rt/lib/interception/../sanitizer_common/sanitizer_internal_defs.h:80: error:                 typedef __sanitizer::u64 __sanitizer::OFF64_T
/root/zzp/palo-master/thirdparty/src/llvm-3.3.src/projects/compiler-rt/lib/interception/interception_type_test.cc:36: error: reference to ‘OFF_T’ is ambiguous
/root/zzp/palo-master/thirdparty/src/llvm-3.3.src/projects/compiler-rt/lib/interception/interception.h:30: error: candidates are: typedef __sanitizer::OFF_T OFF_T
/root/zzp/palo-master/thirdparty/src/llvm-3.3.src/projects/compiler-rt/lib/interception/../sanitizer_common/sanitizer_internal_defs.h:76: error:                 typedef __sanitizer::u64 __sanitizer::OFF_T
/root/zzp/palo-master/thirdparty/src/llvm-3.3.src/projects/compiler-rt/lib/interception/interception_type_test.cc:36: error: reference to ‘OFF_T’ is ambiguous
/root/zzp/palo-master/thirdparty/src/llvm-3.3.src/projects/compiler-rt/lib/interception/interception.h:30: error: candidates are: typedef __sanitizer::OFF_T OFF_T
/root/zzp/palo-master/thirdparty/src/llvm-3.3.src/projects/compiler-rt/lib/interception/../sanitizer_common/sanitizer_internal_defs.h:76: error:                 typedef __sanitizer::u64 __sanitizer::OFF_T
cc1plus: warning: unrecognized command line option "-Wno-c99-extensions"
[ 14%] Building CXX object projects/compiler-rt/lib/sanitizer_common/CMakeFiles/clang_rt.san-x86_64.dir/sanitizer_common.cc.o
make[2]: *** [projects/compiler-rt/lib/interception/CMakeFiles/RTInterception.x86_64.dir/interception_type_test.cc.o] Error 1
make[1]: *** [projects/compiler-rt/lib/interception/CMakeFiles/RTInterception.x86_64.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
[ 14%] Building CXX object projects/compiler-rt/lib/sanitizer_common/CMakeFiles/clang_rt.san-x86_64.dir/sanitizer_flags.cc.o
[ 14%] Built target RTSanitizerCommon.test.x86_64
[ 14%] Building CXX object projects/compiler-rt/lib/sanitizer_common/CMakeFiles/clang_rt.san-x86_64.dir/sanitizer_libc.cc.o
[ 14%] Building CXX object projects/compiler-rt/lib/sanitizer_common/CMakeFiles/clang_rt.san-x86_64.dir/sanitizer_linux.cc.o
[ 14%] Building CXX object projects/compiler-rt/lib/sanitizer_common/CMakeFiles/clang_rt.san-x86_64.dir/sanitizer_mac.cc.o
[ 14%] Building CXX object projects/compiler-rt/lib/sanitizer_common/CMakeFiles/clang_rt.san-x86_64.dir/sanitizer_platform_limits_posix.cc.o
[ 14%] Building CXX object projects/compiler-rt/lib/sanitizer_common/CMakeFiles/clang_rt.san-x86_64.dir/sanitizer_posix.cc.o
[ 14%] Building CXX object projects/compiler-rt/lib/sanitizer_common/CMakeFiles/clang_rt.san-x86_64.dir/sanitizer_printf.cc.o
[ 14%] Building CXX object projects/compiler-rt/lib/sanitizer_common/CMakeFiles/clang_rt.san-x86_64.dir/sanitizer_stackdepot.cc.o
[ 14%] Building CXX object projects/compiler-rt/lib/sanitizer_common/CMakeFiles/clang_rt.san-x86_64.dir/sanitizer_stacktrace.cc.o
[ 14%] Building CXX object projects/compiler-rt/lib/sanitizer_common/CMakeFiles/clang_rt.san-x86_64.dir/sanitizer_stoptheworld_linux.cc.o
[ 14%] Building CXX object projects/compiler-rt/lib/sanitizer_common/CMakeFiles/clang_rt.san-x86_64.dir/sanitizer_symbolizer.cc.o
[ 14%] Building CXX object projects/compiler-rt/lib/sanitizer_common/CMakeFiles/clang_rt.san-x86_64.dir/sanitizer_symbolizer_itanium.cc.o
[ 14%] Building CXX object projects/compiler-rt/lib/sanitizer_common/CMakeFiles/clang_rt.san-x86_64.dir/sanitizer_symbolizer_linux.cc.o
[ 15%] Building CXX object projects/compiler-rt/lib/sanitizer_common/CMakeFiles/clang_rt.san-x86_64.dir/sanitizer_symbolizer_mac.cc.o
[ 15%] Building CXX object projects/compiler-rt/lib/sanitizer_common/CMakeFiles/clang_rt.san-x86_64.dir/sanitizer_symbolizer_win.cc.o
[ 15%] Building CXX object projects/compiler-rt/lib/sanitizer_common/CMakeFiles/clang_rt.san-x86_64.dir/sanitizer_thread_registry.cc.o
[ 15%] Building CXX object projects/compiler-rt/lib/sanitizer_common/CMakeFiles/clang_rt.san-x86_64.dir/sanitizer_win.cc.o
[ 15%] Linking CXX static library ../../../../lib/clang/3.3/lib/linux/libclang_rt.san-x86_64.a
[ 15%] Built target clang_rt.san-x86_64
make: *** [all] Error 2

Readme文档写得真不错,3个问题,希望能够回答

1、Paxos解决主从复制的性能问题主要就是同步+异步copy相混合吗?如果是这样的话,我觉得实在太简单了。结合MySQL集群里多主情况下的binlog同步,似乎可以对应起来看。但不清楚为什么这个只是用在fe,而不是be?Palo不是MPP OLAP系统吗?数据实时update的需求是怎么回事?不太明白。

2、MVCC更新多个table,升级版本号什么的,感觉说的不怎么清楚。因为像Clojure的STM这种MVCC,如果事务失败的话,是会自动回滚重试的。而像MySQL InnoDB的MVCC实际上又需要加行级锁,所以不清楚MVCC怎么支持大并发的跨表更新事务呢?

3、llvm部分的代码,这个是参考Impala的实现自己定制里一套呢,还是主要照搬Impala的一些实现代码?我看到变量类命名都是palo* 什么的

http小批量导入失败

一个fe,一个be,使用预编译的版本palo-0.8.0_20170816_ubuntu16.04_gcc540.tar.gz。按这个教程https://github.com/baidu/palo/blob/master/docs/user_guide/tutorial.md中小批量导入一节执行下面命令

$ curl --location-trusted -u test@example_cluster:test -T table1_data http://127.0.0.1:8030/api/example_db/table1/_load?label=table1_20170816

返回

{
    "status": "Fail",
    "msg": "Request miniload from master(127.0.1.1:9020) because: No more data to read."
}

fe的日志如下

2017-08-16 18:02:41,764 ERROR 126 [TThreadPoolServer$WorkerProcess.run():297] Error occurred during processing of message.
java.lang.NullPointerException
at com.baidu.palo.load.Load.createLoadJob(Load.java:515) ~[palo-fe.jar:?]
at com.baidu.palo.load.Load.addLoadJob(Load.java:339) ~[palo-fe.jar:?]
at com.baidu.palo.load.Load.addLoadJob(Load.java:322) ~[palo-fe.jar:?]
at com.baidu.palo.service.FrontendServiceImpl.miniLoad(FrontendServiceImpl.java:295) ~[palo-fe.jar:?]
at com.baidu.palo.thrift.FrontendService$Processor$miniLoad.getResult(FrontendService.java:1113) ~[palo-fe.jar:?]
at com.baidu.palo.thrift.FrontendService$Processor$miniLoad.getResult(FrontendService.java:1098) ~[palo-fe.jar:?]
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) ~[libthrift-0.9.3.jar:0.9.3]
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) ~[libthrift-0.9.3.jar:0.9.3]
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) [libthrift-0.9.3.jar:0.9.3]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_141]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_141]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_141]

be的日志如下

I0816 18:02:41.753830 10454 mini_load.cpp:472] accept one request HttpRequest:
method:1
uri:/api/example_db/table1/_load
raw_path:/api/example_db/table1/_load
headers:
key=Accept, value=/
key=Authorization, value=Basic dGVzdEBleGFtcGxlX2NsdXN0ZXI6dGVzdA==
key=Content-Length, value=65
key=Expect, value=100-continue
key=Host, value=127.0.1.1:8040
key=User-Agent, value=curl/7.52.1
params:
key=db, value=example_db
key=label, value=table1_20170816
key=table, value=table1
I0816 18:02:41.758327 10454 mini_load.cpp:221] Save file to path /home/crc/palo-0.8.0_ubuntu16.04_gcc540/disks/palo.SSD/mini_download/example_db/table1_20170816/table1..20170816180241.755049 success.
W0816 18:02:41.761920 10454 mini_load.cpp:345] Retrying mini load from master(127.0.1.1:9020) because: No more data to read.
W0816 18:02:41.766263 10454 mini_load.cpp:362] Request miniload from master(127.0.1.1:9020) because: No more data to read.
I0816 18:02:41.797876 10454 status.cpp:49] Request miniload from master(127.0.1.1:9020) because: No more data to read.
@ 0xb48300 palo::Status::Status()
@ 0xf995a9 palo::MiniLoadAction::load()
@ 0xf9a5e0 palo::MiniLoadAction::handle()
@ 0xf8346a palo::Webserver::mongoose_callback()
@ 0xf83bb9 palo::Webserver::mongoose_callback_static()
@ 0xfa4fec call_user
@ 0xfada54 handle_request
@ 0xfafbf3 process_new_connection
@ 0xfb006d worker_thread
@ 0x7f2f52fbe493 start_thread
@ 0x7f2f51d0eafe (unknown)

Fedora编译be模块失败

-- Boost version: 1.64.0
-- Found the following Boost libraries:
--   thread
--   regex
--   system
--   filesystem
--   date_time
--   program_options
--   chrono
--   atomic
-- 
-- LLVM llvm-config found at: /home/aegeaner/Play/palo/thirdparty/installed/bin/llvm-config
-- LLVM clang++ found at: /home/aegeaner/Play/palo/thirdparty/installed/bin/clang++
-- LLVM opt found at: /home/aegeaner/Play/palo/thirdparty/installed/bin/opt
llvm-config: unknown component name: jit
CMake Error at CMakeLists.txt:252 (string):
  string sub-command REPLACE requires at least four arguments.


-- LLVM include dir: /usr/include/llvm3.7
-- LLVM lib dir: /usr/lib64/llvm3.7
-- LLVM libs: -ldl
-- LLVM compile flags: -L/usr/lib64/llvm3.7 -D__GLIBCXX_BITSIZE_INT_N_0=128 -D__GLIBCXX_TYPE_INT_N_0=__int128
-- fedora

-- Compiler Flags: -msse4.2 -Wall -Wno-sign-compare -Wno-deprecated -pthread -DBOOST_DATE_TIME_POSIX_TIME_STD_CONFIG -D__STDC_FORMAT_MACROS -g -O2 -ggdb -Wno-unused-local-typedefs -Wno-strict-aliasing -std=gnu++11 -DPERFORMANCE -D_FILE_OFFSET_BITS=64
-- Configuring incomplete, errors occurred!

be/build/CMakeFiles/CMakeError.log:

Determining if the pthread_create exist failed with the following output:
Change Dir: /home/aegeaner/palo/be/build/CMakeFiles/CMakeTmp

Run Build Command:"/usr/bin/gmake" "cmTC_f8f50/fast"
/usr/bin/gmake -f CMakeFiles/cmTC_f8f50.dir/build.make CMakeFiles/cmTC_f8f50.dir/build
gmake[1]: Entering directory '/home/aegeaner/palo/be/build/CMakeFiles/CMakeTmp'
Building C object CMakeFiles/cmTC_f8f50.dir/CheckSymbolExists.c.o
/usr/bin/cc    -o CMakeFiles/cmTC_f8f50.dir/CheckSymbolExists.c.o   -c /home/aegeaner/palo/be/build/CMakeFiles/CMakeTmp/CheckSymbolExists.c
Linking C executable cmTC_f8f50
/usr/bin/cmake -E cmake_link_script CMakeFiles/cmTC_f8f50.dir/link.txt --verbose=1
/usr/bin/cc      -rdynamic CMakeFiles/cmTC_f8f50.dir/CheckSymbolExists.c.o  -o cmTC_f8f50 
CMakeFiles/cmTC_f8f50.dir/CheckSymbolExists.c.o: In function `main':
CheckSymbolExists.c:(.text+0x16): undefined reference to `pthread_create'
collect2: error: ld returned 1 exit status
gmake[1]: *** [CMakeFiles/cmTC_f8f50.dir/build.make:98: cmTC_f8f50] Error 1
gmake[1]: Leaving directory '/home/aegeaner/palo/be/build/CMakeFiles/CMakeTmp'
gmake: *** [Makefile:126: cmTC_f8f50/fast] Error 2

File /home/aegeaner/palo/be/build/CMakeFiles/CMakeTmp/CheckSymbolExists.c:
/* */
#include <pthread.h>

int main(int argc, char** argv)
{
  (void)argv;
#ifndef pthread_create
  return ((int*)(&pthread_create))[argc];
#else
  (void)argc;
  return 0;
#endif
}

Determining if the function pthread_create exists in the pthreads failed with the following output:
Change Dir: /home/aegeaner/palo/be/build/CMakeFiles/CMakeTmp

Run Build Command:"/usr/bin/gmake" "cmTC_b8bdb/fast"
/usr/bin/gmake -f CMakeFiles/cmTC_b8bdb.dir/build.make CMakeFiles/cmTC_b8bdb.dir/build
gmake[1]: Entering directory '/home/aegeaner/palo/be/build/CMakeFiles/CMakeTmp'
Building C object CMakeFiles/cmTC_b8bdb.dir/CheckFunctionExists.c.o
/usr/bin/cc   -DCHECK_FUNCTION_EXISTS=pthread_create   -o CMakeFiles/cmTC_b8bdb.dir/CheckFunctionExists.c.o   -c /usr/share/cmake/Modules/CheckFunctionExists.c
Linking C executable cmTC_b8bdb
/usr/bin/cmake -E cmake_link_script CMakeFiles/cmTC_b8bdb.dir/link.txt --verbose=1
/usr/bin/cc  -DCHECK_FUNCTION_EXISTS=pthread_create    -rdynamic CMakeFiles/cmTC_b8bdb.dir/CheckFunctionExists.c.o  -o cmTC_b8bdb -lpthreads 
/usr/bin/ld: cannot find -lpthreads
collect2: error: ld returned 1 exit status
gmake[1]: *** [CMakeFiles/cmTC_b8bdb.dir/build.make:98: cmTC_b8bdb] Error 1
gmake[1]: Leaving directory '/home/aegeaner/palo/be/build/CMakeFiles/CMakeTmp'
gmake: *** [Makefile:126: cmTC_b8bdb/fast] Error 2

Failed to find enough alive backends

使用默认的三副本方式建表,提示存活节点不够。
MySQL [dbtest]> CREATE TABLE table1 ( siteid INT DEFAULT '10', citycode SMALLINT, username VARCHAR(32) DEFAULT '', pv BIGINT SUM DEFAULT '0' ) AGGREGATE KEY(siteid, citycode, username) DISTRIBUTED BY HASH(siteid) BUCKETS 10 ;
ERROR 1050 (42S01): Failed to find enough alive backends. need: 3
MySQL [dbtest]> SHOW PROC '/backends';
+-----------------+-----------+---------------+----------+---------------+--------+----------+---------------------+---------------------+-------+----------------------+-----------------------+-----------+
| Cluster | BackendId | IP | HostName | HeartbeatPort | BePort | HttpPort | LastStartTime | LastHeartbeat | Alive | SystemDecommissioned | ClusterDecommissioned | TabletNum |
+-----------------+-----------+---------------+----------+---------------+--------+----------+---------------------+---------------------+-------+----------------------+-----------------------+-----------+
| | 10376 | 192.168.5.100 | FineBI1 | 10050 | 9060 | 10040 | 2017-08-17 11:00:44 | 2017-08-17 11:13:09 | true | false | false | 0 |
| example_cluster | 10001 | 192.168.5.105 | FineBI2 | 9050 | 9060 | 8040 | 2017-08-17 11:00:54 | 2017-08-17 11:13:09 | true | false | false | 37 |
| example_cluster | 10002 | 192.168.5.106 | FineBI3 | 9050 | 9060 | 8040 | 2017-08-17 11:00:59 | 2017-08-17 11:13:09 | true | false | false | 37 |
+-----------------+-----------+---------------+----------+---------------+--------+----------+---------------------+---------------------+-------+----------------------+-----------------------+-----------+
3 rows in set (0.00 sec)

在执行 sh build.sh 的时候报错

在执行 sh build.sh 的时候报错
报错信息如下:
CMake Error at CMakeLists.txt:291 (message):

查找到原因:
lsb_release 命令不可用。
因为 CentOS 7.1 默认没有安装对应软件包

解决办法:
安装对应软件包

以下为具体处理步骤
1.问题现象
[root@kdhs1091 palo-master]# sh build.sh
Apache Ant(TM) version 1.9.2 compiled on June 10 2014
Get params:
BUILD_BE -- 1
BUILD_FE -- 1
CLEAN -- 0
RUN_UT -- 0

Build generated code
make -C script
make[1]: Entering directory /opt/soft/palo-master/gensrc/script' /opt/soft/palo-master/gensrc/script/gen_build_version.sh make[1]: Leaving directory /opt/soft/palo-master/gensrc/script'
make -C proto
make[1]: Entering directory /opt/soft/palo-master/gensrc/proto' make[1]: Nothing to be done for all'.
make[1]: Leaving directory /opt/soft/palo-master/gensrc/proto' make -C thrift make[1]: Entering directory /opt/soft/palo-master/gensrc/thrift'
make[1]: Nothing to be done for all'. make[1]: Leaving directory /opt/soft/palo-master/gensrc/thrift'
make -C parser
make[1]: Entering directory /opt/soft/palo-master/gensrc/parser' make[1]: Nothing to be done for all'.
make[1]: Leaving directory `/opt/soft/palo-master/gensrc/parser'
Build Backend
-- GCC version: 4.8.5

-- GCC major version: 4
-- GCC minor version: 8
-- defined PIC_LIB_PATH
-- build gensrc if necessary
make: Entering directory /opt/soft/palo-master/gensrc' make -C script make[1]: Entering directory /opt/soft/palo-master/gensrc/script'
/opt/soft/palo-master/gensrc/script/gen_build_version.sh
make[1]: Leaving directory /opt/soft/palo-master/gensrc/script' make -C proto make[1]: Entering directory /opt/soft/palo-master/gensrc/proto'
make[1]: Nothing to be done for all'. make[1]: Leaving directory /opt/soft/palo-master/gensrc/proto'
make -C thrift
make[1]: Entering directory /opt/soft/palo-master/gensrc/thrift' make[1]: Nothing to be done for all'.
make[1]: Leaving directory /opt/soft/palo-master/gensrc/thrift' make -C parser make[1]: Entering directory /opt/soft/palo-master/gensrc/parser'
make[1]: Nothing to be done for all'. make[1]: Leaving directory /opt/soft/palo-master/gensrc/parser'
make: Leaving directory `/opt/soft/palo-master/gensrc'
-- Boost version: 1.64.0
-- Found the following Boost libraries:
-- thread
-- regex
-- system
-- filesystem
-- date_time
-- program_options

-- LLVM llvm-config found at: /opt/soft/palo-master/thirdparty/installed/bin/llvm-config
-- LLVM clang++ found at: /opt/soft/palo-master/thirdparty/installed/bin/clang++
-- LLVM opt found at: /opt/soft/palo-master/thirdparty/installed/bin/opt
-- GCC version is less than 5.0.0, no need to set -D__GLIBCXX_BITSIZE_INT_N_0=128 and -D__GLIBCXX_TYPE_INT_N_0=__int128
-- CLANG_IR_CXX_FLAGS: -std=gnu++11;-c;-emit-llvm;-D__STDC_CONSTANT_MACROS;-D__STDC_FORMAT_MACROS;-D__STDC_LIMIT_MACROS;-DIR_COMPILE;-DNDEBUG;-DHAVE_INTTYPES_H;-DHAVE_NETINET_IN_H;-DBOOST_DATE_TIME_POSIX_TIME_STD_CONFIG;-D__GLIBCXX_BITSIZE_INT_N_0=128;-D__GLIBCXX_TYPE_INT_N_0=__int128;-U_GLIBCXX_USE_FLOAT128
-- LLVM include dir: /opt/soft/palo-master/thirdparty/installed/include
-- LLVM lib dir: /opt/soft/palo-master/thirdparty/installed/lib
-- LLVM libs: -ldl;/opt/soft/palo-master/thirdparty/installed/lib/libLLVMBitReader.a;/opt/soft/palo-master/thirdparty/installed/lib/libLLVMipo.a;/opt/soft/palo-master/thirdparty/installed/lib/libLLVMVectorize.a;/opt/soft/palo-master/thirdparty/installed/lib/libLLVMX86Disassembler.a;/opt/soft/palo-master/thirdparty/installed/lib/libLLVMX86AsmParser.a;/opt/soft/palo-master/thirdparty/installed/lib/libLLVMX86CodeGen.a;/opt/soft/palo-master/thirdparty/installed/lib/libLLVMSelectionDAG.a;/opt/soft/palo-master/thirdparty/installed/lib/libLLVMAsmPrinter.a;/opt/soft/palo-master/thirdparty/installed/lib/libLLVMMCParser.a;/opt/soft/palo-master/thirdparty/installed/lib/libLLVMX86Desc.a;/opt/soft/palo-master/thirdparty/installed/lib/libLLVMX86Info.a;/opt/soft/palo-master/thirdparty/installed/lib/libLLVMX86AsmPrinter.a;/opt/soft/palo-master/thirdparty/installed/lib/libLLVMX86Utils.a;/opt/soft/palo-master/thirdparty/installed/lib/libLLVMJIT.a;/opt/soft/palo-master/thirdparty/installed/lib/libLLVMRuntimeDyld.a;/opt/soft/palo-master/thirdparty/installed/lib/libLLVMExecutionEngine.a;/opt/soft/palo-master/thirdparty/installed/lib/libLLVMCodeGen.a;/opt/soft/palo-master/thirdparty/installed/lib/libLLVMObjCARCOpts.a;/opt/soft/palo-master/thirdparty/installed/lib/libLLVMScalarOpts.a;/opt/soft/palo-master/thirdparty/installed/lib/libLLVMInstCombine.a;/opt/soft/palo-master/thirdparty/installed/lib/libLLVMTransformUtils.a;/opt/soft/palo-master/thirdparty/installed/lib/libLLVMipa.a;/opt/soft/palo-master/thirdparty/installed/lib/libLLVMAnalysis.a;/opt/soft/palo-master/thirdparty/installed/lib/libLLVMTarget.a;/opt/soft/palo-master/thirdparty/installed/lib/libLLVMMC.a;/opt/soft/palo-master/thirdparty/installed/lib/libLLVMObject.a;/opt/soft/palo-master/thirdparty/installed/lib/libLLVMCore.a;/opt/soft/palo-master/thirdparty/installed/lib/libLLVMSupport.a
-- LLVM compile flags: -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
CMake Error at CMakeLists.txt:272 (string):
string no output variable specified

--
CMake Error at CMakeLists.txt:291 (message):
Currently not support system

-- Configuring incomplete, errors occurred!
See also "/opt/soft/palo-master/be/build/CMakeFiles/CMakeOutput.log".
[root@kdhs1091 palo-master]#

2.找到报错原因
[root@kdhs1091 be]# lsb_release
bash: lsb_release: command not found...
[root@kdhs1091 be]#

3.查找需要安装的软件包
[root@kdhs1091 palo-master]# yum provides */lsb_release
Loaded plugins: fastestmirror, langpacks
Loading mirror speeds from cached hostfile

  • base: centos.ustc.edu.cn
  • extras: mirrors.163.com
  • updates: centos.ustc.edu.cn
    redhat-lsb-core-4.1-27.el7.centos.1.i686 : LSB Core module support
    Repo : base
    Matched from:
    Filename : /usr/bin/lsb_release

redhat-lsb-core-4.1-27.el7.centos.1.x86_64 : LSB Core module support
Repo : base
Matched from:
Filename : /usr/bin/lsb_release

4.安装对应软件包
[root@kdhs1091 palo-master]# yum install redhat-lsb-core-4.1-27.el7.centos.1.i686

5.再次执行成功
[root@kdhs1091 palo-master]# sh build.sh

第三方库编译失败

编译环境:

Linux dmp1 3.10.0-693.2.2.el7.x86_64 #1 SMP Tue Sep 12 22:26:13 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

编译版本:

Branch:Master

Last Commit:44ed16cd23797137ba08f62a7e66fad7a87b3abf

错误信息:

截图是编译失败的详细信息:

image

在build-thirdparty.sh中export LD_LIBRARY_PATH

mkdir -p $TP_DIR/src
mkdir -p $TP_DIR/installed (52行)
export LD_LIBRARY_PATH=$TP_DIR/installed/lib #加上这一行

要不当本地没有libcrypto.so.1.0.0库时,配置thrift会报错

===== begin build thrift-0.9.3
/home/crc/palo/thirdparty/..//thirdparty/src/thrift-0.9.3
/home/crc/palo/thirdparty/..//thirdparty/installed/lib
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /bin/mkdir -p
checking for gawk... gawk
checking whether make sets $(MAKE)... yes
checking whether make supports nested variables... yes
checking whether UID '1000' is supported by ustar format... yes
checking whether GID '1000' is supported by ustar format... yes
checking how to create a ustar tar archive... gnutar
checking for pkg-config... /usr/bin/pkg-config
checking pkg-config is at least version 0.9.0... yes
checking for gcc... gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... configure: error: in /home/crc/palo/thirdparty/src/thrift-0.9.3': configure: error: cannot run C compiled programs. If you meant to cross compile, use --host'.
See `config.log' for more details

在config.log里面有一行
./conftest error while loading shared libraries: libcrypto.so.1.0.0: cannot open shared object file: No such file or directory

fe启动失败

执行完启动脚本后,fe.log中报错日志如下:
2017-09-15 17:06:53,871 INFO 1 [PaloFe.main():66] Palo FE start
2017-09-15 17:06:54,172 INFO 1 [ConsistencyChecker.initWorkTime():106] consistency checker will work from 23:00 to 4:00
2017-09-15 17:06:54,386 INFO 1 [Catalog.loadImage():888] image does not exist: /fe/palo-meta/image/image.0
2017-09-15 17:06:55,426 INFO 1 [BDBEnvironment.setup():148] add helper[12.99.106.135:9040] as ReplicationGroupAdmin
2017-09-15 17:06:55,433 WARN 1 [BDBStateChangeListener.stateChange():88] transfer from INIT to UNKNOWN
2017-09-15 17:06:55,455 WARN 43 [Catalog.setCanRead():1646] meta out of date. current time: 1505466415454, synchronized time: 0, has log: false, fe type: UNKNOWN
2017-09-15 17:06:57,455 WARN 32 [BDBStateChangeListener.stateChange():39] transfer from UNKNOWN to MASTER
2017-09-15 17:06:57,459 INFO 42 [BDBHA.fencing():63] start fencing, epoch number is 4
2017-09-15 17:06:57,467 ERROR 42 [Catalog.checkCurrentNodeExist():803] current node is not added to the cluster, will exit
请问是什麽原因,该如何解决?

VARCHAR类型的列用在where条件中出错

系统是centos 7.3,使用预编译的palo-0.8.0_centos7.1
表结构直接按教程中的创建的

CREATE TABLE table1
(
    siteid INT DEFAULT '10',
    citycode SMALLINT,
    username VARCHAR(32) DEFAULT '',
    pv BIGINT SUM DEFAULT '0'
)
AGGREGATE KEY(siteid, citycode, username)
DISTRIBUTED BY HASH(siteid) BUCKETS 10
PROPERTIES("replication_num" = "1");

导入数据后,在where条件中限定varchar类型的列,结果返回空。

MySQL [example_db]> select * from table1;
+--------+----------+----------+------+------+
| siteid | citycode | username | pv   | uv   |
+--------+----------+----------+------+------+
|      1 |        1 | 'jim'    |    2 |    0 |
|      3 |        2 | 'tom'    |    2 |    0 |
|      4 |        3 | 'bush'   |    3 |    0 |
|      5 |        3 | 'helen'  |    3 |    0 |
|      2 |        1 | 'grace'  |    2 |    0 |
+--------+----------+----------+------+------+
5 rows in set (0.00 sec)

MySQL [example_db]> select * from table1 where username like 'j%';
Empty set (0.01 sec)

MySQL [example_db]> select * from table1 where username = 'jim';
Empty set (0.01 sec)

导入hdfs数据的时候提示java.net.UnknownHostException

10106 | table2_20170710 | CANCELLED | ETL:N/A; LOAD:N/A | N/A | cluster:N/A; timeout(s):3600; max_filter_ratio:0.1 | type:ETL_SUBMIT_FAIL; msg:create job request fail. Broker list path failed.path=hdfs://c3prc-hadoop:8020/user/h_miui_ad/bi/palo_test/table2_data,broker=TNetworkAddress(hostname:10.136.138.54, port:8000),msg=java.lang.IllegalArgumentException: java.net.UnknownHostException: c3prc-hadoop | 2017-09-13 15:26:09 | N/A | N/A | N/A | 2017-09-13 15:26:10 | N/A |

load label 中的hadoop

Cannot run program "/home/worker/palo-0.8.0_20170822_centos7.1_gcc485/fe/lib/hadoop-client/hadoop/bin/hadoop": error=2, No such file or directory

不是很懂这个hadoop是怎么来的,还是说得依赖自己的hadoop集群

sudo: updatedb:找不到命令

 在 install 文档的第一节(系统依赖)中,提到在安装完cmake等依赖包后需要执行“updatedb”操作,但该命令,不管是Ubuntu,还是CentOS都没有自带的,需要安装mlocate包,命令才生效。

"Apache" Impala vs. "Cloudera" Impala

Cloudera no longer owns the TM for Impala - it is now an Apache Software Foundation project. I noticed it is still called "Cloudera" Impala in the README; it might be in other documentation in this repo as well.

创建表时,报错:ERROR 1046 (3D000): Failed to create partition[table1]. Timeout

创建表时,报错:ERROR 1046 (3D000): Failed to create partition[table1]. Timeout
1)fe日志报错:
2017-09-07 17:35:30,628 INFO 101 [Database.checkQuota():239] database[example_cluster:example_db] data quota: left bytes: 1024.000 GB / total: 1024.000 GB
2017-09-07 17:35:40,671 WARN 101 [Catalog.createPartitionWithIndices():2779] Failed to create partition[table1]. Timeout. unfinished marks: 10000=10017, 10000=10019, 10000=10005, 10000=10021, 10000=10007, 10000=10023, 10000=10009, 10000=10011, 10000=10013, 10000=10015

suggest to fix license statement

The license statements at the header of many source code look like:

// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. ... The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); ...

It seems to be borrowed from some project of ASF. While Palo is not (at least currently) donated to ASF, so it might be not proper to put such statements.

从infomation_schema数据库的columns、schemata、tables查询数据报错

1)从infomation_schema数据库的columns、schemata、tables查询数据报错
ERROR 1064 (HY000): reopen rpc error, address=TNetworkAddress(
2)be直接Alive 变成false;
2)fe日志中报错
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:1.8.0_131]
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ~[?:1.8.0_131]
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) ~[?:1.8.0_131]
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) ~[?:1.8.0_131]
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:1.8.0_131]
at java.net.Socket.connect(Socket.java:589) ~[?:1.8.0_131]
at org.apache.thrift.transport.TSocket.open(TSocket.java:221) ~[libthrift-0.9.3.jar:0.9.3]

Support S3 as the persistent storage

Some OLAPs, such as Snowflake, directly use S3 as their table storage, even the temporary data. It has the benefit to save money for cloud users, additionally, it saves time for data ETL. From this benchmark could we see that S3 based OLAP(Snowflake) does not have a remarkable performance difference with local storage based one(Redshift). There also exists similar projects as Rocksdb-cloud which uses S3 as the Rocksdb's persistent storage, it could save some time to deliver such feature.

Centos6.x上编译be时遇到Error提示,请多多指教。

openldap.c:(.text+0xc0f): undefined reference to ldap_abandon_ext' ../../../../thirdparty/installed//lib/libcurl.a(libcurl_la-openldap.o): In function ldap_do':
openldap.c:(.text+0xc6f): undefined reference to ldap_url_parse' openldap.c:(.text+0xd18): undefined reference to ldap_search_ext'
openldap.c:(.text+0xd25): undefined reference to ldap_free_urldesc' openldap.c:(.text+0xd32): undefined reference to ldap_err2string'
../../../../thirdparty/installed//lib/libcurl.a(libcurl_la-openldap.o): In function ldap_setup_connection': openldap.c:(.text+0xdeb): undefined reference to ldap_url_parse'
openldap.c:(.text+0xe5a): undefined reference to ldap_pvt_url_scheme2proto' openldap.c:(.text+0xe67): undefined reference to ldap_free_urldesc'
collect2: error: ld returned 1 exit status
make[2]: *** [src/service/palo_be] Error 1
make[1]: *** [src/service/CMakeFiles/palo_be.dir/all] Error 2
make: *** [all] Error 2

create table error: failed to create partition.Timeout

建表的时候报错,报错信息: failed to create partition.Timeout。
1.建表语句:
CREATE TABLE table1
(
siteid INT DEFAULT '10',
citycode SMALLINT,
username VARCHAR(32) DEFAULT '',
pv BIGINT SUM DEFAULT '0'
)
AGGREGATE KEY(siteid, citycode, username)
DISTRIBUTED BY HASH(siteid) BUCKETS 10
PROPERTIES("replication_num" = "1");

2.backends信息:
MySQL [(none)]> SHOW PROC '/backends';
+-----------------+-----------+---------------+----------+---------------+--------+----------+---------------------+---------------------+-------+----------------------+-----------------------+-----------+
| Cluster | BackendId | IP | HostName | HeartbeatPort | BePort | HttpPort | LastStartTime | LastHeartbeat | Alive | SystemDecommissioned | ClusterDecommissioned | TabletNum |
+-----------------+-----------+---------------+----------+---------------+--------+----------+---------------------+---------------------+-------+----------------------+-----------------------+-----------+
| | 10376 | 192.168.5.100 | FineBI1 | 10050 | 9060 | 10040 | 2017-08-17 09:31:20 | 2017-08-17 09:36:15 | true | false | false | 0 |
| example_cluster | 10001 | 192.168.5.105 | FineBI2 | 9050 | 9060 | 8040 | 2017-08-16 11:50:53 | 2017-08-17 09:36:15 | true | false | false | 0 |
| example_cluster | 10002 | 192.168.5.106 | FineBI3 | 9050 | 9060 | 8040 | 2017-08-16 11:51:08 | 2017-08-17 09:36:15 | true | false | false | 0 |
+-----------------+-----------+---------------+----------+---------------+--------+----------+---------------------+---------------------+-------+----------------------+-----------------------+-----------+
3 rows in set (0.00 sec)

3.fe log:
2017-08-17 09:40:23,089 INFO 434 [StmtExecutor.analyze():306] the originStmt is =CREATE TABLE table1 ( siteid INT DEFAULT '10', citycode SMALLINT, username VARCHAR(32) DEFAULT '', pv BIGINT SUM DEFAULT '0' ) AGGREGATE KEY(siteid, citycode, username) DISTRIBUTED BY HASH(siteid) BUCKETS 10 PROPERTIES("replication_num" = "1")
2017-08-17 09:40:23,090 INFO 434 [Database.checkQuota():239] database[example_cluster:dbtest] data quota: left bytes: 1024.000 GB / total: 1024.000 GB
2017-08-17 09:40:23,121 WARN 207 [MasterImpl.finishTask():102] backend does not found. host: 127.0.0.1, be port: 9060. task: TFinishTaskRequest(backend:TBackend(host:127.0.0.1, be_port:9060, http_port:8040), task_type:CREATE, signature:10381, task_status:TStatus(status_code:OK, error_msgs:[]), report_version:15028554640082)
2017-08-17 09:40:23,123 WARN 119 [MasterImpl.finishTask():102] backend does not found. host: 127.0.0.1, be port: 9060. task: TFinishTaskRequest(backend:TBackend(host:127.0.0.1, be_port:9060, http_port:8040), task_type:CREATE, signature:10385, task_status:TStatus(status_code:OK, error_msgs:[]), report_version:15028554640083)
2017-08-17 09:40:23,125 WARN 207 [MasterImpl.finishTask():102] backend does not found. host: 127.0.0.1, be port: 9060. task: TFinishTaskRequest(backend:TBackend(host:127.0.0.1, be_port:9060, http_port:8040), task_type:CREATE, signature:10389, task_status:TStatus(status_code:OK, error_msgs:[]), report_version:15028554640084)
2017-08-17 09:40:23,127 WARN 119 [MasterImpl.finishTask():102] backend does not found. host: 127.0.0.1, be port: 9060. task: TFinishTaskRequest(backend:TBackend(host:127.0.0.1, be_port:9060, http_port:8040), task_type:CREATE, signature:10393, task_status:TStatus(status_code:OK, error_msgs:[]), report_version:15028554640085)
2017-08-17 09:40:23,129 WARN 207 [MasterImpl.finishTask():102] backend does not found. host: 127.0.0.1, be port: 9060. task: TFinishTaskRequest(backend:TBackend(host:127.0.0.1, be_port:9060, http_port:8040), task_type:CREATE, signature:10397, task_status:TStatus(status_code:OK, error_msgs:[]), report_version:15028554640086)
2017-08-17 09:40:33,094 WARN 434 [Catalog.createPartitionWithIndices():2786] Failed to create partition[table1]. Timeout. unfinished marks: 10002=10385, 10002=10393, 10002=10381, 10002=10389, 10002=10397
2017-08-17 09:40:34,498 INFO 61 [BDBJEJournal.getFinalizedJournalId():400] database names: 1
2017-08-17 09:40:34,498 INFO 61 [Checkpoint.runOneCycle():94] checkpoint imageVersion 0, checkPointVersion 0

4.be.log:
I0817 09:40:23.095836 25097 command_executor.cpp:249] begin to process create table. [tablet=10379, schema_hash=1421156361]
I0817 09:40:23.095849 25095 command_executor.cpp:249] begin to process create table. [tablet=10383, schema_hash=1421156361]
I0817 09:40:23.095865 25096 command_executor.cpp:249] begin to process create table. [tablet=10387, schema_hash=1421156361]
I0817 09:40:23.137383 25097 command_executor.cpp:333] finish to process create table. [res=0]
I0817 09:40:23.137900 25097 task_worker_pool.cpp:276] finish task success.result: 0
I0817 09:40:23.137923 25097 task_worker_pool.cpp:239] type: 0, signature: 10379 has been erased. queue size: 4
I0817 09:40:23.137938 25097 command_executor.cpp:249] begin to process create table. [tablet=10391, schema_hash=1421156361]
I0817 09:40:23.139243 25095 command_executor.cpp:333] finish to process create table. [res=0]
I0817 09:40:23.140229 25095 task_worker_pool.cpp:276] finish task success.result: 0
I0817 09:40:23.140252 25095 task_worker_pool.cpp:239] type: 0, signature: 10383 has been erased. queue size: 3
I0817 09:40:23.140270 25095 command_executor.cpp:249] begin to process create table. [tablet=10395, schema_hash=1421156361]
I0817 09:40:23.141095 25096 command_executor.cpp:333] finish to process create table. [res=0]
I0817 09:40:23.141538 25096 task_worker_pool.cpp:276] finish task success.result: 0

  1. 在trash的文件夹里可以看到有dat文件里有表结构信息,所以跟日志一致,be表是创建成功的

hdfs 导入 palo

hi 各位好, hdfs文件导入到palo 遇到点问题
1:要求提供bin/hadoop。 这个在什么地方可以配置的么? 我用ln -s 方式解决了
2:2017-09-28 18:23:41,627 INFO 2533 [DppScheduler.submitEtlJob():201] Error: Could not find or load main class bistreaming

还有 有技术群没? 谢谢了

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.