Coder Social home page Coder Social logo

jansu-dev / tichange_for_lightning Goto Github PK

View Code? Open in Web Editor NEW
3.0 1.0 2.0 166 KB

TiChange 是一个能让你快速将csv文件适配 tidb-lightning csv 文件格式要求的工具,遵照 TiDB 官网给出对 CSV 文件的格式支持。

Shell 100.00%

tichange_for_lightning's Introduction

TiChange_for_lightning 是一个能让你快速将csv文件适配 tidb-lightning csv 文件格式要求的工具。遵照 TiDB 官网给出对 CSV 文件的格式支持,本脚本主要功能如下:

  1. 切分 csv 文件数与 CPU vCore 数相同;
  2. 适配 lightning 可识别的文件名;
  3. 批量替换当前字段分隔符(separator)为 lightning 默认字段分隔符(',');
  4. 批量替换当前引用定界符(delimiter)为 lightning 默认引用定界符('"');
  5. 批量替换当前 null 值字符为 lightning 默认字符('\N');

使用简介

[tidb@tidb-51-pd lightning]$ ./TiChange_for_lightning.sh 
Auther    : jan su
Introduce : TiChange_for_lightning 是一个能让你快速将csv文件适配 tidb-lightning csv 文件格式要求的工具,如有任何 BUG 请及时反馈,作者将及时修复!
 
Usage: TiChange_for_lightning.sh [option] [parameter]
option: -i --input-file          [input_csv_path]          |               | 需要处理的csv文件路径;
        -o --operate-path        [operate_dir_path]        |               | 需要处理csv文件的,空间足够的文件夹路径;
        -m --schema-meta         [schema_meta]             |               | 需要指定库中 csv 文件所属对象信息,eg: -m schema_name.table_name;
        -s --separator_import    [separator_import_format] |(default: ',' )| 需要指定当前 csv 文件字段分隔符,eg: -s '||' TiChange 自动将其转换为 "," : "A"||"B" --> "A","B" ;
        -d --delimiter_import    [delimiter_import_format] |(default: '"' )| 需要指定当前 csv 文件引用定界符,eg: -d  ''  TiChange 自动将其转换为 '"' :    ABC   -->  "ABC" ;
        -n --null_import         [null_import_format]      |(default: '\N')| 需要指定解析 csv 文件中字段值为 NULL 的字符, eg: '\N' 导入 TiDB 中会被解析为 NULL ;
        -h --help                                          |               | 获取关于 TiChange.sh 的操作指引,详细 Demo 请参考 : https://github.com/jansu-dev/TiChange_for_lightning ;

Demo-标准CSV

首先,使用 TiChange_for_lightning.sh 将一整块 csv 文件处理为多个 96MB csv,将 terminal 输出的 data-source-dir 路径填写至 tidb-lightning.toml。随后,便可按 TiDB 官网要求配置好后启动进程,导入数据。

[tidb@tidb-51-pd examples]$ mysql -uroot -P4000 -h192.168.169.61 -A jan

MySQL [jan]> create table TiChange_test(id int ,name varchar(20));

[tidb@tidb-51-pd eg_standerd]$ cat TiChange_test_standerd.csv 
"1","jan_standerd_csv"

[tidb@tidb-51-pd lightning]$ ./TiChange_for_lightning.sh \
      -i '/home/tidb/lightning/examples/eg_standerd/TiChange_test_standerd.csv' \
      -o '/home/tidb/lightning/examples/eg_standerd/test' \
      -m 'jan.TiChange_test'

Option i == /home/tidb/lightning/examples/eg_standerd/TiChange_test_standerd.csv
Option o == /home/tidb/lightning/examples/eg_standerd/test
Option s == jan.TiChange_test
---------------------------------------------------------------------------
------------  TiChange starting  ------------------------------------------
---------------------------------------------------------------------------
------------  using below information for tidb-lightning.toml  ------------
---------------------------------------------------------------------------
Please write the string path to tidb-lightning.toml config file!!!
and ,delete the dealed files by hand after imported data into database!!!


[mydumper]
data-source-dir = "/home/tidb/lightning/examples/eg_standerd/test/e78e341_operating_dir"
[mydumper]
no-schema = true
---------------------------------------------------------------------------


[tidb@tidb-51-pd lightning]$ cd examples/eg_nonstanderd/


[tidb@tidb-51-pd eg_standerd]$ tiup tidb-lightning --config ./tidb-lightning.toml
Found tidb-lightning newer version:
......
......
Verbose debug logs will be written to tidb-lightning.log
tidb lightning exit

[tidb@tidb-51-pd eg_standerd]$ mysql -uroot -P4000 -h192.168.169.61 -A jan

MySQL [jan]> select * from Tichange_test;
+------+------------------+
| id   | name             |
+------+------------------+
|    1 | jan_standerd_csv |
+------+------------------+

Demo-非标准CSV

[tidb@tidb-51-pd lightning]$ ./TiChange_for_lightning.sh  \
      -i '/home/tidb/lightning/examples/eg_nonstanderd/TiChange_test_nonstanderd.csv'   \
      -o '/home/tidb/lightning/examples/eg_nonstanderd/test'   \
      -m 'jan.TiChange_test'  \
      -s '||'   \
      -d ''
Option i == /home/tidb/lightning/examples/eg_nonstanderd/TiChange_test_nonstanderd.csv
Option o == /home/tidb/lightning/examples/eg_nonstanderd/test
Option s == jan.TiChange_test
Option s == ||
Option d == 
---------------------------------------------------------------------------
------------  TiChange starting  ------------------------------------------
---------------------------------------------------------------------------
------------  using below information for tidb-lightning.toml  ------------
---------------------------------------------------------------------------
Please write the string path to tidb-lightning.toml config file!!!
and ,delete the dealed files by hand after imported data into database!!!


[mydumper]
data-source-dir = "/home/tidb/lightning/examples/eg_nonstanderd/test/6edfc43_operating_dir"
[mydumper]
no-schema = true
---------------------------------------------------------------------------


[tidb@tidb-51-pd lightning]$ cd examples/eg_standerd/


[tidb@tidb-51-pd eg_nonstanderd]$ tiup tidb-lightning --config ./tidb-lightning.toml


MySQL [jan]> select * from TiChange_test;
+------+---------------------+
| id   | name                |
+------+---------------------+
|    1 | jan_standerd_csv    |
|    2 | jan_nonstanderd_csv |
+------+---------------------+

Demo-NULL值处理

[tidb@tidb-51-pd lightning]$ ./TiChange_for_lightning.sh  \
>       -i '/home/tidb/lightning/examples/eg_null/TiChange_test_null.csv'   \
>       -o '/home/tidb/lightning/examples/eg_null/test'   \
>       -m 'jan.TiChange_test'  \
>       -s '||'   \
>       -d ''     \
>       -n '""'
Option i == /home/tidb/lightning/examples/eg_null/TiChange_test_null.csv
Option o == /home/tidb/lightning/examples/eg_null/test
Option s == jan.TiChange_test
Option s == ||
Option d == 
Option n == ""
---------------------------------------------------------------------------
------------  TiChange starting  ------------------------------------------
---------------------------------------------------------------------------
------------  using below information for tidb-lightning.toml  ------------
---------------------------------------------------------------------------
Please write the string path to tidb-lightning.toml config file!!!
and ,delete the dealed files by hand after imported data into database!!!


[mydumper]
data-source-dir = "/home/tidb/lightning/examples/eg_null/test/e08a2a7_operating_dir"
[mydumper]
no-schema = true
---------------------------------------------------------------------------


[tidb@tidb-51-pd lightning]$ cd examples/eg_null/

[tidb@tidb-51-pd eg_null]$ tiup tidb-lightning --config ./tidb-lightning.toml

MySQL [jan]> select * from TiChange_test;
+------+---------------------+
| id   | name                |
+------+---------------------+
|    1 | jan_standerd_csv    |
|    2 | jan_nonstanderd_csv |
|    3 | NULL                |
+------+---------------------+

Demo-CSV字段包含\n字符

对于 csv 文件中含有 "\n" 值导入 TiDB 的显出非预期的行为不是本脚本的问题,是 TiDB 自身行为实现所致,望关注者周知!!!

[tidb@tidb-51-pd lightning]$ ./TiChange_for_lightning.sh  \
>        -i '/home/tidb/lightning/examples/eg_fields_include_null/TiChange_test_fields_include_null.csv'   \
>        -o '/home/tidb/lightning/examples/eg_fields_include_null/test'   \
>        -m 'jan.TiChange_test' 
Option i == /home/tidb/lightning/examples/eg_fields_include_null/TiChange_test_fields_include_null.csv
Option o == /home/tidb/lightning/examples/eg_fields_include_null/test
Option s == jan.TiChange_test
---------------------------------------------------------------------------
------------  TiChange starting  ------------------------------------------
---------------------------------------------------------------------------
------------  using below information for tidb-lightning.toml  ------------
---------------------------------------------------------------------------
Please write the string path to tidb-lightning.toml config file!!!
and ,delete the dealed files by hand after imported data into database!!!


[mydumper]
data-source-dir = "/home/tidb/lightning/examples/eg_fields_include_null/test/4f9407f_operating_dir"
[mydumper]
no-schema = true
---------------------------------------------------------------------------


[tidb@tidb-51-pd lightning]$ cd examples/eg_fields_include_null/


[tidb@tidb-51-pd eg_fields_include_null]$ tiup tidb-lightning --config ./tidb-lightning.toml


MySQL [jan]> select * from TiChange_test;
+------+---------------------+
| id   | name                |
+------+---------------------+
|    1 | jan_standerd_csv    |
|    2 | jan_nonstanderd_csv |
|    3 | NULL                |
|    4 | 123123
sdffsdfdf    |
+------+---------------------+

MySQL [jan]> select * from TiChange_test where id =4;
+------+------------------+
| id   | name             |
+------+------------------+
|    4 | 123123
sdffsdfdf |
+------+------------------+

tichange_for_lightning's People

Contributors

jansu-dev avatar hey-hoho avatar

Stargazers

 avatar qizheng avatar dongxu avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.