talent-plan / tinysql Goto Github PK

View Code? Open in Web Editor NEW

1.6K 1.6K 509.0 3.64 MB

A course to build the SQL layer of a distributed database.

License: Apache License 2.0

Dockerfile 0.02% Makefile 0.19% Shell 0.09% Go 97.36% Assembly 0.01% Yacc 2.33%

tinysql's People

Contributors

Stargazers

Watchers

Forkers

xxsgo thincher log-e-e chiaoteo yangsongbao dasinlsb xingyingone wishyong digitsisyph xiaodong-ji wfnuser lindasummer kikkon connor1996 jianghang viking714 brucechin kaikanwu yisaer chihiro2014 kuguobing leexiaofeng honor100 gmshepard raygift mmyj arthuryangcs hawkingrei ekalinin benoxo andrewmatilde mkxxq jmpotato vodkav jenenliu pansonpanson hailanwhu b41sh tallguys impluse007 jychen7 zct huashen 1399689727 kathy-baixue win-man whatot tciooc sdjdd doytsujin zhoush41 ekexium sa3036 reminiscent senzhangai caizt16 xhcom-ui yutiansut gogoyao jackwener chyanmio buptjimmyyang monsooooon wanglei1q84 fallensouls yiminmin aierui hooper9973 paranoiaupc kirisky fzhedu zyctree xffxff sgolecha hu00yan keleqnma tszkitlo40 sylzd huahang junlong-gao le0po1d leiwingqueen xz1220 sutrahsing antiknot kstrwind meilin96 navono quuuudp ncuwaln ben2077 wheeeeeeeeels shizy818 clark1013 egnchen onesizefitsquorum pingworld whichxjy divanodestiny sunnerrr

tinysql's Issues

Tracking issue for improvement course documentation and code comments

This issue aims to address the lack of course documentation and code comments for TinySQL. We will start from this step to improve the user experience.

The documentation layout

## Project
### Introduction

### Topic
#### Knowledge topic
#### Related code

### Exercises
### References

Chinese version

## Project
### 简介

### XX主题
#### 相关知识
#### 相关代码

### 练习
### 引用

The code comments layout

We'd better use /* Project 1: Your Code Here */ to identify where the user needs to fill in.

We'd better add detailed comments to help users, such as:

func MyFunc(a int, b int) (int, error) {
/*  What does `MyFunc` need to do?
 *  explain parameters:
 *      a int: ....;
 *      b int: ....;
 *   explain return value:
 *     int: ....;
 *     error: ....;
 *   `MyFunc` may need to follow the steps:
 *       1. xxxx;
 *       2. xxxx;
 *       .....
 *    Some hints that might be useful:
 *       - you may need some structs
 *       - you may need to add some members for struct A
 *       - you need to implement 'FuncA' first
 *       - Be aware of concurrency issues
 *       .....
 */ 
}

TODO

Possibly mismatched zipfFactor and avgError in statistics/cmsketch_test.go

Hi!

While struggling with TestCMSketch, I found an interesting fact that values generated with lower zipfFactor are more dispersive (thus lead to more collision) but with lower expected avgErr. This can be inferred by the definition of Zipf distribution.

I have passed case 2 (zipfFactor = 2) and case 3 (zipfFactor = 3), and the avgErrs are all 0. I am sure that hashing results for different rows are independent and I have tried many groups. However, these different groups of hasher all failed in case 1 (zipfFactor = 1.1) with avgErr 10, which really upsets me :)

The numbers of distinct values generated in terms of different zipfFactor are as follows:

zipfFactor: 1.1  len(lMap): 23965
zipfFactor: 2  len(lMap): 420
zipfFactor: 3  len(lMap): 58

It seems that lower zipfFactor should be coupled with higher avgErr. And the incorrect arrErr 10 I got in case 1 looks reasonable giving width = 2048. Is that wrong? I really need some advice.

Thanks!

Add lab 1 guide

Proj4 part1缺少测试用例

缺少测试用例，只需要在如下代码中加入 la.baseLogicalPlan.PredicatePushDown(predicates)就可通过 TestPredicatePushDown的测试。是故意这么写让我们自己添加测试，还是忘记写了

Fixes GitHub workflow

Current GitHub workflow is broken while we can hardly fix it during @rebelice preparing Talent Plan 3.0.

Will first remove the workflow as it is totally broken and this issue tracks for adding back proper GitHub workflow.

"github.com/pingcap/tidb/parser/mysql"不存在

"github.com/pingcap/tidb/parser/mysql"这个在proj1和proj2中都有出现，但是tidb下面没有parser目录。

课程资料关系代数文档链接失效

在学习tinysql课程时，我发现课程资料这里的链接失效，请问是否有该两个资料的备份网址？

关系代数

SQL Grammar & Relation Algebra

课程资料

Translation of project README to english

Please make a translation of the project readmes to english

workflow run failed in project1 part2

Environment

macos
go 1.14.0

Steps to reproduce this issue

pass the go test for tablecodec
gofmt tablecodec used
push to master branch

Expected Result

workflow run successful

Actual Result

Add README for tinysql 3.0

Project 5 TestAggPushDown time out and proj5-part3-README-zh_CN.md images lost

exec "alter table t add index idx(a, b, c)" times out in executor/aggregate_test.go: 56 (in TestAggPushDown). The test will pass after deleting this line. I didn't change codes about DDL for index.

another issue is the images lost:

add issue and PR template

take https://github.com/pingcap/talent-plan/pull/329/files as an example.

Fix `lock_test.go/TestLockTTL`

In lock_test.go, settings of lock TTL are shrunk together to speed up tests:

func init() {
    // Speed up tests.
    defaultLockTTL = 3
    maxLockTTL = 120
    ttlFactor = 6
    oracleUpdateInterval = 2
}

However, the remaining code still applies millisecond while computing elapsed time (ex.2pc.go/txnLockTTL), which will add a relatively large bias on the basic formula ttl = ttlFactor * sqrt(sizeInMiB).

In addition, ttlEquals ignores case that x < y, where float(x - y) will result in a wrong answer given x and y of uint type.

Add tests for MySQL protocol layer

github classroom assignment link is broken

I click the assignment-invitations URL in tutorial.md, but something goes wrong.

Add code for lab 2

wrong links in proj3.md

the links in proj3-README-zh_CN.md link to 404, please update them.
#36 LGTM, please check.

Question about Material list pointing to pingcap blog site language

I noticed reading material in https://github.com/tidb-incubator/tinysql/blob/course/courses/material.md which included links to
https://pingcap.com/zh/blog/ , for example https://pingcap.com/zh/blog/tidb-source-code-reading-5 . I was wondering if the same blog is present in English too. For now I'm planning to use Google Translate though

There may be a bug in checkColumnAttributes

https://github.com/pingcap-incubator/tinysql/blob/17211e65f907ffb7deb437a722485440682845d6/ddl/ddl_api.go#L647-L660

code here is a little strange, why mysql.TypeDatetime, mysql.TypeDuration, mysql.TypeTimestamp need to check precision?

Perhaps this is for case mysql.TypeNewDecimal, mysql.TypeDouble, mysql.TypeFloat

proj4: Can not find file to edit

Can not find out the code file listed in https://github.com/pingcap-incubator/tinysql/blob/course/courses/proj4-part1-README-zh_CN.md,

can you specify more clearly about the specific folder?

能否做一个内存版的MySQL

用途

可以用于单元测试，类似miniredis 这种
不用安装MySQL也可以进行业务开发

Improvement course documentation and code comments for project 3

Add lab 2 guide

refine project 2

Add MySQL protocol layer code

In TinySQL, we hope everyone can focus on the design and implementation of the SQL layer. The content of the MySQL protocol layer is complex and messy, and it is not our focus. So we will provide a simple implementation at the beginning.

remove this redundant func

https://github.com/tidb-incubator/tinysql/blob/1bdb9a3bc6ba158424a7da352d14d2b72d39ebaf/tablecodec/tablecodec.go#L96

Autograding report GOPATH no set

It works fine whe I try run make test-proj1 locally, but when I submit the code the autograding workflow reports error as below:

Makefile:7: *** Please set the environment variable GOPATH before running `make`.  Stop.
  
❌ proj1
::error::Error: Exit with code: 2 and signal: null

Am I missing some configs? How can I make the autograding work?

typo error in proj1-part1-README-zh_CN.md

proj1-part1-README-zh_CN.md
原句：

还可以制定指输出需要的列，例如：
“指” - > "只"

Add code for lab 1

parser/model/ddl.go: Job.String may cause nil error which stops debugger

In parser/model/ddl.go, line 273:

// String implements fmt.Stringer interface.
func (job *Job) String() string {
	rowCount := job.GetRowCount()
	return fmt.Sprintf("ID:%d, Type:%s, State:%s, SchemaState:%s, SchemaID:%d, TableID:%d, RowCount:%d, ArgLen:%d, start time: %v, Err:%v, ErrCount:%d, SnapshotVersion:%v",
		job.ID, job.Type, job.State, job.SchemaState, job.SchemaID, job.TableID, rowCount, len(job.Args), TSConvert2Time(job.StartTS), job.Error, job.ErrorCount, job.SnapshotVer)
}

job.Errormay be nil, which will cause nil error when fmt package call Error() define in parse/terror.go line 217 and stop the debugger.
So i think it's needed to check whether job.Error is nil in func job.String, or check whether e is nil in func Error.Error

为什么向量化的代码要测Join相关？

Simplify the MySQL protocol layer

The current MySQL protocol layer is a simplified version of TiDB implementation. I just deleted unnecessary parts. It works, but I think it might not be good.

optimizations:

Simplify the config module and delete unnecessary configurations
Refactor the code of the MySQL protocol layer to make it more readable

tablecodec.go:EncodeIndexSeekKey change the initial capacity of key

https://github.com/pingcap-incubator/tinysql/blob/bf13c144faf71010b2d77f754d6c81463b1ac6e8/tablecodec/tablecodec.go#L88

Why not take the length of idxID into account when allocating the capacity of key in the EncodeIndexSeekKey function, like the following:

key := make([]byte, 0, prefixLen+idLen+len(encodedValue))

Typo in proj1-2 code comments

In tableCodec.go, line 146, the comment is suggesting to use errInvalidRecordKey.GenWithStack whereas the function is decoding index prefix key. Might be better to consider changing this into using errInvalidIndexKey.GenWithStack.

func DecodeIndexKeyPrefix(key kv.Key) (tableID int64, indexID int64, indexValues []byte, err error) {
	...
	 *   3. errInvalidRecordKey.GenWithStack is a useful function to generate invalid record key errors.
	 ...
}

func DecodeIndexKeyPrefix(key kv.Key) (tableID int64, indexID int64, indexValues []byte, err error) {
	...
	 *   3. errInvalidIndexKey.GenWithStack is a useful function to generate invalid record key errors.
	 ...
}

Incomplete comment in `store/tikv/lock_resolver.go`

Comment in lock_resolver.go#L334 is broken.

ddl: Project 3 Go test fail

What did I do

go test in ddl folder

Expect

Assert fauilre or Success

See Instead

# github.com/pingcap/tidb/planner/core [github.com/pingcap/tidb/ddl.test]
../planner/core/rule_join_reorder_dp.go:17:2: imported and not used: "math/bits"
FAIL    github.com/pingcap/tidb/ddl [build failed]

The GitHub classroom invitation link is invalid

In classroom.md file, the link is invalid.

The lack of master branch leads to inconsistencies in tutorial.md

now the default branch is "course" , and "master" branch is missing,
project created by 作业模版 also missing master，so local work following https://github.com/pingcap-incubator/tinysql/blob/course/tutorial.md can't work well, maybe describition in tutorial.md should be edited ?

proj2: parser_test directly success without any changing.

When I start proj2, parser_test directly success without any changing.

Makefile error causes the target test-proj4-1,test-proj4-2 fails

roger@192 tinysql-course % make test-proj4-1
pwd
/Users/roger/Downloads/pichunying/tinysql-course
cd planner/core &&
go test -check.f TestPredicatePushDown && \

/bin/sh: -c: line 1: syntax error: unexpected end of file
make: *** [test-proj4-1] Error 2

A line in Proj6 may corrupt Proj3 randomly

Env setup
Go: 1.16
Current progress: proj3 in progress, proj1&2 fin

When I test my code for as the the last section in this page:
https://github.com/tidb-incubator/tinysql/blob/course/courses/proj3-README-zh_CN.md

I hit this line sometimes:
https://github.com/tidb-incubator/tinysql/blob/1bdb9a3bc6ba158424a7da352d14d2b72d39ebaf/store/tikv/snapshot.go#L150

After I commented it out, those tests are able to pass.

Can someone please check if that's a real issue or a bug introduced by my change?

refine project 1

material.md 中的 typo

关系代数是 relational algebra，不是 relation algebra

https://github.com/tidb-incubator/tinysql/blob/course/courses/material.md

Tracking issue for improve TinySQL as a learning-friendly mini distributed relational database

It is corresponding to the effort towards Talent Plan v3.0.

According to user feedback and my investigation, I found that TinySQL has serious issues. They make it a departure from the learning-friendly mini distributed relational database:

Not mini. The TinySQL has more than 100,000 lines of code. It is almost a copy of TiDB, and then part of the code is deleted. It contains a lot of irrelevant code and design.
Documents unfriendly. It almost only briefly explained the relevant knowledge topics and did not explain the project structure.
Poor course design. The topics explained in each lab are very large, but the content that needs to be implemented is only a small part.
Poor comments. They can't help understand the code.

In order to solve the above problems, I will redesign and implement TinySQL. The main improvements in the plan are as follows:

Redesign the course.
- Divide TinySQL into five stages. Each stage has a clear target and iconic function, and the subsequent stages are based on the previous stage, which is the progression of the previous stage.
- At one stage, we hope that TinySQL is simple enough. As more stages are completed, we will add necessary functions to TinySQL to make it truly a distributed relational database.
- I put the specific stage division at the end of this issue.
Adopt incremental framework mode.
- Initially, the course framework has no content, and every stage/substage will introduce the framework code that must be required for that stage/substage. Its purpose is to ensure the conciseness of the framework code and clearly show the content introduced at each stage/substage.

Optimize documentation and comments

The documentation layout

## Stage

### Introduction
#### Objectives
#### Materials

### Topic 1
#### Knowledge topic
#### Related code

### Exercises
### References

The following is stage design:

Stage 1: read-only relational database
- Target: the ability to read data using KV engine API
- Iconic function: the ability to handle simple SELECT statements
- Knowledge topic:
  - parser
  - data mapping from the relational model to KV
  - generating operator
Stage 2: insert and update
- Target: the ability to write data using KV engine API
- Iconic function: the ability to handle simple INSERT/UPDATE statements
- Knowledge topic:
  - volcano model
Stage 3: DDL
- Target: the ability to process DDL online
- Iconic function: the ability to process CREATE/DROP TABLE/INDEX online
- Knowledge topic:
  - online DDL algorithm
Stage 4: Optimizer
- Target: implement an optimizer and be able to choose the appropriate index and Join Order
- Iconic function:
  - ability to collect statistics
  - ability to choose the appropriate index and Join Order
- Knowledge topic:
  - SQL optimization
  - statistics
  - SystemR optimizer
Stage 5: Calculation optimization
- Target: optimize the calculation framework to improve performance
- Iconic function:
  - vectorization
  - Massively Parallel Processing(MPP)
- Knowledge topic:
  - vectorization
  - MPP

Issues

Confusing test case in `2pc_test.go/TestPrewriteRollback`

This test seems to validate the new value of key b while only key a is committed (see 2pc_test.go#L165) .Maybe some comments could be added to explain this counterintuitive design.
Maintaining such a complicated learner-oriented project is hard. I hope I can help although still struggling with the code.
Really appreciate your work :)

refine project 4

go build failure in project 2 test file

Environment Details:
System:  Manjaro Linux x86_64
Kernel: 5.10.7-1-MANJARO
Go Version: 1.15.6

find a bug in tinysql/parser/lexer_test.go
if run go test under directory tinysql/parser, go build would fail with

# github.com/pingcap/tidb/parser
./lexer_test.go:199:12: conversion from untyped int to string yields a string of one rune, not a string of digits (did you mean fmt.Sprint(x)?)
FAIL    github.com/pingcap/tidb/parser [build failed]

by changing line 199 solve the problem

// {`'\Z'`, string(26)},
{`'\z'`, string(rune(26))},

this error may be caused by higher Go version.

go test error on parser

Environment

os windows 10
go 1.15.7

Steps to reproduce this issue

1. cd parser
2. go test .

Expected Result

test build successful

Actual Result

# github.com/pingcap/tidb/parser
.\lexer_test.go:199:12: conversion from untyped int to string yields a string of one rune, not a string of digits (did you mean fmt.Sprint(x)?)
FAIL    github.com/pingcap/tidb/parser [build failed]
FAIL

parser, tests: `TestscanString` type conversion failed

Description

If you are working on project 2(implement&test JoinTable for SQL parser), you maybe would get an error as below:

./lexer_test.go:199:12: conversion from untyped int to string yields a string of one rune, not a string of digits (did you mean fmt.Sprint(x)?)

that code block looked like:

191   func (s *testLexerSuite) TestscanString(c *C) {
192	   table := []struct {
193	   	   raw    string
194		   expect string
195	   }{
196		   {`' \n\tTest String'`, " \n\tTest String"},
197		   {`'\x\B'`, "xB"},
198		   {`'\0\'\"\b\n\r\t\\'`, "\000'\"\b\n\r\t\\"},
199		   {`'\Z'`, string(26)},// Error

It broke tests but not a user cause

Advice

use hardcoded string

repeated make clean should not show error

Repeat "make clean" should not show error

talent-plan / tinysql Goto Github PK

tinysql's People

Contributors

Stargazers

Watchers

Forkers

tinysql's Issues

The documentation layout

The code comments layout

TODO

关系代数

SQL Grammar & Relation Algebra

课程资料

Environment

Steps to reproduce this issue

Expected Result

Actual Result

optimizations:

What did I do

Expect

See Instead

Issues

Environment

Steps to reproduce this issue

Expected Result

Actual Result

Description

Advice

Recommend Projects

Recommend Topics

Recommend Org