Comments (22)
Status: I am still researching the database theory for this.
This work is quite time consuming and I have no time on this since I am a senior high school student and also limited by my finite knowledge. This is high probability to not going to realize this idea by me.
from pttbbs.
I finally have my time to start to make some initial progress about the basic structure of server.
I am quite confused about the database.
I found that most of data are stored in a single leveldb database file.(Like commentd, postd)
So apparently, there's only one machine to handle the database works.
Is that right?
from pttbbs.
I would like to propose to refactor ptt as horizontally-scalable architecture.
To my knowledge, currently ptt is efficient by heavily utilizing shm and mmap.
Given current tech-advances, the following is what I would like to propose:
- use redis-cluster (distributed in-mem db) as the replacement of shm.
- use mongo-cluster (distributed mem / file db) as the replacement of mmap.
We may need to refactor some data-structure as well.
(ex: separating main-content and the comments)
The proposal is welcome to comment in this thread. Once we got consensus about the proposal.
I can start to do the refactor since Dec. this year (2017) in full-time, expected finishing before Jun. 2018.
https://docs.google.com/presentation/d/1If_-ZmcDviIxLWq2USmdTtN8xsWLSlPNpUGMciZ_R-w/edit#slide=id.p
from pttbbs.
I'm experimenting with a FUSE-based solution. Still many things need to be sorted out before it can be tested with current system.
from pttbbs.
You are welcome to try.
from pttbbs.
How about Rust ? It it a new language purposed by a programmer of Mozilla. Some people are rewriting Firefox engine using Rust. Here is the link: https://github.com/servo/servo
from pttbbs.
Rust is definitely a fairly good idea.
But it is quite difficult to me to understand its philosophy.
And there is an implementation of PTT Webserver in Go language already.
I think it is good to follow the previous practice.
from pttbbs.
Ok, I understood.
BTW, it is hard for me to believe that you are just a senior high student. As the way I see, you are more like a experienced programmer 👍
from pttbbs.
I know this is quite stupid question.
But I dont understand how Ptt handles such a big flow of requests with one machine....
from pttbbs.
PTT does not use a database. Everything is stored in the file system as individual files.
Metadata is stored in .DIR files, which is just a bunch of structs. .BOARD and .PASSWD
store board and user metadata.
The leveldb databases are later add-ons, and not used directly by the core BBS system.
They may be used by additional services, such as web BBS.
from pttbbs.
I see.
I actually want to build the program on the platform called kubernetes.
http://kubernetes.io/
I think this would be a good choice as the platform is mature enough and is deployed in Google production system.
Of course. I don't know if this fit for the existing PTT platform.
I think this is beneficial to PTT as this increases the scalability and redundacy.
But some data structure reformation must be introduced as this is a different system, rather than file-based data structure.
And the cost introduced by the migration might decreases this benefit or even worse.
I am still investigating the possible solution and alternatives.
from pttbbs.
Kubernetes is very low in the application stack. PTT is a monolithic system.
Moving to Kubernetes, or any distributed system for that matter, requires a complete separation
and rewrite of data storage and user connection handling.
from pttbbs.
This is exactly what I want to do, though.
I know the data migration is not easy to do, especially for the aging PTT system.
I am still investigating for the solution, and trying to document the work flow of PTT to understand the whole system.
So far this is just a idea, pretty naive.
from pttbbs.
@LivingInPortal Love your idea. I wonder if there is any progress so far.
from pttbbs.
I know this is quite stupid question.
But I dont understand how Ptt handles such a big flow of requests with one machine....
FYI: http://www.kegel.com/c10k.html
PTT is built on libevent, but you can take some modern solutions into account such as libuv or libev.
from pttbbs.
PTT is built on libevent
This is not true.
We did use libevent in some portion, but the major mbbsd is simple fork-based server.
from pttbbs.
Will the plan allow for an API? If we can make PTT's data accessible, then artificial intelligence applications can make use of it for analysis, similar to this or this
from pttbbs.
Access to data is also a policy issue, which is not covered by this technical project here.
from pttbbs.
Besides the policy issue, it seems like it is also hard to implement proper API on the current code structure. As my understanding, it may need lots of changes to the structure.
from pttbbs.
@chhsiao1981
How's the progress going now?
I was too busy last year since I was dealing with personal issues, and I almost forgot it.
I think there are main two factors that need to be considered before starting refactor: Frontend and Backend.
First, I think we need to reconsider how is our data arranged and how to migrate from the current old files-fashion data store.
Choosing of Database depends on everyone's consensus (and lots of things need our consensus too). But the most important thing to do is how we migrate our data.
I wonder if we can create a FUSE-based virtual file system to fix this...
from pttbbs.
hi @LivingInPortal ,
I'm proposing to separate main-content and comments (in #31 )
I'm also trying to integrate some unit-test framework now. (#34 )
The current progress is trying to setup unit-test based on check
https://libcheck.github.io/check/index.html
I think I'll finish the unit-test setup first, and then continue finishing separating main-content / comments.
The storage-mechanism is open for the choices once the separating main-content / comments is done~
(I prefer some database which utilizes the mem-usage)
from pttbbs.
I would like to close this. Please open a separate issue when there is a concrete design. Saying about using language X, database Y, framework Z isn't concrete enough.
from pttbbs.
Related Issues (20)
- whence in PttLock should be SEEK_SET? (or PttLock after lseek should set offset as 0?) HOT 4
- 請問如何找回ID和密碼 HOT 5
- (Forwarded from PttBug)「登入次數」的累計盲點 HOT 2
- Some articles in the searching result for the word "初音" are missing in board C_Chat HOT 1
- [Feature Request] Support more than 8 characters password HOT 10
- 可用單一信箱多次認證 HOT 2
- AOTP verification seems to be broken HOT 1
- unable to compile boardd / mand in debian:bookworm due to libgrpc29. HOT 2
- Add SO_REUSEPORT support to logind HOT 1
- User ID rename does not update regemaildb
- Email input length hardcoded HOT 1
- [Bug] Ptt web search logic error HOT 2
- logind compile error HOT 2
- 使用 Mac 連上 term.ptt.cc,在發表文章/發送站內信時按下 ^ + X 沒有任何反應 HOT 4
- 看板下面的 bar 在特定情況下會消失
- [propose] Replace .PASSWDS with DB (mongodb) HOT 6
- BRD_WARNEL 的註解有錯
- Possible one more i++ in cmsys.strip_nonebig5? HOT 1
- 關於 SHM 沒有加上 volatile 這件事 HOT 10
- [資訊] 如果你收不到認證信的話 HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pttbbs.