Coder Social home page Coder Social logo

A attempt to rewrite Ptt about pttbbs HOT 22 CLOSED

ptt avatar ptt commented on August 11, 2024
A attempt to rewrite Ptt

from pttbbs.

Comments (22)

LivingInPortal avatar LivingInPortal commented on August 11, 2024 2

Status: I am still researching the database theory for this.
This work is quite time consuming and I have no time on this since I am a senior high school student and also limited by my finite knowledge. This is high probability to not going to realize this idea by me.

from pttbbs.

LivingInPortal avatar LivingInPortal commented on August 11, 2024 1

I finally have my time to start to make some initial progress about the basic structure of server.
I am quite confused about the database.
I found that most of data are stored in a single leveldb database file.(Like commentd, postd)
So apparently, there's only one machine to handle the database works.
Is that right?

from pttbbs.

chhsiao1981 avatar chhsiao1981 commented on August 11, 2024 1

I would like to propose to refactor ptt as horizontally-scalable architecture.

To my knowledge, currently ptt is efficient by heavily utilizing shm and mmap.
Given current tech-advances, the following is what I would like to propose:

  1. use redis-cluster (distributed in-mem db) as the replacement of shm.
  2. use mongo-cluster (distributed mem / file db) as the replacement of mmap.

We may need to refactor some data-structure as well.
(ex: separating main-content and the comments)

The proposal is welcome to comment in this thread. Once we got consensus about the proposal.
I can start to do the refactor since Dec. this year (2017) in full-time, expected finishing before Jun. 2018.

https://docs.google.com/presentation/d/1If_-ZmcDviIxLWq2USmdTtN8xsWLSlPNpUGMciZ_R-w/edit#slide=id.p

from pttbbs.

robertabcd avatar robertabcd commented on August 11, 2024 1

I'm experimenting with a FUSE-based solution. Still many things need to be sorted out before it can be tested with current system.

from pttbbs.

wens avatar wens commented on August 11, 2024

You are welcome to try.

from pttbbs.

SLMT avatar SLMT commented on August 11, 2024

How about Rust ? It it a new language purposed by a programmer of Mozilla. Some people are rewriting Firefox engine using Rust. Here is the link: https://github.com/servo/servo

from pttbbs.

LivingInPortal avatar LivingInPortal commented on August 11, 2024

Rust is definitely a fairly good idea.
But it is quite difficult to me to understand its philosophy.
And there is an implementation of PTT Webserver in Go language already.
I think it is good to follow the previous practice.

from pttbbs.

SLMT avatar SLMT commented on August 11, 2024

Ok, I understood.
BTW, it is hard for me to believe that you are just a senior high student. As the way I see, you are more like a experienced programmer 👍

from pttbbs.

LivingInPortal avatar LivingInPortal commented on August 11, 2024

I know this is quite stupid question.
But I dont understand how Ptt handles such a big flow of requests with one machine....

from pttbbs.

wens avatar wens commented on August 11, 2024

PTT does not use a database. Everything is stored in the file system as individual files.
Metadata is stored in .DIR files, which is just a bunch of structs. .BOARD and .PASSWD
store board and user metadata.

The leveldb databases are later add-ons, and not used directly by the core BBS system.
They may be used by additional services, such as web BBS.

from pttbbs.

LivingInPortal avatar LivingInPortal commented on August 11, 2024

I see.
I actually want to build the program on the platform called kubernetes.
http://kubernetes.io/
I think this would be a good choice as the platform is mature enough and is deployed in Google production system.
Of course. I don't know if this fit for the existing PTT platform.
I think this is beneficial to PTT as this increases the scalability and redundacy.
But some data structure reformation must be introduced as this is a different system, rather than file-based data structure.
And the cost introduced by the migration might decreases this benefit or even worse.
I am still investigating the possible solution and alternatives.

from pttbbs.

wens avatar wens commented on August 11, 2024

Kubernetes is very low in the application stack. PTT is a monolithic system.
Moving to Kubernetes, or any distributed system for that matter, requires a complete separation
and rewrite of data storage and user connection handling.

from pttbbs.

LivingInPortal avatar LivingInPortal commented on August 11, 2024

This is exactly what I want to do, though.
I know the data migration is not easy to do, especially for the aging PTT system.
I am still investigating for the solution, and trying to document the work flow of PTT to understand the whole system.
So far this is just a idea, pretty naive.

from pttbbs.

tonytonyjan avatar tonytonyjan commented on August 11, 2024

@LivingInPortal Love your idea. I wonder if there is any progress so far.

from pttbbs.

tonytonyjan avatar tonytonyjan commented on August 11, 2024

@LivingInPortal

I know this is quite stupid question.
But I dont understand how Ptt handles such a big flow of requests with one machine....

FYI: http://www.kegel.com/c10k.html

PTT is built on libevent, but you can take some modern solutions into account such as libuv or libev.

from pttbbs.

kcwu avatar kcwu commented on August 11, 2024

PTT is built on libevent

This is not true.
We did use libevent in some portion, but the major mbbsd is simple fork-based server.

from pttbbs.

robhawkins avatar robhawkins commented on August 11, 2024

Will the plan allow for an API? If we can make PTT's data accessible, then artificial intelligence applications can make use of it for analysis, similar to this or this

from pttbbs.

wens avatar wens commented on August 11, 2024

Access to data is also a policy issue, which is not covered by this technical project here.

from pttbbs.

SLMT avatar SLMT commented on August 11, 2024

Besides the policy issue, it seems like it is also hard to implement proper API on the current code structure. As my understanding, it may need lots of changes to the structure.

from pttbbs.

LivingInPortal avatar LivingInPortal commented on August 11, 2024

@chhsiao1981
How's the progress going now?
I was too busy last year since I was dealing with personal issues, and I almost forgot it.
I think there are main two factors that need to be considered before starting refactor: Frontend and Backend.
First, I think we need to reconsider how is our data arranged and how to migrate from the current old files-fashion data store.
Choosing of Database depends on everyone's consensus (and lots of things need our consensus too). But the most important thing to do is how we migrate our data.
I wonder if we can create a FUSE-based virtual file system to fix this...

from pttbbs.

chhsiao1981 avatar chhsiao1981 commented on August 11, 2024

hi @LivingInPortal ,

I'm proposing to separate main-content and comments (in #31 )
I'm also trying to integrate some unit-test framework now. (#34 )

The current progress is trying to setup unit-test based on check

https://libcheck.github.io/check/index.html

I think I'll finish the unit-test setup first, and then continue finishing separating main-content / comments.

The storage-mechanism is open for the choices once the separating main-content / comments is done~
(I prefer some database which utilizes the mem-usage)

from pttbbs.

robertabcd avatar robertabcd commented on August 11, 2024

I would like to close this. Please open a separate issue when there is a concrete design. Saying about using language X, database Y, framework Z isn't concrete enough.

from pttbbs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.