Coder Social home page Coder Social logo

v5 Discussion Thread about medici HOT 35 CLOSED

flash-oss avatar flash-oss commented on August 13, 2024
v5 Discussion Thread

from medici.

Comments (35)

Uzlopak avatar Uzlopak commented on August 13, 2024 1

Finally fixed the performance issues.

Automattic/mongoose#11115

from medici.

Uzlopak avatar Uzlopak commented on August 13, 2024 1

Closing issue in favor of PR

from medici.

koresar avatar koresar commented on August 13, 2024

We can't make mongoose a peerDependency as yet. But I think you fixed the timeouts by adding .exec(). Not sure though.

The book.ledger() lost its populate option. This is a feature removal. Breaking change. Needs to be listed in the README.

Also, README needs a reversal of the collection name "medici_transactions".

from medici.

Uzlopak avatar Uzlopak commented on August 13, 2024

Ah, right sorry. I copied from my local fork. Will add back the populate option

from medici.

Uzlopak avatar Uzlopak commented on August 13, 2024

@koresar
Added back populate.

What timeouts are you referring to?

from medici.

koresar avatar koresar commented on August 13, 2024

the linked #30 talks about timeouts.

Mongoose 6 had a number of breaking API changes. One of them looks like timeout(s). I could be wrong though.

from medici.

Uzlopak avatar Uzlopak commented on August 13, 2024

Ah Ok.

Added back prettier, should now also run prettier when staging changes.

Anything else you need from my side?

from medici.

Uzlopak avatar Uzlopak commented on August 13, 2024

Made the date range more resilient.

from medici.

Uzlopak avatar Uzlopak commented on August 13, 2024

Unified the credit and debit logic in private method transact, so credit and debit are "curried" from transact method. now all fields of a transaction are set. E.g. in debit setting the timestamp was missing. Now that we set everything, we can ensure, that we wont miss anything.

Made it again possible to "inject" your own mongoose schema. :). With that now it determines the keys of a transaction object from the used model schema, like you had it before BUT isValidTransactionKey is now used in all places were transaction keys are handled. So that is also handled uniform.

from medici.

Uzlopak avatar Uzlopak commented on August 13, 2024

Well, I think I refactored as much as I could. Maybe you give it a shot?

from medici.

koresar avatar koresar commented on August 13, 2024

Help me to understand the .husky/pre-commit file contents:

#!/bin/sh
. "$(dirname "$0")/_/husky.sh"

npx lint-staged
npx lint-staged

Why I can't find "husky.sh" on my hard drive? Why we execute remote code via npx? And why twice?

If this was all about Prettier, then prettier makes most sense when typing code. It has quite minimal sense putting it to git hooks. IMO

from medici.

Uzlopak avatar Uzlopak commented on August 13, 2024

I took it from here

https://prettier.io/docs/en/precommit.html

from medici.

Uzlopak avatar Uzlopak commented on August 13, 2024

I guess we need to modify .gitignore so that .idea files are ignored.

Also I guess now mongodb-memory-server is using only mongo 4.2.5. So we have to make it possible to detect if we are using CI and use the mongo database from the CI or if we are local, use mongodb-memory-server

from medici.

Uzlopak avatar Uzlopak commented on August 13, 2024

ok, now uses process.env.CI to determine if we are in the CI and if so, use the database from the CI.If not, we use mongodb-memory-server :)

from medici.

Uzlopak avatar Uzlopak commented on August 13, 2024

@koresar
I wanted to answer to your remark here
#33 (comment)

You wrote:

Because we find that every dependency results in time waste.

I agree. But I usually try to keep the amount of the production dependencies low. But on the code quality development dependencies I go crazy, if they help me to reduce potential bugs. :)

from medici.

koresar avatar koresar commented on August 13, 2024

I like your refactorings! Sorry I pushed .idea/ files. I didn't realise the .gitignore got shrunk in this branch.

I find lint-staged useless, as I never stage files. I just push them (as you might have noticed). :)

I would highly highly recommend installing a prettier plugin to your IDE. It speeds up development by a ton.
It doesn't matter what code style we have committed to the repo. I just want developers to be productive when typing code. Try it.

from medici.

Uzlopak avatar Uzlopak commented on August 13, 2024

I am currently checking the performance.

I am looking to optimize the balance functionality. I think we can remove projection stage, but we dont get actually good performance from it. I think it is necessary to create a new collection like accounts and store there the balance of an account and every time when the journal is written also the account balance should be incremented or decremented.

I assume you use this package in your production environment, so I guess, there is no opportunity to add the account table in medici, correct?

from medici.

koresar avatar koresar commented on August 13, 2024

Hi.

I am not happy with the performance either. But we have a way to overcome them.

I like the "medici_accounts" because in our production those accounts are a mess!
But the "medici_accounts" balance can potentially get out of sync with "medici_transactions". It will happen at some point (even we use ACID transactions, aka session). Medici would not be production-ready module if we do not guarantee the "medici_accounts" always accurate, always represents the latest balance.

What we can do instead - we can attach "balance: Number" to some transactions (like, the last one in a commit of txs). And when doing the aggregation pipeline we would simply lookup the collection down to the first "transaction" with the "balance" property on it.

If implemented my way (without any additional collections for now) then there is an opportunity to be satisfied with the balance querying. :)

Also, if you are going to add balance performance improvements - then start a new branch. Current branch typescript_migration better be focused on one thing only. Trust me. It's better this way for all of us. :)

from medici.

Uzlopak avatar Uzlopak commented on August 13, 2024

Dont worry, that Branch is only about typescript and refactoring.

But I somehow feel, that putting that information to transactions, will be opening a can of worms. E.g. predating entries.

Btw. I fixed also eslint for linting.

Is there anything we should focus on now in this branch?

from medici.

Uzlopak avatar Uzlopak commented on August 13, 2024

@koresar
Can it be, that the pagination in .ledger() is broken?

const sort = {

Usually pagination is in the order sort, skip, limit. In ledger we have skip, limit and then sort. Maybe it doesnt matter, because it is a query and we use the query builder...

On the other hand we dont limit the aggregation in .balance().

how should this be handled? Is it a code smell?

from medici.

koresar avatar koresar commented on August 13, 2024

Correct. Order in queries doesn't matter. It matters in pipelines only.

We don't have limit in the .balance(). As soon as you get 1 billion transactions your balance gets calculated in about 1 second. That is a big trouble.

It's not a "code smell", but more like "performance stink". :)
Yeah. We have to fix it somehow.

from medici.

koresar avatar koresar commented on August 13, 2024

Sorry. I'm removing pre-commit hook. Not you nor me need the prettification on commit.

Also, I'm fixing eslint to avoid linting auto-generated folders (build/ and types/).

from medici.

koresar avatar koresar commented on August 13, 2024

I was about to merge this branch to master, but it doesn't build any more. See CI.

Error: src/Book.ts(68,15): error TS2769: No overload matches this call.
  Overload 2 of 2, '(pipeline: PipelineStage[], cb: Function): Aggregate<any[]>', gave the following error.
    Type '{ $sort: { datetime: number; timestamp: number; }; }' is not assignable to type 'PipelineStage'.
  Overload 2 of 2, '(pipeline: PipelineStage[], cb: Function): Aggregate<any[]>', gave the following error.
    Type '{ $group: { _id: string; credit: { $sum: string; }; debit: { $sum: string; }; count: { $sum: number; }; }; }' is not assignable to type 'PipelineStage'.
Error: src/Book.ts(75,39): error TS2769: No overload matches this call.
  Overload 1 of 2, '(pipeline?: PipelineStage[] | undefined, options?: AggregateOptions | undefined, callback?: Callback<any[]> | undefined): Aggregate<any[]>', gave the following error.
    Type '{ $group: { _id: string; credit: { $sum: string; }; debit: { $sum: string; }; count: { $sum: number; }; }; }' is not assignable to type 'PipelineStage'.
  Overload 2 of 2, '(pipeline: PipelineStage[], cb: Function): Aggregate<any[]>', gave the following error.
    Type '{ $group: { _id: string; credit: { $sum: string; }; debit: { $sum: string; }; count: { $sum: number; }; }; }' is not assignable to type 'PipelineStage'.

image

Please help.

from medici.

Uzlopak avatar Uzlopak commented on August 13, 2024

This happened because of mongoose 6.1.0
https://github.com/Automattic/mongoose/releases/tag/6.1.0

They implemented typings for PipelineStages. I upgraded mongoose to 6.1.1 and all the other packages too. And fixed the typing issue.

from medici.

Uzlopak avatar Uzlopak commented on August 13, 2024

Regarding .balance()

I think the pagination is wrong. We skip x-elements but still add up till the end of all transactions

I would like to solve it like this:
50d928f

from medici.

Uzlopak avatar Uzlopak commented on August 13, 2024

I created https://github.com/flash-oss/medici/tree/medici_accounts branch, were i created a medici_accounts table. With that it has a O(1) lookup time for getting the balance. I did not replace the original balance method, as it makes sense to have two methods. One for fast lookup and one for slow but reliable lookup.

balanceFast does take only one account (like the normal balance) and also expects the book as parameter,

For Data Migration we could have two ways:
1- write an aggregation which runs only once, and probably takes long to create the accounts collection
2- when balanceFast is called, it makes a lookup and if it fails, does a balance and stores the data into the accounts database. But the risk is, that in the meantime somebody created a transaction and created the accounts documents, which only contain the last transaction.

But if you initialize the installation from scratch, you dont need data migration and the data is correct from the beginning.

I ran the benchmark and if I have 50.000 entries in my transactions collection balance will take about 140 ms and balanceFast will take about 1,4 ms. There is also a benchmark added.

Also I want to mention, that balanceFast is a work name. It is a proof of concept, so the naming discussion is still open.

I also realized that balance is not restricting the balance to the book. So balance is always creating the total balance over all books. This seems to be a bug, as a user would assume that if you do a balance on a book instance, that you get the balance of said book and not over all books.

from medici.

Uzlopak avatar Uzlopak commented on August 13, 2024

Or we could make it so, that if we use ACID-sessions, we make balanceFast call, and if we dont find the account, we make the normal balance call and we create the entries in the account collection, according to the original balance result. Also if we do a balance call and we use sessions, we update the value in the account.

So then successively a v4 medici installation would fill up the account table. So then the longer the system runs, and the more we do the normal balance calls, the overall performance would improve.

from medici.

Uzlopak avatar Uzlopak commented on August 13, 2024

@koresar
If you agree with the latest approach, I would implement it. The thing is, that I need the fast account lookup in my product, so for my needs i am already satisfied with the changes. I think the last approach has the best tradeoffs. It ensures, that we have reliable data in the accounts collection, the source of truth are still the transactions and we dont force the mongodb to use alot of resources just to initialize the database.

Also tbh, I am happy that I refactored the whole code and reviewed it completely.

from medici.

Uzlopak avatar Uzlopak commented on August 13, 2024

Added unit tests for ACID

from medici.

Uzlopak avatar Uzlopak commented on August 13, 2024

@koresar

Done some changes in the ci-workflow

I know, you live Australia, so I guess you will be at work in a short. Looking forward for your feedback.

from medici.

koresar avatar koresar commented on August 13, 2024

Thank you for all the changes!

  1. We can drop MongoDB support which do not have ACID. Meaning, support only v4 and up.
  2. I think I'm okay with your "balanceFast" solution. You'd need to store ObjectID "link" to the most recent transaction used in the balanceFast value. We have to track it for other important reasons.

Also please think of the following scenario:

  • We have a lot of ledger entries. And the medici_accounts have the most recent balance.
  • Then someone goes and edits (yes, manually, in the DB) an old entry/transaction.
  • As the result out balanceFast is out of sync once and forever!

How do we overcome issues like that? We have to be ready for bad actors/actions.

from medici.

Uzlopak avatar Uzlopak commented on August 13, 2024

@koresar
I modified ledger to take lean as option. Tbh I dont know why somebody wants to use hydrated objects there anyway. It exposes the risk that you could actually update exactly one transaction in the field approved, credit and debit and mess up the whole ledger.

from medici.

Uzlopak avatar Uzlopak commented on August 13, 2024

@koresar
I think it is nearly perfect. From DX-PoV I guess this is also super good, as I typed it to the extreme. It has a code coverage of 100% and solved Bugs. Also I added some typescript types test and some stresstests, to ensure that the sessions are working properly.

Please review it.

Only the balance pagination behavior is something i find still to be considered a bug, See plz 50d928f for a potential patch.If you agree, I would integrate that patch into the current branch.

Also i dont understand why you would solve medici_accounts as you describe it. But I think the typescript_migration Branch is ready for merge and publish.

I think some operations will be faster, some operations will be slower.

from medici.

Uzlopak avatar Uzlopak commented on August 13, 2024

@koresar are you active today?

from medici.

koresar avatar koresar commented on August 13, 2024

from medici.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.