Coder Social home page Coder Social logo

Comments (13)

jrossi avatar jrossi commented on July 19, 2024

Read this is morning and the more I think about the more I think this is not a technical problem. Now hang in with me here.

No amount of work on our part will make ossec preforming md5/sha1 on 1 millions files acceptable. It is just not gonna be possible the number of IO operations is just gonna be to high. Unless someone has some wants to push this into a GPU and even then that does not help the IO issue.

This sounds like config issue and we don't have the tooling on rootcheck/syscheck to correct allow people to deal with this type of issue??? or am i completely off base?

from ossec-hids.

gaelmuller avatar gaelmuller commented on July 19, 2024

There is clearly room for improvement in syscheck's performance. On syscheck's side, hashes are computed several times, which is not necessary. But the biggest problem is with analysisd. For each new hash coming to analysisd, the daemon parses the whole database file in order to decide what to do. When you have millions of records, it obviously takes time. Integrity records should be indexed somehow and not just stored as a plain text file.

from ossec-hids.

jrossi avatar jrossi commented on July 19, 2024

I am planning on doing something in this area. But this should not stop anyone from working on this problems.

from ossec-hids.

mbrossard avatar mbrossard commented on July 19, 2024

I've done some work to make the syscheck decoder in analysisd use sqlite3 as a storage backend.

The basic work is done and the results are really impressive (if I may say so), ossec-analysisd goes from hogging 100% CPU to barely showing up in top.

There are a few problems with the patch in its current state:

  • It's not optional (there's work to be done in the build system).
  • It doesn't support the auto ignore.

There are some possible improvement:

  • There's no upgrade path or conversion script from an old dataset.
  • The storage backend is not configurable.

Is there a developer that is willing to review and help me get this in?

from ossec-hids.

awiddersheim avatar awiddersheim commented on July 19, 2024

I am interested. What do you need?

from ossec-hids.

mbrossard avatar mbrossard commented on July 19, 2024

I pushed a first iteration in https://github.com/mbrossard/ossec-hids/tree/stable

I'd like some help with a few points, because there are a few choices I made and some work left to do, I want to know what needs to be done to make it an acceptable pull request.

  • The current format when updating a file, comments out the line and inserts a new line at the end of the file. I chose to update the line.
  • I didn't see where the date (of the update) is actually read back, I can easily add that if needed.
  • The auto-ignore logic is not implemented yet, because I'm not sure if I'm understanding it right. My understanding is when auto ignore is enable there's a counter stored in a peculiar encoding:, 0 (+++), 1 (!++), 2 (!!+), 3 (!!!), and 4 (!!?). The counter is increase until it gets greater than 3, at which point we stopped caring about changes on the file.
  • Since my previous message I made the sqlite backend a compile-time option. I don't think it would make sense to support a config file option: if you have sqlite support compiled in, there's essentially no good reason you'd ever want to use the older "naive" back-end.
  • In terms of migration, I'm still unsure about the proper thing to do. A migration script would be painful to automate, an integrated migration logic would be kept for years as mostly dead code. The "agent is completed" logic is still using the file based logic. If I change it, it would mean that upgrading to an sqlite-enable version would effectively reset the database.
  • Obviously, I'll port the changes to the master branch. The most important change is seemingly the build system, the rest of the questions above are still relevant.

from ossec-hids.

awiddersheim avatar awiddersheim commented on July 19, 2024

Great. I'll probably need a few days to grab enough time to fully review but I'm very interested in this.

from ossec-hids.

awiddersheim avatar awiddersheim commented on July 19, 2024

@mbrossard Sorry, been pretty busy lately but haven't forgotten about this. Hoping to get to look at it this weekend. Thanks for your patience.

from ossec-hids.

awiddersheim avatar awiddersheim commented on July 19, 2024

The current format when updating a file, comments out the line and inserts a new line at the end of the file. I chose to update the line.

Can you link me to the part of the code where you changed this? I am guessing you mean this update statement:

mbrossard@6e85730#diff-7aa1fcad047a2e8e680e0d76bc406676R586

I'm not familiar enough with the reasons behind why the old hash info was kept around and just "commented" out. Whether that is for history tracking purposes and used later or not. If it's not used at all (I haven't yet found where it might be used) than it makes sense to not keep it and just write over the line as you have done unless anyone else in @ossec has some better idea about this and has objections. You only did this in the sqlite code correct?

I didn't see where the date (of the update) is actually read back, I can easily add that if needed.

Sorry, you lost me on this one.

The auto-ignore logic is not implemented yet, because I'm not sure if I'm understanding it right. My understanding is when auto ignore is enable there's a counter stored in a peculiar encoding:, 0 (+++), 1 (!++), 2 (!!+), 3 (!!!), and 4 (!!?). The counter is increase until it gets greater than 3, at which point we stopped caring about changes on the file.

Yeah, this logic is pretty strange. I've never personally looked at this before either but it seems like your understanding is correct. Maybe change this to something else for the sqlite stuff? Make this less hard. Don't think you need to replicate this logic exactly. Something simple like another column called updates that takes an integer maybe.

I think making the sqlite backend a compile time option is fine. I agree, no reason it should be a configuration option.

I'm not sure having a migration path is worth the time and investment at this point either. I'd rather see these additions happen sooner rather than later. If people want to use it they can add the necessary compile time options and we'll make sure to document that this will essentially reset the database.

If somewhere down the road someone can come up with something we can certainly entertain adding a migration path then.

The only major thing I am seeing is you need to make changes to read-agents.c which gets used by the syscheck_control binary. This gets used to print changes, reset the auto ignore counters, etc. This will need to get updated to handle both the old and new stuff appropriately. Doesn't look like it will be that difficult to update.

from ossec-hids.

awiddersheim avatar awiddersheim commented on July 19, 2024

@mbrossard Have you had any time to work on this? Anything I can help with?

from ossec-hids.

mbrossard avatar mbrossard commented on July 19, 2024

The current format when updating a file, comments out the line and inserts a new line at the end of the file. I chose to update the line.

Can you link me to the part of the code where you changed this? I am guessing you mean this update statement:

mbrossard/ossec-hids@6e85730#diff-7aa1fcad047a2e8e680e0d76bc406676R586

That's right.

You only did this in the sqlite code correct?

Yes. Unless I made a mistake, the behavior should be unchanged when not using SQLite.

I didn't see where the date (of the update) is actually read back, I can easily add that if needed.

Sorry, you lost me on this one.

There's a date added for each file entry. This date is set and from what I understand never read back (except maybe if you read the files manually).

The auto-ignore logic is not implemented yet, because I'm not sure if I'm understanding it right. My understanding is when auto ignore is enable there's a counter stored in a peculiar encoding:, 0 (+++), 1 (!++), 2 (!!+), 3 (!!!), and 4 (!!?). The counter is increase until it gets greater than 3, at which point we stopped caring about changes on the file.

Yeah, this logic is pretty strange. I've never personally looked at this before either but it seems like your understanding is correct. Maybe change this to something else for the sqlite stuff? Make this less hard. Don't think you need to replicate this logic exactly. Something simple like another column called updates that takes an integer maybe.

That's what I was thinking.

I think making the sqlite backend a compile time option is fine. I agree, no reason it should be a configuration option.

I'm not sure having a migration path is worth the time and investment at this point either. I'd rather see these additions happen sooner rather than later. If people want to use it they can add the necessary compile time options and we'll make sure to document that this will essentially reset the database.

If somewhere down the road someone can come up with something we can certainly entertain adding a migration path then.

Ok. The issue might be revisited later.

The only major thing I am seeing is you need to make changes to read-agents.c which gets used by the syscheck_control binary. This gets used to print changes, reset the auto ignore counters, etc. This will need to get updated to handle both the old and new stuff appropriately. Doesn't look like it will be that difficult to update.

Nice catch. I didn't know about that. I was expecting shared code but read-agents.c implements its own access to the files. I've started working on this (and the ignore counter mentioned earlier). I'm sorry for the delay, it seems there's always something urgent.

from ossec-hids.

awiddersheim avatar awiddersheim commented on July 19, 2024

No worries. Glad you are still able to contribute time even if it is few and far between. As far as I'm concerned as soon as you add the read-agents.c stuff and implement your own logic for the counter this is in pretty good shape and worth starting to test and eventually merge.

Let us know if you have any other questions.

from ossec-hids.

mbrossard avatar mbrossard commented on July 19, 2024

Hi,

I pushed a second iteration https://github.com/mbrossard/ossec-hids/tree/stable and added two branches based on v2.8.2 (v2.8.2 and squashed).

Since the last time:

  • In the first attempt I tried too much to minimize code duplication at the expense of readability (too many ifdef section, etc.). In this iteration, I instead duplicated whole functions in syscheck.c.
  • I added the necessary code for read-agents.c which meant adding the missing logic we discussed earlier (date, change count).

I'm planning a third push with some added comments.

from ossec-hids.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.