Coder Social home page Coder Social logo

cloudpassage-lib's People

Contributors

derwolfe avatar ehashman avatar fboxwala avatar fhocutt avatar fxfitz avatar irinarenteria avatar lvh avatar reaperhulk avatar sirsean avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cloudpassage-lib's Issues

Increase cache TTL for auth tokens

I was wondering why we need to fetch a new auth token every 4-5 requests in clark-kent, and realized this is because the cache TTL is set to 8000ms (i.e. 8 seconds): https://github.com/RackSec/cloudpassage-lib/blob/48eb2c6ee7840665a8a63cbf6719527a3fcb4ab4/src/cloudpassage_lib/core.clj#L83

The Halo docs suggest the tokens usually live for 15 minutes, although we can check the exact token lifetime in the response body. Is it possible to cache these for 8 minutes (which I believe we did in redis) instead?

Refactor the library such that it doesn't need to look at environment variables

Here's an example of an error message that doesn't say very much:

=> (scans/fim-report! "$ID" "$KEY")
16-03-11 20:25:17 MJV0HLDKQ4 INFO [cloudpassage-lib.core] - fetching new auth token for $ID
Mar 11, 2016 2:25:17 PM clojure.tools.logging$eval420$fn__424 invoke
SEVERE: error in stream handler
java.lang.NullPointerException
    at java.util.Arrays.copyOfRange(Arrays.java:3521)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93)
    at clojure.lang.Reflector.invokeStaticMethod(Reflector.java:207)
    at fernet.core$split_key.invokeStatic(core.clj:29)
    at fernet.core$split_key.invoke(core.clj:28)
    ...

Yay NPE!

The actual problem here was that the fernet key/redis environment variables weren't set properly in profiles.clj. It would be great to do some checking for that and provide a more user-friendly error message.

Sometimes, we fail to fetch an auth token

This looks something like the following, from the clark-kent logs:

16-07-13 14:44:19 cloudpassage-reporter INFO [cloudpassage-lib.core:56] - fetching new auth token for <id redacted>
16-07-13 14:44:19 cloudpassage-reporter INFO [cloudpassage-lib.core:74] - fetching https://api.cloudpassage.com/v1/servers/
16-07-13 14:44:21 cloudpassage-reporter INFO [cloudpassage-lib.scans:94] - no more urls to fetch
Jul 13, 2016 2:44:21 PM clojure.tools.logging$eval36$fn__40 invoke
SEVERE: error in stream handler
clojure.lang.ExceptionInfo: Invalid token. {}
        at clojure.core$ex_info.invokeStatic(core.clj:4617)
        at clojure.core$ex_info.invoke(core.clj:4617)
        at fernet.core$invalid_token.invokeStatic(core.clj:16)
        at fernet.core$invalid_token.invoke(core.clj:13)
        at fernet.core$decrypt_token.invokeStatic(core.clj:89)
        at fernet.core$decrypt_token.doInvoke(core.clj:78)
        at clojure.lang.RestFn.invoke(RestFn.java:425)
        at clojure.lang.AFn.applyToHelper(AFn.java:156)
        at clojure.lang.RestFn.applyTo(RestFn.java:132)
        at clojure.core$apply.invokeStatic(core.clj:650)
        at clojure.core$apply.invoke(core.clj:641)
        at fernet.core$decrypt_to_string.invokeStatic(core.clj:110)
        at fernet.core$decrypt_to_string.doInvoke(core.clj:101)
        at clojure.lang.RestFn.invoke(RestFn.java:425)
        at cloudpassage_lib.core$fetch_token_BANG_.invokeStatic(core.clj:104)
        at cloudpassage_lib.core$fetch_token_BANG_.invoke(core.clj:93)
        at cloudpassage_lib.scans$get_page_BANG_.invokeStatic(scans.clj:73)
        at cloudpassage_lib.scans$get_page_BANG_.invoke(scans.clj:70)
        at cloudpassage_lib.scans$scan_each_server_BANG_$scan_server_BANG___16574.invoke(scans.clj:141)
        at cloudpassage_lib.scans$scan_each_server_BANG_$fn__16577.invoke(scans.clj:145)

<giant manifold traceback>

We should add token-checking logic and a retry to ensure we don't attempt to proceed with an empty auth token.

(I thought I had filed this bug ages ago, but turns out I did not.)

Fetch scans for more precise time ranges

Right now, when I call fim-report! with the hard-coded time range of the last three hours, I may actually receive multiple scan reports for the same host in the results (as FIM scans are completed on an hourly basis). This almost certainly isn't the desired behaviour; rather, I want the last/most recent FIM scan for each individual host.

We need to figure out if this is just a matter of tweaking the time range, or if we'll have to approach this differently (i.e. by requesting scans per host).

Rework Pagination / use a url package, walk links

Instead of creating pagination links in advance, we should walk the links returned by the api. This is more robust in the case that our clock is out of sync with cloudpassage's.

This will likely require

  1. parsing the url returned by the next link, and turning it into a request
  2. pushing the resulting request map onto the input-stream
  3. Reworking the tests.

Also, this looks like it would be a good place to clean up how urls and request data are being sent to the actual page fetcher. Instead of creating a string representing the complete uri in advance, Aleph has support for query parameters. fetch-events! currently builds this string manually.

(original issue here https://github.com/RackSec/cloudpassage-poller/issues/45#issuecomment-195562946)

Off by one error when calling scans!

Calling the sca api is (likely) returning one too few page of results.

The reason for this behavior is (here)[https://github.com/RackSec/cloudpassage-lib/blob/4c67340c1224c90789deb8b7a7af1854e7bcdf82/src/cloudpassage_lib/scans.clj#L81]. The call to ms/put-all ... might be called on a stream that has already been closed. The solution would be to move the call to ms/put-all above the conditional branch that closes the streams, as shown here

Add retry logic on HTTP errors

I got my first 502 gateway error from the CloudPassage SCA API today! If we end up losing a single report like that, for our purposes the combined report is incomplete and hence incorrect, so we need to start over from scratch.

But it's wasteful to have to throw all the fetched data away in order to try again. Ideally, we should add some retry logic on encountering an error, and only fail after trying some number of times.

Add docs for the clojars release process

  • How to sign up for clojars and deal with credentials
  • Adding someone to the group so they can upload releases
  • How to cut a release and push changes back to master using lein release

Fix the docs for compatibility with the lein-env plugin

lein-env the plugin will blow away .lein-env the file if there isn't a profiles.clj in the root of the repo. We should update the docs to reflect this.

Sample profiles.clj for dev testing in the repl:

{:repl {:env {:accounts "lvh:hunter2"
              :redis-url "redis://localhost:6379"
              :redis-timeout 4000
              :fernet-key "very_secret_fernet_key"}}}

fim-report! should fetch more data

Currently the output of fim-report! looks like this:

({:server_id "09d36abea5cc11e591527d9f85f6c9bc",
  :server_url
  "https://api.cloudpassage.com/v1/servers/09d36abea5cc11e591527d9f85f6c9bc",
  :completed_at "2016-03-07T18:45:59.840Z",
  :server_hostname "111111-cf01",
  :module "fim",
  :non_critical_findings_count 0,
  :status "completed_clean",
  :id "d4d0caace49411e58d021b460156fb0c",
  :ok_findings_count 3801,
  :url
  "https://api.cloudpassage.com/v1/scans/d4d0caace49411e58d021b460156fb0c",
  :critical_findings_count 4,
  :created_at "2016-03-07T18:45:58.006Z"}
...)

This probably isn't detailed enough for compliance purposes. It looks like the rest of the scan data lives at the :url; we should probably improve fim-report! to fetch and return the data located there as well.

Performance issues with CloudPassage API calls

As the number of hosts in the reports increase, the CloudPassage API's performance issues become more readily apparent. Based on a trial run, I calculated:

  • SCA calls take ~7s but can take up to 10-15s and return ~190kB of data for Windows, ~250kB for Linux (RHEL)
  • SVM calls take ~3.5s and return ~20kB per server
  • FIM calls take ~0.5s and return ~40kB per server

Part of the discrepancy between the performance of the FIM vs. SCA calls can be explained by a large difference in data size returned: FIM scans at the "details" level have many fewer details than the SCA and SVM scans. However, this doesn't explain everything; SVM scans are smaller than FIM scans and still take longer to fetch.

I gathered this data by working with scans for a client with 58 hosts. It took a total of 10 minutes to fetch data the first time around, but could take longer on occasion. This concerns me because if those numbers are accurate, we can expect to spend nearly 3 hours fetching data for a client with 1000 hosts.

We haven't tried parallelizing these requests, because of worries about rate-limiting (see #41). It may also be helpful if we could batch the requests, but I don't see any API docs on the topic.

Clean up some of the server URL logic in scans.clj

There is at least one unused function amongst all of the URLs helpers at the top of scans. Someone should go through this and remove any unused code, and standardize how we do things between scans vs. servers.

Fix error-handling anywhere `get-page!` is used

scans/get-page! throws an Exception, but if an exception is thrown inside a manifold deferred, manifold catches this exception and prints out a stack trace instead. I suspect what we actually want to do is to md/catch the Exception before manifold does and throw another one ourselves, in order for that to propagate to the report consumer.

Consolidate mocks in tests to use behaviors

The tests inside of scans_test use a mixture of fake-get-page! and anonymous functions defined inside the tests. This is confusing to maintainers and should be refactored to use a single behavior driven mock that can return both the happy and failure paths.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.