rackerlabs / cloudpassage-lib Goto Github PK
View Code? Open in Web Editor NEWA Clojure library for interacting with CloudPassage APIs.
License: Eclipse Public License 1.0
A Clojure library for interacting with CloudPassage APIs.
License: Eclipse Public License 1.0
I was wondering why we need to fetch a new auth token every 4-5 requests in clark-kent, and realized this is because the cache TTL is set to 8000ms (i.e. 8 seconds): https://github.com/RackSec/cloudpassage-lib/blob/48eb2c6ee7840665a8a63cbf6719527a3fcb4ab4/src/cloudpassage_lib/core.clj#L83
The Halo docs suggest the tokens usually live for 15 minutes, although we can check the exact token lifetime in the response body. Is it possible to cache these for 8 minutes (which I believe we did in redis) instead?
Here's an example of an error message that doesn't say very much:
=> (scans/fim-report! "$ID" "$KEY")
16-03-11 20:25:17 MJV0HLDKQ4 INFO [cloudpassage-lib.core] - fetching new auth token for $ID
Mar 11, 2016 2:25:17 PM clojure.tools.logging$eval420$fn__424 invoke
SEVERE: error in stream handler
java.lang.NullPointerException
at java.util.Arrays.copyOfRange(Arrays.java:3521)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93)
at clojure.lang.Reflector.invokeStaticMethod(Reflector.java:207)
at fernet.core$split_key.invokeStatic(core.clj:29)
at fernet.core$split_key.invoke(core.clj:28)
...
Yay NPE!
The actual problem here was that the fernet key/redis environment variables weren't set properly in profiles.clj. It would be great to do some checking for that and provide a more user-friendly error message.
We encountered this issue on another repo as well. Not sure what's going on, all the settings look right on our end. Suspect it's a codecov problem.
This looks something like the following, from the clark-kent logs:
16-07-13 14:44:19 cloudpassage-reporter INFO [cloudpassage-lib.core:56] - fetching new auth token for <id redacted>
16-07-13 14:44:19 cloudpassage-reporter INFO [cloudpassage-lib.core:74] - fetching https://api.cloudpassage.com/v1/servers/
16-07-13 14:44:21 cloudpassage-reporter INFO [cloudpassage-lib.scans:94] - no more urls to fetch
Jul 13, 2016 2:44:21 PM clojure.tools.logging$eval36$fn__40 invoke
SEVERE: error in stream handler
clojure.lang.ExceptionInfo: Invalid token. {}
at clojure.core$ex_info.invokeStatic(core.clj:4617)
at clojure.core$ex_info.invoke(core.clj:4617)
at fernet.core$invalid_token.invokeStatic(core.clj:16)
at fernet.core$invalid_token.invoke(core.clj:13)
at fernet.core$decrypt_token.invokeStatic(core.clj:89)
at fernet.core$decrypt_token.doInvoke(core.clj:78)
at clojure.lang.RestFn.invoke(RestFn.java:425)
at clojure.lang.AFn.applyToHelper(AFn.java:156)
at clojure.lang.RestFn.applyTo(RestFn.java:132)
at clojure.core$apply.invokeStatic(core.clj:650)
at clojure.core$apply.invoke(core.clj:641)
at fernet.core$decrypt_to_string.invokeStatic(core.clj:110)
at fernet.core$decrypt_to_string.doInvoke(core.clj:101)
at clojure.lang.RestFn.invoke(RestFn.java:425)
at cloudpassage_lib.core$fetch_token_BANG_.invokeStatic(core.clj:104)
at cloudpassage_lib.core$fetch_token_BANG_.invoke(core.clj:93)
at cloudpassage_lib.scans$get_page_BANG_.invokeStatic(scans.clj:73)
at cloudpassage_lib.scans$get_page_BANG_.invoke(scans.clj:70)
at cloudpassage_lib.scans$scan_each_server_BANG_$scan_server_BANG___16574.invoke(scans.clj:141)
at cloudpassage_lib.scans$scan_each_server_BANG_$fn__16577.invoke(scans.clj:145)
<giant manifold traceback>
We should add token-checking logic and a retry to ensure we don't attempt to proceed with an empty auth token.
(I thought I had filed this bug ages ago, but turns out I did not.)
Right now, when I call fim-report!
with the hard-coded time range of the last three hours, I may actually receive multiple scan reports for the same host in the results (as FIM scans are completed on an hourly basis). This almost certainly isn't the desired behaviour; rather, I want the last/most recent FIM scan for each individual host.
We need to figure out if this is just a matter of tweaking the time range, or if we'll have to approach this differently (i.e. by requesting scans per host).
@derwolfe pointed out the style clash between the snake case data returned by the API and the kebab case we usually use in Clojure land. Hence, for consistency's sake, it would be nice to convert all the snake case API data to kebab case for consumer use.
@sirsean has pointed me towards https://github.com/qerub/camel-snake-kebab, which will surely solve this problem. @lvh pointed out an example of this used in RackSec/desdemona@7fe497f.
fetch-token!
is a blocking, non-asynchronous operation. It should return deferred/future that wraps a value.
Instead of creating pagination links in advance, we should walk the links returned by the api. This is more robust in the case that our clock is out of sync with cloudpassage's.
This will likely require
Also, this looks like it would be a good place to clean up how urls and request data are being sent to the actual page fetcher. Instead of creating a string representing the complete uri in advance, Aleph has support for query parameters. fetch-events! currently builds this string manually.
(original issue here https://github.com/RackSec/cloudpassage-poller/issues/45#issuecomment-195562946)
It would be nice to have a changelog documenting what's new in each release, above and beyond the git log.
Calling the sca
api is (likely) returning one too few page of results.
The reason for this behavior is (here)[https://github.com/RackSec/cloudpassage-lib/blob/4c67340c1224c90789deb8b7a7af1854e7bcdf82/src/cloudpassage_lib/scans.clj#L81]. The call to ms/put-all ...
might be called on a stream that has already been closed. The solution would be to move the call to ms/put-all
above the conditional branch that closes the streams, as shown here
Although "CSM" might actually be an inaccurate: it looks like the docs call this "SCA"? See https://support.cloudpassage.com/entries/24082902-Scan-History for more details.
I got my first 502 gateway error from the CloudPassage SCA API today! If we end up losing a single report like that, for our purposes the combined report is incomplete and hence incorrect, so we need to start over from scratch.
But it's wasteful to have to throw all the fetched data away in order to try again. Ideally, we should add some retry logic on encountering an error, and only fail after trying some number of times.
lein release
Cheshire can handle parsing of a stream; it shouldn't use the bs/to-string api
. Instead it should use:
...
byte-stream/to-reader
#(cheshire.core/parse-stream % true)
)
lein-env the plugin will blow away .lein-env
the file if there isn't a profiles.clj in the root of the repo. We should update the docs to reflect this.
Sample profiles.clj for dev testing in the repl:
{:repl {:env {:accounts "lvh:hunter2"
:redis-url "redis://localhost:6379"
:redis-timeout 4000
:fernet-key "very_secret_fernet_key"}}}
Currently, core.clj#L86 and scans.clj#L59 log as errors and not warnings. These would be more appropriately logged as warnings. I'd like to change the log levels here so we can only alert on ERROR or FATAL messages.
Currently the output of fim-report!
looks like this:
({:server_id "09d36abea5cc11e591527d9f85f6c9bc",
:server_url
"https://api.cloudpassage.com/v1/servers/09d36abea5cc11e591527d9f85f6c9bc",
:completed_at "2016-03-07T18:45:59.840Z",
:server_hostname "111111-cf01",
:module "fim",
:non_critical_findings_count 0,
:status "completed_clean",
:id "d4d0caace49411e58d021b460156fb0c",
:ok_findings_count 3801,
:url
"https://api.cloudpassage.com/v1/scans/d4d0caace49411e58d021b460156fb0c",
:critical_findings_count 4,
:created_at "2016-03-07T18:45:58.006Z"}
...)
This probably isn't detailed enough for compliance purposes. It looks like the rest of the scan data lives at the :url
; we should probably improve fim-report!
to fetch and return the data located there as well.
As the number of hosts in the reports increase, the CloudPassage API's performance issues become more readily apparent. Based on a trial run, I calculated:
Part of the discrepancy between the performance of the FIM vs. SCA calls can be explained by a large difference in data size returned: FIM scans at the "details" level have many fewer details than the SCA and SVM scans. However, this doesn't explain everything; SVM scans are smaller than FIM scans and still take longer to fetch.
I gathered this data by working with scans for a client with 58 hosts. It took a total of 10 minutes to fetch data the first time around, but could take longer on occasion. This concerns me because if those numbers are accurate, we can expect to spend nearly 3 hours fetching data for a client with 1000 hosts.
We haven't tried parallelizing these requests, because of worries about rate-limiting (see #41). It may also be helpful if we could batch the requests, but I don't see any API docs on the topic.
There is at least one unused function amongst all of the URLs helpers at the top of scans. Someone should go through this and remove any unused code, and standardize how we do things between scans vs. servers.
This is currently in progress ala #18.
scans/get-page!
throws an Exception, but if an exception is thrown inside a manifold deferred, manifold catches this exception and prints out a stack trace instead. I suspect what we actually want to do is to md/catch
the Exception before manifold does and throw another one ourselves, in order for that to propagate to the report consumer.
The tests inside of scans_test
use a mixture of fake-get-page!
and anonymous functions defined inside the tests. This is confusing to maintainers and should be refactored to use a single behavior driven mock that can return both the happy and failure paths.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.