Coder Social home page Coder Social logo

Comments (12)

mmzyk avatar mmzyk commented on May 28, 2024

We'll see if we can figure it out @skymob. I'm not familiar with the Chef-API gem, but I'll dig up the auth code in the Chef server when I have a minute and link it here with any thoughts I have that might point to what is going on.

from chef-server.

skymob avatar skymob commented on May 28, 2024

Hi @mmzyk - checking back in on this issue. Do you have any suggestions for things we could be looking at on our end? Specific logs, increasing log verbosity, etc? Would a TCPdump on either end be helpful? We ended up adding retries to our gem that wraps the Chef API gem, but even then sometimes the retry fails up to 4 times in a row.

from chef-server.

mmzyk avatar mmzyk commented on May 28, 2024

Hi @skymob. Sorry for the long silence on this. You caught me on a two week vacation and then coming back to some internal work that had to be sorted out. I'm going to dig into the chef code and pull out where this is failing and maybe it can lead us to why.

from chef-server.

mmzyk avatar mmzyk commented on May 28, 2024

So the workhorse file that's doing the auth check is here: https://github.com/opscode/chef_wm/blob/master/src/chef_wm_base.erl

While the erlang code might look strange, I think the naming is clear enough that you can follow this and get an idea where or why it might be failing.

So, one thing to know is that this file is plugged into webmachine, which is the webserver Chef uses. So based on the return of the functions in the chef_wm_base file webmachine determines which return value to provide. More info on the functions webmachine can be found is here: https://github.com/basho/webmachine/wiki/Resource-Functions

The key takeaway is that webmachine looks for the is_authorized function and if that function returns anything other than true, a 401 is returned. So let's look at what is_authorized is doing.

https://github.com/opscode/chef_wm/blob/master/src/chef_wm_base.erl#L174

We can see it's calling verify_request_signature, which is doing most of the work. That function is found here: https://github.com/opscode/chef_wm/blob/master/src/chef_wm_base.erl#L174

verify_request_signature is gathering a bunch of info (which I'm going to assume is working properly, as the error message you gave isn't one of not_found). It is then calling out here: https://github.com/opscode/chef_wm/blob/master/src/chef_wm_base.erl#L174 to actually authenticate the request.

That call is using some included code that can be found on github here: https://github.com/opscode/chef_authn/blob/master/src/chef_authn.erl#L422

It looks like from that called code this error is resulting: https://github.com/opscode/chef_authn/blob/master/src/chef_authn.erl#L426

So we need to figure out what is happening where the actual code is being run that's throwing an error.
That code that is throwing the error is somewhere in the do_authenticate_user_request function here: https://github.com/opscode/chef_authn/blob/master/src/chef_authn.erl#L438

I've got to leave it at that for now, but will try to circle back around to look at this more. Hopefully that gives you enough to go on to possibly look at this more on your end.

from chef-server.

mmzyk avatar mmzyk commented on May 28, 2024

All right, coming back to this, I see that my copy/paste foo failed in the last message I posted and I posted the same link to chef_wm_base line 174 three times. Go figure.

So, just to wrap up some loose ends on how the code works (I've been purposefully thinking out loud here, or maybe typing out loud, mostly because I don't know what the cause of this error is going to be, so I want to give you as much info as possible to try and solve it).

The entry point for this code is here: https://github.com/opscode/chef_wm/blob/master/src/chef_wm_base.erl#L174, in the is_authorized function.

It will move to the verify_request_signature function, here: https://github.com/opscode/chef_wm/blob/master/src/chef_wm_base.erl#L261

If this method fails, https://github.com/opscode/chef_wm/blob/master/src/chef_wm_base.erl#L287, we construct the failure message, by calling verify_request_message, and so if we follow that we can find the exact error message that you see, which is coming from here: https://github.com/opscode/chef_wm/blob/master/src/chef_wm_base.erl#L469

So that wraps up where the error message is coming from, but this doesn't tell us what is causing the error. That takes us back to the previous code tracing I was doing in the last comment.

I had traced the code across modules to chef_authn and the authenticate_user_request function, https://github.com/opscode/chef_authn/blob/master/src/chef_authn.erl#L422

In that function is the code that causes the error message above, https://github.com/opscode/chef_authn/blob/master/src/chef_authn.erl#L426

So I left off trying to find what was trigging that error path. That took me down to do_authenticate_user_request, https://github.com/opscode/chef_authn/blob/master/src/chef_authn.erl#L438

Following the code, no method before verify_sig will cause the error seen. The call to verify_sig is here https://github.com/opscode/chef_authn/blob/master/src/chef_authn.erl#L438 and the verify_sig function is just below: https://github.com/opscode/chef_authn/blob/master/src/chef_authn.erl#L455

The code actually branches here based on the signing version used. If it is 1.0 or 1.1 it will go this path:
https://github.com/opscode/chef_authn/blob/master/src/chef_authn.erl#L455
If it is 1.2 it will go this path: https://github.com/opscode/chef_authn/blob/master/src/chef_authn.erl#L465

1.1 is the default, as defined in the code here: https://github.com/opscode/chef_authn/blob/master/src/chef_authn.erl#L86

(FYI, the macros that the ?SIGNING_VERSION strings in the code reference that define these items are located in the header file here: https://github.com/opscode/chef_authn/blob/master/src/chef_authn.hrl so that is how those values are being resolved)

Since I can't be sure the which signing version you are using from the info I have, if we go back a bit, we can see it is pulled from the headers here: https://github.com/opscode/chef_authn/blob/master/src/chef_authn.erl#L445 and is then passed along through the request. So you should be able to look at the headers being sent and see if you are using 1.0, 1.1, or 1.2 and follow the code path as appropriate.

The code that pulls the sign version from the header, if you follow the code deep enough, is here: https://github.com/opscode/chef_authn/blob/master/src/chef_authn.erl#L392 You can see it's looking for the X-Ops-Sign value to determine what the signing version is, so look for that header value in your requests.

So, if we follow the 1.0, 1.1 default path, we're going to call decrypt_sig https://github.com/opscode/chef_authn/blob/master/src/chef_authn.erl#L392 with the AuthSig value and the public key. decrypt_sig is here, and it branches based on if we find an RSA public key or not: https://github.com/opscode/chef_authn/blob/master/src/chef_authn.erl#L469

However, note that if we don't find an RSA Public Key, we assume the key is encoded and try to decode it, then call decrypt_sig again. For completeness, the decode_key function is here: https://github.com/opscode/chef_authn/blob/master/src/chef_authn.erl#L494

So when decrypt_sig returns, in the 1.0 and 1.1 default Path you can see that we attempt to do what looks like an assignment here: https://github.com/opscode/chef_authn/blob/master/src/chef_authn.erl#L458. Except we passed in Plain, so what we're really doing is pattern matching. If the decrypt_sig doesn't match up with Plain, which was already passed in, then this will fail with Erlang returning an error:bad_match, which will trigger the code path seen above. Plain was set in the previous function here: https://github.com/opscode/chef_authn/blob/master/src/chef_authn.erl#L447

If we follow the 1.2 code path, we get in a similar situation:, in that in the verify_sig function for that code path we call public_key:verify and match it against true. If public_key verify doesn't return true, then we fail the match, resulting in the error seen.

The functions that are in the public_key module and are called here are located in the core Erlang library. http://erlang.org/doc/man/public_key.html

So, to come full circle, I can't say exactly what is happening @skymob, except that for some reason the auth info being sent across is failing to authenticate properly after it reaches a very deep level where it is comparing the key and the signature. We do know that having a client with the same name as a user can cause this, but in that cause it would be expected to fail for each request, not in the pattern being seen here.

I'd suggest trying to capture the auth info being sent from both failing and successful requests and seeing if there is a difference. Given this is an intermittent issue (meaning there doesn't seem to be a consistent reproducible case, not that it doesn't happen often), I am inclined to think this is an issue with the chef-api gem and not with the server itself, especially since knife works just fine. I am not surprised this fails across different types of requests, as this auth code is fundamental to every request made to the chef server.

Let us know if during further investigation you still think this might be a server issue, but I'm going to close this issue out for now, since I don't believe this to be a chef server issue but instead to likely be a chef-api gem issue. The chef-api gem is a community project and is not maintained by Chef, to be clear on that.

Hopefully this helps @skymob. Good luck figuring it out.

from chef-server.

skymob avatar skymob commented on May 28, 2024

@mmzyk, this makes sense. Thanks again for your very thorough research!

from chef-server.

phene avatar phene commented on May 28, 2024

I tried filing this issue with the chef-api gem, but the maintainer is still unconvinced. chef-boneyard/chef-api#32

from chef-server.

mmzyk avatar mmzyk commented on May 28, 2024

@phene For better or worse, the only reports I've seen of this happening are with the chef-api gem, which isn't a Chef maintained project. Beyond what I've already investigated I don't have the bandwidth to devote to trying to track this down, especially since I don't personally use the chef-api gem. The maintainer maybe right or wrong, but it's likely going to be up to one of the users of the chef-api gem to try and track this down or come up with a consistently reproducible case. Hopefully the code paths I've laid out above can be helpful to anyone who wants to try and track this down.

from chef-server.

neurogenesis avatar neurogenesis commented on May 28, 2024

also having this problem intermittently.

from chef-server.

spuder avatar spuder commented on May 28, 2024

I can reproduce this. I've covered the full details with scripts and tcp dumps here:

chef-boneyard/chef-api#32 (comment)

from chef-server.

stevendanna avatar stevendanna commented on May 28, 2024

@spuder Thanks for the detailed investigation. I'll try to take a look at the issue this week.

from chef-server.

stevendanna avatar stevendanna commented on May 28, 2024

@spuder If you could retry your various tests with chef-boneyard/chef-api#39, I'd love to know if that solves it for you.

from chef-server.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.