Coder Social home page Coder Social logo

Code audit: DAS about das HOT 108 CLOSED

 avatar commented on August 30, 2024
Code audit: DAS

from das.

Comments (108)

 avatar commented on August 30, 2024

lat: Initial set of deployment review comments are on [https://wave.google.com/wave/waveref/googlewave.com/w+XsBuMPFwb google wave].

from das.

 avatar commented on August 30, 2024

lat: After various rounds of deployments comments on mailing list, currently outstanding deployment issues are:

  • RPM %post should do less (#398)
  • Need automatic server restart on boot; patch pending review
  • Documentation to be provided separately on http group pages
    • Exact scheme to be communicated and implemented
  • Can't run DAS under separate account, as it writes to RPM install area
    • Possibly happens only on first start to populate MongoDB?

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS packaging: would prefer %post to do even less, in particular would prefer that servers don't ship with a running configurations in the RPM. An example is ok, but don't create the default configuration the server will use and run with. It's too risky, if there's a mistake one might start a highly insecure server.

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS bin directory: suggest to rename all programs consistently with or without suffix. I'd prefer without suffix and it seems to be the more common here too, so perhaps the suffixes can be removed.

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS {{{bin}}} directory: Use syntax {{{foo ${1+"$@"} }}} not just {{{foo "$@"}}}. The former handles no arguments correctly, the latter has problems if user doesn't supply any command line arguments at all.

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS {{{bin}}} directory: Use {{{#!/bin/sh}}} instead of {{{#!/usr/bin/env bash}}}. I don't think the scripts use any features that are not in POSIX standard. And even if they did, they can probably be trivially expressed in POSIX-compatible subset.

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS {{{bin}}} directory: start-up scripts should be folded into {{{manage}}}. We very much prefer to see everything inlined directly into the {{{manage}}} script without several layers of indirection, for simplicity, comprehension and transparency.

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS {{{bin}}} directory: I don't know if {{{das_test}}} is meant to be used by anyone other than developers, but in the general spirit of not holding configuration in the software area, would suggest to take {{{das.cfg}}} from outside {{{$DAS_ROOT}}}. Would also suggest to use python configuration file, not .cfg style.

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS {{{bin}}} directory: {{{dassh}}} appears to touch $HOME, which seems suspicious. How exactly does DAS use ipython? Is this going to appear on server side, and if so, why? I did notice we had /home/cmsweb/.ipython appear at one point.

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS {{{bin}}} directory: suggest to remove non-production binaries from default project bin directory when the RPM is built. In general would prefer if test artifacts didn't make it to the running servers at all. I have no objection to having them in SVN mixed up, as long as the build recipe can put the bits in the right place in the RPM (or omit them, as the case may be).

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS: nice job with unit tests.

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS: xwho is obsolete and probably should be converted to use xldap.cern.ch. See {{{phonebook}}} command on any CERN linux system. It's just a perl script which makes a simple LDAP query to xldap.cern.ch. Note that xldap.cern.ch is only available within CERN, (testing) outside CERN you'll need (SSL + password authenticated) connection to ldap.cern.ch instead.

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS: would suggest to generally consider slight flattening of name space. "from DAS.utils.utils import foo" seems a little redundant :-)

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS: generally like the use of yaml for describing schema, and how it is linked to service providers - good job.

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS: curious about this coding pattern:

{{{
#!python
try:
jsondict = json.loads(data)
except:
jsondict = eval(data)
}}}

Use of {{{eval()}}}, especially without any dictionary restrictions - eval global and local dictionary options - seems very dangerous. It would seem that if someone manages to compromise some other CMS application, they can also contaminate DAS because it will eval the data produced by those sources. Admittedly I am not sure where the "data" can originate here. But if the data is already sanitised, the construct seems unnecessary, so I am assuming it originates from untrusted outside source.

What's the reason falling back on eval? Is the data python-esque json, but not completely legal json? Can we not fix the upstream data source? The above example was from SiteDB, so for that we should just fix the output...

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS: suggest to rename services/runsum/run_summary.py to something/cern_sso_auth.py.

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS: services/runsum/runsum_service.py, runsum_keys() should not be referring to configuration files under $DAS_ROOT. Parametre file locations should come exclusively from server configuration, not picked up via environment, or RPMs.

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS: general comment, would find expression {{{if not isinstance(x, dict)}}} style more readable than {{{if type(x) is not types.DictType}}} style.

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS: services/runsum/runsum_service.py, {{{parser()}}} method is not protected against uninitialised in {{{root.clear()}}} at the end of the function. Also not obvious why the function has to call {{{source.close()}}}, seems like that should be automatic, either in caller, or just handled automatically by it going out of scope.

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS: overview plotfairy version, session arguments are unnecessary and can be omitted.

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS: {{{encode4admin}}} seems unused. I'd prefer it was removed entirely, just to avoid it accidentally getting used somewhere.

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS: I have not yet read all the code, but have a general comment in that it's somewhat unclear where input validation occurs, especially on data loaded from the upstream sources in services. There are some int(x) type conversions which will naturally throw, but I didn't see where types were specified, and how DAS handles situation where query requests say numerical comparison against mixed / non-numerical types.

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS: DAS seems pretty liberal on using various ports, especially for MongoDB. We should ensure DAS uses port range allocated to it.

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS: {{{das_top.tmpl}}} pulls yui from yahoo's servers - seems like das should serve it itself.

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS: {{{das_table.tmpl}}} defines variable {{{YAHOO.example.DynamicData}}} - seems like it probably should CMS/DAS related name instead?

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS: {{{das_table.tmpl}}} (among other places) generates URLs which don't use quoting for arguments. It's possible the arguments don't require it, but would recommend that as a general style, anything originating from possible user source goes via encodeURIComponent() or alike.

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS: {{{das_services.tmpl}}} - among other places - generates HTML and URI components from values where it's not obvious whether quoting is required.

I assume for {{{das_services}}} the text cannot contain unsafe characters, and in particular, DAS cannot be induced to generate a XSS attack here by injecting malicious content, but nevertheless I would suggest the page contents are automatically quoted.

Not sure if the template engine automatically does that, but given that there aren't any distinctive directives for different output contexts (HTML vs. URI inside A tag), I assume no quoting is happening by default.

from das.

 avatar commented on August 30, 2024

lat: Following [comment:30 comment 30], {{{das_query_info}}} seems vulnerable to XSS and other injection attacks. You should be careful that if someone constructs malicious queries, you don't inject them as-is to the admin/expert page, which may get the admin's browser do all kinds of funny things! This has been one classic way several sites got compromised...

from das.

 avatar commented on August 30, 2024

lat: Further to [comment:30 comment 30], it seems many templates and bits from code like {{{DAS/web/*.py}}} may pass in parameters from client unvalidated all the way straight to the template and back to the client.

So it seems there may be possibility for various kinds of XSS and injection attacks. I am definitely not positive about this. It's possible there's a central validation somewhere, and quoting in the template engine, but it's not very obvious from reading this code that that will happen.

If that's the case I would definitely appreciate localising the knowledge to the functions (e.g. {{{records()}}} in {{{das_web.py}}}), so reading the code it's essentially obvious what sort of protection is in place. I'd definitely recommend against splitting any protections to another location, away from the code that needs them.

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS: {{{das_jsontable.tmpl}}} seems like standard YUI boilerplate template, complete with tracking bug. This seems like a testing thing that belongs somewhere else. As far as I can tell it's unused, so perhaps can be removed, or put waiting in dev patch queue until it's needed?

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS: authorisation to 'expert' page appears to be protected by 'admin.dns' mongodb database.

It is good DAS uses X509 certificate access status to make the check, but it's unclear to me how this database gets populated and where, and whether it's possible to subvert it. Is it somehow related to the admins in SiteDB?

However more generally, DAS should use standard CMS authentication and authorisation system to protect access. Even if it uses information from SiteDB, it shouldn't use an alternate path to that information, and alternate code to make the checks. It should use standard method decorators and other agreed conventions.

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS: accessing {{{/das/help}}} URL has two issues. First it has a problem generating the {{{das_help}}} page because {{{services}}} is missing. Second is that it spits out a traceback to the client browser, which means the server isn't properly configured to hide the errors.

from das.

 avatar commented on August 30, 2024

lat: Further to [comment:30 comment 30], the {{{das_error.tmpl}}} does give an example how to perform quoting. So it does seem the vast majority of other pages, both python code and template, need to be carefully revisited to make sure all data, whether from user or another database source, is carefully sanitised before putting it into output.

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS: the various method decorators in {{{DAS/web/tools.py}}} seem to require review. I would suggest to carefully go through all of them, and all their uses in the code. I would suggest to remove all decorators which aren't strictly required.

Some of the uses of the decorators seem even more curious than the decorators themselves. The wrap2das* methods seem to have very little in way of validation, and I would really like to see some comments to whether they can be used to get the server reflect arbitrary data back to the client. You really should avoid any clever redirection (including internal redirection) and wrapping tools, especially against any possibly tainted data that originated from user.

from das.

 avatar commented on August 30, 2024

lat: Further to [comment:37 comment 37], I did verify the wrap2das* methods are accessible to the clients, even though they appear to be used internally as well. The internal calls pass real data, so it seems that sufficiently clever call to wrap2das* method with suitably complex GET or POST might be possible to use as a reflector. If true this again opens window for XSS and XSRF attacks.

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS: {{{DAS/web/das_webmanager.py}}}'s {{{css()}}} and {{{js()}}} methods have duplicate logic for locating files. The {{{check_scripts()}}} should probably return the decision it made on file path. Also related to this, I am not sure modifying a list while iterating it is safe (in {{{check_scripts()}}}, and I am not sure the code is properly protected against duplicated invalid entries.

I would suggest to simplify the code so it makes the path decisions in exactly one location, by simple iteration and producing a new output, rather than trying to modify the data structure in-place.

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS: I suggest to centralise URL scraping. It seems several places in code use {{{urllib2_request}}} which seems to perform an appropriate amount of encoding for protection, but it's not consistently used everywhere. For general understanding of the code it would be helpful if there were as few scraping / internal http request patterns as possible.

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS: the {{{status()}}} method on {{{DAS/web/das_web.py}}} seems to pass any request through unmodified to the underlying data store.

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS: {{{DAS/web/das_server.py}}} defaults to writing pid files to /tmp. I would suggest to default to not writing a pid file at all if one wasn't requested, rather the defaulting to "somewhere".

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS: the default socket listen queue should probably default to 100+, especially as we'd expect this to be a high-traffic server.

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS: the server should default to production mode, turning tracebacks to client off, and not reloading code whenever the files change. Not sure if that's handled somewhere other than {{{DAS/web/das_server.py}}}. In fact would like to see this use more of WMCore...

from das.

 avatar commented on August 30, 2024

lat: Further to [comment:44 comment 44] indeed it seems there is very little use of WMCore. I suggest to reconsider reusing at least the base server, possibly quite a bit more from WMCore now that it is available - I appreciate it might not have been in place when this code got started.

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS: like many other places, {{{DAS/web/das_expert.py}}} seems to reflect caller arguments directly back to the client in URLs and otherwise. Seems like more aggressive input and output sanitisation is in order.

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS: {{{DAS/web/das_expert.py}}} also seems to use the input parameters completely unchecked. Granted the code is protected with DN check, and some of the mishaps would likely just produce python KeyError, but would still prefer to see a style that rigorously untaints all client input before using it. (Using WMCore should help here, and would probably also help verify the HTTP request method is appropriate to the method call.)

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS: as previously discussed would like to see {{{das_doc.py}}} dropped.

from das.

 avatar commented on August 30, 2024

lat: Review note on DAS: suggest {{{DASLogger}}} to default to use {{{rotatelogs}}} now that we've made that available as a stand-alone utility. Alternatively perhaps support pipe output syntax. Preferably in any case shared with WMCore as much as possible. I do see it uses python internal logging facility rotation ability, but I'd suggest to standardise on rotatelogs instead.

from das.

 avatar commented on August 30, 2024

lat: Stopping this round of review here. This leaves DAS core and analytics for another round of code audit / review.

from das.

 avatar commented on August 30, 2024

lat: Marking this work item completed for HG1009 milestone, and assigning ticket to DAS project, for a milestone and priority to be determined together with other project managers. It would probably be best to split this ticket to individual items. I am certainly available to discuss how to handle review notes and to make sure they don't get lost.

from das.

 avatar commented on August 30, 2024

lat: == Summary of my findings ==

A lot of the issues mentioned above are low-level technical points. While I did not review the core and analytics code, my overall impression of DAS code is very favourable. I compliment in particular the choice of YAML for describing the schema, queries and notation, and its mapping to service python code. Overall I find the code simple, elegant and understandable, and very approachable. Well done on the core technical choices.

The greatest weakness I see is almost complete lack of input and output sanitisation, but from DAS clients and validation of upstream data. I am a little concerned this might be quite vulnerable to XSS, XSRF and injection attacks, and very vulnerable to compromised upstream data sources. I did not reason deeply enough about the code to understand if it is possible for clients to inject code to the map reduce algorithms, but I'd consider that possible.

I would also very much encourage more code sharing from current WMCore server. Many trivial issues seem best addressed by hooking up to WMCore server and its validation support - not by recoding the same functionality into DAS.

Depending on what sort of schedule you'd anticipate for fixing some of the handling of tainted data, I might feel a bit more comfortable if we put this behind mandatory X509 certificate authentication.

from das.

vkuznet avatar vkuznet commented on August 30, 2024

valya: Replying to [comment:6 lat]:

Review note on DAS packaging: would prefer %post to do even less, in particular would prefer that servers don't ship with a running configurations in the RPM. An example is ok, but don't create the default configuration the server will use and run with. It's too risky, if there's a mistake one might start a highly insecure server.

from das.

vkuznet avatar vkuznet commented on August 30, 2024

valya: Lassi, thank you for comprehensive feedback. I'm working on all topics you raise, but want to clarify a few of them:

'''Lat comment:''' Would also suggest to use python configuration file, not .cfg style.
'''VK reply:''' I support both configuration styles: via python configuration and cfg. The former is used for DAS CMS configuration, while cfg can be used for DAS used elsewhere apart from CMS environment (see below).

'''Lat comment:''' xwho is obsolete and probably should be converted to use xldap.cern.ch.
'''VK reply:''' The services such as xwho, ipservice, etc. are used for DAS tests. They're not activated in production, but pretty useful for testing/development of generic DAS behavior. The service activation is done via das_cms.py configuration where all serviced are listed (aka whitelist).

'''Lat comment:''' DAS seems pretty liberal on using various ports, especially for MongoDB. We should ensure DAS uses port range allocated to it.
'''VK reply:''' I don't really understand this statement. What sort of liberty we're talking about? MongoDB port usage is configured via das_cms.py configuration file, see config.mongodb.dbport = 27017. All connections are done through one port. The port number is not hardcoded in a code. Please elaborate more.

'''Lat comment:''' das_top.tmpl pulls yui from yahoo's servers - seems like das should serve it itself.
'''VK reply:''' My understanding that a new service StaticScruncher should take care of that. Right now it is not in production, therefore I use Yahoo to server YUI files. Also I intentionally would like to keep DAS code pretty generic and portable outside of CMS environment, see below.

'''Lat comment:''' It seems there is very little use of WMCore. I suggest to reconsider reusing at least the base server, possibly quite a bit more from WMCore now that it is available - I appreciate it might not have been in place when this code got started.
'''VK reply:''' The usage of WMCore was minimized on purpose due to significant interest in DAS for other scientific domains apart from CMS. This has been discussed with L2 managers up-front. Since none of the bits of DAS code is really require WMCore, including web base server (the DAS web server depends on CherryPy rather on WMCore), I would like to keep DAS core very generic to allow its portability to non-CMS environments.

from das.

vkuznet avatar vkuznet commented on August 30, 2024

valya: Replying to [comment:18 lat]:

Review note on DAS: curious about this coding pattern:

{{{
#!python
try:
jsondict = json.loads(data)
except:
jsondict = eval(data)
}}}

Use of {{{eval()}}}, especially without any dictionary restrictions - eval global and local dictionary options - seems very dangerous. It would seem that if someone manages to compromise some other CMS application, they can also contaminate DAS because it will eval the data produced by those sources. Admittedly I am not sure where the "data" can originate here. But if the data is already sanitised, the construct seems unnecessary, so I am assuming it originates from untrusted outside source.

What's the reason falling back on eval? Is the data python-esque json, but not completely legal json? Can we not fix the upstream data source? The above example was from SiteDB, so for that we should just fix the output...

Yes some data services which returns JSON, e.g. overview, is not fully parseable, that's was a reason to have eval. I totally agree that usage of eval is not a best solution is better to fix the upcoming source. But this is sort of chicken and egg problem. If I put strict json.loads, then some services/data will not be accessible and it is reduce value of DAS per-se. So I would rely on L2 decision about fixing DAS code to use json.loads or make a gradual approach to keep eval around until we fix incoming data-service's data.

For your convenience I'm attaching test_monitor.py script which fails in json.load for overview service. Feel free to investigate it more.

from das.

 avatar commented on August 30, 2024

lat: Comments [comment:15 15] and [comment:55 55]: xwho is deprecated and will go away, so even as test may stop being useful.

from das.

 avatar commented on August 30, 2024

lat: Comments [comment:26 26] and [comment:55 55]: Apologies, partly a misunderstanding here. Indeed there are a ton of ports in use - I counted 424 on vocms53 - but they are mostly ESTABLISHED, not LISTEN, so non-bound randomly assigned ports by the operating system. So it looks like DAS just creates a large number of ports to talk with MongoDB. But please do move !MongoDB to the DAS assigned port range. See [https://cms-http-group.web.cern.ch/cms-http-group/activity.html#info assigned port ranges]. Note though default ulimits on the servers are 1024 file descriptors. You'll want to make sure DAS retains enough available descriptors to talk to clients after it's created sockets to talk to !MongoDB.

from das.

 avatar commented on August 30, 2024

lat: Comments [comment:27 27] and [comment:55 55]: We have had servers deliver internally deployed YUI installation for years. I would suggest this is a case of just making it happen. New services really shouldn't regress in this basic a matter. I can provide example code for bundling minimised collapsed YUI, ExtJS and other resources - javascript, css, sprites, etc. - from DQM GUI / Overview, it should take no more than 30 minutes to adapt it for DAS.

from das.

 avatar commented on August 30, 2024

lat: Comments [comment:44 44] plus many others, and [comment:55 55]: Personally my feedback is that this is mission creep.

Basically you would effectively be saying that the one concrete client that funds the development work, CMS, is getting an inferior product because there is unquantified interest from hypothetical other clients. The dependencies on WMCore base server aren't that big, and can certainly be slimmed if necessary. Many issues in deployment would have been avoided if the base server had been used.

I would suggest merely discussing the matter with L2s is not enough. I would suggest we need a statement from the L2s that they actively support this choice. If the latter materialises we will need to redraft the CMSWEB SLA as this is a pretty significant change in direction.

If concrete interest beyond corridor discussions does manifest I'd be quite happy to look at making sure DAS is general and reusable outside CMS. But unless it actually has outside people funded to work on it upfront, I would suggest it needs to be primarily and actively focused on addressing the priorities of its sole client, CMS.

from das.

drsm79 avatar drsm79 commented on August 30, 2024

metson: Replying to [comment:60 lat]:

Comments [comment:44 44] plus many others, and [comment:55 55]: Personally my feedback is that this is mission creep.

Basically you would effectively be saying that the one concrete client that funds the development work, CMS, is getting an inferior product because there is unquantified interest from hypothetical other clients. The dependencies on WMCore base server aren't that big, and can certainly be slimmed if necessary. Many issues in deployment would have been avoided if the base server had been used.

I don't really recall discussions where not using the WMCore stuff was agreed on - I remember there being features DAS needed that weren't in WMCore that now are. I was surprised by how little DAS uses the WMCore pieces for the web interface related stuff, and think that moving to a WMCore base should be part of the next major release. Using WMCore doesn't preclude it's use by other interested parties, since the WMCore code is open source (though don't ask me what license...).

from das.

 avatar commented on August 30, 2024

lat: Comments [comment:18 18] and [comment:56 56]: Thanks for the test case. I suggest to actively report any such issues when you run into them. Most of the data sources can be fixed pretty quickly. Could you please file the particular one for overview in savannah, iguana project, and I'll see it gets fixed?

Where you do find yourself stuck having to use eval from outside sources, you should do so with a restricted dictionary to prevent access to outside program with {{{ eval(x, { "builtins": None }, {}) }}}

I'd like to see at least the latter fix as soon as possible. Fixing upstream sources I can't say much about until we have a concrete list of bugs/tickets for the specific issues and are able to determine how much work they are to fix.

from das.

vkuznet avatar vkuznet commented on August 30, 2024

valya: Replying to [comment:61 metson]:

Replying to [comment:60 lat]:

Comments [comment:44 44] plus many others, and [comment:55 55]: Personally my feedback is that this is mission creep.

Basically you would effectively be saying that the one concrete client that funds the development work, CMS, is getting an inferior product because there is unquantified interest from hypothetical other clients. The dependencies on WMCore base server aren't that big, and can certainly be slimmed if necessary. Many issues in deployment would have been avoided if the base server had been used.

I don't really recall discussions where not using the WMCore stuff was agreed on - I remember there being features DAS needed that weren't in WMCore that now are. I was surprised by how little DAS uses the WMCore pieces for the web interface related stuff, and think that moving to a WMCore base should be part of the next major release. Using WMCore doesn't preclude it's use by other interested parties, since the WMCore code is open source (though don't ask me what license...).

Ok, code will be migrated in next release, see https://svnweb.cern.ch/trac/CMSDMWM/ticket/518

from das.

vkuznet avatar vkuznet commented on August 30, 2024

valya: Replying to [comment:62 lat]:

Comments [comment:18 18] and [comment:56 56]: Thanks for the test case. I suggest to actively report any such issues when you run into them. Most of the data sources can be fixed pretty quickly. Could you please file the particular one for overview in savannah, iguana project, and I'll see it gets fixed?

Where you do find yourself stuck having to use eval from outside sources, you should do so with a restricted dictionary to prevent access to outside program with {{{ eval(x, { "builtins": None }, {}) }}}

I'd like to see at least the latter fix as soon as possible. Fixing upstream sources I can't say much about until we have a concrete list of bugs/tickets for the specific issues and are able to determine how much work they are to fix.

I applied {{{ eval(x, { "builtins": None }, {}) }}} in a code. While I unable to find overview in savannah. I don't have a clue how it is called and search results for '''overview''' does not yield anything. SiteDB also has this problem, the ticket is https://svnweb.cern.ch/trac/CMSDMWM/ticket/523

from das.

 avatar commented on August 30, 2024

lat: Replying to comment [comment:64 64]: Thanks. The comment you quoted mentioned "Could you please file the particular one for overview in savannah, iguana project, and I'll see it gets fixed?". Let me know if you still can't find the project and I'll post a direct bug report link.

from das.

vkuznet avatar vkuznet commented on August 30, 2024

valya: I found IGUANA savannah and submitted ticket over there, https://savannah.cern.ch/bugs/index.php?73620

from das.

vkuznet avatar vkuznet commented on August 30, 2024

valya: Replying to [comment:58 lat]:

Comments [comment:26 26] and [comment:55 55]: Apologies, partly a misunderstanding here. Indeed there are a ton of ports in use - I counted 424 on vocms53 - but they are mostly ESTABLISHED, not LISTEN, so non-bound randomly assigned ports by the operating system. So it looks like DAS just creates a large number of ports to talk with MongoDB. But please do move !MongoDB to the DAS assigned port range. See [https://cms-http-group.web.cern.ch/cms-http-group/activity.html#info assigned port ranges]. Note though default ulimits on the servers are 1024 file descriptors. You'll want to make sure DAS retains enough available descriptors to talk to clients after it's created sockets to talk to !MongoDB.

I want to clarify this a little bit. I can reassign MongoDB port to DAS range. But I think we need separate port range slot for MongoDB itself. It can be used elsewhere apart from DAS and then it may be a port conflict between services. It is database back-end. By default Mongo uses 27017, 27018 ports. Another point if we will use sharding and spread Mongo across the nodes. In later case a separate port slot would be more convenient. But, I don't mind and will follow whatever you decide, just want to raise a point.

from das.

drsm79 avatar drsm79 commented on August 30, 2024

metson: I agree with Valentin, MongoDB should be given it's own port range (same as CouchDB has 5984), however I'd also like to understand why there are 424 open ports if it's using the two default ports (27017, 27018)...

from das.

 avatar commented on August 30, 2024

lat: Regarding [comment:67 ports], there's no problem assigning another port range to MongoDB, but I'd rather not use ports above 10000. We still get machines reallocated from "special" netblocks within CERN, with pass-through permissions for ports in 10000-30000 range.

We've had servers supposedly protected by the global firewall accessed from out in the wild in these high ports, just because the machine was some grid server in some previous life before being reallocated to us, and the global firewall had some huge gaping hole for the netblock it was in.

For example port range 8230 - 8239, just above DAS, is currently not reserved for anything. Can we relocate MongoDB there?

from das.

vkuznet avatar vkuznet commented on August 30, 2024

valya: I can use this port slot, no problem.

from das.

 avatar commented on August 30, 2024

lat: Regarding [comment:68 open connections], they are connected sockets, i.e sockets between DAS and MongoDB. Just ssh to [email protected] and run {{{netstat -tanlp | grep ESTABLISHED | grep 27017}}} to see them. We have currently:

{{{
$ netstat -tanlp | grep ESTABLISHED | grep 27017 | awk '{print $NF}' | sort | uniq -c
212 4500/mongod
138 4860/python
74 4875/python
}}}

Why there are that many I can't answer. Maybe every DAS thread creates some number of connections? Note that half of the sockets are for python side, the other half is the mongod side, as shown above.

from das.

vkuznet avatar vkuznet commented on August 30, 2024

valya: Replying to [comment:69 lat]:

Regarding [comment:67 ports], there's no problem assigning another port range to MongoDB, but I'd rather not use ports above 10000. We still get machines reallocated from "special" netblocks within CERN, with pass-through permissions for ports in 10000-30000 range.

We've had servers supposedly protected by the global firewall accessed from out in the wild in these high ports, just because the machine was some grid server in some previous life before being reallocated to us, and the global firewall had some huge gaping hole for the netblock it was in.

For example port range 8230 - 8239, just above DAS, is currently not reserved for anything. Can we relocate MongoDB there?

Please open new tickets for new features defects, rather keep a monster thread. This task has been assigned to ticket #528

from das.

vkuznet avatar vkuznet commented on August 30, 2024

valya: Replying to [comment:70 valya]:

I can use this port slot, no problem.

This will be tracked separately, see #529

from das.

vkuznet avatar vkuznet commented on August 30, 2024

valya: I walk through all of the issues shown originally in this ticket. The attached patch address issues with bin area, input parameter validation, JSON vs eval, templates, URL quoting., wrap2das, etc. It also fix the following tickets: #493, #494, #446, #449, #447

All other issues are relocated to stand-alone tickets: #528, #529, #523, #398, #452

from das.

vkuznet avatar vkuznet commented on August 30, 2024

valya: (In 29f1b00) Work on code based on Code Audit, fixes #290, #493, #494, #446, #449, #447

Signed-off-by: Valentin Kuznetsov [email protected]

from das.

 avatar commented on August 30, 2024

lat: There's a few issues with the patch, so I am re-opening the ticket. If you want to split these off to new tickets, that's fine, but I thought the comments belong here. Am adding them as individual comments for easier reference.

from das.

 avatar commented on August 30, 2024

lat: Here in das_map, you should take $dir as command line argument, not make checks on host name:

{{{
#!diff
+if [ hostname -d == "cern.ch" ]

  • dir=/data/projects/das/config/maps
    +else
  • dir=$DAS_ROOT/src/python/DAS/services/maps
    +fi
    }}}

from das.

 avatar commented on August 30, 2024

lat: Comment [comment:12 12] wasn't really answered, the {{{dassh}}} was just removed. Is ipython dependency completely removed now, in RPM rules as well, and ipython is completely irrelevant to DAS?

from das.

 avatar commented on August 30, 2024

lat: Comment [comment:10 10] still needs addressing. Will we see init scripts folded into DAS 'manage'?

from das.

 avatar commented on August 30, 2024

lat: Comment [comment:18 18] follow-up: thanks for switching to the new {{{eval()}}} scheme. Though I would note that the {{{eval()}}} isn't inside {{{try ... except}}} - it probably should be. Or are you deliberately intending it to raise an exception?

from das.

 avatar commented on August 30, 2024

lat: Comment [comment:21 21] follow-up: I'd still find {{{isinstance}}} more readable. (Cf. PEP 8.)

from das.

 avatar commented on August 30, 2024

lat: Comment [comment:23 23] follow-up: I guess it wasn't clear enough, but "session arguments" meant "session" and "version". All you need is the actual data arguments. Also would prefer they were deleted, not just commented out.

from das.

 avatar commented on August 30, 2024

lat: I didn't understand the addition of urllib quoting in, for example, das_table.tmpl. Shouldn't you use encodeURIComponent in javascript code / arguments, and urllib when quoting something originating from DAS server itself? To me it seems you are now sometimes quoting javascript itself, not the javascript variable value.

from das.

 avatar commented on August 30, 2024

lat: Also I note here that the quoting wasn't added universally everywhere - not in all templates, and not even systematically in the one example I happened to quote, das_table.tmpl. As I wrote before, it looks like every template needs to be sanitised. I can't easily tell which values are safe.

from das.

 avatar commented on August 30, 2024

lat: I would suggest to remove code, not comment it out. There can be exceptions, but I'd like to see most of the commented out code removed. There's version management history if you need to go back; the history doesn't need to be in the code itself.

from das.

 avatar commented on August 30, 2024

lat: Thank you for adding {{{checkargs}}} to verify parameters. It has a few flaws I'd like to see fixed:

  • You don't use what you verify. Some arguments are casted to strings (str(x)) before checking. You should instead verify what you will use.
  • You should type check all arguments for reasons above. A keyword argument can be None (not given), a string (given once), or a list (if given several times).
  • Contents of many, but not all arguments are checked. I didn't see any additional checking added for remaining arguments elsewhere so it looks like several vulnerabilities remain. You should always sanitise all arguments. Even if the argument is free form input, you can often make sure it only consists of certain legitimate characters (e.g. letters only).
  • Failure to verify arguments should raise an exception.
  • Failure to check an argument should not return the argument value back to caller. This is unsafe; you don't know what the value contains, and you just determined it's not valid. Returning the value to caller can be used to create XSS and other attacks. My general preference is to never return anything to the caller - you simply return suitable HTTP status code.
  • It's not sanitising the HTTP method; note that 'method' keyword argument is not the same as the request method!

from das.

 avatar commented on August 30, 2024

lat: Quite a number of comments above had no response or follow-up, and were not addressed in the patch or new tickets created. Should I assume closing the ticket means the comments are dismissed?

from das.

 avatar commented on August 30, 2024

lat: Forgot to mention that the logger change in json_parser() seems to have a bug as it calls logger.warining, not .warning.

from das.

vkuznet avatar vkuznet commented on August 30, 2024

valya: Replying to [comment:76 lat]:

There's a few issues with the patch, so I am re-opening the ticket. If you want to split these off to new tickets, that's fine, but I thought the comments belong here. Am adding them as individual comments for easier reference.

Lassi I prefer individual tickets, so I'll walk through your comments and create them. The closing of this one was done automatically when patch was committed. So it is not intentional and neither means dismissing issues.

from das.

vkuznet avatar vkuznet commented on August 30, 2024

valya: Replying to [comment:78 lat]:

Comment [comment:12 12] wasn't really answered, the {{{dassh}}} was just removed. Is ipython dependency completely removed now, in RPM rules as well, and ipython is completely irrelevant to DAS?

ipython is not part of DAS. I used to manage mongo. So we can safely remove this dependency. This should be done in das.spec file and therefore doesn't reflect in a patch.

from das.

vkuznet avatar vkuznet commented on August 30, 2024

valya: Replying to [comment:80 lat]:

Comment [comment:18 18] follow-up: thanks for switching to the new {{{eval()}}} scheme. Though I would note that the {{{eval()}}} isn't inside {{{try ... except}}} - it probably should be. Or are you deliberately intending it to raise an exception?

Yes, I want to get exception if eval is not succeed. I don't know how to handle input data if I can't parse it and throwing exception is a best way to address the unparseable data.

from das.

vkuznet avatar vkuznet commented on August 30, 2024

valya: Replying to [comment:81 lat]:

Comment [comment:21 21] follow-up: I'd still find {{{isinstance}}} more readable. (Cf. PEP 8.)

Open new ticket, https://svnweb.cern.ch/trac/CMSDMWM/ticket/541

from das.

vkuznet avatar vkuznet commented on August 30, 2024

valya: Replying to [comment:86 lat]:

Thank you for adding {{{checkargs}}} to verify parameters. It has a few flaws I'd like to see fixed:

  • You don't use what you verify. Some arguments are casted to strings (str(x)) before checking. You should instead verify what you will use.
  • You should type check all arguments for reasons above. A keyword argument can be None (not given), a string (given once), or a list (if given several times).
  • Contents of many, but not all arguments are checked. I didn't see any additional checking added for remaining arguments elsewhere so it looks like several vulnerabilities remain. You should always sanitise all arguments. Even if the argument is free form input, you can often make sure it only consists of certain legitimate characters (e.g. letters only).
  • Failure to verify arguments should raise an exception.
  • Failure to check an argument should not return the argument value back to caller. This is unsafe; you don't know what the value contains, and you just determined it's not valid. Returning the value to caller can be used to create XSS and other attacks. My general preference is to never return anything to the caller - you simply return suitable HTTP status code.
  • It's not sanitising the HTTP method; note that 'method' keyword argument is not the same as the request method!

See new ticket #542

from das.

vkuznet avatar vkuznet commented on August 30, 2024

valya: Replying to [comment:87 lat]:

Quite a number of comments above had no response or follow-up, and were not addressed in the patch or new tickets created. Should I assume closing the ticket means the comments are dismissed?

Lassi, I would prefer individual tickets, since it's much easier to trac than monolitic thread where you can easily lost not because of dismissing issue, but just unintentionally pass it. Feel free to open tickets for those comments which are not addressed. Having individual tickets also simplify code patching/management.

from das.

vkuznet avatar vkuznet commented on August 30, 2024

valya: Replying to [comment:88 lat]:

Forgot to mention that the logger change in json_parser() seems to have a bug as it calls logger.warining, not .warning.

Thanks for catching this typo. It's fixed now.

from das.

vkuznet avatar vkuznet commented on August 30, 2024

valya: Replying to [comment:79 lat]:

Comment [comment:10 10] still needs addressing. Will we see init scripts folded into DAS 'manage'?

Working on it, now in separate ticket #543

from das.

vkuznet avatar vkuznet commented on August 30, 2024

valya: Replying to [comment:82 lat]:

Comment [comment:23 23] follow-up: I guess it wasn't clear enough, but "session arguments" meant "session" and "version". All you need is the actual data arguments. Also would prefer they were deleted, not just commented out.

Now under #544

from das.

vkuznet avatar vkuznet commented on August 30, 2024

valya: Replying to [comment:84 lat]:

Also I note here that the quoting wasn't added universally everywhere - not in all templates, and not even systematically in the one example I happened to quote, das_table.tmpl. As I wrote before, it looks like every template needs to be sanitised. I can't easily tell which values are safe.

This and previous comment now under separate ticket, #545

from das.

vkuznet avatar vkuznet commented on August 30, 2024

valya: Replying to [comment:85 lat]:

I would suggest to remove code, not comment it out. There can be exceptions, but I'd like to see most of the commented out code removed. There's version management history if you need to go back; the history doesn't need to be in the code itself.

Now under #546

from das.

vkuznet avatar vkuznet commented on August 30, 2024

valya: Replying to [comment:76 lat]:

There's a few issues with the patch, so I am re-opening the ticket. If you want to split these off to new tickets, that's fine, but I thought the comments belong here. Am adding them as individual comments for easier reference.

Lassi please review this one more time. I reassigned all your comments to separate tickets. If I miss something please fire up new ticket. We can keep this ticket open as you wish until all other ones will be addressed. Since we put this into DAS-Commissioning milestone the milestone can be completed once all tickets assigned to it will be closed.

from das.

 avatar commented on August 30, 2024

lat: I'd prefer not to go through the comments one by one myself. I'd suggest you do that. Just start from the top and create a ticket for every item that wasn't yet covered.

from das.

vkuznet avatar vkuznet commented on August 30, 2024

valya: Replying to [comment:15 lat]:

Review note on DAS: xwho is obsolete and probably should be converted to use xldap.cern.ch. See {{{phonebook}}} command on any CERN linux system. It's just a perl script which makes a simple LDAP query to xldap.cern.ch. Note that xldap.cern.ch is only available within CERN, (testing) outside CERN you'll need (SSL + password authenticated) connection to ldap.cern.ch instead.

All YML files will move into SITECONF, see ticket #546. Xwho will not be present over there.

from das.

vkuznet avatar vkuznet commented on August 30, 2024

valya: Replying to [comment:77 lat]:

Here in das_map, you should take $dir as command line argument, not make checks on host name:

{{{
#!diff
+if [ hostname -d == "cern.ch" ]

  • dir=/data/projects/das/config/maps
    +else
  • dir=$DAS_ROOT/src/python/DAS/services/maps
    +fi
    }}}

Ticket #550

from das.

vkuznet avatar vkuznet commented on August 30, 2024

valya: Replying to [comment:16 lat]:

Review note on DAS: would suggest to generally consider slight flattening of name space. "from DAS.utils.utils import foo" seems a little redundant :-)

I don't think it is relevant, it's a matter of preferences. DAS code structure was established based on WMCore rules. WMCORE has similar long namespaces. I'll skip it, since it involves a large redesign of dir structure, changing all the names, etc.

from das.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.