uabrc / cri_xcbc Goto Github PK
View Code? Open in Web Editor NEWThis project forked from jprorama/cri_xcbc
Home of the XSEDE Compatible Basic Cluster
License: MIT License
This project forked from jprorama/cri_xcbc
Home of the XSEDE Compatible Basic Cluster
License: MIT License
Error when starting an RStudio Server after the install of ood_updated_rstudio:
undefined local variable or method `version' for #<BatchConnect::SessionContext:0x0000000003183f78>
* The RStudio Server session data for this session can be accessed under the staged root directory.
Per Louis, the fix is to change version.split
to r_version.split
in roles/ood_updated_rstudio/files/submit.yml
diff --git a/roles/ood_updated_rstudio/files/submit.yml b/roles/ood_updated_rstudio/files/submit.yml
index a2a174c..d292772 100644
--- a/roles/ood_updated_rstudio/files/submit.yml
+++ b/roles/ood_updated_rstudio/files/submit.yml
@@ -8,7 +8,7 @@ script:
- "--mem-per-cpu=<%= bc_num_mems.blank? ? 4 : bc_num_mems.to_i %>G"
- "--partition=<%= bc_partition %>"
- "--time=<%= bc_num_hours.blank? ? 1 : bc_num_hours.to_i %>:00:00"
- - "--job-name=ood-r-<%= version.split("/")[-1] %>"
+ - "--job-name=ood-r-<%= r_version.split("/")[-1] %>"
<%- if bc_partition.include? "pascalnodes" -%>
- "--gres=gpu:1"
<%- end -%>
Currently users can only put interactive apps on a subset of partitions. There are legitimate use cases in recent tickets for putting workflows onto RStudio using >190GB ram (esp. genomics pipelines)
Can we get largemem
and largemem-long
and amd-hdr100
added to the list of choices?
Ideally this would be automated and pull from scontrol
but that looks like a longer-term solution involving a fair bit of ruby coding.
We need to change the current behavior of reading in /etc/motd
to the default page to using a markdown file included in the repo.
The /etc/motd
file isn't formatted for a web page.
On the SAS startup there is a "Qickstart" overview and clicking on the button leads to an error message that the local document browser connection failed.
The local URL does appear to work if a browser is opened manually but the SaS link doesn't open the browser. Not sure if that's a SaS issue or something related to how our desktops are configured.
OOD is logging the following error during initialization under /var/log/nginx/${USER}/error.log
on the OOD node:
App 179829 stderr: --> No passenger_native_support.so found for current Ruby interpreter.
App 179829 stderr: This library provides various optimized routines that make
App 179829 stderr: Phusion Passenger faster. Please run 'sudo yum install passenger-devel-4.0.50'
App 179829 stderr: so that Phusion Passenger can compile one on the next run.
App 179829 stderr: --> Continuing without passenger_native_support.so.
It looks like OOD isn't locating the library, even though it exists:
$ rpm -q passenger
passenger-4.0.53-4.el7.x86_64
$ rpm -ql passenger | grep passenger_native_support.so
/usr/lib64/passenger/passenger_native_support.so
$ ls -l /usr/share/passenger/phusion_passenger.rb
-rw-r--r-- 1 root root 10435 Mar 22 2016 /usr/share/passenger/phusion_passenger.rb
Right now the web form requires the user to input memory per CPU, which is all well and good. New users frequently miss the "per CPU" part, no matter how much instruction we provide up front. Having a UI element that displays a read-only computed memory TOTAL (i.e. CPU * mem/CPU) would be helpful for avoiding this issue.
The /register link works as intended for users without "@uab.edu" in their REMOTE_USER but users who have "@uab.edu" see the default failure to map error message.
I suspect this is due to the parsing in our .regex mapping function. We likely need to update it to check for actual accounts being defined.
Issue:
The lmod tracking has stopped logging after Oct 27th. The issue seems to be with lmod mysql DB reading the config file that has mysql credentials. This lead to the lmod tracking data not being ingested into the database from lmod log files. After certain period, the log rotate started clearing the old files which are yet to push the data into mysql DB because of issue with reading encoded password. However, we've been rsyncing /var/log on cm01 to a backup location on /data/rc nightly. Mike (@flakrat) copied all of the missing moduleUsage.log-%DATE% files to /data/rc/lmod_logs. We should restore missing data into the DB with them.
Work around
Modify the log rotate config to move old files after log rotation into a different location so that log files will be preserved until we fix this issue.
Solution
Fix the error we see with python-mysql client not being able to read the creds from config file.
After integrating Shibboleth, Clicking Logout directs to:
https://idp.my-center.edu/idp/profile/Logout
We need to identify the correct URL for our site / UAB
We made GPU partition available with single GPU for simplicity of development.
Now it seems like users need more control over amount of GPUs while using VNC desktop.
We are using Shibboleth for the authentication method, yet the following message is still triggered when using Safari web browser:
As currently configured, the Cluster and Interactive Apps of Open OnDemand do not work with Safari. This is due to a bug in Safari with using websockets through servers protected using "Basic" auth. Open OnDemand can be installed with another authentication mechanism such as Shibboleth or OpenID Connect. If "Basic" auth is required, Mac users can connect with other browsers like Chrome or Firefox. Please contact this site’s technical support and report this message.
Max memory per partition is 128GB per CPU, so users can't request 1 CPU and 1024GB on largemem, for example. We might want to consider restricting memory selections based on the partition chosen. It isn't clear how to make this work in a fully-automated way (i.e. using scontrol
, but I imagine the solution to this may be related to the complete solution to #86.
<link rel="shortcut icon" type="image/x-icon" href="/public/favicon.ico">
There is no "favicon.ico" in rc.uab.edu/public repo.
The "eye" icon is missing from the button that accepts the connect to jupyter notebook action. Here's the code on the web page:
Here's the reference to the image as it appears on the HPC desktop
It's basically the same icon so odd that it's not appearing. Is there a name change that occurred with the upgrade that would require us to update the Jupyter repo clone?
It would be nice if the mandatory job requirements of the form (for Interactive Apps) came from a default template so that edits are made in one place as the underlying scheduler changes.
For example, with three apps deployed, we have three completely different forms with a number of job resources that don't apply to our scheduler configuration:
Note that the playbook for Jupyter was already modified once to align better with our scheduler to include a hard coded list of partitions:
We need to move the steps that create our site ood personality into a dedicated playbook to make applying them and managing different environments easier.
For example the following roles in ood.yaml are more specifically related to personalizing the prod site and could be considered as separate from building or deploying a stock ood node.
ood_auth_regex
ood_uab_ui
ood_easter_egg
ood_static_user_reg
It seems that isolating these roles into a personality playbook could enable a site specific set of tasks that could then be easily applied to any stock (rpm released) ondemand environment.
The user attempted to login using their mixed case upstream ID and it failed with:
Error -- user doesn't exist: JohnDoe
They also tried all lowercase, however it returns back from the identity provider in mixed case format.
Usernames on the HPC cluster are all lowercase.
Loading the /register page in chrome incognito session showed the link for osc open on demand but the "powered by" logo was missing.
We had an issue when trying to set up GPU selection's default value.
Further investigation shows that there are duplicates field in the form file: https://github.com/uabrc/CRI_XCBC/blob/uab-prod/roles/ood_vnc_form/templates/cluster.yml.erb.j2?plain=1#L25 and https://github.com/uabrc/CRI_XCBC/blob/uab-prod/roles/ood_vnc_form/templates/cluster.yml.erb.j2?plain=1#L40
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.