Coder Social home page Coder Social logo

Deploy CODA at neuropoly about coda HOT 24 CLOSED

namgo avatar namgo commented on August 21, 2024
Deploy CODA at neuropoly

from coda.

Comments (24)

namgo avatar namgo commented on August 21, 2024 2

Small update on this: Louis-Francois will continue working on CODA today, and we'll probably have a call on Wednesday. I offered to set up some more of the system but he stressed the importance of learning how it works himself which I'm very happy with :)

from coda.

namgo avatar namgo commented on August 21, 2024 1

We could probably use Bireli as our toss-at-projects-that-need-root machine in the future since it's largely unused. We'd want to discuss this formally though.

from coda.

louisfb01 avatar louisfb01 commented on August 21, 2024 1

tl;dr: We cannot enable HTTPS since we are deploying caprover locally, which ends up causing SSL issues (see below).

Right now, we are creating a single caprover instance in one VM and deploy both the hub and site on this instance. When deploying the hub for coda, we change the hub.coda domain to http://captain.captain.localhost/ within the caprover instance, as mentioned in the caprover documentation, but cannot enable HTTPS using localhost. This makes the CODA hub-deployer crash because of SSL not being enabled.

Possible solution: Skip the SSL check for coda when deployed locally. Maybe create a branch for local deployment alternative for QA testing at Neuropoly?
Question: Can it be done, and how complicated would this be?

Screenshot 2023-09-08 at 10 39 29 AM

Screenshot 2023-09-07 at 2 33 10 PM

from coda.

kousu avatar kousu commented on August 21, 2024 1

In the CODA deployment guide, they state

In order to deploy the CODA platform sandbox using CapRover, you will need a registered domain name and access to DNS settings. Throughout this guide, we will use coda-platform.com as the example base domain.

We have DNS we could use: @namgo has access to Namecheap, and could give you subdomains, say, *.coda.neuropoly.org. But since they're using letsencrypt by default to actually get the SSL working, you will run into the problem that our network admins are skeptical about opening ports (e.g. https://github.com/neuropoly/computers/issues/320, https://github.com/neuropoly/computers/issues/337) and will drag their feet on doing it for you. But if you can talk to our network admins clearly enough, you should be able to get them to open the ports for you, and then be well. Alternately, maybe there is a setting in their deployment script that would let you switch to using the DNS challenge, but I don't think there's a Namecheap API that could work with that so we'd have to delegate coda.neuropoly.org to a different DNS hoster and it'd be pretty tricky.


I'm confused. Those instructions don't seem to address deployment behind a firewall. I must be missing something, isn't the target audience researchers working at institutions? Institutional firewalls are always strict and always block letsencrypt. I also don't understand why they suggest deploying outside of the firewall

As an alternative, you can deploy a VM with CapRover pre-installed on DigitalOcean.

because DigitalOcean is not covered by any kind of NDA or data sharing agreement or PII protection plan that any institution would agree to. And isn't the point of CODA that everyone can keep and analyse their data locally without having to break their PII plans?


If you can grep in the CODA codebase for caprover serversetup (it might be written ["caprover", "serversetup"] or ["caprover"] + ["serversetup"] or {'caprover', 'serversetup'} or one of many variations, watch out!) and find where that happens, you should be able to patch it to add skipVerifyingDomains like the caprover instructions say?

from coda.

kousu avatar kousu commented on August 21, 2024

We delete

https://github.com/neuropoly/computers/blob/16a6de2b6114ee49f694de51b0354320c51c141a/ansible/hosts#L19

and then we have to manually uninstall netdata and maybe some other stuff like tmpreaper and sssd (roles/neuropoly.grames) -- ansible doesn't have any way to propagate deleting that line into removing everything that line implied.

from coda.

kousu avatar kousu commented on August 21, 2024

We could probably use Bireli as our toss-at-projects-that-need-root machine in the future since it's largely unused. We'd want to discuss this formally though.

I'd rather solve handle "projects-that-need-root" with neuropoly/computers#461 ! And/or containers. podman can handle 99% of what people think they need root for but don't really.

The CODA project is an exception, because they have a multi-stage setup and ansible scripts that make strong assumptions. What we could be doing is running our own internal cloud (maybe openstack but neuropoly/computers#461 is enough) that we can point such scripts at. And also we could be working with them to see if they can adapt their ansible scripts into ansible roles that might be willing to play nicer with pre-existing ansible deployments.

from coda.

namgo avatar namgo commented on August 21, 2024

I got the go-ahead from Julien to work with Louis-Francois on CODA. I think my first step will be going over the requirements with Louis-Francois, to figure out what we actually need and how to set it up.

from coda.

namgo avatar namgo commented on August 21, 2024

CODA provides deployment documentation (https://github.com/coda-platform/guides-and-policies/tree/main/guides/deployment).

I misunderstood some parts of this project, we do in fact need a GPU node (which Bireli provides), and the system requirements for individual nodes are pretty steep.

I'm going to have a call with one of the developers of CODA and Louis-Francois next week.

from coda.

namgo avatar namgo commented on August 21, 2024

CODA would be really difficult to provision without caprover. Their caprover config https://github.com/coda-platform/site-deployer might as well be a script to tell docker to deploy things and which ports are where but I don't know if it's worth the effort/risk to try to rewrite their system into ansible.

They are planning to use ansible in the long-term for prod but are not at that point yet.

I'm going ahead and removing Bireli so I can deploy docker to it (https://github.com/neuropoly/computers/commit/344cb13fa749eae75ce3e9f12a4869355a8d9fd1).

from coda.

namgo avatar namgo commented on August 21, 2024

I'm adding a new config for coda-resolver in dnsmasq which I'll call directly for now, setting subdomains of .coda to resolve locally it'd seem. I'm not sure how well this will work with remote connections, we'll see (it won't, but I might have a workaround).

I got confused about the site-deployer and installed it directly to bireli, I'm installing it to VMs now.

from coda.

namgo avatar namgo commented on August 21, 2024

Alright, so I have a working-ish solution where Bireli resolves hub and site1 locally, but this is only going to work on Bireli specifically. If Hub and Site1 need to talk to each other this isn't going to work.

from coda.

kousu avatar kousu commented on August 21, 2024

@namgo can you pull out whatever apt commands and config file edits you made and post them in here? I can probably pull it out but it'd be faster if you could remember.

Once that's here to examine, I suspect there is a way to combine it with neuropoly/computers#461! It sounds like in the end you didn't have to make too many invasive changes in the end -- you set up libvirt (i.e. neuropoly/computers#461) and dnsmasq, and those should fit into our ansible :) The bulk of the work in doing CODA is going to be inside those VMs and I think that we can probably leave out of ansible for the foreseeable future.

from coda.

namgo avatar namgo commented on August 21, 2024

@kousu this one's fairly non-invasive you're right, but it's not stable by any means. I'm using dnsmasq as a non-daemon process that I have running in a tmux session to ensure that each VM has a domain name addressable by Bireli.

It's not an unprivileged VM setup as I haven't given anyone libvirt group access.

Never the less, documenting is good you're right!

dnsmasq -R --interface=docker0 --except-interface=lo0 -d -C coda-resolver --bind-interfaces with a config file coda-resolver containing:

address=/hub.coda/192.168.122.2
address=/site1.coda/192.168.122.3
interface=docker0

The two VMs were set up manually and I assigned each an address.

from coda.

namgo avatar namgo commented on August 21, 2024

Just had a call with Louis-Francois, looks like we can up the memory and cpu count of the hub.coda VM and get rid of site1.

He'll be working on this tomorrow so I'll document further discussion here.

from coda.

namgo avatar namgo commented on August 21, 2024

We had a call with our contact with the CODA project, he found out that caprover might not work well with snap's docker... I've redirected DNS to the other ubuntu VM I set up on bireli for this project and re-installing docker on that from the repository.

(somewhat surprisingly? docker is weird) it works now!

from coda.

namgo avatar namgo commented on August 21, 2024

Louis-Francois Bouchard is on vacation at the moment, so we'll resume when he gets back.

from coda.

namgo avatar namgo commented on August 21, 2024

Louis-Francois and I have been working at this. We've run into a bit of a snag where the repo is called every time we initialize a container, and the repo has security checks that prevent us from working locally (ensuring "captain" is running with https enabled regardless of whether it is or isn't behind a firewall, is probably the first of many such checks).

Our two options appear to be:

  • connect with the developers again for some help
  • try to patch the docker init script and hope we keep notes

Louis-Francois and I have opted for the first option because there's probably something we're not seeing.

from coda.

namgo avatar namgo commented on August 21, 2024

I got a message from Louis-Francois (originally from a developer), whose suggestion and question made me wonder if in fact a valid ssl certificate is not a requirement, that we only need a certificate to pass the check. I may well be misunderstanding the system still, but I am going to try to force https on captain and see what happens. Gonna try that in 20 minutes or so.

from coda.

louisfb01 avatar louisfb01 commented on August 21, 2024

This is what we get when trying to enable https (as needed) on the caprover instance:

Screenshot 2023-09-06 at 1 56 41 PM

from coda.

namgo avatar namgo commented on August 21, 2024

Excellent writeup @louisfb01! Your framing of the solution was good and got me thinking that we might be able to import the docker/caprover deployment system into this repository and comment out the checks as a temporary measure, like you're suggesting.

I feel that before we do this, we need to get confirmation that this is the right way to go.

@louisfb01 Would you be comfortable reading through the docker deployment system and taking note of any checks like the https one? I want to make sure that there's not going to be any further surprises :D just in case.

from coda.

louisfb01 avatar louisfb01 commented on August 21, 2024

Update: Managed to deploy all of coda hub and site on the same caprover instance.

I had to fork most repositories to remove SSL verifications and update some npm packages. Louis also helped me with other bugs for the site-deployer with the stats-api repo, which he pushed to the main repo.

Now only need to confirm everything works as expected!

from coda.

namgo avatar namgo commented on August 21, 2024

Looks like we're going to have some trouble accessing the containers over the network, so my suggestion is to script a socat redirect from the VM's (external net) ip to the VM's docker network. I gave Louis-Francois the basic ideas and would recommend he reads up on http://www.dest-unreach.org/socat/ but if he doesn't get to it, I'll be able to.

from coda.

kousu avatar kousu commented on August 21, 2024

I believe this issue is done for now, since @louisfb01 has left to pursue a company, so this is on pause. Should we archive this whole repo?

I'll open a new issue in https://github.com/neuropoly/computers/ to pull bireli back under the normal ansible fleet.

from coda.

jcohenadad avatar jcohenadad commented on August 21, 2024

Should we archive this whole repo?

Yes, maybe we should do that indeed. Thank you

from coda.

Related Issues (1)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.