Coder Social home page Coder Social logo

AWS support about linuxkit HOT 25 CLOSED

linuxkit avatar linuxkit commented on May 18, 2024
AWS support

from linuxkit.

Comments (25)

FrenchBen avatar FrenchBen commented on May 18, 2024

Some idea of what people have been doing to get Alpine running:
https://gist.github.com/kennwhite/d89174749ce468f7c455

from linuxkit.

justincormack avatar justincormack commented on May 18, 2024

Yes there was some discussion on Alpine list about making it easier.

Moby will also need TLS cert support for this, currently we are not creating certs for docker.

from linuxkit.

nathanleclaire avatar nathanleclaire commented on May 18, 2024

Moby will also need TLS cert support for this, currently we are not creating certs for docker.

What for exactly? I don't think it's strictly needed just to get an AMI up?

from linuxkit.

nathanleclaire avatar nathanleclaire commented on May 18, 2024

So, I have been making some babysteps towards implementing this (because I really want it for Docker AWS edition), and want to type up the results of my research so far, and get your input on how we should proceed @justincormack.

First off, I want to say that the existing Dockerfile and Makefile automation in this repo is really great. Thank you! It has made taking in the information about how to build these artifacts, and of course actually building them, so much easier.

There aren't too many guides on making AMIs completely from scratch (usually they presume that you are going to take a snapshot of a modification of a running instance to generate a new AMI), but I managed to find out some information, and even take a first whack at it (which of course failed miserably, but that's not going to stop me from continuing to try 😉 ).

Amazon offers two varieties of virtualization for instances, paravirtualization (PV) and hardware virtual machine (HVM).

It seems that the main important difference between these instance types for building a from-scratch Moby AMI has to do with the way that boot is handled. In PV's case, Amazon has this thing called PV-GRUB, which:

is a paravirtual boot loader that runs a patched version of GNU GRUB 0.97. When you start an instance, PV-GRUB starts the boot process and then chain loads the kernel specified by your image's menu.lst file.

This can be used to specify a custom kernel (as opposed to just their out-of-the-box ones, which are called Amazon Kernel Images or AKIs). In GRUB conf looks like you also specify the initrd.

As for HVM, it seems more complicated. As noted by the article I link in the end of this post:

HVM images are a bit more tricky, since we need to create a complete disk image including partition and MBR

So, basically, if I understand correctly we would have to create our own disk image from scratch, including GRUB, and configure it to boot into Moby. I'm not really sure how to do this, it seems like mostly people use a local loopback device, install the operating system and do their own configuration manually, and then bundle it up into a volume using aws ec2 bundle-image (that creates an instance-store backed AMI though, and I think we should maybe be after EBS). I wonder if we can leverage the generated artifacts or do this kind of build in a container. Any guidance here would be welcomed on my end.

The other option for generating an AMI from scratch in this fashion seems to be to write the image to an attached device (EBS volume), which you can then take a snapshot of, which can then be turned into the root device for a new AMI. This is what I outline as attempting in the next section.

Since gummiboot seems to be the boot loader for Pinata (if I'm understanding the code correctly, could be wrong), to support HVM we would need make modifications so that we have the option to boot using GRUB instead for Amazon. AWS won't do UEFI boot and doesn't seem to have plans to support it.

For next steps I might try to get a PV image working since it seems a little simpler than HVM, but HVM seems to be preferred instance type nowadays so I actually attempted that first.

I looked into Packer but it seems largely pretty not-useful for building AMI completely from scratch, the vast majority of Packer workflows seem to assume that you have an existing AMI you are starting from. However, there is the AMI chroot builder which is interesting.

What I attempted so far

I know this is wrong because it didn't work but learning what's wrong about it hopefully will help lead me in the right direction. I used the AWS console just to get a feel for it but this should all be automatable through the awscli.

Just from memory unfortunately, I don't have a directly reproducible example, but like mentioned above what I'd like to do next is try and script this using awscli:

  • First I makeed Moby on a running AWS instance.
  • I powered the instance down, made a new (non-root) EBS volume (/dev/sdf IIRC) and attached it to the instance, where it appeared as /dev/xvdf when I started the instance again.
  • Next I did sudo cp alpine/initrd.img /dev/xvdf next to actually set the root image for the attached device. (I know this is probably wrong now, but hey you live you learn).
  • I took a snapshot of the volume using the console.
  • I turned that snapshot into an AMI with the snapshot as the root device (/dev/sda1 I think). I used the default AKI image for kernel, but naturally we need to figure out a way to bundle the generated vmlinuz64.
  • I tried starting an instance using the baked AMI!
  • It powered down :(
  • Nothing even showed up in the system (boot) logs so the experiment was a failure, but I'm not sure why. Obviously kernel is one of several suspects.

Questions

  • HVM or PV? (Seems that HVM is generally favored these days, here is an article outlining some reasons why).
  • sshd will probably need to be bundled, right? I think from conversation with @kencochrane @mavenugo we want to avoid exposing SSH directly to the host to user when possible (instead offering a limited, locked-down shell running in a container), but doesn't that seem really operationally risky? (come to think of it, this might be why you reference needing to generate certificates above)
  • What's up with gummiboot? Is that the boot loader that's used with xhyve? (And I guess presumably xhyve supports UEFI?) Do we use systemd with it or intend to? (in the Arch wiki references it says gummiboot is now known as "systemd-boot")?
  • What commands to use to bake the HVM image including GRUB? I have a copy of GRUB source locally that I've compiled using Dockerfile, I'm not sure if we can bundle it with this repo due to GPL though.

Relevant links

Anyway hope to hear back from you on this! Hope the wall of text isn't too much -- I'm having fun, really like the layout of the repo and I'm excited about making this happen.

I have PR with GRUB and AWS CLI Dockerfile tool too if you're interested.

from linuxkit.

rn avatar rn commented on May 18, 2024

@nathanleclaire there are currently three ways we boot Moby:

  • direct vmlinux/initrd (used on xhyve)
  • mobylinux-efi.iso: This rolls the vmlinux/initrd together with a EFI stub (from gummiboot) into a EFI executable which is put into a dos partition in an ISO image. We use this on Hyper-V to boot generation 2 VMs which don't support legacy BIOS boot.
  • mobylinux-bios.iso: This is a plain bootable ISO with a syslinux bootloader to load vmlinux/initrd. We used this on Hyper-V to boot on generation 1 VMs which go through Legacy BIOs emulation

The mobylinux-efi.iso boot also has a commented out build step which would create a VHDX (Hyper-V virtual disk format) with an EFI partition to boot vmlinux/initrd from.

If you can boot an ISO on AWS, maybe mobylinux-bios.iso is a good start

One thing to keep in mind is that Moby runs completely out of RAM disk/initrd and the virtual disks we use on Mac/Windows is more or less purely for storing docker images.

re xhyve and EFI. As mentioned above on xhyve we load vmlinux/initrd straight. No EFI needed. But xhyve also supports EFI boot because that's how we boot windows on it ;)

Another thing to keep in mind, once you managed to boot on AWS, is that Moby Linux init system has a few bits an pieces in there which expect certain services to be present on the host. I had to modify a few bits and bobs when I started porting it to Windows/Hyper-V. grep for /sys/bus/vmbus in ./alpine/package. There aren't many (anymore) but you may have to do some customisation for AWS as well.

finally /cc @ijc25 who is more skilled in the dark arts of bootloading and packaging than me,

from linuxkit.

justincormack avatar justincormack commented on May 18, 2024

Hi, welcome to the horrible world of making AWS images. We have the same issues with unikernels, the whole thing is really sucky.

You should use HVM, all the decent instance types are HVM now, and PV is being phased out slowly. ALthough, saying that, PV might be easier to get started...

As Rolf says, it doesnt need a root filesystem on the disk, you should just be able to get grub set up with the kernel and initrd in a small boot partition, and then it can format the rest of the disk to use for docker (scripts may need some tweaking still). Alternatively though we could decide that we won't use an initramfs for the AWS edition and put the filesystem on the disk image I guess..

This guide for installing Alpine on AWS might help https://wiki.alpinelinux.org/wiki/Install_Alpine_on_Amazon_EC2

from linuxkit.

justincormack avatar justincormack commented on May 18, 2024

The TLS support is for the docker socket. Obviously you only need this once you have it running with http...

Don't worry about GPL code - there is already some, and a mechanism to provide source to users.

from linuxkit.

justincormack avatar justincormack commented on May 18, 2024

Initial support merged in #116

from linuxkit.

justincormack avatar justincormack commented on May 18, 2024

@nathanleclaire can you add a new console log, there are a few minor cleanups that are needed I think.

For SSL certs can we obtain these from the AWS API? Or did you have another idea (do we want to make a CA)?

from linuxkit.

nathanleclaire avatar nathanleclaire commented on May 18, 2024

can you add a new console log, there are a few minor cleanups that are needed I think.

Happy to, but I don't understand what you mean. Can you explain please?

For SSL certs can we obtain these from the AWS API? Or did you have another idea (do we want to make a CA)?

Well, we have to decide how we'd like to handle this. I think that in Swarm V2, which is where we'd like to head with this, agents manage their certs with the manager automatically after an initial handshake on first connection (we need to sync with @diogomonica on the details of this). If that's the case, I think I'd like to just disable exposing the Docker API directly (over TCP, that is) in the cloud. Users would manage their cluster from whichever interface we choose to expose (locked down shell, web GUI, etc.).

from linuxkit.

justincormack avatar justincormack commented on May 18, 2024

I just mean the output of the boot, I just want to check a few things, I forgot when I was in the office and the internet is still not installed at home...

How do you configure the host to initially talk to the swarm in that case?

At present the tcp socket is only there for transitional reasons and should go away soon.

from linuxkit.

nathanleclaire avatar nathanleclaire commented on May 18, 2024

I just mean the output of the boot, I just want to check a few things, I forgot when I was in the office and the internet is still not installed at home...

Oh, I see, sure, I'm happy to.

How do you configure the host to initially talk to the swarm in that case?

We'll have to figure out a way of dropping in the manager IP / host, either via cloudinit (possibly painful due to the additional cost of having that dependency installed), mDNS, magic AWS metadata (initial research in this direction is not promising), etc. Still remains to be decided.

At present the tcp socket is only there for transitional reasons and should go away soon.

Oh? What will it be replaced by?

from linuxkit.

justincormack avatar justincormack commented on May 18, 2024

Replaced by the unix socket connected to VSOCK/HVSOCK on Mac/Win - Mac is done already. This is basically a unix socket style thing but with a direct hypervisor transport.

from linuxkit.

justincormack avatar justincormack commented on May 18, 2024

Re getting the manager IP, I was assuming we would use the aws metadata for the config (not just that, also other config), why do you not think it is not promising? You just curl http://169.254.169.254/latest/user-data the only issue is the 16k max size, which could be an issue, although mostly should be ok.

from linuxkit.

nathanleclaire avatar nathanleclaire commented on May 18, 2024

Re getting the manager IP, I was assuming we would use the aws metadata for the config (not just that, also other config), why do you not think it is not promising? You just curl http://169.254.169.254/latest/user-data

Oh, that's good! We should do that. I had originally thought of using tags for this feature (which is not possible directly through the metadata API), but this accomplishes the same thing I was hoping to use tags for in a much better manner.

the only issue is the 16k max size, which could be an issue, although mostly should be ok.

What would we need to get from it other than the manager IP?

from linuxkit.

nathanleclaire avatar nathanleclaire commented on May 18, 2024

Replaced by the unix socket connected to VSOCK/HVSOCK on Mac/Win - Mac is done already. This is basically a unix socket style thing but with a direct hypervisor transport.

Interesting -- we will have to chat about how we'd like to expose the cluster access in the AWS case then.

from linuxkit.

justincormack avatar justincormack commented on May 18, 2024

Moby has a small amount of config, which can just be a json file, the stuff you can set with pinata, eg daemon.json, global sysctl settings. Should fit in 16k...

from linuxkit.

rn avatar rn commented on May 18, 2024

we have a couple of places where we special case in the init scripts if we are running on xhyve or Hyper-V. Naturally, we are working on reducing these case where we can, but we should be able to do something similar for AWS.

from linuxkit.

justincormack avatar justincormack commented on May 18, 2024

@rneugeba we could standardise those a bit eg with a kernel command line option or database option if we cant get rid of them, although they are not many left now.

from linuxkit.

rn avatar rn commented on May 18, 2024

@justincormack no disagreement whatsoever. My view is that it is fine to start/not start certain services based on platform (e.g. Hyper-V integration daemons only on Hyper-V, chronyd only on xhyve etc) and ideally we hide the less binary differences (e.g. docker daemon config, network config etc) behind some higher level tools like mobyconfig

from linuxkit.

nathanleclaire avatar nathanleclaire commented on May 18, 2024

Moby has a small amount of config, which can just be a json file, the stuff you can set with pinata, eg daemon.json, global sysctl settings. Should fit in 16k...

Oh yeah, I see. Exposing config for the AWS / general higher environment stuff is another area we need to discuss and come to consensus on.

from linuxkit.

nathanleclaire avatar nathanleclaire commented on May 18, 2024

ideally we hide the less binary differences (e.g. docker daemon config, network config etc) behind some higher level tools like mobyconfig

👍

from linuxkit.

kencochrane avatar kencochrane commented on May 18, 2024

We also need to add the ability to support passing in userdata from AWS to bootstrap the node on startup. (starting containers, etc)

For Docker for AWS, we want to use Moby as our manager node, which needs to have some containers running on startup. Currently we are using ubuntu and passing in some commands to install docker, and pull and start some containers that are needed for the platform. Since Docker will already be baked into Moby we don't need that step, but the ability to run some setup scripts and start some containers would be great.

Ubuntu and most AWS AMi's use cloud-init for this, which uses the AWS meta data service to pull down the userdata that is passed in on startup, and executes the code. Maybe we can add an init script that checks to see if it is on AWS, and if so, looks at the meta data service, pulls down any userdata and then executes. It could drop a (lock) file somewhere when complete, and we can check if the lock file is present in future reboots, if it is, we know the node setup was already done, and skip that going forward.

Open to other ideas as well.

from linuxkit.

nathanleclaire avatar nathanleclaire commented on May 18, 2024

@justincormack @rneugeba Let us know what your thoughts are on making mobyconfig a bit more generic. In the AWS/Azure case it would be nice for us to have a tool where we could do something like (totally spitballing here) mobyconfig set manager <ip>:4242, or reading in some kind of serialized file (YAML, TOML, etc.) to generate the engine parameters for boot.

from linuxkit.

justincormack avatar justincormack commented on May 18, 2024

In progress with Docker for AWS, closing the general issue here. We do need to converge configs soon.

from linuxkit.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.