coreos / fedora-coreos-tracker Goto Github PK

Issue tracker for Fedora CoreOS

Home Page: https://fedoraproject.org/coreos/

fedora-coreos-tracker's Introduction

Welcome to the Fedora CoreOS issue tracker. This tracker will be used to discuss new features for Fedora CoreOS and also important bugs that are affecting the project. Tickets with the meeting label will be taken as agenda items during the meetings. This repo is to be used primarily for development purposes. If you are a user and have questions please use the forum or the mailing list.

Fedora CoreOS Working Group

Fedora CoreOS is an automatically updating, minimal, monolithic, container-focused operating system, designed for clusters but also operable standalone, optimized for Kubernetes but also great without it. It aims to combine the best of both CoreOS Container Linux and Fedora Atomic Host, integrating technology like Ignition from Container Linux with rpm-ostree and SELinux hardening from Project Atomic. Its goal is to provide the best container host to run containerized workloads securely and at scale.

The Fedora CoreOS Working Group works to bring together the various technologies and produce Fedora CoreOS.

Get Fedora CoreOS

Download Fedora CoreOS.

Communication channels for Fedora CoreOS

Main mailing list: [email protected]
Status mailing list: [email protected] (announcements/important messages)
Chat room: #coreos:fedoraproject.org on Matrix
Forum at https://discussion.fedoraproject.org/tag/coreos
Feature planning and important issue tracking at github.com/coreos/fedora-coreos-tracker
Website at https://getfedora.org/coreos/
Documentation at https://docs.fedoraproject.org/en-US/fedora-coreos/
Twitter: @fedoracoreos

Roadmap/Plans

Fedora CoreOS is available for general use and no longer in preview. We're continuing to add more platforms and functionality, fix bugs, and write documentation. Please try out Fedora CoreOS and give us feedback!

Adding Packages to Fedora CoreOS

We often find people asking for a particular package to be added to the base set of packages included in Fedora CoreOS. One of the goals of Fedora CoreOS is to remain as lean as possible, without impacting overall usability for our users. Thus, new package requests are carefully scrutinized to weigh the benefits and drawbacks of adding an additional package.

If you would like to propose the inclusion of a new package in the base set of packages, please file a new package request.

Releases

See RELEASES.md.

Meetings

The Fedora CoreOS Working Group has a weekly meeting. The meeting usually happens in #meeting-1:fedoraproject.org on Matrix and the schedule for the meeting can be found here: https://calendar.fedoraproject.org/CoreOS/ Currently, meetings are at 16:30 UTC on Wednesdays.

As the Matrix bridge to Libera Chat is shutdown, you can not attend the meeting from IRC and you have to join using Matrix.

Steps to run the meeting

The fedora meeting host can follow the guide which is curated by the fcos-meeting-action repo. Every Wednesday a new checklist will be available in the form of a issue in the fcos-meeting-action repo, which can be used to run the meeting.

If the action meeting repo is not available for some reason, the host can follow the below steps to run the meeting.

Legacy Meeting steps

cd to a local checkout of this repo and git pull
Ping meeting people in #fedora-coreos on libera.chat
- bash meeting-people.txt
- copy lines of output and paste into #coreos:fedoraproject.org on Matrix
Navigate to #meeting-1:fedoraproject.org on Matrix
Type:
- !startmeeting fedora_coreos_meeting
- !topic roll call

Wait for 2-4 minutes for people to check in for the roll call.

!topic Action items from last meeting

Find the last meeting log from meetbot and post the action items in the meeting for people to update the status of.

After they are done move to each meeting ticket from this tracker

Do the following for each ticket

!topic Ticket subject
!link <link_to_the_ticket>

During the meeting, you can give people action items for them to complete:

!action <nickname> description of what needs to be done

When all topics are over, go for open floor:

!topic Open Floor

After open floor, end the meeting.

!endmeeting

Then, when convenient:

Remove meeting labels from tickets that were discussed
Send an email to [email protected] with the details of the meeting from meetbot page. Minutes in textual format are directly available using .txt as URL extension. It's easiest to get the Minutes/Minutes (text)/Log URLs by copying the footer that Meetbot prints after #endmeeting. You can see examples in the archives; the usual format follows:

Subject:  Fedora CoreOS Meeting Minutes year-mm-dd

Body:

Minutes: <URL to meetbot .html>
Minutes (text): <URL to meetbot .txt>
Log:  <URL to meetbot .log.html>

<Copy/paste content of meetbot .txt>

Voting

On some topics we will need to vote. The following rules apply to the voting process.

For Regularly Scheduled Meetings

A quorum for the meeting is 5 people, or 51% of the members of the WG listed below, which ever is lower. Voting items must pass with a majority of the members voting at the meeting.

For General Ad-Hoc Votes

All ad-hoc votes will be held via tracker issues.
Ad-hoc votes must be announced on the current primary mailing list for Fedora Atomic (atomic-devel).
Ad-hoc votes must be open for at least three working days (see below) after the announcement.

At least 5 people must vote, or 51% of the WG membership, whichever is less. Votes are "+1" (in favor), "-1" (against), or +0 (abstain). Votes pass by a simple majority of those voting.

For Urgent Ad-Hoc Votes

All ad-hoc votes will be held via tracker issues in the fedora-coreos-tracker repo.
Ad-hoc votes must be announced on the current primary mailing list for Fedora CoreOS.
Ad-Hoc votes must be open for at least three hours after the announcement.

At least 5 people must vote, or 51% of the WG membership, whichever is less. Votes are "+1" (in favor), "-1" (against), or +0 (abstain). Votes pass by a 2/3 majority of those voting (round up).

Working days: non-holiday weekdays. Relevant holidays are the national holidays of the USA, Western Europe, and India.

Working Group Members and Points of Contact

Please see meeting-people.txt.

fedora-coreos-tracker's People

Contributors

Stargazers

Watchers

Forkers

dustymabe ajeddeloh miabbott rpi-user lucab dcode arithx mike-nguyen chuanchang jlebon ycs0405 sinnykumari frederiknjs secretts2019 davdunc jamescassell sumantro93 cgwalters travier piyapsri xiaoruiguo nasirhm conan-kudo jdoss kelvinfan001 jaimemagiera cverna isabella232 arjun-1 rugk darkmuggle jcajka ravanelli gursewak1997 nbebout jmarrero ekmixon ediboko1980 acidburn0zzz aaradhak ryanjenkins99 rmohr prestist socios-linux mohelt geoeducator anthr76 c4rt0 baude adam0brien dougkirkley quentin9696 gui-don theg1 marmijo yasminvalim jbtrystram madhu-pillai jasonbrooks guspan-tanadi

fedora-coreos-tracker's Issues

ostree delivery model

For Red Hat CoreOS, we are experimenting with doing ostree-inside-container as a delivery model. For people running a Kubernetes/OpenShift cluster, dealing with container images is...a known quantity. A critical aspect is that there a metric ton of tools that know how to mirror container images (and "offline"/"behind the firewall" usage is absolutely critical to support).

(Aside - one might ask: why ostree-in-container? Why can't the container filesystem itself just be the OS filesystem? Well...one issue there is SELinux labeling; container files don't carry labels today. We might be able to create some sort of new, special extensions to the OCI format but for now it's way easier to just embed an ostree repo inside a container image)

I propose that Fedora CoreOS support this - i.e. we produce an "oscontainer" image for each release. We should document how to pull it down and deploy it inside a cluster, and have the cluster use it for updates as a local mirror.

However, that still leaves the question of the default. One downside of OCI today is that there's no deltas. This is generally irrelevant for clusters (particularly deployed in the public cloud), but...for small scale usage it does matter. Now, "pure ostree" mode does have deltas. However, if one wants to use any package layering...this is where rojig comes into play. (On a semi-related topic, I will be pushing for Silverblue to use rojig by default)

One advantage of offering rojig as well for FCOS is that it'd be trivial to spin up a container and get the same RPM packages that went into the host OS - it's just a yum repo and you can yum distro-sync to it or whatever.

So in the end that's my proposal: We offer both rojig and oscontainer for FCOS. The default is rojig, but we make it falling-off-a-log-easy to switch to the oscontainer.

But that's just a proposal. Any other opinions?

meeting times and daylight savings

In the past we have kept the meeting time on UTC year round. This was mainly to minimize confusion as well as not favor any one part of the world when making decisions. I'd like to propose we do the same thing with the fedora-coreos meeting, but let's discuss and come to a conclusion.

Configuration management

Fedora CoreOS carries forward the Container Linux philosophy of immutable infrastructure: the configuration of an FCOS machine should not change after it is provisioned. When config changes are necessary, they should be done by updating the Ignition config and reprovisioning the host.

In some cases, however, that approach can be rather heavyweight, and configuration management may be a better fit. In addition, some environments have existing CM setups that would be convenient to reuse with FCOS.

Even if FCOS doesn't include enough infrastructure to support running CM tools natively in the host, it may be feasible to run them in a container. If so, we should provide documentation (and maybe even some tooling) to support this.

rules for membership and voting

I assume we'll want to establish some sort of governance structure for the project. Let's determine rules for membership and voting.

For now I left the rules for voting in the README that we used in the atomic working group. We can start with those and modify.

should notifications from github issue tracker go to the mailing list?

For the atomic working group we had pagure notifications go to the [email protected] mailing list. We also had the [email protected] mailing list that these notifications did not go to.

For Fedora CoreOS I think each person can subscribe to the issue tracker on github and we'll leave the mailing list alone.

Thoughts?

Containers building and delivery ?

We are currently using OSBS (Openshift Build Service) to build and release containers to registry.fedoraproject.org, we also inherit from containers hosted on Quay.io so the questions is what do we want to do with containers we are building in Fedora ?

arm64 / aarch64 support for Fedora CoreOS

Reading through the channel minutes for the last meeting, @glevand mentioned an interest in first-class arm64 / aarch64 support for Fedora CoreOS, a sentiment which I share.

I'd like to add this to the next agenda (Wednesday of week 29, July 18 2018) and start to identify the current state of the components that would go into this, look for some target hardware to point at first, and see what infrastructure needs there are.

no cloud agents: azure

In #12 we decided that we'd like to try to not ship cloud agents. This ticket will document investigation and strategy for shipping without a cloud agent on the azure cloud platform.

See also #41 for a discussion of how to ship cloud specific bits using ignition.

Determine how we extend partitions on boot

We've used a tool called container-storage-setup that called out to cloud-utils-growpart in the past in Atomic Host. See https://pagure.io/atomic-wg/issue/384

In CL was this handled by ignition-disks ?

Add Afterburn

We're going to focus on Ignition, but today CL includes coreos-metadata. Unlike Ignition, it runs every boot; it's obviously very useful, but IMO this blurs the story of "immutable infastructure".

(I personally vote to include it, but we should think about it)

set up logging for #fedora-coreos IRC channel

This started as a discussion on the mailing list before we had an issue tracker: https://lists.fedoraproject.org/archives/list/[email protected]/thread/ETRBNRM5ZCUP7PN3SMB3YP6I36S4Y46M/

It was decided to set up logging so that people who don't sit in the channel at all times can review conversations and also so we can link to conversations that were had for context. I'll try to work with botbot.me to set up logging similar to how it is set up for #coreos

Enabling docker by default or not

Migrating this issue from https://pagure.io/atomic-wg/issue/384

I'm going to assume that we'll include docker by default for FCOS (though that may be up for debate?). But that still leaves the question open about enabling it by default and how much we encourage its use.

We'll have podman, so...

Determine how to handle automatic rollback

We want to bring forward Container Linux's automatic rollback model and probably extend it even further. Automatic rollbacks can't solve every problem (since in some cases it may mean downgrading something like docker which is an unsupported operation) but it works well to protect against kernel issues and other such problems.

CL's model currently uses the GPT attribute bits to record if a partition has been tried or not and if was successfully booted. On a successful boot update_engine waits 45 seconds then marks the boot as successful.

We're not using A/B partitions in FCOS, so we can't use the GPT priority bits (and I think we shouldn't regardless, but that's beside the point).

Ostree currently does not support automatic rollback (@cgwalters please correct me if I'm wrong), so we'll need to implement it.

note: I'm going to use the term "install" to mean an ostree/kernel combo (for FCOS, this would be a kernel/usr-partition for CL).

Essentially there are four states an install can be in (in order of what should be chosen to boot)

Untested (just installed)
Successful (most recent)
Successful (fallback, mostly for manual recovery)
Failed

Goals I think we ought to strive for:

Handling boot all failures, including kernel failures.
Allowing users to decide what a "successful" boot means (related: greenboot)
Avoid "flapping" where the OS gets stuck in a loop of "install bad image, reboot, fail, reboot"
Never be in a state where it is unclear what the correct choice to boot (even if power is lost mid-update)
Allow manual selection of installs if necesary
With the exception of the first boot, always try to keep two successful installs around (to avoid problems like coreos/bugs#2457 (comment))

My proposal:

So I think we should use flag files in /boot like we do for Ignition. When creating a new install, it's kernel gets written to /boot along with two flag files: untested and failed. There should only ever be one install with both flags. Additionally there should be a flag file "recent" which indicates which install to boot in the case of two successful installs.

Here is a table of what combinations of flags mean what:

Untest	Failed	Recent	Meaning
N	N	N	Successful boot, fallback
N	N	Y	Successful boot, most recent
N	Y	N	Tried, unsuccessful boot, last resort only
N	Y	Y	Successful boot (probably), most recent, power most likely lost before failed flag was removed
Y	N	N	Impossible. Machine should die. Something went horribly wrong
Y	N	Y	Impossible. Machine should die. Something went horribly wrong
Y	Y	N	Just installed, boot this if available
Y	Y	Y	Impossible. Machine should die. Something went horribly wrong

The grub config should select installs in this order:

installs with both untested and failed flags
installs with the recent flag
installs with no flags
installs with just the failed flag.

When grub selects one it immediately removes the untested flag. On a successful boot a systemd unit (tbd: integrate this with greenboot?) adds the recent flag, removes the recent flag from the old entry, then removes the failed flag.

This proposal does hinge on grub being able to delete files, which I haven't confirmed yet. It also means ostree wouldn't need to write out any grub configs at all, just empty files.

Edit: hrmmm. Grub doesn't seem to be able to write to or delete files. That makes the whole "recover from a bad kernel" bit hard.

Thoughts?

cc @cgwalters and @jlebon for the ostree bits and @bgilbert to keep me honest about how CL works.

set up POC pipeline for Fedora CoreOS

Since we now have a makeshift SDK (coreos-assembler) and some configs that are the start of what will define Fedora CoreOS let's try to hook it up to some sort of build system so it can spit out artifacts and we can iterate on it.

Will first try to target CentOS CI and then go from there.

make /var/srv/containers labeled container_file_t:s0 by default

See https://pagure.io/atomic-wg/issue/505

Major release and update cycle for Fedora CoreOS

For Fedora Atomic Host, we do Atomic Host release along with Fedora major release. This follows with new release every Two Weeks with updated content from Fedora updates. This helps users to receives updated (including security fixes) and tested content every Two Weeks. For a major CVE fix, we make exception and do an in-between releases.

For FCOS, as per my knowledge we will have our first official release around Fedora 30 release based on be f30 tagged built packages (correct me if I am wrong). It will be nice to discuss and define how frequently we are going to do releases in between with updated content.

coreos/toolbox v2

We'd like to carry https://github.com/coreos/toolbox forward for FCOS and maybe give it a revamp as part of the process. Let's discuss design decisions for a v2 of coreos/toolbox in this ticket.

Equivalent to system containers from Fedora Atomic in Fedora CoreOS

This issue aims to collect use cases for system containers.

One of the ways to customize the Fedora Atomic Host is to use system containers which is does't require a reboot and can run arbitrary applications if they are built in an OCI image, not only from rpms.

Here is my input, as a community member of the OpenStack/Magnum project and as an operator of the CERN cloud.

In the OpenStack/Magnum project (present in 10-15% of openstack deployments per user survey) we are using syscontainers to run:

kubernetes, etcd, flannel (to become self-hosted at almost everyone else does)
docker-ce
an openstack-specific daemon (os-collect-config) which is actively maintained in rdo (centos) not in fedora
a cern-specific volume plugin

Apart from the kubelet, docker-ce and the openstack daemon which is used for passing configuration and runs first, the other components can be containerized in other ways, pods, docker or podman containers.

As a user I would like to see a similar solution in Fedora CoreOS that gives freedom to run newer (or from master) releases of kubernetes/cri-o/docker/containerd/kata-containers using a stable minimal base OS.

If you are another user with similar needs, please provide your input here!

create a doc that tracks important repos and how they play together

@ajeddeloh had a good idea in #49 (comment)

We should have a doc in this repo that just lists all the repos and how they play together. We never really had that for CL but I always wished we did.

Container Linux migration tools

The Container Linux distro currently has a planned EOL in 2020, and users will need to migrate to Fedora CoreOS to remain supported. We'll need to decide on what tooling can be built to automate parts of the transition, and document the differences to make the migration is as painless as possible.

PRD: Red Hat CoreOS upstream

I think the Fedora CoreOS PRD should explicitly call out being the upstream for Red Hat CoreOS.

Default filesystem choices for FCOS

Container Llinux uses ext4 by default, Fedora Workstation uses ext4, Fedora Server uses xfs. We should pick one for / and /var (if /var is a seperate partition). Some people like btrfs, so maybe consider that too?

I personally don't have any strong opinion.

Assuming we can setup ostree between the disks and files stage, users would be able to change this via Ignition as well, so this would only affect the default image we ship.

flesh out use cases for Fedora CoreOS

We should flesh out use cases for Fedora CoreOS and develop some user stories so that we can define what we are providing and what we are not providing for users. This will help us stay focused and have defined reasoning for our decisions.

Additionally defining how we expect users to interact with the OS will be key for success.

Do we separate /boot and the ESP

On Container Linux /boot and the ESP are the same. This is not the case on Fedora. Do we want to combine them for FCOS?

Pros of combining:

Fewer partitions (simpler)
Similar to CL for users migrating from CL

Cons of combining:

/boot must be fat32
Possibly confusing for users migrating from FAH

Network Management

Fedora uses NetworkManager for handling network configuration. Container Linux uses networkd. We need to decide on one. We don't want to carry both since that's just twice the maintainence and chance of breakage without much benefit.

NetworkManager is advantageous because it has wider adoption, especially within the Fedora (and RedHat) ecosystem. It is (to my knowledge) generally more stable than networkd. Unfortunately, it's also harder to write config files for. The nmstate project will help significantly (makes the configuation more declarative) but it still lacks the flexibility of networkd's configuration. nmstate would need to be rewritten in some compiled language (i.e. not python) for inclusion in FCOS.

networkd has a configuation format that lends itself nicely to Container Linux today. The ability to "layer" configs works well for having a default that can be overridden for cloud specific changes and user specified changes. This is especially powerful when combined with it's matching rules. It's configuration is very similar to systemd's in general. nmstate has a proposal for templates which would help, but they still aren't as flexible as networkd's configuation. Unfortunately, networkd tends to suffer regressions and isn't as actively maintained as the core of systemd or NetworkManager. It cannot handle config file changes without restarting the service, but that isn't an issue with FCOS since the nodes shouldn't be configured after first boot.

Finally, networkd has fewer dependencies than networkmanager (considering we're already shipping systemd), especially since Fedora enables most features. We could change this and repackage it for FCOS stripping out unneeded features, but that'd be another custom package to carry and maintain.

We don't have any visibilty into how existing CL or FAH are using networkd or NetworkManager (respectively). This makes determining what requirements we have for network configuration hard.

In my opinion, networkd is a better fit for FCOS, even if it is more regression-happy than we'd like. I'm also perhaps a bit biased coming from my CL background.

Should OSTree be used at all?

Hi All,

I've been a happy user of Fedora Atomic Workstation and really like rpm-ostree feature. Having said that, I can't find any discussion why ostree (or rpm-ostree) should be used for FCOS? With it's Ignition based configuration and read-only /usr what is the benefit of having ostree?

enhance ostree mirroring for better UX

Since we have decided on using ostree repos to deliver content to end users for now, let's work with Fedora Infra and upstream in ostree to get ostree mirroring set up and working well.

goal: ship OS images with empty /etc

I think it may be worth trying to ship FCOS with a (mostly) empty /etc/.

The rationale for that is that as distribution maintainers we like how many components (mostly coming from systemd land) already support overlaying multiple snippets and splitting configuration concerns among:

vendor defaults (packaged RO in /usr)
runtime-volatile tweaks (generated into /run)
user configuration (in /etc/ and normally not touched by us)

This came up already in multiple places, and the last one that is triggering this is #36 (comment) (motd/issue handling).

As such, I think we should:

document the etc-run-usr split explicitly in our design docs
encourage upstream developers to implement fall-through lookups
encourage packagers to avoid hard-coding config files/dirs in /etc
contribute relevant patches for the two items above
eventually get to a point where /etc is basically empty

This is a proposed goal, i.e. something that I am aware is likely not feasible in the short term but that I'd like to achieve in a longer timeframe. It may also go against many people definition of a Linux system, or eventually turn out not to be a good idea at all.

Here I'd like to gather consensus, document short-comings and snowflakes (or why this is a bad idea to avoid), and write it down as a guideline to clarify our goals to other (upstream) developers.

Fedora-CoreOS session on Fedora Classroom

Everyone is excited about Fedora-CoreOS.
It would be great if we can have a Fedora Classroom session on

Introduction
UseCase
Getting started (contribution) with Fedora-CoreOS

Just had APAC meeting and I am sure the session can reach to a broader audience.

cc'ing @sinnykumari @prakashmishra1598

Fedora QA cooperation

Note: I originally started this as a discussion at Fedora discourse, now moving it here after request.

Hello, my name is Kamil Páral and I'm part of the Fedora QA team. I was asked to be a point of contact between Fedora CoreOS and Fedora QA. I would like to learn what the current state of QA is in your project, and how I can help you with integrating it into Fedora QA processes, or extending it.

I know there hasn't been any Fedora CoreOS release yet, but do you have any QA processes existing already? Testcases to describe what to test and how, testplans to describe which tests to run when, automated tests executed regularly (before release? after each commit?), results publicly visible somewhere? You know, the usual boring QA stuff.

As an example, here's our Cloud-specific test plan containing just a few test cases that we used for Fedora 28 Cloud/Atomic release validation:
https://fedoraproject.org/wiki/Test_Results:Fedora_28_RC_1.1_Cloud
(scroll to the bottom). These were tested manually during release. Atomic folks had some automated testing as well, I believe, but I don't really know the details (I'm sure the right people in your team will know about it).

If it makes sense to you, we could adopt those test cases, extend them to cover the most important CoreOS functionality, and create a separate CoreOS section for it. If you have any automated test results, we can talk about how to best integrated into our usual workflows. For the parts that we have automated in our team, we usually combine writing test results directly to such wiki matrices as shown above, and also inspect the failures in the tools' specific frontend. For example, everything filled out by coconut in our Installation matrix has been automatically tested by OpenQA - the failures are examined in its frontend directly. But there are definitely other approaches that can be taken.

I can also help you set up test days for your project, and try to reach out to Fedora community to help you get more of a unique test coverage. Or.... tell me what else you'd like to see or help with.

Looking forward to your feedback!

update docker in Fedora to not read os-release in %posttrans

It tries to read /etc/os-release which is also updated in %posttrans so we have an ordering problem. This can be avoided since docker reading /etc/os-release is not really needed any longer (we have standardized storage configuration).

Allow binaries written in Python via a "platform python" style approach

Like https://fedoraproject.org/wiki/Changes/Platform_Python_Stack

But we can do it in even stronger way: if we wanted to make it a whole lot harder for external people to use it, we could randomize the location of the binary, e.g. /usr/libexec/platform-python-$(uuidgen).

(Walking over every python binary and rewrite the shebangs)

Experiment: Setting up OSTree from the initramfs

In #18 we discussed the possibility of setting up ostree after the Ignition disks stage but before the files stage. This would pull down the contents of the ostree from some remote location and "install" it to the new root filesystem the disks stage set up. We'd want to only do this if the root fs actually changed, since it would be somewhat slow.

We should investigate this and make sure it's possible.

This would also enable us to create an "ultra-small" OS image which is just the ESP/boot partition and optionally a bios-boot partition, which I think would be neat.

cc @jlebon @cgwalters

reboot coordination: locksmith successor

CL encourages using locksmith + etcd by default as a "cluster". Do we want to do that out of the box, or focus on e.g. https://github.com/ashcrow/container-linux-update-operator/tree/spike ?

Another option is to document how to "roll your own" coordination with e.g. Ansible; we have APIs.

issue tracker for fedora releng integration

I don't know all of the issues until we try but here's some I think may be issues:

I want our build process to use modern containers - specifically https://github.com/coreos/coreos-assembler/ (not koji+mock)
The prototype config repo currently lives on github - now I want to support "offline" builds too, do we just set up an automatic pagure import, or do people want to move it to pagure?

bring forward nice login prompts from container linux

When you log in to container linux you get a nice prompt that tells you some information about the machine.

$ ssh [email protected]
Container Linux by CoreOS stable (1800.7.0)
Failed Units: 1
  example.service
core@localhost ~ $

On the serial console there is a nice message telling you what the IP address is for the machine.

This is localhost (Linux x86_64 4.14.16-coreos) 14:19:32
SSH host key: SHA256:yP+/44/bfuj6UKHdUwAVURsO3y6haKLKfSFNcnmn7bY (ECDSA)
SSH host key: SHA256:gGDZ/JQzwL76UpT29dyZ/M6Zua7QvGyegP8aTLc/D+Y (DSA)
SSH host key: SHA256:nQEysCYP3hZgkus2+e28KQGrs0pRI2NOgJGQ6L8PnyU (RSA)
SSH host key: SHA256:A3c6toZ3/eTMKNDmyyG9CYUSWsdSunmTeOC68iuDfAg (ED25519)
eth0: 192.168.122.36 fe80::5054:ff:fe85:43a6

we should carry these nice features forward for Fedora CoreOS.

where should rpm-ostree project live

I think this one is up to @cgwalters, but figured I'd start the conversation. Some options include:

github.com/coreos org
github.com/ostreedev
leave it where it is (will want to move it eventually)
new org?

create a fedora-release-coreos rpm

We'll need one of these similar to fedora-release-atomichost. It will need to be added to the fedora-release package as a sub rpm. Here is an example where atomic host is added today. upstream for that is https://pagure.io/fedora-release

Firewall Management

What is the desired firewall management method for FCOS at the moment? Atomic Host is using firewalld, Container Linux plain iptables. AFAIK Fedora 29 will switch over from iptables to nftables and set nftables as a default backend in firewalld. FCOS, with a strong desire to not ship Python, plans to stay with iptables, switch to plain nftables?

Partition Layout

In converstations we had recently, we think that FCOS should have a default partition layout, similar to how CL has a standard fs layout since it provides consistency across bare metal and clouds and well as making the image "dd-able" directly to a drive (which makes installation trivial). Any further disk modification should be done via Ignition.

What should that partition layout look like?

My (quick and not fully thought out) proposal:

1 - EFI-SYSTEM (needed for all EFI systems)
2 - BIOS-BOOT  (holds grub for systems that need bios booting)
3 - ROOT       (i.e. everything else)

Ideally we'd be able to move ROOT around using Ignition and re-deploying the OSTree to where we moved it to between the disks and files stages. If you're on a EFI system you could even wipe away BIOS-BOOT to make more room (not that its terribly large). There's some tricky cases with that which we're still exploring, but it should be possible at very least in simple cases.

Select /etc/os-release ID, VARIANT_ID, and ID_LIKE

ID=coreos implies Container Linux, so we shouldn't use that. ID=fedora + VARIANT_ID=coreos is defensible, but tools that see ID=fedora might expect things like dnf or /etc/sysconfig/network-scripts to work. ID=fedora-coreos seems harder to justify. Other options may exist.

Ultimately this seems like a broader Fedora policy question, and maybe it should get some attention at that level.

How to ship cloud specific bits

Not to be confused with the discussion of agents in #12.

Clouds will need slightly different base/default Ignition configs and probably some extra config files for Ignition to use. Two questions arise: where do we ship them (initramfs? /? /boot?) and how do we ship them considering there are multiple clouds?

My proposal that I'm not super attached to:

Ship the bits in the initramfs itself. They're not big and this means we don't have to deal with copying things over from the real root. At least do this for the Ignition configs.
Ship all of the clouds' configs on for all the clouds (and bare metal). Teach Ignition to look under /some_path/$oem_id/{base,default}.ign or have systemd service that copys one to the location Ignition expects. Again, they aren't big and text compresses well, so it shouldn't be too much extra space used.

where to move ignition dracut modules source code

They currently live at https://github.com/dustymabe/ignition-dracut. I propose we move them somewhere more proper like the coreos org on github. Is there a better place to put it? Any issues with moving them there?

cloud agents

Today, Container Linux uses an OEM partition for various cloud agents (e.g. GCE). For Fedora Atomic, we never created such a thing and mostly limped along with the (very limited) support that cloud-init has for different sites. The only exception here is that for RHEL Atomic Host we did make a VMWare agent container.

The architecture for Fedora CoreOS calls for us to close to CL here (Ignition + coreos-metadata) but that doesn't answer the larger cloud agent problem.

A known major issue with the CL approach is that there is no update mechanism for the OEM partition.

We have a few options, and we can consider different strategies per cloud.

Layering it on as a package just for that cloud
Layering but not updating it (i.e. we don't engage the rpm-md machinery)
separate ostree streams per cloud
Rkt/atomic system containers style
Statically linked binary in /opt

Docker version

FAH ships "slightly" modified version based on Docker 1.13.1, CL follows Docker CE and currently delivers 18.06.1. AFAIR it was mentioned during one of the meetings that FCOS will ship both options (correct me if I'm wrong), the question is which one will be the default one?
The difference is significant - 11 API releases:

Docker 1.13.1 - API v1.26
Docker CE 18.06.1 - API v1.37

docker group or not

We need to decide whether to include a docker group or not. Today CL does, Fedora does not. Context: https://www.projectatomic.io/blog/2015/08/why-we-dont-let-non-root-users-run-docker-in-centos-fedora-or-rhel/

Kernel Module Support

This issue is a continuation of this conversation which details out some of the pain points of not having an official method for supporting kernel modules. On Fedora Atomic Host, and CoreOS CL you can work around not having kernel module support by building the modules in a container but it doesn't easily solve the problem of having the kernel modules built on boot and it is a pretty fragile method for building kernel modules consistently. Also, @cgwalters had some feedback on this topic detailed out here that might be relevant to this design topic.

Having some sort of support for a DKMS-like kernel module build system in Fedora CoreOS would be great.

Host Installer for Fedora CoreOS (bare metal)

Being that we are planning to boot from a common "image" on first boot in Fedora CoreOS we'd like an installer that can get that image onto a disk for a bare metal environment (cloud/VM environments should be using related image artifacts or pre-uploaded cloud artifacts). Anaconda can do this (i.e. write a pre-baked image to disk), but might be overkill for what we actually need considering we don't really want any customizations done by the installer and all of them performed by ignition on first boot. Container Linux in the past has used a small script (basically wrapping dd) as their installer.

Let's come up with a strategy for a host installer for Fedora CoreOS and implement it.

Goals for build tools

We should decide on goals (and anti-goals) for what we want from the FCOS build system. We should use them to decide what tools we can use already, what needs to be modified to meet them, and what needs to be built.

This is meant more as "I want to be able to do X" not "I want to use project Y".

Ignition versions supported in FCOS: ignition 3.0.0!

We're considering breaking compat between Ignition spec <= 2.x.y and v3.0.0. This is because of coreos/ignition#608. This would mean FCOS would only accept configs >=v3.0.0. This would mean users migrating from Container Linux would need to migrate their configs (i.e. could not boot the same config). That would include configs that are appended. Container Linux would only support 2.x configs through the rest of its lifetime.

Some consequences if we did that:

Everyone would need to migrate configs. We expect that to some degree anyway for any reasonably complex config since some filesystem paths will change. An automated tool to help translate could do a "best effort" translation, but anyone taking advantage of weird use cases may not get what they are expecting.
CL would need to support a branch of Ignition with spec 2.x through the rest of its life. This means backporting bugfixes to it.

Some consequences if we didn't do that:

Either the 3.0.0+ configs (forever) carries the same lack of declarativeness or we have an imperfect translation step from 2.3.0 to 3.0.0. I don't think we can ship a broken translator.

What are people's thoughts on only shipping 3.0.0+? Note that the 3.0.0 spec is nowhere near finalized yet.