Coder Social home page Coder Social logo

lxc / lxcfs Goto Github PK

View Code? Open in Web Editor NEW
997.0 997.0 243.0 2.07 MB

FUSE filesystem for LXC

Home Page: https://linuxcontainers.org/lxcfs

License: Other

Shell 5.70% C 90.05% Makefile 0.08% Meson 3.90% Python 0.27%
c containers fuse-filesystem lxc

lxcfs's Introduction

Linux Containers logo

LXC

LXC is the well-known and heavily tested low-level Linux container runtime. It is in active development since 2008 and has proven itself in critical production environments world-wide. Some of its core contributors are the same people that helped to implement various well-known containerization features inside the Linux kernel.

Status

Type Service Status
CI (Linux) GitHub Build Status
CI (Linux) Jenkins Build Status
Project status CII Best Practices CII Best Practices
Fuzzing OSS-Fuzz Fuzzing Status
Fuzzing CIFuzz CIFuzz

System Containers

LXC's main focus is system containers. That is, containers which offer an environment as close as possible as the one you'd get from a VM but without the overhead that comes with running a separate kernel and simulating all the hardware.

This is achieved through a combination of kernel security features such as namespaces, mandatory access control and control groups.

Unprivileged Containers

Unprivileged containers are containers that are run without any privilege. This requires support for user namespaces in the kernel that the container is run on. LXC was the first runtime to support unprivileged containers after user namespaces were merged into the mainline kernel.

In essence, user namespaces isolate given sets of UIDs and GIDs. This is achieved by establishing a mapping between a range of UIDs and GIDs on the host to a different (unprivileged) range of UIDs and GIDs in the container. The kernel will translate this mapping in such a way that inside the container all UIDs and GIDs appear as you would expect from the host whereas on the host these UIDs and GIDs are in fact unprivileged. For example, a process running as UID and GID 0 inside the container might appear as UID and GID 100000 on the host. The implementation and working details can be gathered from the corresponding user namespace man page.

Since unprivileged containers are a security enhancement they naturally come with a few restrictions enforced by the kernel. In order to provide a fully functional unprivileged container LXC interacts with 3 pieces of setuid code:

  • lxc-user-nic (setuid helper to create a veth pair and bridge it on the host)
  • newuidmap (from the shadow package, sets up a uid map)
  • newgidmap (from the shadow package, sets up a gid map)

Everything else is run as your own user or as a uid which your user owns.

In general, LXC's goal is to make use of every security feature available in the kernel. This means LXC's configuration management will allow experienced users to intricately tune LXC to their needs.

A more detailed introduction into LXC security can be found under the following link

Removing all Privilege

In principle LXC can be run without any of these tools provided the correct configuration is applied. However, the usefulness of such containers is usually quite restricted. Just to highlight the two most common problems:

  1. Network: Without relying on a setuid helper to setup appropriate network devices for an unprivileged user (see LXC's lxc-user-nic binary) the only option is to share the network namespace with the host. Although this should be secure in principle, sharing the host's network namespace is still one step of isolation less and increases the attack vector. Furthermore, when host and container share the same network namespace the kernel will refuse any sysfs mounts. This usually means that the init binary inside of the container will not be able to boot up correctly.

  2. User Namespaces: As outlined above, user namespaces are a big security enhancement. However, without relying on privileged helpers users who are unprivileged on the host are only permitted to map their own UID into a container. A standard POSIX system however, requires 65536 UIDs and GIDs to be available to guarantee full functionality.

Configuration

LXC is configured via a simple set of keys. For example,

  • lxc.rootfs.path
  • lxc.mount.entry

LXC namespaces configuration keys by using single dots. This means complex configuration keys such as lxc.net.0 expose various subkeys such as lxc.net.0.type, lxc.net.0.link, lxc.net.0.ipv6.address, and others for even more fine-grained configuration.

LXC is used as the default runtime for Incus, a container hypervisor exposing a well-designed and stable REST-api on top of it.

Kernel Requirements

LXC runs on any kernel from 2.6.32 onwards. All it requires is a functional C compiler. LXC works on all architectures that provide the necessary kernel features. This includes (but isn't limited to):

  • i686
  • x86_64
  • ppc, ppc64, ppc64le
  • riscv64
  • s390x
  • armv7l, arm64
  • loongarch64

LXC also supports at least the following C standard libraries:

  • glibc
  • musl
  • bionic (Android's libc)

Backwards Compatibility

LXC has always focused on strong backwards compatibility. In fact, the API hasn't been broken from release 1.0.0 onwards. Main LXC is currently at version 4.*.*.

Reporting Security Issues

The LXC project has a good reputation in handling security issues quickly and efficiently. If you think you've found a potential security issue, please report it by e-mail to all of the following persons:

  • serge (at) hallyn (dot) com
  • stgraber (at) ubuntu (dot) com
  • brauner (at) kernel (dot) org

For further details please have a look at

Becoming Active in LXC development

We always welcome new contributors and are happy to provide guidance when necessary. LXC follows the kernel coding conventions. This means we only require that each commit includes a Signed-off-by line. The coding style we use is identical to the one used by the Linux kernel. You can find a detailed introduction at:

and should also take a look at the CONTRIBUTING file in this repo.

If you want to become more active it is usually also a good idea to show up in the LXC IRC channel #lxc-dev on irc.libera.chat. We try to do all development out in the open and discussion of new features or bugs is done either in appropriate GitHub issues or on IRC.

When thinking about making security critical contributions or substantial changes it is usually a good idea to ping the developers first and ask whether a PR would be accepted.

Semantic Versioning

LXC and its related projects strictly adhere to a semantic versioning scheme.

Downloading the current source code

Source for the latest released version can always be downloaded from

You can browse the up to the minute source code and change history online

Building LXC

Without considering distribution specific details a simple

meson setup -Dprefix=/usr build
meson compile -C build

is usually sufficient.

Getting help

When you find you need help, the LXC projects provides you with several options.

Discuss Forum

We maintain a discuss forum at

where you can get support.

IRC

You can find us in #lxc on irc.libera.chat.

Mailing Lists

You can check out one of the two LXC mailing list archives and register if interested:

lxcfs's People

Contributors

3xx0 avatar aither64 avatar alexhudspith avatar asokoloski avatar blub avatar bmiklautz avatar brauner avatar cyphar avatar dasteihn avatar elianka avatar evgeni avatar fabian-gruenbichler avatar foxboron avatar gibmat avatar hallyn avatar hongbo-bd avatar hunter1016 avatar hustcat avatar mihalicyn avatar peterrk avatar phanhuyn avatar simondeziel avatar sn0rt avatar sparlane avatar stgraber avatar tomponline avatar tssge avatar tych0 avatar wavezhang avatar zhang2639 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lxcfs's Issues

RE : Container isolation in LXC

Hello All,

This is a repost of conversation #40 that has been closed by Hallyn before we could get any answers from you guys. Can someone look into this and give us some headup ?

Let us jump in this conversation as we are facing the same issues alphaonex86 describes in this post.
We are a hosting provider with over 400 production websites running on our servers.

We made the deliberate choice to go toward LXC as our main virtualization technology for our nearly 69 physical servers because we needed a flexible, hardware and OS independant technology that can evolve along with the Debian distribution. We tested VMware, KVM, vServer and Proxmox technologies in the past, all of which had a blocking issue for us.

We were happy running LXC as our hypervisor until one of our customer pointed out this major flaw in the VM isolation mechanism. Better than long speeches, let me give a graphical example of the problem we are facing.
Here is what a htop reveals, on the left hand side from the LXC VM, on the right hand side on the host :

lxc_htops

โ€‹
You can clearly see that the VM is not using any CPU/RAM at all, the process are sorted by CPU usage and peeks at 0.0% CPU/RAM. However htop shows one CPU #1 load at 96.7% because of a rsync task that runs at the host level, as you can see on the right side.

You can easily see what is the problem here. As hosting provider, we should be able to provide our customer with isolated environment. A root user logged into a VM should not be able to access any of this information. This kind of security leak can lead to major risks, as a very skilled system engineer could access other VM process or even host processes!

For instance, the /proc folder of the host is openly accessible from the VM and exposes all sensible information about what is running on the server and how it's configured.

To our opinion, the idea of letting the VM access to all raw information from the host then filter it inside the tools through patches is a dangerous path to follow.

Can you tell us if there is a plan to adapt the LXC technology to a more secured and isolated model such as VMware or qemu can offer? Correcting this through a Debian patch would not be a suitable solution for us as we are running different versions from Debian 6 to Debian 8 on our servers. Best would be a LXC patch.

Best regards,

Dash read built-in behavior change with 0.9 vs 0.7

In 0.7 when running the command: read b c < /proc/uptime
variables $b and $c are populated with appropriate data. In 0.9 they do not get data. When running strace on lxcfs outside the container it shows that while in 0.7 they get data sent to them in 0.9 they are getting /0 sent instead.

When I test using bash (same command above) or using cat /proc/uptime as a placebo check they both act properly in 0.7 and 0.9 both giving data for uptime.

I'm putting this ticket in at Serge's request after posting on the mailing list. Per his note it appears to be related to the implementation of direct_io which was done before the 0.8 release.

7253e0a

A strace on cat is trivial but one on dash is not (read is a builtin and strace won't do builtins) so I did that one for comparison sake.

Method used
stty -echo
cat | strace dash > /dev/null
Once that is done I had to type the command I wanted to strace blind when it paused waiting for input. I enabled echo after it was all done.

https://pastebin.com/5T1xbHDH

cpuset.cpus with non-consecutive cores have wrong messages

Ubuntu 14.04.2, unprivileged container configured.

lxc-create -t download -n test -- -d ubuntu -r trusty -a amd64

lxc 1.1.2 and lxcfs 0.7
(all from lxd-git-master_1_1 PPA repository)

i setup cores in the config:

lxc.cgroup.cpuset.cpus = 0,1

or in shell with lxc-cgroup

lxc-cgroup -n base cpuset.cpus 0,1

test with stress command:

stress --cpu 4 --timeout 10s

show in the host only two hosts (0 and 1) used.

now if i change the cpusets.cpus to non-consecutive cores:

lxc.cgroup.cpuset.cpus = 0,2

or

lxc-cgroup -n base cpuset.cpus 0,2

with the same stress command, I get two cores used, you can see htop in the host for diagnostic, but in the container, htop, top and less /proc/cpuinfo shows only one core active, when the cores are non-consecutive.

Cache an 'fd' at open

We are duplicating work at read/write that was done at open. So implement a proper cache of 'open fds'.

guest_nice CPU stats missing in /proc/stat

/proc/stat in a container has 1 column fewer on CPU entries than on the host under Linux 3.13. It looks like (since Linux 2.6.33), that column is guest_nice.

Looking at lxcfs.c:proc_stat_read(), it looks like guest_nice isn't included under any codepath.

Impact I've noticed from this is running top in a container fails with "top: failed /proc/stat read". I'm guessing this is related to guest_nice missing.

/var/lib/lxcfs paths are not set correctly in lxc.mount.hook during ./configure

On the lxcfs portion of the lxc homepage it states that one should run:

git clone git://github.com/lxc/lxcfs
cd lxcfs
./bootstrap.sh
./configure
make
sudo mkdir -p /var/lib/lxcfs
sudo ./lxcfs -s -f -o allow_other /var/lib/lxcfs/

However, when running ./configure the paths in lxc.mount.hooks are either prefixed with the default /usr/local (e.g. in lxc.mount.hooks: /usr/local/var/lib/lxcfs/proc) or when specifying the prefix in the ./configure stage as in:

./configure --prefix=/usr

They get set to /usr/var/lib/proc which does not even exist. Maybe it would be good to mention that ./configure needs to be run as:

./configure --prefix=/usr --localstatedir=/var

Potential bug

Hello, I have test pรนt 99 cpu, but failed due to "truncated write to cache", mean if you have 12 core, 24, 36 core, due to size of cpuinfo lxcfs will bug, right?

The patch:

> diff -U 3 -H -d -r -N -- lxcfs-master-org/lxcfs.c lxcfs-master/lxcfs.c
> --- lxcfs-master-org/lxcfs.c  2015-12-07 19:08:04.000000000 +0100
> +++ lxcfs-master/lxcfs.c  2015-12-09 07:35:27.673636198 +0100
> @@ -33,6 +33,8 @@
>  #include <sys/mount.h>
>  #include <wait.h>
>  
> +#define FAKECPUCOUNT 99
> +
>  #ifdef FORTRAVIS
>  #define GLIB_DISABLE_DEPRECATION_WARNINGS
>  #include <glib-object.h>
> @@ -2132,12 +2134,17 @@
>   if (!f)
>       goto err;
>  
> +    do
> +    {
> +
>   while (getline(&line, &linelen, f) != -1) {
>       size_t l;
>       if (is_processor_line(line)) {
>           am_printing = cpuline_in_cpuset(line, cpuset);
>           if (am_printing) {
>               curcpu ++;
> +                if((curcpu+1)>=FAKECPUCOUNT)
> +                    break;
>               l = snprintf(cache, cache_size, "processor  : %d\n", curcpu);
>               if (l < 0) {
>                   perror("Error writing to cache");
> @@ -2186,6 +2193,11 @@
>           }
>       }
>   }
> + 
> + if(fseek(f,0,SEEK_SET)!=0)
> +        break;
> +    }
> + while((curcpu+1)<FAKECPUCOUNT);
>  
>   d->cached = 1;
>   d->size = total_len;
> 

/proc/meminfo overlay shows host value for sub-cgroups

Due to the way the memory controller works, a sub-cgroup shows a -1 limit and host values even if one of its parents set a limit.

The actual limit is still being enforced by the kernel but lxcfs doesn't report it in the meminfo overlay, making the overlay pretty much useless inside any container that actively uses cgroups (such as any container using systemd).

This means any piece of software running under systemd and using meminfo to figure out how much memory it can use will fail pretty badly when it tries to allocate more than it can use.

This also provides a pretty weird user experience in that users of lxc-attach see the right limit but someone sshing in will see the host value.

As far as lxcfs bugs go, I consider this to be a pretty important issue as the meminfo overlay is one of the most useful properties of lxcfs (along with the cpuinfo overlay which does work properly in this case).

lxcfs not able to boot with systemd 218 or at least hanging

I created a privileged archlinux container and used @serge's uidmapshift tool to convert it to an unprivileged container and moved it from /var/lib/lxc/arch/ to ~/.local/share/lxc/arch. I have lxc 1.1, cgmanager 0.35 and lxcfs 0.5. lxcfs was downloaded from git and compiled and installed with

./bootstrap.sh
./configure
make

installed with sudo make install to /usr/local/share/lxcfs. I then ran it with:

sudo mkdir -p /usr/local/var/lib/lxcfs
sudo lxcfs -s -f -o allow_other /usr/local/var/lib/lxcfs/

The permissions for usr/local/var/lib/lxcfs are root:root; 00-lxcfs.conf and lxcfs.mount.hook are both executable.

The container used the following config file (leaving out the network section):

# Template used to create this container: /usr/share/lxc/templates/lxc-archlinux
# Parameters passed to the template:
# For additional config options, please look at lxc.container.conf(5)

# Distribution configuration
lxc.include = /usr/share/lxc/config/archlinux.common.conf
lxc.include = /usr/share/lxc/config/archlinux.userns.conf
lxc.include = /usr/local/share/lxc/config/common.conf.d/00-lxcfs.conf
lxc.arch = x86_64
lxc.autodev = 1
lxc.kmsg = 0

# Container specific configuration
lxc.id_map = u 0 100000 65536
lxc.id_map = g 0 100000 65536
lxc.rootfs = /home/chb/.local/share/lxc/arch/rootfs
lxc.utsname = arch

When I boot the container it takes a really long time to switch from

Welcome to Arch Linux!
Set hostname to <arch>

it then fails to mount the fuse control file system and the huge pages file system:

[FAILED] Failed to mount FUSE Control File System.
See "systemctl status sys-fs-fuse-connections.mount" for details.
[FAILED] Failed to mount Huge Pages File System.
See "systemctl status dev-hugepages.mount" for details.

It also takes a long time when starting dbus and permanently hangs at Started Permit User Sessions and never goes any further. If I log into the container with lxc-attach -n arch I get a shell.

I tried booting a systemd-based unprivileged Debian Jessie container. This works! Although it also takes a long time to boot, hanging at:

Welcome to Debian Jessie!
Set hostname to <jessie>

But once it is passed this point it works and I am presented with a login screen and can login. So it seems that systemd 218 is not working even with lxcfs installed. Is there a workaround or am I doing something wrong?

cgmanager on centos7

Anyone tried compiling cgmanager on centos7 ?
I compiled it successfully however service fails to start on centos 7 .

Please tag 0.12

0.12 was announced on 11/17, but it looks like a tag hasn't been pushed. Thank you!

lxcfs from lxc ppa (daily, stable) is broken

The hosts are mainly running ubuntu trusty (lts) and are up to date.
LXC packages come and are up to date with the lxc PPA (both stable & daily).

This looked at first a bit like #24.

The symptom:

The first is that ubuntu - vivid (systemd) based container got problems randomly a bit after their start, /proc gets broken (userland says proc is not mounted, and doing some "/bin/ls" in there get some fuse error (transport is not connected). This seems tied to lxcfs ?
upstart based containers are working.

-- edit:
The problem also happen with the daily PPA.

Cache uptime

We should cache uptime contents in order to reduce the amount of work on the host when top (for instance) is running.

Patch to hide loadaverage

Hello, I'm developping patch to hide the load average, but it loop, do you see why?:
diff -U 3 -H -d -r -N -- lxcfs-master-org/lxcfs.c lxcfs-master/lxcfs.c
--- lxcfs-master-org/lxcfs.c 2015-12-07 19:08:04.000000000 +0100
+++ lxcfs-master/lxcfs.c 2015-12-27 20:50:53.000000000 +0100
@@ -49,6 +49,7 @@
LXC_TYPE_PROC_UPTIME,
LXC_TYPE_PROC_STAT,
LXC_TYPE_PROC_DISKSTATS,

  • LXC_TYPE_PROC_LOADAVG,
    };

struct file_info {
@@ -2791,6 +2842,21 @@
return rv;
}

+static int proc_loadavg_read(char *buf, size_t size, off_t offset,

  •    struct fuse_file_info *fi)
    
    +{
  • struct file_info *d = (struct file_info *)fi->fh;
  • char *cache = d->buf;
  • size_t cache_size = d->buflen;
  • int total_len = snprintf(d->buf, d->size, "0.01 0.01 0.01 1/99 99");
  • if (total_len < 0){
  •    perror("Error writing to cache");
    
  •    return 0;
    
  • }
  • if (total_len >= cache_size) {
  •    fprintf(stderr, "Internal error: truncated write to cache\n");
    
  •    return 0;
    
  • }
  • cache_size -= total_len;
  • d->cached = 1;
  • d->size = total_len;
  • return total_len;
    +}

static off_t get_procfile_size(const char *which)
{
FILE *f = fopen(which, "r");
@@ -2826,7 +2883,8 @@
strcmp(path, "/proc/cpuinfo") == 0 ||
strcmp(path, "/proc/uptime") == 0 ||
strcmp(path, "/proc/stat") == 0 ||

  •       strcmp(path, "/proc/diskstats") == 0) {
    
  •       strcmp(path, "/proc/diskstats") == 0 ||
    
  •        strcmp(path, "/proc/loadavg") == 0) {
    sb->st_size = 0;
    sb->st_mode = S_IFREG | 00444;
    sb->st_nlink = 1;
    
    @@ -2843,7 +2909,8 @@
    filler(buf, "meminfo", NULL, 0) != 0 ||
    filler(buf, "stat", NULL, 0) != 0 ||
    filler(buf, "uptime", NULL, 0) != 0 ||
  •           filler(buf, "diskstats", NULL, 0) != 0)
    
  •            filler(buf, "diskstats", NULL, 0) != 0 ||
    
  •            filler(buf, "loadavg", NULL, 0) != 0)
    return -EINVAL;
    
    return 0;
    }
    @@ -2861,8 +2928,10 @@
    type = LXC_TYPE_PROC_UPTIME;
    else if (strcmp(path, "/proc/stat") == 0)
    type = LXC_TYPE_PROC_STAT;
  • else if (strcmp(path, "/proc/diskstats") == 0)
  •   type = LXC_TYPE_PROC_DISKSTATS;
    
  • else if (strcmp(path, "/proc/diskstats") == 0)
  •    type = LXC_TYPE_PROC_DISKSTATS;
    
  • else if (strcmp(path, "/proc/loadavg") == 0)
  •    type = LXC_TYPE_PROC_LOADAVG;
    
    if (type == -1)
    return -ENOENT;

@@ -2907,8 +2976,10 @@
return proc_uptime_read(buf, size, offset, fi);
case LXC_TYPE_PROC_STAT:
return proc_stat_read(buf, size, offset, fi);

  • case LXC_TYPE_PROC_DISKSTATS:
  •   return proc_diskstats_read(buf, size, offset, fi);
    
  • case LXC_TYPE_PROC_DISKSTATS:
  •    return proc_diskstats_read(buf, size, offset, fi);
    
  • case LXC_TYPE_PROC_LOADAVG:
  •    return proc_loadavg_read(buf, size, offset, fi);
    
    default:
    return -EINVAL;
    }

loadavg and swap

Two additional files that should be considered for lxcfs:

/proc/loadavg
/proc/swaps

Both should be relatively easy to generate using cgroups and the existing information in lxcfs.

cat /proc/loadavg

0.00 0.01 0.05 1/273 16306

loadavg consists of 1 minute average, 5 minute average, 10 minute average, running processes / total processes, last PID running

My idea was to poll the system every second and keep a database of the last 600 polls (10 minutes) consisting of the number of running processes. This would allow you to generate averages as needed.

cat /proc/swaps

Filename Type Size Used Priority
/dev/sdb1 partition 12582908 49188 -1

I'd assume you can generate a fake partition entry with integer values from cgroup/memory.

The size would be:
(memory.memsw.limit_in_bytes - memory.limit_in_bytes) / 1024

The usage can be derived from the 'swap' row in memory.stat divided by 1024 (since /proc/swaps reports in KB)

I'm not a C programmer so I can't assist with the code, but I'd be happy to help in ways that I can.

lxcfs is readonly?

Hi guys, does lxcfs simulate the whole /proc? I mean, example, can we provide the writable files, such as /proc/sys/kernel/core_pattern, for per-contianer via lxcfs??

More DOC

Hello, I try use no systemd container, can you more verbose the README, explain what test if not work, cgmanager and lxcfs -s -f -o allow_other /var/lib/lxcfs work, lxc 1.1
Cheers,

Blocked at:
cubieboard2-6 ~ # lxc-start -n xxhash -F
INIT: version 2.88 booting

OpenRC 0.17 is starting up Gentoo Linux (armv7l)

  • Mounting /proc ...mount: permission denied
    [ !! ]
  • getmntinfo: No such file or directory
  • Mounting /run ...mount: permission denied
  • Unable to mount tmpfs on /run.
  • Can't continue.
  • Caching service dependencies ... [ ok ]
    devfs | * sysfs |grep: /proc/filesystems: No such file or directory
    getmntinfo: No such file or directory
    sysfs | * getmntinfo: No such file or directory
    devfs |grep: /proc/filesystems: No such file or directory
    sysfs | * devfs |grep: getmntinfo: No such file or directory
    /proc/filesystems: No such file or directory
    devfs | * This kernel does not have devtmpfs or tmpfs support, and there
    sysfs | * getmntinfo: No such file or directory
    devfs | * is no entry for /dev in fstab.
    sysfs | * devfs | * This means /dev will not be mounted.
    getmntinfo: No such file or directory
    devfs | * To avoid this message, set CONFIG_DEVTMPFS or CONFIG_TMPFS to y
    devfs | * in your kernel configuration or see /etc/conf.d/devfs
    devfs |grep: /proc/filesystems: No such file or directory
    devfs |grep: /proc/filesystems: No such file or directory
    devfs |grep: /proc/filesystems: No such file or directory
    tmpfiles.dev | * Setting up tmpfiles.d entries for /dev ... [ ok ]
    udev | * CONFIG_DEVTMPFS=y is required in your kernel configuration
    udev | * for this version of udev to run successfully.
    udev | * This requires immediate attention.
    udev | * getmntinfo: No such file or directory
    udev |mount: permission denied
    udev |mkdir: cannot create directory '/dev/pts': File exists
    udev | * ERROR: udev failed to start
    sysctl | * Configuring kernel parameters ... [ ok ]
    loopback | * Bringing up network interface lo ...fsck | * Checking local filesystems ...RTNETLINK answers: File exists
    loopback | [ ok ]
    ext2fs_check_if_mount: Can't check if filesystem is mounted due to missing mtab file while determining whether /dev/BOOT is mounted.
    fsck |fsck.ext2: No such file or directory while trying to open /dev/BOOT
    fsck |Possibly non-existent device?
    fsck | * Operational error
    fsck | [ !! ]
    localmount | * getmntinfo: No such file or directory
    localmount | * Mounting local filesystems ...root | * Remounting filesystems ... [ ok ]
  • getmntinfo: No such file or directory
    root | * getmntinfo: No such file or directory
    root | * getmntinfo: No such file or directory
    root | [ ok ]
    tmpfiles.setup | * Setting up tmpfiles.d entries ...ip6tables | * Loading ip6tables state and starting firewall ...iptables | * Loading iptables state and starting firewall ... [ ok ]
    [ ok ]
    bootmisc |ln: failed to create symbolic link '/var/lock': File exists
    [ ok ]
    bootmisc |mount: permission denied
    bootmisc |umount: /tmp/tmp.CbYM7EbSac: must be superuser to unmount
    bootmisc | * binfmt | * Loading custom binary format handlers ...Could not clean up underlying /run on /
    bootmisc | * Creating user login records ... [ ok ]
    [ ok ]
    bootmisc | * getmntinfo: No such file or directory
    bootmisc | * Cleaning /var/run ... [ ok ]
    bootmisc | * getmntinfo: No such file or directory
    bootmisc | * Wiping /tmp directory ... [ ok ]
    net.lo | * Bringing up interface lo
    net.lo | * Caching network module dependencies
    net.lo | * 127.0.0.1/8 ... [ ok ]
    net.lo | * Adding routes
    net.lo | * 127.0.0.0/8 via 127.0.0.1 ... [ ok ]
    sshd | * Starting sshd ... [ ok ]
    INIT: Entering runlevel: 3
    local | * Starting local ... [ ok ]
    INIT: no more processes left in this runlevel

Mount lxcfs in direct_io mode by default

Currently lxcfs mounts with enabled FS cache by default. As result any file size request (getattr call) produce a whole file read for calculate an actual size (because of a file cache restrictions). That is ridiculous for auto-generated content of procfs/sysfs, which file size not reflect space consuming by any real data set. So maybe do mount with force direct_io and returns zero as size in getattr call(as procfs really do)?

lxcfs crashes on container start with double free

on a 14.04.2 host and container, inside the container, catting /proc/cpuinfo fails with a FUSE error:

root@openstack-single-mmccrack:/# cat /proc/cpuinfo 
cat: /proc/cpuinfo: Transport endpoint is not connected

/var/log/upstart/lxcfs.log has the following entries, repeated once each time I started the container:

fuse: read too many bytes
fuse: writing device: Invalid argument
fuse: read too many bytes
fuse: writing device: Invalid argument
*** Error in `/usr/bin/lxcfs': double free or corruption (!prev): 0x00007f04965bc120 ***

init scripts are only in ubuntu package

it seems that the init scripts are only in http://github.com/lxc/lxcfs-pkg-ubuntu/ and really they should exist in the upstream package just like they do in cgmanager, for those of us building from source. looks like just these are needed to bring over:

debian/lxcfs.init
debian/lxcfs.service
debian/lxcfs.upstart

I don't know if the whole --with-init-script stuff needs to be ported over so they can be part of install target.

LXCFS: crash and Transport endpoint is not connected

Hi,

Often, LXCFS crash and my container i have Transport endpoint is not connected: '/proc/cpuinfo'

I have this bug with Debian Jessie, but not with Ubuntu Trusty container.

Log LXCFS :

sudo cat /var/log/upstart/lxcfs.log
hierarchies: 2: cpuset
 3: cpu
 4: cpuacct
 5: memory
 6: devices
 7: freezer
 8: blkio
 9: hugetlb
 10: perf_event
 11: name=systemd
send_creds: failed at sendmsg: No such process
send_creds: failed at sendmsg: No such process
send_creds: failed at sendmsg: No such process
send_creds: Error getting reply from server over socketpair
hierarchies: 2: cpuset
 3: cpu
 4: cpuacct
 5: memory
 6: devices
 7: freezer
 8: blkio
 9: hugetlb
 10: perf_event
 11: name=systemd
send_creds: Error getting reply from server over socketpair
hierarchies: 2: cpuset
 3: cpu
 4: blkio
 5: cpuacct
 6: devices
 7: freezer
 8: hugetlb
 9: memory
 10: perf_event
 11: name=systemd
send_creds: failed at sendmsg: No such process
send_creds: failed at sendmsg: No such process
send_creds: Error getting reply from server over socketpair
Timed out waiting for scm_cred: No such file or directory
send_creds: failed at sendmsg: No such process
send_creds: failed at sendmsg: No such process
send_creds: Error getting reply from server over socketpair
Timed out waiting for scm_cred: No such file or directory
send_creds: failed at sendmsg: No such process
send_creds: Error getting reply from server over socketpair
do_read_pids: failed to ask child to exit: No such process
Timed out waiting for scm_cred: No such file or directory
send_creds: failed at sendmsg: No such process
send_creds: Error getting reply from server over socketpair
Timed out waiting for scm_cred: No such file or directory
send_creds: failed at sendmsg: No such process
send_creds: Error getting reply from server over socketpair
Timed out waiting for scm_cred: No such file or directory
send_creds: failed at sendmsg: No such process
send_creds: Error getting reply from server over socketpair
Timed out waiting for scm_cred: No such file or directory
send_creds: failed at sendmsg: No such process
send_creds: failed at sendmsg: No such process
send_creds: failed at sendmsg: No such process
send_creds: Error getting reply from server over socketpair
Timed out waiting for scm_cred: No such file or directory
send_creds: Error getting reply from server over socketpair
send_creds: failed at sendmsg: No such process
send_creds: failed at sendmsg: No such process
send_creds: failed at sendmsg: No such process
send_creds: failed at sendmsg: No such process
Timed out waiting for scm_cred: Success
send_creds: Error getting reply from server over socketpair
hierarchies: 2: cpuset
 3: cpu
 4: cpuacct
 5: memory
 6: devices
 7: freezer
 8: blkio
 9: perf_event
 10: hugetlb
 11: name=systemd
*** Error in `/usr/bin/lxcfs': double free or corruption (out): 0x00007f58b40009d0 ***
hierarchies: 2: cpuset
 3: cpu
 4: cpuacct
 5: memory
 6: devices
 7: freezer
 8: blkio
 9: perf_event
 10: hugetlb
 11: name=systemd
*** Error in `/usr/bin/lxcfs': munmap_chunk(): invalid pointer: 0x00007f55700010b0 ***
hierarchies: 2: cpuset
 3: cpu
 4: cpuacct
 5: memory
 6: devices
 7: freezer
 8: blkio
 9: perf_event
 10: hugetlb
 11: name=systemd

Versions:

$ lxcfs --version
0.17

$ lxd --version
2.0.0.beta1

$ uname -a
Linux lxc-integcontinue 3.13.0-77-generic #121-Ubuntu SMP  x86_64 x86_64 x86_64 GNU/Linux

Thanks,

Wierd swap info on lxcfs 0.12

Saw this on lxc 1.1.4 or 1.1.5, and lxcfs 0.12 on ubuntu 14.04.

free -m output on lxcfs 0.11:

         total       used       free     shared    buffers     cached

Mem: 992 326 666 0 0 155
-/+ buffers/cache: 170 822
Swap: 0 0 0

free -m output on lxcfs 0.12:

         total       used       free     shared    buffers     cached

Mem: 992 343 648 0 0 156
-/+ buffers/cache: 187 804
Swap: 17592186043423 0 17592186043423

lxcfs with systemd containers cause containers hang after a while

  • Create vivid container with systemd
# lxc-create -n c1 -- -d ubuntu -r vivid -a amd64
# lxc-attach -n c1 apt-get update
# lxc-attach -n c1 apt-get install systemd-sysv
# lxc-stop -n c1
  • Start-stop container repeatedly
# n=0;while true;do n=$((n+1));echo "$(date) -- Try #$n";sleep 1;lxc-start -n c1 && sleep 10 && lxc-attach -n c1 -- poweroff && lxc-console -n c1 -t console;done

Eventually the container will hang. This is the result of my latest test

Thu Mar 26 14:50:03 WIB 2015 -- Try #12
Failed to start poweroff.target: Connection timed out

Connected to tty 0
Type <Ctrl+a q> to exit the console, <Ctrl+a Ctrl+a> to enter Ctrl+a itself

At this time another lxc-attach -n c1 command from the host will hang.

I run this test with lxcfs started using strace -f /usr/bin/lxcfs -s -f -o allow_other /var/lib/lxcfs 2>&1 | tee /data/lxcfs.log, and last lines of output are

clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fe18bdcfa10) = 573
Process 573 attached
[pid  3156] select(5, [4], NULL, NULL, {2, 0} <unfinished ...>
[pid   573] set_robust_list(0x7fe18bdcfa20, 24) = 0
[pid   573] open("/proc/31766/ns/pid", O_RDONLY) = 6
[pid   573] setns(6, 0)                 = 0
[pid   573] close(6)                    = 0
[pid   573] clone(Process 574 attached
child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fe18bdcfa10) = 574
[pid   573] wait4(574,  <unfinished ...>
[pid   574] futex(0x7fe18aee2760, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid  3156] <... select resumed> )      = 0 (Timeout)
[pid  3156] recvfrom(4, 0x7fffcc209c4f, 1, 64, 0, 0) = -1 EAGAIN (Resource temporarily unavailable)
[pid  3156] write(2, "send_creds: Error getting reply "..., 60send_creds: Error getting reply from server over socketpair
) = 60
[pid  3156] kill(573, SIGTERM)          = 0
[pid   573] <... wait4 resumed> 0x7fffcc209c94, 0, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid  3156] wait4(573,  <unfinished ...>
[pid   573] --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=3156, si_uid=0} ---
[pid   573] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid   573] wait4(574, 

573 and 574 are lxcfs process. From the output above it looks like 573 is waiting for 574, while 574 is stuck on
[pid 574] futex(0x7fe18aee2760, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>

When I run strace -f lxc-attach -n c1 2>&1 | tee /data/lxc-attach.log, the last lines of output are

[pid   752] getuid()                    = 0
[pid   752] getgid()                    = 0
[pid   752] geteuid()                   = 0
[pid   752] getegid()                   = 0
[pid   752] rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
[pid   752] ioctl(0, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
[pid   752] ioctl(2, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 0x7fffa8acde60) = -1 ENOTTY (Inappropriate ioctl for device)
[pid   752] brk(0x1f78000)              = 0x1f78000
[pid   752] open("/proc/meminfo", O_RDONLY|O_CLOEXEC

So what I gather so far:

  • lxcfs stuck on some futex (don't know what exactly)
  • any another process which access lxcfs (including lxc-attach, which in the case above simply reads /proc/meminfo will hang

Swap accounting looks busted

root@blah:~# free -m
             total       used       free     shared    buffers     cached
Mem:         11912         58      11853         31          0         49
-/+ buffers/cache:          9      11903
Swap:   8796093010295          0 8796093010295
root@blah:~# 

Host value is 6GB of swap, container has no swap limit.

Show only the cpu of the container

Hello,

As qemu emulation, is more logic to display cpu 100% idle if the current vm is idle, even if the other cpu is busy by other vm.
This do information leak, is not good for the confidentiality.

Cheers,

lxcfs not working with systemd based unprivileged containers

Hello,

I tried to get unprivileged systemd-based containers to work in LXC and asked a couple of question on the lxc user mailing about this. See:

https://lists.linuxcontainers.org/pipermail/lxc-users/2015-January/008344.html

and

https://lists.linuxcontainers.org/pipermail/lxc-users/2014-December/008155.html.

I just got pointed to lxcfs and tried to get it working with an unprivileged systemd-based Debian Jessie container. I downloaded lxcfs from source and did:

./configure && make && sudo make install which installed it to /usr/local/share/lxcfs.

Then I did:

sudo mkdir -p /usr/local/var/lib/lxcfs

and

sudo lxcfs -s -f -o allow_other /usr/local/var/lib/lxcfs

and put

lxc.include = /usr/local/share/lxcfs/00-lxcfs.conf

in my .config for the Debian Jessie container after creating and making the file executable.
However, even with LXC 1.1 I get the following error copied from the logs resulting from using:

lxc-start -n jessie -F -l DEBUG -o AAA

 lxc-start 1422277724.347 INFO     lxc_conf - conf.c:run_script_argv:344 - Executing script '@LXCFSSHAREDIR@/lxc.mount.hook' for container 'jessie', config section 'lxc'
 lxc-start 1422277724.352 ERROR    lxc_conf - conf.c:run_buffer:324 - Script exited with status 127
 lxc-start 1422277724.352 ERROR    lxc_conf - conf.c:lxc_setup:3772 - failed to run mount hooks for container 'jessie'.
 lxc-start 1422277724.352 ERROR    lxc_start - start.c:do_start:699 - failed to setup the container
 lxc-start 1422277724.352 ERROR    lxc_sync - sync.c:__sync_wait:51 - invalid sequence number 1. expected 2
 lxc-start 1422277724.417 ERROR    lxc_start - start.c:__lxc_start:1099 - failed to spawn 'jessie'
 lxc-start 1422277724.420 ERROR    lxc_start_ui - lxc_start.c:main:344 - The container failed to start.
 lxc-start 1422277724.420 ERROR    lxc_start_ui - lxc_start.c:main:348 - Additional information can be obtained by setting the --logfile and --logpriority options.

Lxcfs On centos7

Hi,

i can't restrict memory in lxc containers on centos 7, cgmanager and lxcfs are compiled, here is the config of the container

Template used to create this container: /usr/share/lxc/templates/lxc-download

Parameters passed to the template:

For additional config options, please look at lxc.container.conf(5)

Distribution configuration

lxc.include = /usr/share/lxc/config/centos.common.conf
lxc.arch = x86_64
lxc.include = /usr/local/share/lxcfs/lxc.mount.hook
lxc.include = /usr/local/share/lxcfs/lxc.reboot.hook

Allow for 1024 pseudo terminals

lxc.pts = 1024

Setup 4 tty devices

lxc.tty = 4

Drop some harmful capabilities

lxc.cap.drop = mac_admin mac_override sys_time sys_module

Container specific configuration

lxc.rootfs = /var/lib/lxc/ramCheck/rootfs
lxc.utsname = ramCheck

Network configuration

lxc.network.type = veth
lxc.network.flags = up
lxc.network.link = br0

lxc.cgroup.memory.limit_in_bytes = 536870912
lxc.cgroup.memory.kmem.limit_in_bytes = 536870912
lxc.cgroup.memory.memsw.limit_in_bytes = 1073741824
lxc.cgroup.cpu.cfs_period_us = 100000
lxc.cgroup.cpu.cfs_quota_us = 100000
lxc.cgroup.cpu.shares = 250

CGroup whitelist

lxc.cgroup.devices.deny = a

Allow any mknod (but not reading/writing the node)

lxc.cgroup.devices.allow = c : m
lxc.cgroup.devices.allow = b : m

Allow specific devices

/dev/null

lxc.cgroup.devices.allow = c 1:3 rwm

/dev/zero

lxc.cgroup.devices.allow = c 1:5 rwm

/dev/full

lxc.cgroup.devices.allow = c 1:7 rwm

/dev/tty

lxc.cgroup.devices.allow = c 5:0 rwm

/dev/console

lxc.cgroup.devices.allow = c 5:1 rwm

/dev/ptmx

lxc.cgroup.devices.allow = c 5:2 rwm

/dev/random

lxc.cgroup.devices.allow = c 1:8 rwm

/dev/urandom

lxc.cgroup.devices.allow = c 1:9 rwm

/dev/pts/*

lxc.cgroup.devices.allow = c 136:* rwm

fuse

lxc.cgroup.devices.allow = c 10:229 rwm

The containers starts fine the restrictions do not work.

no btime in /proc/stat when running with 160 cpus

On the p8 box, if I leave SMT=8 on, the host has 160 cpus. When I run ps aux in a container, it complains that there is no btime in /proc/stat output. If I disable SMT and have 20 cores, then btime shows up in the guest /proc/stat.

Compile lxcfs on centos7 failed

when I try to compile lxcfs on centos7,met the problem:

configure: error: Package requirements (libnih >= 1.0.2) were not met:
No package 'libnih' found
Consider adjusting the PKG_CONFIG_PATH environment variable if you
installed software in a non-standard prefix.
Alternatively, you may set the environment variables NIH_CFLAGS
and NIH_LIBS to avoid the need to call pkg-config.

However,I have make&make install the libnih(v1.0.2) on my server.

Would you help me?

calculate idle time in uptime properly

Please offer ideas or patches for how to best calculate the idletime in /proc/uptime. Currently it'll always be the min of (host idletime, container uptime) which is obviously wrong.

hooks.conf should be 00-lxcfs.conf

Currently in your README.md it states that one should copy the hook.lxcfs file. But this file does not exist. I think it should rather read 00-lxcfs.conf file.

Compiling on 14.04 fails

$ make                                                                                                                         
make  all-recursive
make[1]: Entering directory `/home/erkan/Test/lxcfs'
Making all in tests
make[2]: Entering directory `/home/erkan/Test/lxcfs/tests'
make[2]: Nothing to be done for `all'.
make[2]: Leaving directory `/home/erkan/Test/lxcfs/tests'
Making all in share
make[2]: Entering directory `/home/erkan/Test/lxcfs/share'
make[2]: Nothing to be done for `all'.
make[2]: Leaving directory `/home/erkan/Test/lxcfs/share'
make[2]: Entering directory `/home/erkan/Test/lxcfs'
gcc -std=gnu99 -DHAVE_CONFIG_H -I.    -Wall -ggdb -D_GNU_SOURCE -DSBINDIR=\"\" -I/usr/include/dbus-1.0 -I/usr/lib/x86_64-linux-gnu/dbus-1.0/include    -I/usr/include/dbus-1.0 -I/usr/lib/x86_64-linux-gnu/dbus-1.0/include   -I/usr/include/dbus-1.0 -I/usr/lib/x86_64-linux-gnu/dbus-1.0/include   -D_FILE_OFFSET_BITS=64 -I/usr/include/fuse   -g -O2 -MT cgmanager.o -MD -MP -MF .deps/cgmanager.Tpo -c -o cgmanager.o cgmanager.c
cgmanager.c: In function 'cgm_get_controllers':
cgmanager.c:116:2: warning: implicit declaration of function 'cgmanager_list_controllers_sync' [-Wimplicit-function-declaration]
  if ( cgmanager_list_controllers_sync(NULL, cgroup_manager, contrls) != 0 ) {
  ^
cgmanager.c: In function 'cgm_list_keys':
cgmanager.c:135:2: warning: implicit declaration of function 'cgmanager_list_keys_sync' [-Wimplicit-function-declaration]
  if ( cgmanager_list_keys_sync(NULL, cgroup_manager, controller, cgroup,
  ^
cgmanager.c:136:6: error: 'CgmanagerListKeysOutputElement' undeclared (first use in this function)
     (CgmanagerListKeysOutputElement ***)keys) != 0 ) {
      ^
cgmanager.c:136:6: note: each undeclared identifier is reported only once for each function it appears in
cgmanager.c:136:40: error: expected expression before ')' token
     (CgmanagerListKeysOutputElement ***)keys) != 0 ) {
                                        ^
make[2]: *** [cgmanager.o] Error 1
make[2]: Leaving directory `/home/erkan/Test/lxcfs'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/erkan/Test/lxcfs'
make: *** [all] Error 2

Strangeness with configure, Line 12300 failure

Ubuntu 15.04:

Upstream:

root@core01:~/source/lxcfs-current# git remote -v
origin  https://github.com/lxc/lxcfs.git (fetch)
origin  https://github.com/lxc/lxcfs.git (push)

Pull'd latest:

root@core01:~/source/lxcfs-current# git pull origin master
From https://github.com/lxc/lxcfs
 * branch            master     -> FETCH_HEAD
Already up-to-date.

Git Log Commit:

root@core01:~/source/lxcfs-current# git log | head
commit cde8fac3454903900de83bf9389c866b0cdbe4d6
Author: Serge Hallyn <[email protected]>
Date:   Fri May 8 19:52:28 2015 -0500

    configure.ac: v0.9

    Signed-off-by: Serge Hallyn <[email protected]>

Run Bootstrap:

root@core01:~/source/lxcfs# ./bootstrap.sh
+ set -e
+ test -d autom4te.cache
+ aclocal -I m4
+ libtoolize
libtoolize: putting auxiliary files in `.'.
libtoolize: linking file `./ltmain.sh'
libtoolize: putting macros in AC_CONFIG_MACRO_DIR, `m4'.
libtoolize: linking file `m4/libtool.m4'
libtoolize: linking file `m4/ltoptions.m4'
libtoolize: linking file `m4/ltsugar.m4'
libtoolize: linking file `m4/ltversion.m4'
libtoolize: linking file `m4/lt~obsolete.m4'
+ autoheader
+ autoconf
+ automake --add-missing --copy
configure.ac:8: installing './compile'
configure.ac:19: installing './config.guess'
configure.ac:19: installing './config.sub'
configure.ac:17: installing './install-sh'
configure.ac:17: installing './missing'
Makefile.am: installing './INSTALL'
Makefile.am: installing './depcomp'

./configure:

root@core01:~/source/lxcfs# ./configure
checking for gcc... gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking whether gcc understands -c and -o together... yes
checking how to run the C preprocessor... gcc -E
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking minix/config.h usability... no
checking minix/config.h presence... no
checking for minix/config.h... no
checking whether it is safe to define __EXTENSIONS__... yes
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
/bin/bash: /root/missing: No such file or directory
configure: WARNING: 'missing' script is too old or missing
checking for a thread-safe mkdir -p... /bin/mkdir -p
checking for gawk... gawk
checking whether make sets $(MAKE)... yes
checking for style of include used by make... GNU
checking whether make supports nested variables... yes
checking dependency style of gcc... gcc3
checking build system type... x86_64-unknown-linux-gnu
checking host system type... x86_64-unknown-linux-gnu
checking how to print strings... printf
checking for a sed that does not truncate output... /bin/sed
checking for fgrep... /bin/grep -F
checking for ld used by gcc... /usr/bin/ld
checking if the linker (/usr/bin/ld) is GNU ld... yes
checking for BSD- or MS-compatible name lister (nm)... /usr/bin/nm -B
checking the name lister (/usr/bin/nm -B) interface... BSD nm
checking whether ln -s works... yes
checking the maximum length of command line arguments... 1572864
checking whether the shell understands some XSI constructs... yes
checking whether the shell understands "+="... yes
checking how to convert x86_64-unknown-linux-gnu file names to x86_64-unknown-linux-gnu format... func_convert_file_noop
checking how to convert x86_64-unknown-linux-gnu file names to toolchain format... func_convert_file_noop
checking for /usr/bin/ld option to reload object files... -r
checking for objdump... objdump
checking how to recognize dependent libraries... pass_all
checking for dlltool... no
checking how to associate runtime and link libraries... printf %s\n
checking for ar... ar
checking for archiver @FILE support... @
checking for strip... strip
checking for ranlib... ranlib
checking command to parse /usr/bin/nm -B output from gcc object... ok
checking for sysroot... no
checking for mt... mt
checking if mt is a manifest tool... no
checking for dlfcn.h... yes
checking for objdir... .libs
checking if gcc supports -fno-rtti -fno-exceptions... no
checking for gcc option to produce PIC... -fPIC -DPIC
checking if gcc PIC flag -fPIC -DPIC works... yes
checking if gcc static flag -static works... yes
checking if gcc supports -c -o file.o... yes
checking if gcc supports -c -o file.o... (cached) yes
checking whether the gcc linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes
checking whether -lc should be explicitly linked in... no
checking dynamic linker characteristics... GNU/Linux ld.so
checking how to hardcode library paths into programs... immediate
checking whether stripping libraries is possible... yes
checking if libtool supports shared libraries... yes
checking whether to build shared libraries... yes
checking whether to build static libraries... yes
checking for gcc... (cached) gcc
checking whether we are using the GNU C compiler... (cached) yes
checking whether gcc accepts -g... (cached) yes
checking for gcc option to accept ISO C89... (cached) none needed
checking whether gcc understands -c and -o together... (cached) yes
checking for gcc option to accept ISO C99... -std=gnu99
./configure: line 12300: syntax error near unexpected token `NIH,'
./configure: line 12300: `PKG_CHECK_MODULES(NIH, libnih >= 1.0.2)'
root@core01:~/source/lxcfs#

lxcfs high(?) cpu usage cost/container

Hello..
I am using Ubuntu 15.10 at Hyper-V (2CPUs, 4GB RAM) as I wanted to test lxd before switching some of my noncritical www ct's currently running at debian/openvz
I added to my apt sources "with "deb http://ppa.launchpad.net/ubuntu-lxc/lxd-stable/ubuntu wily main" && update && upgrade && reboot

dpkg -l|grep lx[cd]
ii liblxc1 2.0.0beta2-0ubuntu2ubuntu15.10.1ppa1 amd64 Linux Containers userspace tools (library)
ii lxc 2.0.0
beta2-0ubuntu2ubuntu15.10.1ppa1 amd64 Linux Containers userspace tools
ii lxc-templates 2.0.0beta2-0ubuntu2ubuntu15.10.1ppa1 amd64 Linux Containers userspace tools (templates)
ii lxcfs 0.17-0ubuntu3
ubuntu15.10.1ppa1 amd64 FUSE based filesystem for LXC
ii lxd 2.0.0
beta1-0ubuntu4ubuntu15.10.1ppa1 amd64 Container hypervisor based on LXC - daemon
ii lxd-client 2.0.0beta1-0ubuntu4ubuntu15.10.1ppa1 amd64 Container hypervisor based on LXC - client
ii python3-lxc 2.0.0
beta2-0ubuntu2ubuntu15.10.1ppa1 amd64 Linux Containers userspace tools (Python 3.x bindings)

All I did till now is to add some CTs
lxc config set storage.zfs_pool_name lxd
lxc remote add images images.linuxcontainers.org
lxc launch images:ubuntu/wily/amd64 ct01
lxc launch images:ubuntu/wily/amd64 ct02

etc..
I added 10 CTs ... played with it for a while (start,stop,create profile,clone,install mysql at one, postgresql on other) just to learn lxc tool and zfs integration.

Now the BUG(?) part..
Thou none of CTs does ~anything now, lxcfs process consumes quite significant portion of CPU per every running CT. With 20CTs running now this is constant 30%(of 200% - 2 CPUs) as htop shows. Munin graph shows flat 30% system,10% user. If I stop half of CTs this is cut by half ...

This is something to be expected with beta version (debug?) or am I running some strange hyper-v issue?

How should we report available swap?

Perhaps this should be discussed on the mailing list, or merged with another issue. But let's start here.

The problem: cgroup.memory.memsw.limit_in_bytes is combined available memory and swap. lxcfs's meminfo currently reports 'swap available' as memsw.limit_in_bytes - memory.limit_in_bytes. In the case where those two are the same, meminfo will report 0 swap available, and up to memsw.limit_in_bytes for swap usage.

What is the best way to handle this?

Continue as now.

We could simply not report Swap Available and instead report a lxcfs-specific value.

We could make the memory and swap available numbers dynamic. That's going to be confusing for a lot of legacy software.

Other ideas?

lxc.mount.hook script exits on non-fatal error. needs additional (different) checks

Host: Debian 8 Jessie (lxc 1:1.0.6-6+deb8u1, lxcfs 0.10-0ubuntu1ubuntu15.04.1ppa1)
Guest: CentOS 6.7

LXCFS mount script runs with "-e" flag. This means it will exit as soon as any command fails.

Problematic command (exit code 1)
mkdir ${LXC_ROOTFS_MOUNT}/sys/fs/cgroup/$DEST

Command after it, works (exit code 0)
mount -n --bind $entry ${LXC_ROOTFS_MOUNT}/sys/fs/cgroup/$DEST

Debug script output:

  • for entry in '/var/lib/lxcfs/cgroup/*'
    ++ basename /var/lib/lxcfs/cgroup/cpuset
  • DEST=cpuset
  • '[' cpuset = name=systemd ']'
  • mountpoint -q /usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/fs/cgroup/cpuset
  • mkdir /usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/fs/cgroup/cpuset
  • echo 'Exit code: 1'
  • mount -n --bind /var/lib/lxcfs/cgroup/cpuset /usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/fs/cgroup/cpuset
  • echo 'Exit code: 0'
  • grep -q ,
  • echo cpuset

Temporary workaround: remove "-e" flag in the script header

lxcfs -d doesn't work

Readme.md claims that -d can be passed to lxcfs to enter debug mode. But:

[root@localhost lxcfs]# git rev-parse HEAD
121013f74ca35633d460a699fb9ddbd46a942632
[root@localhost lxcfs]# lxcfs -s -f -d -o allow_other /usr/local/var/lib/lxcfs
Usage:

lxcfs mountpoint
lxcfs -h

lxcfs provides a blank /proc/meminfo when memory controller on host is unavailable

When the kernel isn't supplying a memory controller on the host then lxcfs fails to provide a proper /proc/meminfo to the containers. In this situation lxcfs should reflect the host values so that processes like procps and top can work properly inside the container.

These are situations where the memory controller may be unavailable.
The memory cgroup isn't compiled into the kernel being used at all
The memory cgroup functions are compiled in but are disabled at boot time (cgroup_disable=memory)
The memory cgroup functions are compiled in but are disabled by default at boot time

In the last 2 if you did lxc-checkconfig it may still report that the memory controller was enabled even though it was disabled either via default kernel configuration or boot time parameters.

In all cases doing a cat against /proc/self/cgroup confirms the memory controller is missing.

Examples of end result of current issue.
ProcPS can't report memory values. It actually throws a FPE when it tries.
shinji@trusty-x86:~$ ps -o %mem
%MEM
Signal 8 (FPE) caught by ps (procps-ng version 3.3.9).
ps:display.c:66: please report this bug
Floating point exception

Top reports a value of inf (infinity) in the memory column.

Additional relevant information:
OS: Ubuntu 14.04.2
Using Ubuntu-LXC daily PPA
lxcfs:
Installed: 0.7-0ubuntu2ubuntu14.04.1ppa1
Candidate: 0.7-0ubuntu2ubuntu14.04.1ppa1
Version table:
*** 0.7-0ubuntu2ubuntu14.04.1ppa1 0
500 http://ppa.launchpad.net/ubuntu-lxc/daily/ubuntu/ trusty/main amd64 Packages
100 /var/lib/dpkg/status

lxcfs man page on 0.7 appears to have been generated incorrectly

Reproduction: man lxcfs
OS: Ubuntu 14.04.2 on Ubuntu-LXC daily ppa

Expected results: A man page about lxcfs that looks appropriate
Actual results: I think it is still a man page about lxcfs but it is titled "FAILED" instead. I believe it generated improperly during compile time.

File contents (Note the FAILED title header and the failure/error messages mixed in the contents):
shinji@icarus:/usr/share/man/man1> zcat lxcfs.1.gz
." DO NOT MODIFY THIS FILE! It was generated by help2man 1.44.1.
.TH FAILED "1" "April 2015" "Failed opening dbus connection: org.freedesktop.DBus.Error.FileNotFound: Failed to connect to socket /sys/fs/cgroup/cgmanager/sock: No such file or directory" "User Commands"
.SH NAME
Failed - Set up cgroup fs for containers
.SH DESCRIPTION

lxcfs implements a FUSE fs which allows containers to have virtualized
cgroup filesystems and virtualized views of /proc/cpuinfo and /proc/meminfo.
.PP
Usage:
.PP
&./lxcfs [FUSE and mount options] mountpoint
.PP
WARNING: failed to escape to root cgroup
Failed opening dbus connection: org.freedesktop.DBus.Error.FileNotFound: Failed to connect to socket /sys/fs/cgroup/cgmanager/sock: No such file or directory
.SH "SEE ALSO"
cgmanager(1),
lxc(1)

Does not work on debian jessie

Have 3 different unprivileged LXC containers. If the container boots for a small period of time the correct memory assignment is visible in free -m and then the total memory of the host is shown.

RE : Container isolation in LXC

Hello All,

This is a repost of conversation #40 that has been closed by Hallyn before we could get any answers from you guys. Can someone look into this and give us some headup ?

Let us jump in this conversation as we are facing the same issues alphaonex86 describes in this post.
We are a hosting provider with over 400 production websites running on our servers.

We made the deliberate choice to go toward LXC as our main virtualization technology for our nearly 69 physical servers because we needed a flexible, hardware and OS independant technology that can evolve along with the Debian distribution. We tested VMware, KVM, vServer and Proxmox technologies in the past, all of which had a blocking issue for us.

We were happy running LXC as our hypervisor until one of our customer pointed out this major flaw in the VM isolation mechanism. Better than long speeches, let me give a graphical example of the problem we are facing.
Here is what a htop reveals, on the left hand side from the LXC VM, on the right hand side on the host :

lxc_htops

โ€‹
You can clearly see that the VM is not using any CPU/RAM at all, the process are sorted by CPU usage and peeks at 0.0% CPU/RAM. However htop shows one CPU #1 load at 96.7% because of a rsync task that runs at the host level, as you can see on the right side.

You can easily see what is the problem here. As hosting provider, we should be able to provide our customer with isolated environment. A root user logged into a VM should not be able to access any of this information. This kind of security leak can lead to major risks, as a very skilled system engineer could access other VM process or even host processes!

For instance, the /proc folder of the host is openly accessible from the VM and exposes all sensible information about what is running on the server and how it's configured.

To our opinion, the idea of letting the VM access to all raw information from the host then filter it inside the tools through patches is a dangerous path to follow.

Can you tell us if there is a plan to adapt the LXC technology to a more secured and isolated model such as VMware or qemu can offer? Correcting this through a Debian patch would not be a suitable solution for us as we are running different versions from Debian 6 to Debian 8 on our servers. Best would be a LXC patch.

Best regards,

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.