Coder Social home page Coder Social logo

kubevirt / containerized-data-importer Goto Github PK

View Code? Open in Web Editor NEW
400.0 400.0 254.0 204.27 MB

Data Import Service for kubernetes, designed with kubevirt in mind.

License: Apache License 2.0

Go 86.28% Makefile 0.27% Shell 4.82% Dockerfile 0.04% Starlark 8.52% C 0.07%

containerized-data-importer's Issues

Containerized only build breaks downstream

Containerizing the build process removed the ability to do a non-containerized build. Can the ability to do a non-containerized build be put back into the makefile so that the option is available?

need to clean up some general flow and error recovery scenarios

  1. If importer pod fails to create - don't want to forget the key in processItem
    - to ensure we try to process again

  2. importer race condition - multiple imports on a single pvc

$ kubectl logs importer-golden-pvcmzqld
I0425 17:36:37.917054       5 importer.go:35] main: Starting importer
I0425 17:36:37.919191       5 importer.go:50] main: importing file "tinyCore.qcow2.gz"
W0425 17:36:37.919256       5 dataStream.go:42] NewDataStream: IMPORTER_ACCESS_KEY_ID and/or IMPORTER_SECRET_KEY env variables are empty
I0425 17:36:37.919470       5 dataStream.go:71] Using S3 client to get data
I0425 17:36:37.919657       5 dataStream.go:78] Attempting to get object "s3://kubevirt-images/tinyCore.qcow2.gz" via S3 client
I0425 17:36:37.919850       5 importer.go:58] Beginning import from "/tinyCore.qcow2.gz"
I0425 17:36:37.919871       5 decompress.go:26] UnpackData: checking compressed and/or archive for file "tinyCore.qcow2.gz"
I0425 17:36:37.919875       5 decompress.go:45] DecompressData: checking if "tinyCore.qcow2.gz" is compressed
I0425 17:36:38.268947       5 decompress.go:59] DecompressData: decompressed "tinyCore.qcow2.gz"
I0425 17:36:38.268995       5 decompress.go:70] DearchiveData: checking if "tinyCore.qcow2" is an archive file
I0425 17:36:38.269281       5 util.go:44] StreamDataToFile: begin import...
I0425 17:36:52.916024       5 importer.go:88] main: converting qcow2 image to raw
I0425 17:36:52.949694       5 importer.go:98] main: Import complete, exiting
jcope@jonMBP | ~/.../kubevirt/manifests
$ kubectl logs importer-golden-pvcs9ntq
I0425 17:36:39.276378       7 importer.go:35] main: Starting importer
I0425 17:36:39.276943       7 importer.go:50] main: importing file "tinyCore.qcow2.gz"
W0425 17:36:39.276975       7 dataStream.go:42] NewDataStream: IMPORTER_ACCESS_KEY_ID and/or IMPORTER_SECRET_KEY env variables are empty
I0425 17:36:39.277832       7 dataStream.go:71] Using S3 client to get data
I0425 17:36:39.277960       7 dataStream.go:78] Attempting to get object "s3://kubevirt-images/tinyCore.qcow2.gz" via S3 client
I0425 17:36:39.278006       7 importer.go:58] Beginning import from "/tinyCore.qcow2.gz"
I0425 17:36:39.278050       7 decompress.go:26] UnpackData: checking compressed and/or archive for file "tinyCore.qcow2.gz"
I0425 17:36:39.278063       7 decompress.go:45] DecompressData: checking if "tinyCore.qcow2.gz" is compressed
I0425 17:36:39.536339       7 decompress.go:59] DecompressData: decompressed "tinyCore.qcow2.gz"
I0425 17:36:39.536358       7 decompress.go:70] DearchiveData: checking if "tinyCore.qcow2" is an archive file
I0425 17:36:39.536531       7 util.go:44] StreamDataToFile: begin import...
I0425 17:36:41.136349       7 importer.go:88] main: converting qcow2 image to raw
I0425 17:36:41.166394       7 importer.go:98] main: Import complete, exiting

Versioning releases

It's high time that releases of CDI be versioned in a format that conforms with Openshift and Kubevirt (v#.#.#). A versioning needs to be agreed upon to mark major, minor, and patch milestones. Some automation should be implemented for incrementing each to avoid human error.

Streaming Data Conversion test fails

Test failure after the travis fix:

• Failure [0.037 seconds]
Streaming Data Conversion
/home/travis/gopath/src/github.com/kubevirt/containerized-data-importer/test/datastream/datastream_test.go:26
  when data is in a supported file format
  /home/travis/gopath/src/github.com/kubevirt/containerized-data-importer/test/datastream/datastream_test.go:28
    should convert .qcow2 [It]
    /home/travis/gopath/src/github.com/kubevirt/containerized-data-importer/test/datastream/datastream_test.go:79
    Test data filename doesn't match expected file name.
    Expected
        <string>: tinyCore.qcow2
    to equal
        <string>: tinyCore.iso.qcow2

DataVolume CRD Implementation Tasks

Tasks for Initial PR: #189

  • add new package dependencies
  • define DataVolume API and add client/informer generators
  • refactor cdi controller component in preparation for multiple controller loops
  • introduce DataVolume Controller
  • add DataVolume controller unit tests

Followup Doc Tasks

  • add DataVolume documentation to CDI README

Followup Dev Tasks

  • add filesystem source
  • add DataVolume events
  • add DataVolume functional tests
  • add autogen tests to travis (verifies autogen code is up to date)
  • Autogenerated openapiv3 crd validation
  • validating webhook
  • autogenerated swagger (similar to what we have for kubevirt)

Add access to secrets for RBAC

rules:
- apiGroups: [""]
  resources: ["persistentvolumeclaims"]
  verbs: ["get", "list", "watch", "create", "update", "patch"]
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list", "watch", "create"]
- apiGroups: [""]
  resources: ["secrets"]
  verbs: ["get", "list", "watch", "create"]

determine image type w/o relying on file extension names

Today VM images are validated only by their filename extension. We could improve this by peeking inside the image and verifying headers, checksums (if any), etc to be more certain that we have a supported image type.
Something to consider is should we use the file's extension (suffix) at all? What if an image is named
foo.tar.gz but it is not a gzip file? Is this an error or do we ignore the extension?

controller needs to track pvc updates

Consider:

  1. create pvc but forget endpoint anno
  2. controller sees pvc but calls q.forget()
  3. user edits pvc and adds ep anno
  4. controller never sees the pvc update.
    Therefore the user has to delete the pvc, edit it, then re-create it.

CDI version issues

CDI "releases" are arbitrary, do not represent any functional composition, and only serve to keep track of the latest kubevirt version string. There really is no CDI version, instead we have obscured "latest" with kubvirt's version tag.

Per a recent email from @davidvossel:
CDI's release are overwritten with every commit. This means that someone using the CDI v0.5.2 will get different code each time they they deploy depending on what happens upstream

I propose that we either:

  1. make CDI releases real and immutable. This means CDI updates its release tag when we have collected a group of prs worthy of being released, and we support that release for some reasonable number of months. It also requires that kubevirt-ansible be able to consume a specific CDI release that may differ from the target kubevirt release tag. Or,
  2. stop pretending that CDI has releases and use "latest".

Interested in everyone's thoughts on this. @copejon @screeley44 @erinboyd @davidvossel @aglitke

Do we have a plan to integrate this repo into kubevirt/kubevirt ?

I'm working on a research on how to setup e2e test, there're two solutions here: 1. setting up a ci in this repo. 2. integrate e2e test in kubevirt/kubevirt, integrate in kubevirt/kubevirt has some pros but it will be no meaning if we didn't integrate the code base into kubevirt/kubevirt: no pr trigger tests. so I was just asking if there's a long term plan for doing that ?

@jeffvance @copejon @aglitke

(travis) All tests fail due to minikube error

Here's an example:
https://travis-ci.org/kubevirt/containerized-data-importer/builds/365227336#L518

Setting environment variables from .travis.yml
$ export CHANGE_MINIKUBE_NONE_USER=true
$ export K8S_VER=1.9.0
$ export K6T_VER=0.3.0
$ export SRC="http://www.tinycorelinux.net/9.x/x86/release/Core-current.iso"
4.79s$ GIMME_OUTPUT="$(gimme 1.10 | tee -a $HOME/.bashrc)" && eval "$GIMME_OUTPUT"
go version go1.10 linux/amd64
$ export GOPATH=$HOME/gopath
$ export PATH=$HOME/gopath/bin:$PATH
$ mkdir -p $HOME/gopath/src/github.com/kubevirt/containerized-data-importer
$ rsync -az ${TRAVIS_BUILD_DIR}/ $HOME/gopath/src/github.com/kubevirt/containerized-data-importer/
$ export TRAVIS_BUILD_DIR=$HOME/gopath/src/github.com/kubevirt/containerized-data-importer
$ cd $HOME/gopath/src/github.com/kubevirt/containerized-data-importer
0.01s
$ gimme version
v1.3.0
$ go version
go version go1.10 linux/amd64
go.env
$ go env
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/travis/.cache/go-build"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/travis/gopath"
GORACE=""
GOROOT="/home/travis/.gimme/versions/go1.10.linux.amd64"
GOTMPDIR=""
GOTOOLDIR="/home/travis/.gimme/versions/go1.10.linux.amd64/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build731311504=/tmp/go-build -gno-record-gcc-switches"
Using Go 1.5 Vendoring, not checking for Godeps
install
0.00s$ true
before_script.1
1.79s$ curl -Lo kubectl https://storage.googleapis.com/kubernetes-release/release/v$K8S_VER/bin/linux/amd64/kubectl && chmod +x kubectl && sudo mv kubectl /usr/local/bin/
before_script.2
0.49s$ curl -Lo minikube https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64 && chmod +x minikube && sudo mv minikube /usr/local/bin/
41.27s$ sudo minikube start --vm-driver=none --kubernetes-version=v$K8S_VER
Starting local Kubernetes v1.9.0 cluster...
Starting VM...
Getting VM IP address...
Moving files into cluster...
Downloading kubeadm v1.9.0
Downloading kubelet v1.9.0
Finished Downloading kubelet v1.9.0
Finished Downloading kubeadm v1.9.0
E0411 17:20:38.620011    4114 start.go:234] Error updating cluster:  starting kubelet: running command: 
sudo systemctl daemon-reload &&
sudo systemctl enable kubelet &&
sudo systemctl start kubelet
: exit status 1

cid label names

We need to consider removing the 'kubevirt" prefix from all cdi labels. The reason being that we want to eventually decouple cdi from kubevirt.

dockerhub image tag latest need to be latest image

When I was working on CDI e2e test in kubevirt-ansible pr here: kubevirt/kubevirt-ansible#246, I found the docker images here latest haven't been updated for a month, the actually latest image is v0.5.0-alpha.0.

The reason why this bothering me is because the old image I use doesn't have any label tagged on importer-pod, so it's not convenient for you sift some result you want to check (unless use regex but I think using label is kind of best practice)

So my understanding of latest == the latest usable image we pushed == v0.5.0-alpha.0

Could someone help to update those images IIUC ? also related to pr here: kubevirt/kubevirt-ansible#243

https://hub.docker.com/r/kubevirt/cdi-controller/
https://hub.docker.com/r/kubevirt/cdi-importer/

cdi rbac roles

The advice in the primary README.md to bind the default service account for a namespace with cluster-admin privileges isn't wise. Even with the disclaimer, we don't want that accidentally replicated.

For example, I already see that kubevirt-ansible is using this method to deploy CDI right now. I'd hate for something like this to accidentally make it's way into production someday.

short term fix

As a short term method, we'd be better off giving the CDI deployment a service-account with cluster-admin roles rather than binding cluster-admin to the default account in a namespace.

I'd recommend making this change asap before CDI gains any more traction.

long term fix

add a ServiceAccount, RBAC ClusterRole, and ClusterRoleBinding into a manifest and make the cdi-controller-deployment.yaml reference the service account.

The kubevirt manifest has some examples of how we do this for our controllers.
https://github.com/kubevirt/kubevirt/blob/master/manifests/release/kubevirt.yaml.in

import-pod error will conflict with next creatation

Let's say I'm about to create a pvc but provided some wrong endpoint url,
then the pod created by cdi will error out, then I figure it out my url was wrong, I have to manually delete the pod, then delele pvc, then change pvc and create.

but ideally, I just want to oc edit pvc change to a valid url, then let the cdi do the rest of job.

add label to pvc if user did not create it

We should add a label to the pvc object if a user has not added it, it's not a requirement for processing but a nice UXP to give ability to easily filter on CDI type resources using kubectl .i.e.

   kubectl get pvc -l app=containerized-data-importer --all-namespaces

Improve controller unit tests

The current controller unit tests are more like functional tests in that they call higher level functions (NewController, ProcessNextItem) rather than some of lower level funcs (which may need to be exported?).

  1. We should consider moving the current unit tests to /test/ so they are treated as functional tests and create real controller unit tests.
  2. We probably should remove the tests using an empty namespace since this is not a supported condition.

New containerized `make` requires --privileged or `setenforce 0`

make fails with a perms error on a rhel 7.4 vm:
Eg: make controller

stat /go/src/github.com/kubevirt/containerized-data-importer/cmd/controller/controller.go: permission denied

Also, shelling into the golang container shows the perms issue:

: docker run -it --rm -w /go/src/github.com/kubevirt/containerized-data-importer -v $PWD:/go/src/github.com/kubevirt/containerized-data-importer 3f30f1fc3c43 sh
# pwd
/go/src/github.com/kubevirt/containerized-data-importer
# ls -l
ls: cannot open directory '.': Permission denied
# id
uid=0(root) gid=0(root) groups=0(root)
# mount|grep import
/dev/mapper/rhel-root on /go/src/github.com/kubevirt/containerized-data-importer type xfs (rw,relatime,seclabel,attr2,inode64,noquota)

Running the golang container privileged or setenforce 0 fixes the perms problem.

Do not drop items from the queue if they error during processing

The call to should be reversed so that it calls Forget() on the key only if processItem() does not return an error. Errors that occur in processing may be ephemeral and only require that the process be retried. As such, keys that error should be requeued.

Validate image after copy

We are likely exposed to vulnerabilities by allowing users to import any image into the kubernetes cluster.

Need ginko and gomega vendor'd

:  cd pkg/controller/
: ls
controller.go  controller_suite_test.go  controller_test.go  util.go
: go test -c
# github.com/kubevirt/containerized-data-importer/pkg/controller
controller_suite_test.go:4:2: cannot find package "github.com/onsi/ginkgo" in any of:
	/root/go/src/github.com/kubevirt/containerized-data-importer/vendor/github.com/onsi/ginkgo (vendor tree)
	/usr/local/go/src/github.com/onsi/ginkgo (from $GOROOT)
	/root/go/src/github.com/onsi/ginkgo (from $GOPATH)
FAIL	github.com/kubevirt/containerized-data-importer/pkg/controller [setup failed]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.