azure / acs-engine Goto Github PK
View Code? Open in Web Editor NEWWE HAVE MOVED: Please join us at Azure/aks-engine!
Home Page: https://github.com/Azure/aks-engine
License: MIT License
WE HAVE MOVED: Please join us at Azure/aks-engine!
Home Page: https://github.com/Azure/aks-engine
License: MIT License
Currently custom data is embedded in the project, this task suggests pulling from https://github.com/dcos/dcos
Unable to stop kubelet.service via systemctl on k8s master node.
> systemctl stop kubelet.service
> systemctl status kubelet.service
Nov 22 01:29:19 k8s-master-1479775666-0 systemd[1]: Stopping Kubelet...
Nov 22 01:29:19 k8s-master-1479775666-0 docker[26894]: Error response from daemon: No such container: kubelet
Should be a fairly easy fix if you name the docker image when starting:
https://github.com/Azure/acs-engine/blob/master/parts/kubernetesmastercustomdata.yml#L323
Before we can support clusters with both linux and windows nodes, we need to have os label constraints on the addons.
RE: moby/moby#23793
Just hit this on one of my deployments:
● docker.service - Docker Application Container Engine
Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/docker.service.d
└─clear_mount_propagation_flags.conf, overlay.conf
Active: failed (Result: exit-code) since Thu 2016-11-03 04:56:35 UTC; 8min ago
Docs: https://docs.docker.com
Main PID: 4061 (code=exited, status=1/FAILURE)
Nov 03 04:56:35 k8s-agent-13086297-1 systemd[1]: Starting Docker Application Container Engine...
Nov 03 04:56:35 k8s-agent-13086297-1 docker[4061]: time="2016-11-03T04:56:35.613750839Z" level=fatal msg="no sockets found via socket activation: make sure the service was started by systemd"
Nov 03 04:56:35 k8s-agent-13086297-1 systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
Nov 03 04:56:35 k8s-agent-13086297-1 systemd[1]: Failed to start Docker Application Container Engine.
Nov 03 04:56:35 k8s-agent-13086297-1 systemd[1]: docker.service: Unit entered failed state.
Nov 03 04:56:35 k8s-agent-13086297-1 systemd[1]: docker.service: Failed with result 'exit-code'.
Obviously kubelet
didn't come up either.
We now have external dependencies in this project.
We should strongly considering vendoring them into the repo.
This re-presents the problem of this repo being used as both a package and a standalone CLI.
Generally you want to vendor for binaries, but not vendor for packages. We may want to revisit splitting the CLI part out of this into a separate tool.
Let's consider a Windows node, as it were deployed by Anthony's Windows branch: https://github.com/Azure/acs-engine/tree/anhowe-wink8s.
This node is identified in Azure by k8s-windowspool2-250945250
.
This node has a hostname of 25094acs0901
(aka, the MachineName in Windows is 25094acs0901
)
The kubelet
is overriding the hostname so that the apiserver identifies it as k8s-windowspool2-250945250
.
However, the Kubernetes control plane expects to be able to resolve the registered hostname back to the actual machine (this is used for logs and remote pod connectivity). Unfortunately, since the hostname in Windows is 25094acs0901
, there is never an internal DNS record created for k8s-windowspool2-250945250
, hence the errors we observed during KubeCon that look like this:
Get https://k8s-agentpool2-299742071:10250/containerLogs/default/wordpressapp-ext-h9t8c/wordpressapp-ext?timestamps=true: dial tcp: lookup k8s-agentpool2-299742071 on 168.63.129.16:53: no such host
The first fix that comes to mind is just not overriding the hostname, but then the Routes won't be created properly because the cloudprovider expects to be able to look up the VM instance via ARM using the hostname.
Why is Azure so incredibly limiting when it comes to Windows hostnames? The 15 character limit stems from NetBIOS which, afaik, hasn't been used in Windows for a very long time.
I use the xplatform cli 0.10.0, I followed the readme docuement and created acs cluster with docker swarm sucessfully with below command
$ azure group deployment create
--name=""
--resource-group="<RESOURCE_GROUP_NAME>"
--template-file="./_output//azuredeploy.json"
--parameters-file="./_output//azuredeploy.parameters.json
The problem is that when I run azure acs list -s {subscription id} -g {resource group name}, I get empty response.
But when I use portal->new->containers->Azure Container Service to create a new docker swarm cluster with UI, I get response from command azure acs list -s {subscription id} -g {resource group name}.
Is there any differences between ACS cluster created with portal and command line?
The nameSuffix
field is missing from the azuredeploy.parameters.json file following template generation. It is in the azuredeploy.json file and causes the following error when following the documentation:
info: Executing command group deployment create
error: Template and Deployment "parameters" objects lengths do not match
Deployment Parameter file does not have { nameSuffix } defined.
error: Error information has been recorded to <path>/.azure/azure.err
error: group deployment create command failed
The error is solved by manually adding the nameSuffix
field to the azuredeploy.parameters.json file.
For my use case I would like to create a single ARM template which deploys an ACS cluster (DC/OS orchestrator) and some additional services directly provisioned on top of Marathon.
I've tried to add an Custom Script Extension to the Mesos master. This works although there is no way to resolve the VM's resource name due to generated 8 HEX string to make the resources unique.
Issue #53 would solve this particular problem, but I think it would be cleaner if it's somehow possible to provide an reference to a script.
This has two benefits:
We would warn if the user is generating a template with a dnsPrefix that already exists. Means they're probably going to have a conflict at deploy-time, and/or they forgot to update the dns prefix in the model to be different for their next cluster.
It's a lot more intuitive. I still don't like the fact that we expose this randomly generated number to users in the first place.
cc: @anhowe what do you think?
Hi guys,
I used acs-engine to generate acs template and deploy to China, while at the last step, it showed the following information:
info: Resource 'dcos-agentpublic-nsg-31045692' of type 'Microsoft.Network/networkSecurityGroups' provisioning status is Succeeded
info: Resource 'dcos-agentprivate-nsg-31045692' of type 'Microsoft.Network/networkSecurityGroups' provisioning status is Succeeded
error: connect EAGAIN 42.159.198.85:443 - Local (0.0.0.0:17141)
error: Error information has been recorded to /home/steven/.azure/azure.err
error: group deployment create command failed
/home/steven/.azure/azure.err file details:
2016-11-19T16:13:31.024Z:
{ [Error: connect EAGAIN 42.159.198.85:443 - Local (0.0.0.0:17141)]
stack: [Getter/Setter],
code: 'EAGAIN',
errno: 'EAGAIN',
syscall: 'connect',
address: '42.159.198.85',
port: 443,
__frame:
{ name: '__1',
line: 79,
file: '/usr/local/lib/node_modules/azure-cli/lib/commands/arm/group/group.deployment.js',
prev:
{ name: '__1',
line: 52,
file: '/usr/local/lib/node_modules/azure-cli/lib/commands/arm/group/group.deployment.js',
prev: undefined,
calls: 2,
active: false,
offset: 26,
col: 24 },
calls: 96,
active: false,
offset: 2,
col: 44 },
rawStack: [Getter] }
Error: connect EAGAIN 42.159.198.85:443 - Local (0.0.0.0:17141)
<<< async stack >>>
at __1 (/usr/local/lib/node_modules/azure-cli/lib/commands/arm/group/group.deployment.js:81:45)
at __1 (/usr/local/lib/node_modules/azure-cli/lib/commands/arm/group/group.deployment.js:78:25)
<<< raw stack >>>
at Object.exports._errnoException (util.js:907:11)
at exports._exceptionWithHostPort (util.js:930:20)
at connect (net.js:865:16)
Could you please share any ideas to help on that? Thanks a lot!
Hi
it would be helpfully if acs-engine
would allow to parametrize ephemeral disk layout for dcos
. Currently we can manually change values inside the file dcoscustomdata184.t
layout:
-50
-50
and next rebuild project and generate deploy templates. But we will end up with all nodes with the same layout. One of the disk (ephemeral0.1) is used as mesos directory (/var/lib/mesos
) and second (ephemeral0.2) is used as docker directory (/var/lib/docker
). Some times there can be more optimal layouts.
For example in our scenario (cassandra on dcos) we would like to have
/var/lib/mesos
for storing data)Currently workaround is manually edit generated deploy templates.
After rebasing on Azure/master
I had two problems doing a Kubernetes generation/deployment:
I had to add "isStateful" to the agentpool definitions to avoid this error:
error while loading /tmp/tmp.aEZwUfrU0K: error validating acs cluster from file /tmp/tmp.aEZwUfrU0K: stateless (VMSS) deployments are not supported with Kubernetes, Kubernetes requires the ability to attach/detach disks. To fix specify 'isStateful=true'
After fixing that, I get:
+ ./acstgen /tmp/tmp.t5g0tSw0i4
cert creation took 9.584222059s
error generating template /tmp/tmp.t5g0tSw0i4: template: masteroutputs.t:6: function "RequiresFakeAgentOutput" not defined
Ref: kubernetes/kubernetes#34526
Need to output this field in the azure.json
file that is written out via CustomScriptExtension.
@anhowe: I'll need your advice on how to do this. Should we just assume that the first availability set is the one we want loadbalanced? Should we require the user to specify the primaryAvailabilitySet
to point at a specific agentpool as a field in the OrchestratorProfile
?
Issue to track the addition of HA to Kubernetes in ACS-Engine.
Depending on where kubeadm
is at whenever we get to this, we may want to adopt it to minimize our maintenance burden.
Hi,
I want to deploy Kubernetes on Azure and wanted SSL termination at load balancer level (typical as ingress object in Kubernetes).
As of now, only Azure Load Balancers (L4) are provisioned by acs. Is there a roadmap for supporting Application gateways ?
In the mean time, what is my best option to implement SSL offloading ? Eg. could I put an Application gateway in front of the loadbalancers ? or should I implement SSL termination myself by implementing an nginx container(s) specifically for this / or use something like 'dynamic SSL' plugin of Kong ?
Thx !
I managed to deploy a Kubernetes cluster successfully from the first time. It was actually great. I just noticed that the acs-engine created a kubconfig folder with server configuration for each supported region. I did not see the files anywhere referenced in your docs on how to use. Can you please point to the docs that provide more details what to use these files for?
Use MustCompile in an init function. Removes the need to compile the regex every time and removes the need for error checking.
When I try to scale up the VM instances on Kubernetes via Container service, it does not allow me to create a new node and returns the error below. Besides, it messes with my cluster, and it does not allow me to reach to apiserver anymore.
Failed to save container service 'containerservice-[name]'. Error: Provisioning of resource(s) for container service 'containerservice-[name]' in resource group '[name]' failed with errors: Resource type: Microsoft.Compute/virtualMachines, name: k8s-agent-1F791599-2, id: /subscriptions/[subscriptionid]/resourceGroups/[name]/providers/Microsoft.Compute/virtualMachines/k8s-agent-1F791599-2, StatusCode: BadRequest, StatusMessage: \n {
"error": {
"code": "InvalidParameter",
"target": "vmSize",
"message": "The value of parameter vmSize is invalid."
}
}
I'm not a huge fan of the b64+gzip trick we're using to shove the addons and customscript into the cloudconfig file.
Unfortunately, the alternative is raising the string expression limit in ARM again (not sure if they're willing). Also requires then figuring out how to do the indentation correctly into the YAML file, since the contents of the file would need to be inserted, with each line having the proper indentation.
Might not be worth the effort, if the only benefit is increasing the readability of the template outputs ever so slightly, but worth consideration.
What happened:
Creating a k8s cluster using an existing vnet, the cluster is unable to create routes in the Azure Route table, and is therefore unable to schedule any pods.
How to reproduce it:
When the cluster is up, the nodes report as ready:
gfadmin@k8s-master-35738843-0:~$ kubectl get nodes
NAME STATUS AGE
k8s-agentpool1-35738843-0 Ready 16h
k8s-agentpool1-35738843-1 Ready 16h
k8s-agentpool1-35738843-2 Ready 16h
k8s-master-35738843-0 Ready,SchedulingDisabled 16h
Wtih NetworkUnavailable message of RouteController failed tocreate a route:
gfadmin@k8s-master-35738843-0:~$ kubectl describe node k8s-master-35738843-0
Name: k8s-master-35738843-0
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/instance-type=Standard_D2_v2
beta.kubernetes.io/os=linux
failure-domain.beta.kubernetes.io/region=westus
failure-domain.beta.kubernetes.io/zone=0
kubernetes.io/hostname=k8s-master-35738843-0
Taints: <none>
CreationTimestamp: Wed, 23 Nov 2016 18:40:52 +0000
Phase:
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
OutOfDisk False Thu, 24 Nov 2016 11:02:41 +0000 Wed, 23 Nov 2016 18:40:52 +0000 KubeletHasSufficientDisk kubelet has sufficient disk space available
MemoryPressure False Thu, 24 Nov 2016 11:02:41 +0000 Wed, 23 Nov 2016 18:40:52 +0000 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Thu, 24 Nov 2016 11:02:41 +0000 Wed, 23 Nov 2016 18:40:52 +0000 KubeletHasNoDiskPressure kubelet has no disk pressure
Ready True Thu, 24 Nov 2016 11:02:41 +0000 Wed, 23 Nov 2016 18:40:52 +0000 KubeletReady kubelet is posting ready status
NetworkUnavailable True Thu, 24 Nov 2016 11:02:47 +0000 Thu, 24 Nov 2016 11:02:47 +0000 NoRouteCreated RouteController failed tocreate a route
Looking at the kube-controller logs (/var/log/containers):
routecontroller.go:132] Could not create route 5cb8901d-b1ac-11e6-89eb-000d3a32ff9f 10.244.2.0/24 for node k8s-master-35738843-0 after 38.691596ms: network.SubnetsClient#Get: Failure responding to request: StatusCode=404 -- Original Error: autorest/azure: Service returned an error. Status=404 Code=\"ResourceNotFound\" Message=\"The Resource 'Microsoft.Network/virtualNetworks/subscriptions' under resource group 'ACSRG2' was not found.\"\n","stream":"stderr","time":"2016-11-23T18:51:29.914307462Z"}
Notice the error message has an malform resource: Microsoft.Network/virtualNetworks/subscriptions
.
Workaround
We've deduced this to the /etc/kubernetes/azure.json
expecting unqualified names for both the vnet and subnet. Instead, the fully-qualified names are present:
{
...
"subnetName": "/subscriptions/76aabf62-fa6e-41ac-a2f3-5532b22811b5/resourceGroups/ACSRG2/providers/Microsoft.Network/virtualNetworks/k8s-vnet-test/subnets/k8s-subnet-test",
"securityGroupName": "...",
"vnetName": "/subscriptions/76aabf62-fa6e-41ac-a2f3-5532b22811b5/resourceGroups/ACSRG2/providers/Microsoft.Network/virtualNetworks/k8s-vnet-test",
...
}
After changing the subnet and vnet to unqualified names and restarting kubelet, we see the routes as being created and things are back to normal.
Much of the credit in debugging this goes to @jamesbak.
I tried to install ACS engine on a Linux which had docker installed already with the document below,
https://github.com/Azure/acs-engine/blob/master/docs/acsengine.md
Linux
For Linux, ensure Docker is installed, and follow the developer instructions at https://github.com/Azure/acs-engine#development-docker to build and use the ACS Engine.
After click the above link, the instruction for install ACS engine on Linux is not very clear.
Development (Docker)
The easiest way to get started developing on acs-engine is to use Docker. If you already have Docker or "Docker for {Windows,Mac}" then you can get started without needing to install anything extra.
•Windows (PowerShell): .\scripts\devenv.ps1
•Linux (bash): ./scripts/devenv.sh
This setup mounts the acs-engine source directory as a volume into the Docker container. This means that you can edit your source code normally in your favorite editor on your machine, while still being able to compile and test inside of the Docker container (the same environment used in our Continuous Integration system).
Here's a quick demo video showing the dev/build/test cycle with this setup.
Do we need to export any path or parameters on Linux? Where is devenv.sh located at? It would be great if there will be a step-by-step installation guide as Windows or OSX. Current installation guide for Linux is not very clear.
Since gcr.io
is not reachable from China mainland, I noticed all acs-engine
guides in China are suggesting users to replace gcr.io
in both code and configuration files to something else, and, what's worse, re-compile acs-engine to make this work.
This is very bad experience.
If we can not solve the URL problem, we can try to remove all image URLs in code at least, for example:
acs-engine/pkg/acsengine/const.go
Line 22 in ef10fce
README says:
Generated templates can be deployed using the Azure XPlat CLI (v0.10 only)
I have v0.10.5, however the examples/kubernetes.json
template output I got does not have "nameSuffix"
parameter in _output/Kubernetes-*/azuredeploy.parameters.json
file and therefore it fails with error:
zure group deployment create --template-file azuredeploy.json --parameters-file azuredeploy.parameters.json --resource-group ahmetb-k8s
info: Executing command group deployment create
error: Template and Deployment "parameters" objects lengths do not match
Deployment Parameter file does not have { nameSuffix } defined.
error: Error information has been recorded to /Users/alp/.azure/azure.err
error: group deployment create command failed
I am pretty new here
following the instruction and for some reason the template failed.
{
"error": {
"code": "PropertyChangeNotAllowed",
"target": "customData",
"message": "Changing property 'customData' is not allowed."
}
}
attached template files
ocdc.zip
azure portal errors
The DC/OS template currently generates Ubuntu virtual machines. This is currently not standard in DC/OS. This request is to modify the virtual machine OS to support CoreOS as well as Ubuntu.
As I can see, this will require modification of the cloud-config.yaml file that is parsed into the customData fields found in the template.
Hi Guys,
Great to see this open sourced (and in go)! I'm looking at using this package to deploy production kubernetes clusters, however the inability to deploy to a custom VNET in another resource group is a bit of a problem for us. I was wondering why this isn't possible? is it a constraint inherited from the resources deployed by the generated template, or a feature that needs adding in code?
If it's a feature request in acs-engine then i'm happy to muck in and add this. If you can point me at the problem.
Cheers,
Morgan
I'm trying to create a service principal using the instructions on
https://github.com/timfpark/acs-engine/blob/master/docs/serviceprincipal.md
and are running into a number of difficulties with both the Azure CLI and the xplat Azure CLI.
With the Azure CLI, the following happens when I attempt to create a principal:
$ az ad sp create-for-rbac --role="Contributor" --scopes="/subscriptions/1d3bc944-c31f-41a9-a1ac-cafea961eba5"
Error loading command module 'storage'
Resource '2bc558c2-4e9f-41fa-bcff-3c57a5a2fce4' does not exist or one of its queried reference-property objects are not present.
While on the xplat CLI, something similar happens with both the one step method:
$ azure ad sp create -n app -p password
info: Executing command ad sp create
+ Creating application rhom
+ Creating service principal for application 83276f3b-8652-468c-95f9-3bf91982273b
error: {"odata.error":{"code":"Request_ResourceNotFound","message":{"lang":"en","value":"Resource 'ServicePrincipal_e0d3c5bd-8ab3-400e-829b-910f562b6a23' does not exist or one of its queried reference-property objects are not present."}}}
error: Error information has been recorded to /Users/tim/.azure/azure.err
error: ad sp create command failed
and two step method:
$ azure ad sp create -a cea1c793-70b1-4681-aa2a-66688eadc271
info: Executing command ad sp create
+ Creating service principal for application cea1c793-70b1-4681-aa2a-66688eadc271
error: Error "Unexpected token in JSON at position 0" occurred in deserializing the responseBody - "{"odata.error":{"code":"Request_ResourceNotFound","message":{"lang":"en","value":"Resource 'ServicePrincipal_b3bbe085-5ae8-41e1-939f-eeeb68450ac0' does not exist or one of its queried reference-property objects are not present."},"requestId":"373a6fcd-3257-44c6-b433-45f098f4c329","date":"2016-11-15T15:21:58"}}" for the default response.
error: Error information has been recorded to /Users/tim/.azure/azure.err
error: ad sp create command failed
The go import paths were not updated. Will send a PR shortly.
I took a fast look onto the acs-engine and the generated RG templates. First thing: Thumbs up for supporting Kubernetes :) Now I got some open questions. I hope this is the right place to ask them.
How is HA with Kubernetes handled in ACS/acs-engine? All the docs speak about a single Kubernetes master and from what I understand from kubernetesmastercustomdata.yml, only a single non-HA master is set up. In the cluster definition doc however, there is the count field for the master profile, which suggests that HA is supported.
How can we later scale the number of nodes in the cluster? Should we change the number of worker nodes in the generated ARM template? Or should we regenerate the template with a new cluster definition and update the RG with the new template? Does the ACS portal provide this? If we do apply custom ARM templates, can we be sure that Azure does NOT kill/recreate anything important (e.g. route table or LB rules)?
How can we later upgrade to never versions of Kubernetes? Will there be any support from ACS/acs-engine or will this be a completely manual process? Will there be a difference between major/minor/patch level upgrades of Kubernetes?
How ca we ensure the distro has it's latest security patches and bug fixes installed?
If the caller of acs-engine does not specify an optional string, ACS engine will generate one to create unique resource names. This will also be used by ACS RP to follow its own resource naming format.
Log here: https://gist.github.com/8a8aa97bb2ecb20f99dc567dc763ca3c
Leaving this for investigation later. The IaaS resources seemed to all be deployed...
using docker file with build command ,does not generate the executable.
you have to compile manually
Hi Guys,
I would like to ask a little help. I tried to provision a Kubernetes cluster with the acs-engine CLI tool. The tool siad provision is successfully ended.
kubectl get pods --namespace=kube-system
NAME READY STATUS RESTARTS AGE
heapster-v1.2.0-194960081-8xrxh 0/2 Pending 0 8m
kube-addon-manager-k8s-master-39229988-0 1/1 Running 0 8m
kube-apiserver-k8s-master-39229988-0 1/1 Running 0 9m
kube-controller-manager-k8s-master-39229988-0 1/1 Running 0 9m
kube-dns-v19-dmvvc 0/3 Pending 0 8m
kube-dns-v19-hnrqp 0/3 Pending 0 8m
kube-proxy-7mqm5 1/1 Running 0 8m
kube-proxy-ebtw9 1/1 Running 0 8m
kube-proxy-lxlbu 1/1 Running 0 8m
kube-proxy-r5bez 1/1 Running 0 5m
kube-scheduler-k8s-master-39229988-0 1/1 Running 0 9m
kubernetes-dashboard-1872324879-kqdsy 0/1 Pending 0 8m
sudo docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
176cbfdf5235 gcr.io/google_containers/hyperkube-amd64:v1.4.5 "/hyperkube proxy --k" 11 minutes ago Up 11 minutes k8s_kube-proxy.6534244d_kube-proxy-lxlbu_kube-system_a62e7361-abd5-11e6-80f0-000d3a2613fc_4ae5ff30
25f91309aa05 gcr.io/google_containers/pause-amd64:3.0 "/pause" 11 minutes ago Up 11 minutes k8s_POD.d8dbe16c_kube-proxy-lxlbu_kube-system_a62e7361-abd5-11e6-80f0-000d3a2613fc_1cbcf764
0243775c437b gcr.io/google_containers/kube-addon-manager-amd64:v5.1 "/opt/kube-addons.sh" 11 minutes ago Up 11 minutes k8s_kube-addon-manager.ed858faf_kube-addon-manager-k8s-master-39229988-0_kube-system_c0133a504dee133427d4802c1f2c3314_8e1e67dc
72ca17c7edb0 gcr.io/google_containers/hyperkube-amd64:v1.4.5 "/hyperkube scheduler" 11 minutes ago Up 11 minutes k8s_kube-scheduler.22257f8_kube-scheduler-k8s-master-39229988-0_kube-system_6203373493987263d369756729453b5f_9bf5a243
1119a6276383 gcr.io/google_containers/pause-amd64:3.0 "/pause" 11 minutes ago Up 11 minutes k8s_POD.d8dbe16c_kube-scheduler-k8s-master-39229988-0_kube-system_6203373493987263d369756729453b5f_79ff11bf
158391dcd7cc gcr.io/google_containers/hyperkube-amd64:v1.4.5 "/hyperkube controlle" 11 minutes ago Up 11 minutes k8s_kube-controller-manager.954cbc53_kube-controller-manager-k8s-master-39229988-0_kube-system_ee5fb6e3d925965b0048e6cc77534a6e_e0ba6b18
af6c5c83eeef gcr.io/google_containers/hyperkube-amd64:v1.4.5 "/hyperkube apiserver" 11 minutes ago Up 11 minutes k8s_kube-apiserver.e54c022a_kube-apiserver-k8s-master-39229988-0_kube-system_1b3fae831a29391607f2e670f7f1e21a_21cb5974
cb14c133721d gcr.io/google_containers/pause-amd64:3.0 "/pause" 11 minutes ago Up 11 minutes k8s_POD.d8dbe16c_kube-controller-manager-k8s-master-39229988-0_kube-system_ee5fb6e3d925965b0048e6cc77534a6e_aecd055e
16d5a7e41944 gcr.io/google_containers/pause-amd64:3.0 "/pause" 11 minutes ago Up 11 minutes k8s_POD.d8dbe16c_kube-apiserver-k8s-master-39229988-0_kube-system_1b3fae831a29391607f2e670f7f1e21a_122a8fd7
33b25ada02b6 gcr.io/google_containers/pause-amd64:3.0 "/pause" 11 minutes ago Up 11 minutes k8s_POD.d8dbe16c_kube-addon-manager-k8s-master-39229988-0_kube-system_c0133a504dee133427d4802c1f2c3314_70fac706
5fce078d090b gcr.io/google_containers/hyperkube-amd64:v1.4.5 "/hyperkube kubelet -" 12 minutes ago Up 12 minutes jovial_hoover
I found the tons of the followed lines in a log of the hyperkube docker container:
5728 status_manager.go:450] Failed to update status for pod "_()": Get https://10.240.255.5:443/api/v1/namespaces/kube-system/pods/kube-apiserver-k8s-master-39229988-0: dial tcp 10.240.255.5:443: getsockopt: connection refused
If I try this request, I got this error:
wget https://10.240.255.5:443/api/v1/namespaces/kube-system/pods/kube-apiserver-k8s-master-39229988-0
--2016-11-16 08:36:25-- https://10.240.255.5/api/v1/namespaces/kube-system/pods/kube-apiserver-k8s-master-39229988-0
Connecting to 10.240.255.5:443... connected.
ERROR: cannot verify 10.240.255.5's certificate, issued by 'CN=ca':
Unable to locally verify the issuer's authority.
Could you help me, what is wrong with my setup?
Thanks in advance.
Will make it easier for contributions, CI, issue tracking, etc.
In the documentation, there was a reference on how to use the API JSON file to scale up \ down the deployment. I did not see further details. How can I add \ remove nodes to an existing agents, master or add \ remove new agent pools?
In the startup scripts somewhere, perhaps this should be executed
usermod -a -G docker $ADMIN_USER
Hi,
I used acs-engine to generate dcos ARM json file, the default package download address as following:
https://dcosio.azureedge.net/dcos/testing/bootstrap/${BOOTSTRAP_ID}.bootstrap.tar.xz
https://az837203.vo.msecnd.net/dcos-deps/docker-engine_1.11.2-0~xenial_amd64.deb
https://az837203.vo.msecnd.net/dcos-deps/ipset_6.29-1_amd64.deb
https://az837203.vo.msecnd.net/dcos-deps/libltdl7_2.4.6-0.1_amd64.deb
https://az837203.vo.msecnd.net/dcos-deps/unzip_6.0-20ubuntu1_amd64.deb
While in China, I have to using China local address to download these packages, but even I updated urls in parts/dcosprovision.sh file, the generated dcos json file still keeps the original dcosio and az837203 address.
What is the proper way to automatically generate custom URL in DCOS json file? Currently I updated the json file manually.
Filing this for ongoing and future discussion/consideration.
This could dramatically reduce the amount of code we maintain to deploy Kubernetes.
Advantages:
Disadvantages:
kubeconfig
out (yet)It would be beneficial to add a check routine into the acs-engine that will validate the Service Principal prior to generating the templates. This will reduce issue as template will build out successfully even in SP is incorrect.
I created a sp through az ad sp create-for-rbac --role contributor --scopes /subscriptions/xxx-yyy-zzz
, then I deployed a k8s cluster through the portal UI. After the boxes were up, I ssh'ed into master node and:
Unable to connect to the server: dial tcp 40.68.165.173:443: i/o timeout
but az login
was working fine with my sp account! Confused, I tried restarting the k8s api server using docker restart foo
, and suddenly the k8s api server was responding. Albeit all nodes were not ready!
NAME STATUS AGE
k8s-agent-a21727d1-0 NotReady 27s
k8s-agent-a21727d1-1 NotReady 30s
k8s-agent-a21727d1-2 NotReady 31s
k8s-master-a21727d1-0 Ready,SchedulingDisabled 27s
I rebooted agent-1 from the web portal UI .. a minute later
NAME STATUS AGE
k8s-agent-a21727d1-0 NotReady 6m
k8s-agent-a21727d1-1 Ready 6m
k8s-agent-a21727d1-2 NotReady 7m
k8s-master-a21727d1-0 Ready,SchedulingDisabled 6m
I didn't yet reboot the rest of nodes in case anyone wants to take a look. If I were to guess, It seems k8s cluster was up before AAD had fully replicated the sp account? and surprisingly, k8s does not auto-retry, but somehow gets stuck!
Hi,
Trying to run ACS on AzureGermanCloud and found some missing parts:
"cloud": "${CLOUD_NAME}"
, essential to get kubelet workingAll of this could be done by hand, but fixing this would be nice.
Thanks!
We need to update the mounts for kubelet for acsengine. It's missing /dev
causing disk attachment to fail.
fix flow in templates so there is not so much jumping between the logic in the templates and the go code
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.