I got the following error when I try to deploy a qvisor based nginx on kubedam cluster:
yaoxin@master:~/Quark$ kubectl describe pod
Name: nginx-quark
Namespace: default
Priority: 0
Runtime Class Name: quark
Service Account: default
Node: worker/192.168.122.12
Start Time: Thu, 01 Sep 2022 09:36:55 +0000
Labels: <none>
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Containers:
nginx:
Container ID:
Image: nginx
Image ID:
Port: <none>
Host Port: <none>
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-fs9jj (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kube-api-access-fs9jj:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 26s default-scheduler Successfully assigned default/nginx-quark to worker
Warning FailedCreatePodSandBox 21s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: Others("Other: Common(\"ttrpc error is Other(\\\"IOError(\\\\\\\"WriteFile \\\\\\\\\\\\\\\"/sys/fs/cgroup/cpu/kubepods-besteffort-podc6afbe65_7732_4026_8075_8b677c2ecbba.slice:cri-containerd:81e21487a6ae2bef9ec479fc1bba62403d0cb914c3b8a10cc271b6477f36d58e/cpu.shares\\\\\\\\\\\\\\\" io::error is Os { code: 13, kind: PermissionDenied, message: \\\\\\\\\\\\\\\"Permission denied\\\\\\\\\\\\\\\" }\\\\\\\")\\\")\")"): unknown
Warning FailedCreatePodSandBox 6s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: Others("Other: Common(\"ttrpc error is Other(\\\"IOError(\\\\\\\"WriteFile \\\\\\\\\\\\\\\"/sys/fs/cgroup/cpu/kubepods-besteffort-podc6afbe65_7732_4026_8075_8b677c2ecbba.slice:cri-containerd:db9ab5d94833c4795b7d6029f004056d81b1fc34d42b543b05e03e55350d7796/cpu.shares\\\\\\\\\\\\\\\" io::error is Os { code: 13, kind: PermissionDenied, message: \\\\\\\\\\\\\\\"Permission denied\\\\\\\\\\\\\\\" }\\\\\\\")\\\")\")"): unknown
Environment
Kernel version:
yaoxin@master:~/Quark$ uname -r
5.15.0-46-generic
Kubeadm version:
yaoxin@master:~/Quark$ sudo kubeadm version
[sudo] password for yaoxin:
kubeadm version: &version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.0", GitCommit:"a866cbe2e5bbaa01cfd5e969aa3e033f3282a8a2", GitTreeState:"clean", BuildDate:"2022-08-23T17:43:25Z", GoVersion:"go1.19", Compiler:"gc", Platform:"linux/amd64"}
Check nest virtualization on master VM/Node:
yaoxin@master:~/Quark$ sudo virt-host-validate
QEMU: Checking for hardware virtualization : PASS
QEMU: Checking if device /dev/kvm exists : PASS
QEMU: Checking if device /dev/kvm is accessible : PASS
QEMU: Checking if device /dev/vhost-net exists : PASS
QEMU: Checking if device /dev/net/tun exists : PASS
QEMU: Checking for cgroup 'cpu' controller support : PASS
QEMU: Checking for cgroup 'cpuacct' controller support : PASS
QEMU: Checking for cgroup 'cpuset' controller support : PASS
QEMU: Checking for cgroup 'memory' controller support : PASS
QEMU: Checking for cgroup 'devices' controller support : PASS
QEMU: Checking for cgroup 'blkio' controller support : PASS
QEMU: Checking for device assignment IOMMU support : WARN (No ACPI DMAR table found, IOMMU either disabled in BIOS or not supported by this hardware platform)
QEMU: Checking for secure guest support : WARN (Unknown if this platform has Secure Guest support)
LXC: Checking for Linux >= 2.6.26 : PASS
LXC: Checking for namespace ipc : PASS
LXC: Checking for namespace mnt : PASS
LXC: Checking for namespace pid : PASS
LXC: Checking for namespace uts : PASS
LXC: Checking for namespace net : PASS
LXC: Checking for namespace user : PASS
LXC: Checking for cgroup 'cpu' controller support : PASS
LXC: Checking for cgroup 'cpuacct' controller support : PASS
LXC: Checking for cgroup 'cpuset' controller support : PASS
LXC: Checking for cgroup 'memory' controller support : PASS
LXC: Checking for cgroup 'devices' controller support : PASS
LXC: Checking for cgroup 'freezer' controller support : PASS
LXC: Checking for cgroup 'blkio' controller support : PASS
LXC: Checking if device /sys/fs/fuse/connections exists : PASS
Check nest virtualization on worker VM/Node:
yaoxin@worker:~/Quark$ sudo virt-host-validate
[sudo] password for yaoxin:
QEMU: Checking for hardware virtualization : PASS
QEMU: Checking if device /dev/kvm exists : PASS
QEMU: Checking if device /dev/kvm is accessible : PASS
QEMU: Checking if device /dev/vhost-net exists : PASS
QEMU: Checking if device /dev/net/tun exists : PASS
QEMU: Checking for cgroup 'cpu' controller support : PASS
QEMU: Checking for cgroup 'cpuacct' controller support : PASS
QEMU: Checking for cgroup 'cpuset' controller support : PASS
QEMU: Checking for cgroup 'memory' controller support : PASS
QEMU: Checking for cgroup 'devices' controller support : PASS
QEMU: Checking for cgroup 'blkio' controller support : PASS
QEMU: Checking for device assignment IOMMU support : WARN (No ACPI DMAR table found, IOMMU either disabled in BIOS or not supported by this hardware platform)
QEMU: Checking for secure guest support : WARN (Unknown if this platform has Secure Guest support)
LXC: Checking for Linux >= 2.6.26 : PASS
LXC: Checking for namespace ipc : PASS
LXC: Checking for namespace mnt : PASS
LXC: Checking for namespace pid : PASS
LXC: Checking for namespace uts : PASS
LXC: Checking for namespace net : PASS
LXC: Checking for namespace user : PASS
LXC: Checking for cgroup 'cpu' controller support : PASS
LXC: Checking for cgroup 'cpuacct' controller support : PASS
LXC: Checking for cgroup 'cpuset' controller support : PASS
LXC: Checking for cgroup 'memory' controller support : PASS
LXC: Checking for cgroup 'devices' controller support : PASS
LXC: Checking for cgroup 'freezer' controller support : FAIL (Enable 'freezer' in kernel Kconfig file or mount/enable cgroup controller in your system)
LXC: Checking for cgroup 'blkio' controller support : PASS
LXC: Checking if device /sys/fs/fuse/connections exists : PASS
Master VM libvirt xml:
<domain type='kvm'>
<name>master</name>
<memory unit='G'>3</memory>
<currentMemory unit='G'>3</currentMemory>
<vcpu>2</vcpu>
<os>
<type arch='x86_64' machine='pc'>hvm</type>
<boot dev='hd'/> //即harddisk,从磁盘启
</os>
<cpu mode='host-passthrough'/>
<features>
<acpi/>
<apic/>
<pae/>
</features>
<clock offset='localtime'/>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>destroy</on_crash>
<devices>
<emulator>/usr/bin/qemu-system-x86_64</emulator>
<disk type='file' device='disk'>
<driver name='qemu' type='qcow2'/>
<source file='/home/yaoxin/project/useful_script/test/master_img.qcow2'/> //目的镜像路径
<target dev='hda' bus='ide'/>
</disk>
<disk type='file' device='cdrom'>
<source file='/home/yaoxin/project/useful_script/test/ubuntu-22.04.1-live-server-amd64.iso'/> //光盘镜像路径
<target dev='hdb' bus='ide'/>
</disk>
<interface type='bridge'>
<source bridge='virbr0'/>
<mac address="00:16:3e:5d:aa:a9"/>
</interface>
<input type='mouse' bus='ps2'/>
<graphics type='vnc' port='-1' autoport='yes' keymap='en-us'/>
</devices>
</domain>
Worker VM libvirt xml:
<domain type='kvm'>
<name>master</name>
<memory unit='G'>3</memory>
<currentMemory unit='G'>3</currentMemory>
<vcpu>2</vcpu>
<os>
<type arch='x86_64' machine='pc'>hvm</type>
<boot dev='hd'/> //即harddisk,从磁盘启
</os>
<cpu mode='host-passthrough'/>
<features>
<acpi/>
<apic/>
<pae/>
</features>
<clock offset='localtime'/>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>destroy</on_crash>
<devices>
<emulator>/usr/bin/qemu-system-x86_64</emulator>
<disk type='file' device='disk'>
<driver name='qemu' type='qcow2'/>
<source file='/home/yaoxin/project/useful_script/test/master_img.qcow2'/> //目的镜像路径
<target dev='hda' bus='ide'/>
</disk>
<disk type='file' device='cdrom'>
<source file='/home/yaoxin/project/useful_script/test/ubuntu-22.04.1-live-server-amd64.iso'/> //光盘镜像路径
<target dev='hdb' bus='ide'/>
</disk>
<interface type='bridge'>
<source bridge='virbr0'/>
<mac address="00:16:3e:5d:aa:a9"/>
</interface>
<input type='mouse' bus='ps2'/>
<graphics type='vnc' port='-1' autoport='yes' keymap='en-us'/>
</devices>
</domain>
Containerd version
yaoxin@master:~/Quark$ containerd version
INFO[2022-09-01T11:16:48.013602390Z] starting containerd revision=9cd3357b7fd7218e4aec3eae239db1f68a5a6ec6 version=1.6.8
CPU MODEL on vm
yaoxin@master:~/Quark$ cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 69
model name : Intel(R) Core(TM) i5-4210U CPU @ 1.70GHz
stepping : 1
microcode : 0x26
cpu MHz : 2394.456
cache size : 16384 KB
physical id : 0
siblings : 1
core id : 0
cpu cores : 1
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq vmx ssse3 fma cx16 pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm cpuid_fault invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt arat umip md_clear arch_capabilities
vmx flags : vnmi preemption_timer invvpid ept_x_only ept_ad ept_1gb flexpriority tsc_offset vtpr mtf vapic ept vpid unrestricted_guest shadow_vmcs pml
bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs srbds
bogomips : 4788.91
clflush size : 64
cache_alignment : 64
address sizes : 39 bits physical, 48 bits virtual
power management:
processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 69
model name : Intel(R) Core(TM) i5-4210U CPU @ 1.70GHz
stepping : 1
microcode : 0x26
cpu MHz : 2394.456
cache size : 16384 KB
physical id : 1
siblings : 1
core id : 0
cpu cores : 1
apicid : 1
initial apicid : 1
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq vmx ssse3 fma cx16 pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm cpuid_fault invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt arat umip md_clear arch_capabilities
vmx flags : vnmi preemption_timer invvpid ept_x_only ept_ad ept_1gb flexpriority tsc_offset vtpr mtf vapic ept vpid unrestricted_guest shadow_vmcs pml
bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs srbds
bogomips : 4788.91
clflush size : 64
cache_alignment : 64
address sizes : 39 bits physical, 48 bits virtual
power management:
How to reproduce?
After boot the master and worker VM, I did following on master and worker vms
- Download the qvisor
- Compiled it with
"ShimMode" : true,
make install
- Modify
/etc/containerd/config.toml
cat <<EOF | sudo tee /etc/containerd/config.toml
version = 2
[plugins."io.containerd.runtime.v1.linux"]
shim_debug = true
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runsc]
runtime_type = "io.containerd.runsc.v1"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.quark]
runtime_type = "io.containerd.quark.v1"
EOF
- Start a k8s cluster
# Execute on master node
sudo kubeadm init --cri-socket=/var/run/containerd/containerd.sock --pod-network-cidr=10.244.0.0/16
sudo rm $HOME/.kube/config
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
# Execute on worker node
sudo kubeadm join 10.218.233.29:6443 --cri-socket=/var/run/containerd/containerd.sock --token qy2r1j.t0y5ekx71t0tcfiq \
--discovery-token-ca-cert-hash sha256:78a23762652befd90bbcd3506ca9309c5243371360d7a66fc131cb1a4b25555
- Add CNI to K8S
kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml
- Add quark as a Runtime Resource to K8S
cat <<EOF | kubectl apply -f -
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
name: quark
handler: quark
EOF
- use Quark to run nginx
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: nginx-quark
spec:
runtimeClassName: quark
containers:
- name: nginx
image: nginx
EOF
Then I checked the pod status:
yaoxin@master:~/Quark$ kubectl get pod
NAME READY STATUS RESTARTS AGE
nginx-quark 0/1 ContainerCreating 0 109m