MindX DL(昇腾深度学习组件)是反对 Atlas 800 训练服务器、Atlas 800 推理服务器的深度学习组件参考设计,提供昇腾 AI 处理器资源管理和监控、昇腾 AI 处理器优化调度、分布式训练汇合通信配置生成等根底性能,疾速使能合作伙伴进行深度学习平台开发。

    操作系统应用的是Ubuntu-1804,CPU是华为自研ARM架构。

一、装置前筹备

  1. 配置apt网络源
hello@ubuntu:/etc/apt$ sudo cp sources.list~ sources.listhello@ubuntu:/etc/apt$ cat sources.list# # deb cdrom:[Ubuntu-Server 18.04.5 LTS _Bionic Beaver_ - Release arm64 (20200810)]/ bionic main restricted#deb cdrom:[Ubuntu-Server 18.04.5 LTS _Bionic Beaver_ - Release arm64 (20200810)]/ bionic main restricted# See http://help.ubuntu.com/community/UpgradeNotes for how to upgrade to# newer versions of the distribution.deb http://cn.ports.ubuntu.com/ubuntu-ports/ bionic main restricted# deb-src http://cn.ports.ubuntu.com/ubuntu-ports/ bionic main restricted## Major bug fix updates produced after the final release of the## distribution.deb http://cn.ports.ubuntu.com/ubuntu-ports/ bionic-updates main restricted# deb-src http://cn.ports.ubuntu.com/ubuntu-ports/ bionic-updates main restricted## N.B. software from this repository is ENTIRELY UNSUPPORTED by the Ubuntu## team. Also, please note that software in universe WILL NOT receive any## review or updates from the Ubuntu security team.deb http://cn.ports.ubuntu.com/ubuntu-ports/ bionic universe# deb-src http://cn.ports.ubuntu.com/ubuntu-ports/ bionic universedeb http://cn.ports.ubuntu.com/ubuntu-ports/ bionic-updates universe# deb-src http://cn.ports.ubuntu.com/ubuntu-ports/ bionic-updates universe## N.B. software from this repository is ENTIRELY UNSUPPORTED by the Ubuntu ## team, and may not be under a free licence. Please satisfy yourself as to ## your rights to use the software. Also, please note that software in ## multiverse WILL NOT receive any review or updates from the Ubuntu## security team.deb http://cn.ports.ubuntu.com/ubuntu-ports/ bionic multiverse# deb-src http://cn.ports.ubuntu.com/ubuntu-ports/ bionic multiversedeb http://cn.ports.ubuntu.com/ubuntu-ports/ bionic-updates multiverse# deb-src http://cn.ports.ubuntu.com/ubuntu-ports/ bionic-updates multiverse## N.B. software from this repository may not have been tested as## extensively as that contained in the main release, although it includes## newer versions of some applications which may provide useful features.## Also, please note that software in backports WILL NOT receive any review## or updates from the Ubuntu security team.deb http://cn.ports.ubuntu.com/ubuntu-ports/ bionic-backports main restricted universe multiverse# deb-src http://cn.ports.ubuntu.com/ubuntu-ports/ bionic-backports main restricted universe multiverse## Uncomment the following two lines to add software from Canonical's## 'partner' repository.## This software is not part of Ubuntu, but is offered by Canonical and the## respective vendors as a service to Ubuntu users.# deb http://archive.canonical.com/ubuntu bionic partner# deb-src http://archive.canonical.com/ubuntu bionic partnerdeb http://ports.ubuntu.com/ubuntu-ports bionic-security main restricted# deb-src http://ports.ubuntu.com/ubuntu-ports bionic-security main restricteddeb http://ports.ubuntu.com/ubuntu-ports bionic-security universe# deb-src http://ports.ubuntu.com/ubuntu-ports bionic-security universedeb http://ports.ubuntu.com/ubuntu-ports bionic-security multiverse# deb-src http://ports.ubuntu.com/ubuntu-ports bionic-security multiverse

    2.配置kubernetes网络源

root@ubuntu:~/123/offline-pkg-arm64# cat <<EOF >/etc/apt/sources.list.d/kubernetes.list> deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main> EOF

    3.创立目录并下载根底包

root@ubuntu:~/123# mkdir offline-pkg-arm64root@ubuntu:~/123# cd offline-pkg-arm64/root@ubuntu:~/123/offline-pkg-arm64# sudo apt updateroot@ubuntu:~/123/offline-pkg-arm64# apt-get download conntrack cri-tools haveged keyutils libhavege1 libltdl7 libnfsidmap2 libtirpc-dev libtirpc1 nfs-common nfs-kernel-server rpcbind socat sshpassroot@ubuntu:~/123/offline-pkg-arm64# wget --no-check-certificate https://download.docker.com/linux/ubuntu/dists/bionic/pool/stable/arm64/docker-ce_18.06.3~ce~3-0~ubuntu_arm64.debroot@ubuntu:~/123/offline-pkg-arm64# apt-get download kubelet=1.17.3-00 kubeadm=1.17.3-00 kubectl=1.17.3-00 kubernetes-cni=0.8.6-00

    4.下载docker镜像并导出保留

root@ubuntu:~/123# mkdir docker_imagesroot@ubuntu:~/123# cd docker_images/root@ubuntu:~/123/docker_images# docker pull calico/node:v3.11.3root@ubuntu:~/123/docker_images# docker save -o calico-node_arm64.tar.gz calico/node:v3.11.3root@ubuntu:~/123/docker_images# docker pull calico/pod2daemon-flexvol:v3.11.3root@ubuntu:~/123/docker_images# docker save -o calico-pod2daemon-flexvol_arm64.tar.gz calico/pod2daemon-flexvol:v3.11.3root@ubuntu:~/123/docker_images# docker pull calico/cni:v3.11.3root@ubuntu:~/123/docker_images# docker save -o calico-cni_arm64.tar.gz calico/cni:v3.11.3root@ubuntu:~/123/docker_images# docker pull calico/kube-controllers:v3.11.3root@ubuntu:~/123/docker_images# docker save -o calico-kube-controllers_arm64.tar.gz calico/kube-controllers:v3.11.3root@ubuntu:~/123/docker_images# docker pull coredns/coredns:1.6.5root@ubuntu:~/123/docker_images# docker save -o coredns_arm64.tar.gz coredns/coredns:1.6.5root@ubuntu:~/123/docker_images# docker pull cruse/etcd-arm64:3.4.3-0root@ubuntu:~/123/docker_images# docker save -o etcd_arm64.tar.gz cruse/etcd-arm64:3.4.3-0root@ubuntu:~/123/docker_images# docker pull cruse/kube-apiserver-arm64:v1.17.3root@ubuntu:~/123/docker_images# docker save -o kube-apiserver_arm64.tar.gz cruse/kube-apiserver-arm64:v1.17.3root@ubuntu:~/123/docker_images# docker pull cruse/kube-controller-manager-arm64:v1.17.3root@ubuntu:~/123/docker_images# docker save -o kube-controller-manager_arm64.tar.gz  cruse/kube-controller-manager-arm64:v1.17.3root@ubuntu:~/123/docker_images# docker pull cruse/kube-proxy-arm64:v1.17.3-beta.0root@ubuntu:~/123/docker_images# docker save -o kube-proxy_arm64.tar.gz cruse/kube-proxy-arm64:v1.17.3-beta.0root@ubuntu:~/123/docker_images# docker pull cruse/kube-scheduler-arm64:v1.17.3-beta.0root@ubuntu:~/123/docker_images# docker save -o kube-scheduler_arm64.tar.gz cruse/kube-scheduler-arm64:v1.17.3-beta.0root@ubuntu:~/123/docker_images# docker pull cruse/pause-arm64:3.1root@ubuntu:~/123/docker_images# docker save -o pause_arm64.tar.gz cruse/pause-arm64:3.1root@ubuntu:~/123/docker_images# root@ubuntu:~/123/docker_images# root@ubuntu:~/123/docker_images# docker login -u 15648907522 -p RtZOXgmpYAQd5cj93uFCabNXUWB7wOftGw4pFdcal4XZH4bf06hvFxTOrYtr1nRao ascendhub.huawei.comroot@ubuntu:~/123/docker_images# root@ubuntu:~/123/docker_images# root@ubuntu:~/123/docker_images# docker pull ascendhub.huawei.com/public-ascendhub/vc-controller-manager_arm64:v1.0.1-r40root@ubuntu:~/123/docker_images# docker pull ascendhub.huawei.com/public-ascendhub/vc-scheduler_arm64:v1.0.1-r40root@ubuntu:~/123/docker_images# docker pull ascendhub.huawei.com/public-ascendhub/vc-webhook-manager_arm64:v1.0.1-r40root@ubuntu:~/123/docker_images# docker pull ascendhub.huawei.com/public-ascendhub/vc-webhook-manager-base_arm64:v1.0.1-r40root@ubuntu:~/123/docker_images# docker pull ascendhub.huawei.com/public-ascendhub/hccl-controller_arm64:v20.2.0root@ubuntu:~/123/docker_images# docker pull ascendhub.huawei.com/public-ascendhub/ascend-k8sdeviceplugin_arm64:v20.2.0root@ubuntu:~/123/docker_images# docker pull ascendhub.huawei.com/public-ascendhub/cadvisor_arm64:v0.34.0-r40root@ubuntu:~/123/docker_images# root@ubuntu:~/123/docker_images# root@ubuntu:~/123/docker_images# root@ubuntu:~/123/docker_images# docker tag ascendhub.huawei.com/public-ascendhub/vc-controller-manager_arm64:v1.0.1-r40 volcanosh/vc-controller-manager:v1.0.1-r40root@ubuntu:~/123/docker_images# docker tag ascendhub.huawei.com/public-ascendhub/vc-scheduler_arm64:v1.0.1-r40 volcanosh/vc-scheduler:v1.0.1-r40root@ubuntu:~/123/docker_images# docker tag ascendhub.huawei.com/public-ascendhub/vc-webhook-manager_arm64:v1.0.1-r40 volcanosh/vc-webhook-manager:v1.0.1-r40root@ubuntu:~/123/docker_images# docker tag ascendhub.huawei.com/public-ascendhub/vc-webhook-manager-base_arm64:v1.0.1-r40 volcanosh/vc-webhook-manager-base:v1.0.1-r40root@ubuntu:~/123/docker_images# docker tag ascendhub.huawei.com/public-ascendhub/hccl-controller_arm64:v20.2.0 hccl-controller:v20.2.0root@ubuntu:~/123/docker_images# docker tag ascendhub.huawei.com/public-ascendhub/ascend-k8sdeviceplugin_arm64:v20.2.0 ascend-k8sdeviceplugin:v20.2.0root@ubuntu:~/123/docker_images# docker tag ascendhub.huawei.com/public-ascendhub/cadvisor_arm64:v0.34.0-r40 google/cadvisor:v0.34.0-r40root@ubuntu:~/123/docker_images# docker rmi ascendhub.huawei.com/public-ascendhub/vc-controller-manager_arm64:v1.0.1-r40root@ubuntu:~/123/docker_images# docker rmi ascendhub.huawei.com/public-ascendhub/vc-scheduler_arm64:v1.0.1-r40root@ubuntu:~/123/docker_images# docker rmi ascendhub.huawei.com/public-ascendhub/vc-webhook-manager_arm64:v1.0.1-r40root@ubuntu:~/123/docker_images# docker rmi ascendhub.huawei.com/public-ascendhub/vc-webhook-manager-base_arm64:v1.0.1-r40root@ubuntu:~/123/docker_images# docker rmi ascendhub.huawei.com/public-ascendhub/hccl-controller_arm64:v20.2.0root@ubuntu:~/123/docker_images# docker rmi ascendhub.huawei.com/public-ascendhub/ascend-k8sdeviceplugin_arm64:v20.2.0root@ubuntu:~/123/docker_images# docker rmi ascendhub.huawei.com/public-ascendhub/cadvisor_arm64:v0.34.0-r40root@ubuntu:~/123/docker_images# root@ubuntu:~/123/docker_images# root@ubuntu:~/123/docker_images# root@ubuntu:~/123/docker_images# root@ubuntu:~/123/docker_images# docker save -o Ascend-K8sDevicePlugin-v20.2.0-arm64-Docker.tar.gz ascend-k8sdeviceplugin:v20.2.0root@ubuntu:~/123/docker_images# docker save -o hccl-controller-v20.2.0-arm64.tar.gz hccl-controller:v20.2.0root@ubuntu:~/123/docker_images# docker save -o huawei-cadvisor-v0.34.0-r40-arm64.tar.gz google/cadvisor:v0.34.0-r40root@ubuntu:~/123/docker_images# docker save -o vc-controller-manager-v1.0.1-r40-arm64.tar.gz volcanosh/vc-controller-manager:v1.0.1-r40root@ubuntu:~/123/docker_images# docker save -o vc-scheduler-v1.0.1-r40-arm64.tar.gz volcanosh/vc-scheduler:v1.0.1-r40root@ubuntu:~/123/docker_images# docker save -o vc-webhook-manager-base-v1.0.1-r40-arm64.tar.gz volcanosh/vc-webhook-manager-base:v1.0.1-r40root@ubuntu:~/123/docker_images# docker save -o vc-webhook-manager-v1.0.1-r40-arm64.tar.gz volcanosh/vc-webhook-manager:v1.0.1-r40

注* 其中局部镜像是须要在华为hub外面进行获取权限后进行下载

https://support.huaweicloud.c...\_03\_0047.html

    5.实现后的目录

root@ubuntu:~/123# tree.├── docker_images│   ├── Ascend-K8sDevicePlugin-v20.2.0-arm64-Docker.tar.gz│   ├── calico-cni_arm64.tar.gz│   ├── calico-kube-controllers_arm64.tar.gz│   ├── calico-node_arm64.tar.gz│   ├── calico-pod2daemon-flexvol_arm64.tar.gz│   ├── coredns_arm64.tar.gz│   ├── etcd_arm64.tar.gz│   ├── hccl-controller-v20.2.0-arm64.tar.gz│   ├── huawei-cadvisor-v0.34.0-r40-arm64.tar.gz│   ├── kube-apiserver_arm64.tar.gz│   ├── kube-controller-manager_arm64.tar.gz│   ├── kube-proxy_arm64.tar.gz│   ├── kube-scheduler_arm64.tar.gz│   ├── pause_arm64.tar.gz│   ├── vc-controller-manager-v1.0.1-r40-arm64.tar.gz│   ├── vc-scheduler-v1.0.1-r40-arm64.tar.gz│   ├── vc-webhook-manager-base-v1.0.1-r40-arm64.tar.gz│   └── vc-webhook-manager-v1.0.1-r40-arm64.tar.gz├── offline-pkg-arm64│   ├── conntrack_1%3a1.4.4+snapshot20161117-6ubuntu2_arm64.deb│   ├── cri-tools_1.13.0-01_arm64.deb│   ├── docker-ce_18.06.3~ce~3-0~ubuntu_arm64.deb│   ├── haveged_1.9.1-6_arm64.deb│   ├── keyutils_1.5.9-9.2ubuntu2_arm64.deb│   ├── kubeadm_1.17.3-00_arm64.deb│   ├── kubectl_1.17.3-00_arm64.deb│   ├── kubelet_1.17.3-00_arm64.deb│   ├── kubernetes-cni_0.8.6-00_arm64.deb│   ├── libhavege1_1.9.1-6_arm64.deb│   ├── libltdl7_2.4.6-2_arm64.deb│   ├── libnfsidmap2_0.25-5.1_arm64.deb│   ├── libtirpc1_0.2.5-1.2ubuntu0.1_arm64.deb│   ├── libtirpc-dev_0.2.5-1.2ubuntu0.1_arm64.deb│   ├── nfs-common_1%3a1.3.4-2.1ubuntu5.5_arm64.deb│   ├── nfs-kernel-server_1%3a1.3.4-2.1ubuntu5.5_arm64.deb│   ├── rpcbind_0.2.3-0.6ubuntu0.18.04.4_arm64.deb│   ├── socat_1.7.3.2-2ubuntu2_arm64.deb│   └── sshpass_1.06-1_arm64.deb├── offline-pkg-arm64.zip└── yamls    ├── ascendplugin-310-v20.2.0.yaml    ├── ascendplugin-volcano-v20.2.0.yaml    ├── cadvisor-v0.34.0-r40.yaml    ├── calico.yaml    ├── hccl-controller-v20.2.0.yaml    ├── npu-exporter-v20.2.0.yaml    └── volcano-v1.0.1-r40.yaml3 directories, 46 filesroot@ubuntu:~/123#

注* 其中yamls文件在下方链接中下载

https://gitee.com/ascend/mind...

    6.配置免密登陆

root@ubuntu:~# ssh-keygenGenerating public/private rsa key pair.Enter file in which to save the key (/root/.ssh/id_rsa): Created directory '/root/.ssh'.Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa.Your public key has been saved in /root/.ssh/id_rsa.pub.The key fingerprint is:SHA256:07dTbsAycQqT2w7HdCwjIyJig5T20FQ/eHZGxWg7pbY root@ubuntuThe key's randomart image is:+---[RSA 2048]----+| .+...   .+.     ||o+ .  o .+ +     ||+o+ ...=BoO +    ||...o .o.+/ O     ||        S @ + .  ||         E + =   ||          . o o  ||             o   ||                 |+----[SHA256]-----+root@ubuntu:~# root@ubuntu:~# ssh-copy-id -i 127.0.0.1

    7.配置装置ansible

root@ubuntu:~# root@ubuntu:~# apt install ansibleroot@ubuntu:~# vim /etc/ansible/hosts#配置内容如下[all:vars]# default shared directory, you can change it as yoursnfs_shared_dir=/data/atlas_dls# NFS service IPnfs_service_ip=192.168.1.110# Master IPmaster_ip=192.168.1.110# dls install package dirdls_root_dir=/root/123# set proxyproxy=""# Command for logging in to the Asend hubascendhub_login_command="login_command"# Generally, you do not need to change the value or delete it.ascendhub_prefix="ascendhub.huawei.com/public-ascendhub"# versionsdeviceplugin_version="v20.2.0"cadvisor_version="v0.34.0-r40"volcano_version="v1.0.1-r40"hccl_version="v20.2.0"[nfs_server]ubuntu ansible_host=192.168.1.110 ansible_ssh_user="root" ansible_ssh_pass="123123"[localnode]ubuntu ansible_host=192.168.1.110 ansible_ssh_user="root" ansible_ssh_pass="123123"[training_node]ubuntu ansible_host=192.168.1.110 ansible_ssh_user="root" ansible_ssh_pass="123123"[inference_node][A300T_node][arm]ubuntu ansible_host=192.168.1.110 ansible_ssh_user="root" ansible_ssh_pass="123123"[x86][workers:children]training_nodeinference_nodeA300T_noderoot@ubuntu:~/mindxdl/deploy/offline/steps# vim /etc/ansible/ansible.cfglog_path = /var/log/ansible.loghost_key_checking = Falsedeprecation_warnings = False

注* 参数阐明,请依据理论写入:

nfs-host-ip:NFS节点服务器IP地址,即服务器IP地址,如果不装置NFS可设置为空字符串,如:""。master-host-ip:治理节点服务器IP地址,即服务器IP地址。install_dir:根底软件包、镜像包和yamls文件夹的上传目录。proxy_address:代理地址,请依据理论状况配置,如果不须要代理,设置为空字符串,如:""。login_command:从Ascend Hub核心获取镜像须要应用的登录命令,仅在线装置须要配置,如:"docker login -u xxxxxx@xxxxxx -p xxxxxxxx ascendhub.huawei.com",留神不要脱漏命令前后的引号,获取形式请参见获取MindX DL镜像中1~2。离线装置可设置为空字符串,如:""。single-node-host-name:请应用单节点主机名,可通过hostname命令查看。IP:服务器IP地址。username:登录服务器的用户名。倡议应用root用户,防止权限有余。passwd:登录服务器的用户明码。

二、一键装置

root@ubuntu:~/sshpass# apt install sshpassroot@ubuntu:~/mindxdl/deploy/offline/steps# dos2unix *root@ubuntu:~/mindxdl/deploy/offline/steps# chmod 500 entry.shroot@ubuntu:~/mindxdl/deploy/offline/steps# bash -x entry.sh

三、装置后进行验证

    1.docker信息查看

root@ubuntu:~# docker infoContainers: 35 Running: 30 Paused: 0 Stopped: 5Images: 18Server Version: 18.06.3-ceStorage Driver: overlay2 Backing Filesystem: extfs Supports d_type: true Native Overlay Diff: trueLogging Driver: json-fileCgroup Driver: systemdPlugins: Volume: local Network: bridge host macvlan null overlay Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslogSwarm: inactiveRuntimes: ascend runcDefault Runtime: ascendInit Binary: docker-initcontainerd version: 468a545b9edcd5932818eb9de8e72413e616e86erunc version: a592beb5bc4c4092b1b1bac971afed27687340c5init version: fec3683Security Options: apparmor seccomp  Profile: defaultKernel Version: 4.15.0-112-genericOperating System: Ubuntu 18.04.5 LTSOSType: linuxArchitecture: aarch64CPUs: 192Total Memory: 503.6GiBName: ubuntuID: MUTU:QOYU:2P6F:P2QB:4JKZ:QNKE:PPMQ:PQLL:3PDG:QEYU:LMDK:KNMFDocker Root Dir: /var/lib/dockerDebug Mode (client): falseDebug Mode (server): falseRegistry: https://index.docker.io/v1/Labels:Experimental: falseInsecure Registries: docker.mirrors.ustc.edu.cn 127.0.0.0/8Registry Mirrors: https://dockerhub.azk8s.cn/ https://docker.mirrors.ustc.edu.cn/ http://hub-mirror.c.163.com/Live Restore Enabled: falseWARNING: No swap limit support

    2. kubectl的pod信息查看

root@ubuntu:~# kubectl get pod --all-namespacesNAMESPACE        NAME                                       READY   STATUS      RESTARTS   AGEcadvisor         cadvisor-nsn4r                             1/1     Running     0          5m23sdefault          hccl-controller-645bb466f-5fqq6            1/1     Running     0          5m34skube-system      ascend-device-plugin-daemonset-vxj8s       1/1     Running     0          5m23skube-system      calico-kube-controllers-8464785d6b-bnjdn   1/1     Running     0          5m50skube-system      calico-node-blshl                          1/1     Running     0          5m51skube-system      coredns-6955765f44-5jr59                   1/1     Running     0          5m50skube-system      coredns-6955765f44-wbzvz                   1/1     Running     0          5m50skube-system      etcd-ubuntu                                1/1     Running     0          5m43skube-system      kube-apiserver-ubuntu                      1/1     Running     0          5m43skube-system      kube-controller-manager-ubuntu             1/1     Running     0          5m43skube-system      kube-proxy-b78fm                           1/1     Running     0          5m51skube-system      kube-scheduler-ubuntu                      1/1     Running     0          5m43svolcano-system   volcano-admission-74776688c8-g9p9q         1/1     Running     0          5m31svolcano-system   volcano-admission-init-sbktn               0/1     Completed   0          5m31svolcano-system   volcano-controllers-6786db54f-vn797        1/1     Running     0          5m31svolcano-system   volcano-scheduler-844f9b547b-xxjm7         1/1     Running     0          5m31sroot@ubuntu:~# root@ubuntu:~# kubectl describe node ubuntuName:               ubuntuRoles:              master,workerLabels:             accelerator=huawei-Ascend910                    beta.kubernetes.io/arch=arm64                    beta.kubernetes.io/os=linux                    host-arch=huawei-arm                    kubernetes.io/arch=arm64                    kubernetes.io/hostname=ubuntu                    kubernetes.io/os=linux                    masterselector=dls-master-node                    node-role.kubernetes.io/master=                    node-role.kubernetes.io/worker=worker                    workerselector=dls-worker-nodeAnnotations:        huawei.com/Ascend910: Ascend910-1,Ascend910-2,Ascend910-3,Ascend910-0                    kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock                    node.alpha.kubernetes.io/ttl: 0                    projectcalico.org/IPv4Address: 192.168.1.110/24                    projectcalico.org/IPv4IPIPTunnelAddr: 10.30.243.192                    volumes.kubernetes.io/controller-managed-attach-detach: trueCreationTimestamp:  Thu, 05 Aug 2021 16:34:33 +0800Taints:             <none>Unschedulable:      falseLease:  HolderIdentity:  ubuntu  AcquireTime:     <unset>  RenewTime:       Thu, 05 Aug 2021 16:41:29 +0800Conditions:  Type                 Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message  ----                 ------  -----------------                 ------------------                ------                       -------  NetworkUnavailable   False   Thu, 05 Aug 2021 16:35:06 +0800   Thu, 05 Aug 2021 16:35:06 +0800   CalicoIsUp                   Calico is running on this node  MemoryPressure       False   Thu, 05 Aug 2021 16:40:30 +0800   Thu, 05 Aug 2021 16:34:27 +0800   KubeletHasSufficientMemory   kubelet has sufficient memory available  DiskPressure         False   Thu, 05 Aug 2021 16:40:30 +0800   Thu, 05 Aug 2021 16:34:27 +0800   KubeletHasNoDiskPressure     kubelet has no disk pressure  PIDPressure          False   Thu, 05 Aug 2021 16:40:30 +0800   Thu, 05 Aug 2021 16:34:27 +0800   KubeletHasSufficientPID      kubelet has sufficient PID available  Ready                True    Thu, 05 Aug 2021 16:40:30 +0800   Thu, 05 Aug 2021 16:35:19 +0800   KubeletReady                 kubelet is posting ready status. AppArmor enabledAddresses:  InternalIP:  192.168.1.110  Hostname:    ubuntuCapacity:  cpu:                   192  ephemeral-storage:     920422204Ki  huawei.com/Ascend910:  4  hugepages-2Mi:         0  memory:                528101392Ki  pods:                  110Allocatable:  cpu:                   192  ephemeral-storage:     848261101802  huawei.com/Ascend910:  4  hugepages-2Mi:         0  memory:                527998992Ki  pods:                  110System Info:  Machine ID:                 3996e745414f461b9e0e990f6d0b597e  System UUID:                CD56756C-607E-BD02-EB11-5292EAFB068C  Boot ID:                    adb96127-7fdc-4d84-8867-a13005f9b535  Kernel Version:             4.15.0-112-generic  OS Image:                   Ubuntu 18.04.5 LTS  Operating System:           linux  Architecture:               arm64  Container Runtime Version:  docker://18.6.3  Kubelet Version:            v1.17.3  Kube-Proxy Version:         v1.17.3PodCIDR:                      10.30.0.0/24PodCIDRs:                     10.30.0.0/24Non-terminated Pods:          (15 in total)  Namespace                   Name                                        CPU Requests  CPU Limits  Memory Requests  Memory Limits  AGE  ---------                   ----                                        ------------  ----------  ---------------  -------------  ---  cadvisor                    cadvisor-nsn4r                              500m (0%)     1 (0%)      300Mi (0%)       2000Mi (0%)    6m17s  default                     hccl-controller-645bb466f-5fqq6             500m (0%)     500m (0%)   300Mi (0%)       300Mi (0%)     6m28s  kube-system                 ascend-device-plugin-daemonset-vxj8s        500m (0%)     500m (0%)   500Mi (0%)       500Mi (0%)     6m17s  kube-system                 calico-kube-controllers-8464785d6b-bnjdn    0 (0%)        0 (0%)      0 (0%)           0 (0%)         6m44s  kube-system                 calico-node-blshl                           250m (0%)     0 (0%)      0 (0%)           0 (0%)         6m45s  kube-system                 coredns-6955765f44-5jr59                    100m (0%)     0 (0%)      70Mi (0%)        170Mi (0%)     6m44s  kube-system                 coredns-6955765f44-wbzvz                    100m (0%)     0 (0%)      70Mi (0%)        170Mi (0%)     6m44s  kube-system                 etcd-ubuntu                                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         6m37s  kube-system                 kube-apiserver-ubuntu                       250m (0%)     0 (0%)      0 (0%)           0 (0%)         6m37s  kube-system                 kube-controller-manager-ubuntu              200m (0%)     0 (0%)      0 (0%)           0 (0%)         6m37s  kube-system                 kube-proxy-b78fm                            0 (0%)        0 (0%)      0 (0%)           0 (0%)         6m45s  kube-system                 kube-scheduler-ubuntu                       100m (0%)     0 (0%)      0 (0%)           0 (0%)         6m37s  volcano-system              volcano-admission-74776688c8-g9p9q          500m (0%)     500m (0%)   300Mi (0%)       300Mi (0%)     6m25s  volcano-system              volcano-controllers-6786db54f-vn797         500m (0%)     500m (0%)   300Mi (0%)       300Mi (0%)     6m25s  volcano-system              volcano-scheduler-844f9b547b-xxjm7          500m (0%)     500m (0%)   300Mi (0%)       300Mi (0%)     6m25sAllocated resources:  (Total limits may be over 100 percent, i.e., overcommitted.)  Resource              Requests     Limits  --------              --------     ------  cpu                   4 (2%)       3500m (1%)  memory                2140Mi (0%)  4040Mi (0%)  ephemeral-storage     0 (0%)       0 (0%)  huawei.com/Ascend910  0            0Events:  Type    Reason                   Age                    From                Message  ----    ------                   ----                   ----                -------  Normal  NodeHasSufficientMemory  7m10s (x8 over 7m11s)  kubelet, ubuntu     Node ubuntu status is now: NodeHasSufficientMemory  Normal  NodeHasNoDiskPressure    7m10s (x7 over 7m11s)  kubelet, ubuntu     Node ubuntu status is now: NodeHasNoDiskPressure  Normal  NodeHasSufficientPID     7m10s (x6 over 7m11s)  kubelet, ubuntu     Node ubuntu status is now: NodeHasSufficientPID  Normal  Starting                 6m37s                  kubelet, ubuntu     Starting kubelet.  Normal  NodeHasSufficientMemory  6m37s                  kubelet, ubuntu     Node ubuntu status is now: NodeHasSufficientMemory  Normal  NodeHasNoDiskPressure    6m37s                  kubelet, ubuntu     Node ubuntu status is now: NodeHasNoDiskPressure  Normal  NodeHasSufficientPID     6m37s                  kubelet, ubuntu     Node ubuntu status is now: NodeHasSufficientPID  Normal  NodeAllocatableEnforced  6m37s                  kubelet, ubuntu     Updated Node Allocatable limit across pods  Normal  Starting                 6m33s                  kube-proxy, ubuntu  Starting kube-proxy.  Normal  NodeReady                6m17s                  kubelet, ubuntu     Node ubuntu status is now: NodeReadyroot@ubuntu:~#

注* 再此信息中能够看到CPU和加速卡的信息

Capacity:  cpu:                   192  ephemeral-storage:     920422204Ki  huawei.com/Ascend910:  4  hugepages-2Mi:         0  memory:                528101392Ki  pods:                  110Allocatable:  cpu:                   192  ephemeral-storage:     848261101802  huawei.com/Ascend910:  4  hugepages-2Mi:         0  memory:                527998992Ki  pods:                  110

**详情能够查看华为官网文档:
**

https://support.huaweicloud.c...