[TOC]

0x00 前言简述

描述: 在我博客以及前面的文章之中讲解Kubernetes相关集群环境的搭建, 随着K8S及其相关组件的迭代, 与读者当前接触的版本有所不同,所以在当前【2022年4月26日 10:08:29】时间节点,博主使用ubuntu 20.04 、haproxy、keepalive、containerd、etcd、kubeadm、kubectl 等相关工具插件【最新或者稳定的版本】进行实践高可用的kubernetes集群的搭建,这里不再对k8s等相关基础知识做介绍,如有新入门的童鞋,请访问如下【博客文章】(https://blog.weiyigeek.top/tags/k8s/) 或者【B站专栏】(https://www.bilibili.com/read/readlist/rl520875?spm_id_from=333.999.0.0) 按照顺序学习。

简述
Kubernetes(后续简称k8s)是 Google(2014年6月) 开源的一个容器编排引擎,使用Go语言开发,它支持自动化部署、大规模可伸缩、以及云平台中多个主机上的容器化应用进行管理。其目标是让部署容器化的应用更加简单并且高效,提供了资源调度、部署管理、服务发现、扩容缩容、状态 监控、维护等一整套功能, 努力成为跨主机集群的自动化部署、自动化扩展以及运行应用程序容器的平台,它支持一些列CNCF毕业项目,包括 Containerd、calico 等 。


0x01 环境准备

主机规划

主机地址 主机名称 主机配置 主机角色 软件组件
10.10.107.223 master-223 4C/4G/ 控制节点
10.10.107.224 master-224 4C/4G 控制节点
10.10.107.225 master-225 4C/8G 控制节点
10.10.107.226 node-1 4C/2G 工作节点
10.10.107.227 node-2 4C/2G 工作节点
10.10.107.222 weiyigeek.cluster.k8s - 虚拟VIP 虚拟网卡地址

温馨提示: 此处使用的是 Ubuntu 20.04 操作系统, 该系统已做安全加固和内核优化符合等保2.0要求【SecOpsDev/Ubuntu-InitializeSecurity.sh at master · WeiyiGeek/SecOpsDev (github.com)】, 如你的Linux未进行相应配置环境可能与读者有些许差异, 如需要进行(windows server 、Ubuntu、CentOS)安全加固请参照如下加固脚本进行加固, 请大家疯狂的 star 。

加固脚本地址:【 https://github.com/WeiyiGeek/SecOpsDev/blob/master/OS-%E6%93%8D%E4%BD%9C%E7%B3%BB%E7%BB%9F/Linux/Ubuntu/Ubuntu-InitializeSecurity.sh


软件版本

操作系统

  • Ubuntu 20.04 LTS - 5.4.0-107-generic

TLS证书签发

  • cfssl - v1.6.1
  • cfssl-certinfo - v1.6.1
  • cfssljson - v1.6.1

高可用软件

  • ipvsadm - 1:1.31-1
  • haproxy - 2.0.13-2
  • keepalived - 1:2.0.19-2

ETCD数据库

  • etcd - v3.5.4

容器运行时

  • containerd.io - 1.6.4

Kubernetes

  • kubeadm - v1.23.6
  • kube-apiserver - v1.23.6
  • kube-controller-manager - v1.23.6
  • kubectl - v1.23.6
  • kubelet - v1.23.6
  • kube-proxy - v1.23.6
  • kube-scheduler - v1.23.6

网络插件&辅助软件
calico - v3.22
coredns - v1.9.1
kubernetes-dashboard - v2.5.1
k9s - v0.25.18


网络规划

子网 Subnet 网段 备注
nodeSubnet 10.10.107.0/24 C1
ServiceSubnet 10.96.0.0/16 C2
PodSubnet 10.128.0.0/16 C3


温馨提示: 上述环境所使用的到相关软件及插件我已打包, 方便大家进行下载,可访问如下链接(访问密码请访问 WeiyiGeek 公众号回复【k8s二进制】获取)。

下载地址: http://share.weiyigeek.top/f/36158960-578443238-a1a5fa (访问密码:点击访问 WeiyiGeek 公众号回复【k8s二进制】

WeiyiGeek

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
/kubernetes-cluster-binary-install# tree .
.
├── calico
│   └── calico-v3.22.yaml
├── certificate
│   ├── admin-csr.json
│   ├── apiserver-csr.json
│   ├── ca-config.json
│   ├── ca-csr.json
│   ├── cfssl
│   ├── cfssl-certinfo
│   ├── cfssljson
│   ├── controller-manager-csr.json
│   ├── etcd-csr.json
│   ├── kube-scheduler-csr.json
│   ├── proxy-csr.json
│   └── scheduler-csr.json
├── containerd.io
│   └── config.toml
├── coredns
│   ├── coredns.yaml
│   ├── coredns.yaml.sed
│   └── deploy.sh
├── cri-containerd-cni-1.6.4-linux-amd64.tar.gz
├── etcd-v3.5.4-linux-amd64.tar.gz
├── k9s
├── kubernetes-dashboard
│   ├── kubernetes-dashboard.yaml
│   └── rbac-dashboard-admin.yaml
├── kubernetes-server-linux-amd64.tar.gz
└── nginx.yaml

0x02 安装部署

1.基础主机环境准备配置

步骤 01.【所有主机】主机名设置按照上述主机规划进行设置。

1
2
3
4
5
# 例如, 在10.10.107.223主机中运行。
hostnamectl set-hostname master-223

# 例如, 在10.10.107.227主机中运行。
hostnamectl set-hostname node-2

步骤 02.【所有主机】将规划中的主机名称与IP地址进行硬解析。

1
2
3
4
5
6
7
8
sudo tee -a /etc/hosts <<'EOF'
10.10.107.223 master-223
10.10.107.224 master-224
10.10.107.225 master-225
10.10.107.226 node-1
10.10.107.227 node-2
10.10.107.222 weiyigeek.cluster.k8s
EOF

步骤 03.验证每个节点上IP、MAC 地址和 product_uuid 的唯一性,保证其能相互正常通信

1
2
3
4
# 使用命令 ip link 或 ifconfig -a 来获取网络接口的 MAC 地址
ifconfig -a
# 使用命令 查看 product_uuid 校验
sudo cat /sys/class/dmi/id/product_uuid

步骤 04.【所有主机】系统时间同步与时区设置

1
2
3
4
5
6
7
date -R
sudo ntpdate ntp.aliyun.com
sudo timedatectl set-timezone Asia/Shanghai
# 或者
# sudo dpkg-reconfigure tzdata
sudo timedatectl set-local-rtc 0
timedatectl

步骤 05.【所有主机】禁用系统交换分区

1
2
3
swapoff -a && sed -i 's|^/swap.img|#/swap.ing|g' /etc/fstab
# 验证交换分区是否被禁用
free | grep "Swap:"

步骤 07.【所有主机】系统内核参数调整

1
2
3
4
5
6
7
8
# 禁用 swap 分区
egrep -q "^(#)?vm.swappiness.*" /etc/sysctl.conf && sed -ri "s|^(#)?vm.swappiness.*|vm.swappiness = 0|g" /etc/sysctl.conf || echo "vm.swappiness = 0" >> /etc/sysctl.conf
# 允许转发
egrep -q "^(#)?net.ipv4.ip_forward.*" /etc/sysctl.conf && sed -ri "s|^(#)?net.ipv4.ip_forward.*|net.ipv4.ip_forward = 1|g" /etc/sysctl.conf || echo "net.ipv4.ip_forward = 1" >> /etc/sysctl.conf

# - 允许 iptables 检查桥接流量
egrep -q "^(#)?net.bridge.bridge-nf-call-iptables.*" /etc/sysctl.conf && sed -ri "s|^(#)?net.bridge.bridge-nf-call-iptables.*|net.bridge.bridge-nf-call-iptables = 1|g" /etc/sysctl.conf || echo "net.bridge.bridge-nf-call-iptables = 1" >> /etc/sysctl.conf
egrep -q "^(#)?net.bridge.bridge-nf-call-ip6tables.*" /etc/sysctl.conf && sed -ri "s|^(#)?net.bridge.bridge-nf-call-ip6tables.*|net.bridge.bridge-nf-call-ip6tables = 1|g" /etc/sysctl.conf || echo "net.bridge.bridge-nf-call-ip6tables = 1" >> /etc/sysctl.conf

步骤 07.【所有主机】禁用系统防火墙

1
ufw disable && systemctl disable ufw && systemctl stop ufw

步骤 08.【master-225 主机】使用 master-225 主机的公钥免账号密码登陆其它主机(可选)方便文件在各主机上传下载。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# 生成ed25519格式的公密钥
sh-keygen -t ed25519

# 例如,在master-225 主机上使用密钥登录到 master-223 设置 (其它主机同样)
ssh-copy-id -p 20211 weiyigeek@10.10.107.223
# /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_ed25519.pub"
# Are you sure you want to continue connecting (yes/no/[fingerprint])? yes # 输入yes
# weiyigeek@10.10.107.223s password: # 输入主机密码
# Number of key(s) added: 1
# Now try logging into the machine, with: "ssh -p '20211' 'weiyigeek@10.10.107.223'"
# and check to make sure that only the key(s) you wanted were added.
ssh-copy-id -p 20211 weiyigeek@10.10.107.224
ssh-copy-id -p 20211 weiyigeek@10.10.107.226
ssh-copy-id -p 20211 weiyigeek@10.10.107.227


2.负载均衡管理工具安装与内核加载

步骤 01.安装ipvs模块以及负载均衡相关依赖。

1
2
3
4
5
6
7
8
9
10
# 查看可用版本
sudo apt-cache madison ipvsadm
# ipvsadm | 1:1.31-1 | http://mirrors.aliyun.com/ubuntu focal/main amd64 Packages

# 安装
sudo apt -y install ipvsadm ipset sysstat conntrack

# 锁定版本
apt-mark hold ipvsadm
# ipvsadm set on hold.


步骤 02.将模块加载到内核中(开机自动设置-需要重启机器生效)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
tee /etc/modules-load.d/k8s.conf <<'EOF'
# netfilter
br_netfilter

# containerd
overlay

# nf_conntrack
nf_conntrack

# ipvs
ip_vs
ip_vs_lc
ip_vs_lblc
ip_vs_lblcr
ip_vs_rr
ip_vs_wrr
ip_vs_sh
ip_vs_dh
ip_vs_fo
ip_vs_nq
ip_vs_sed
ip_vs_ftp
ip_tables
ip_set
ipt_set
ipt_rpfilter
ipt_REJECT
ipip
xt_set
EOF


步骤 03.手动加载模块到内核中

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
mkdir -vp /etc/modules.d/
tee /etc/modules.d/k8s.modules <<'EOF'
#!/bin/bash
# netfilter 模块 允许 iptables 检查桥接流量
modprobe -- br_netfilter
# containerd
modprobe -- overlay
# nf_conntrack
modprobe -- nf_conntrack
# ipvs
modprobe -- ip_vs
modprobe -- ip_vs_lc
modprobe -- ip_vs_lblc
modprobe -- ip_vs_lblcr
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- ip_vs_dh
modprobe -- ip_vs_fo
modprobe -- ip_vs_nq
modprobe -- ip_vs_sed
modprobe -- ip_vs_ftp
modprobe -- ip_tables
modprobe -- ip_set
modprobe -- ipt_set
modprobe -- ipt_rpfilter
modprobe -- ipt_REJECT
modprobe -- ipip
modprobe -- xt_set
EOF

chmod 755 /etc/modules.d/k8s.modules && bash /etc/modules.d/k8s.modules && lsmod | grep -e ip_vs -e nf_conntrack
# ip_vs_sh 16384 0
# ip_vs_wrr 16384 0
# ip_vs_rr 16384 0
# ip_vs 155648 6 ip_vs_rr,ip_vs_sh,ip_vs_wrr
# nf_conntrack 139264 1 ip_vs
# nf_defrag_ipv6 24576 2 nf_conntrack,ip_vs
# nf_defrag_ipv4 16384 1 nf_conntrack
# libcrc32c 16384 5 nf_conntrack,btrfs,xfs,raid456,ip_vs

sysctl --system

温馨提示: 在 kernel 4.19 版本及以上将使用 nf_conntrack 模块, 则在 4.18 版本以下则需使用nf_conntrack_ipv4 模块。


3.高可用HAproxy与Keepalived软件安装配置

描述: 由于是测试学习环境, 此处我未专门准备两台HA服务器, 而是直接采用master节点机器,如果是正式环境建议独立出来。

步骤 01.【Master节点机器】安装下载 haproxy (HA代理健康检测) 与 keepalived (虚拟路由协议-主从)。

1
2
3
4
5
6
7
8
9
10
# 查看可用版本
sudo apt-cache madison haproxy keepalived
# haproxy | 2.0.13-2ubuntu0.5 | http://mirrors.aliyun.com/ubuntu focal-security/main amd64 Packages
# keepalived | 1:2.0.19-2ubuntu0.2 | http://mirrors.aliyun.com/ubuntu focal-updates/main amd64 Packages

# 安装
sudo apt -y install haproxy keepalived

# 锁定版本
apt-mark hold haproxy keepalived


步骤 02.【Master节点机器】进行 HAProxy 配置,其配置目录为 /etc/haproxy/,所有节点配置是一致的。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
sudo cp /etc/haproxy/haproxy.cfg{,.bak}
tee /etc/haproxy/haproxy.cfg<<'EOF'
global
user haproxy
group haproxy
maxconn 2000
daemon
log /dev/log local0
log /dev/log local1 err
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
stats timeout 30s
# Default SSL material locations
ca-base /etc/ssl/certs
crt-base /etc/ssl/private
# See: https://ssl-config.mozilla.org/#server=haproxy&server-version=2.0.3&config=intermediate
ssl-default-bind-ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384
ssl-default-bind-ciphersuites TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256
ssl-default-bind-options ssl-min-ver TLSv1.2 no-tls-tickets

defaults
log global
mode http
option httplog
option dontlognull
timeout connect 5000
timeout client 50000
timeout server 50000
timeout http-request 15s
timeout http-keep-alive 15s
# errorfile 400 /etc/haproxy/errors/400.http
# errorfile 403 /etc/haproxy/errors/403.http
# errorfile 408 /etc/haproxy/errors/408.http
# errorfile 500 /etc/haproxy/errors/500.http
# errorfile 502 /etc/haproxy/errors/502.http
# errorfile 503 /etc/haproxy/errors/503.http
# errorfile 504 /etc/haproxy/errors/504.http

# 注意: 管理HAproxy (可选)
# frontend monitor-in
# bind *:33305
# mode http
# option httplog
# monitor-uri /monitor

# 注意: 基于四层代理, 1644 3为VIP的 ApiServer 控制平面端口, 由于是与master节点在一起所以不能使用6443端口.
frontend k8s-master
bind 0.0.0.0:16443
bind 127.0.0.1:16443
mode tcp
option tcplog
tcp-request inspect-delay 5s
default_backend k8s-master

# 注意: Master 节点的默认 Apiserver 是6443端口
backend k8s-master
mode tcp
option tcplog
option tcp-check
balance roundrobin
default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100
server master-223 10.10.107.223:6443 check
server master-224 10.10.107.224:6443 check
server master-225 10.10.107.225:6443 check
EOF


步骤 03.【Master节点机器】进行 置KeepAlived 配置 ,其配置目录为 /etc/haproxy/

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
# 创建配置目录,分别在各个master节点执行。
mkdir -vp /etc/keepalived
# __ROLE__ 角色: MASTER 或者 BACKUP
# __NETINTERFACE__ 宿主机物理网卡名称 例如我的ens32
# __IP__ 宿主机物理IP地址
# __VIP__ 虚拟VIP地址
sudo tee /etc/keepalived/keepalived.conf <<'EOF'
! Configuration File for keepalived
global_defs {
router_id LVS_DEVEL
script_user root
enable_script_security
}
vrrp_script chk_apiserver {
script "/etc/keepalived/check_apiserver.sh"
interval 5
weight -5
fall 2
rise 1
}
vrrp_instance VI_1 {
state __ROLE__
interface __NETINTERFACE__
mcast_src_ip __IP__
virtual_router_id 51
priority 101
advert_int 2
authentication {
auth_type PASS
auth_pass K8SHA_KA_AUTH
}
virtual_ipaddress {
__VIP__
}
# HA 健康检查
# track_script {
# chk_apiserver
# }
}
EOF

# 此处 master-225 性能较好所以配置为Master (master-225 主机上执行)
# master-225 10.10.107.225 => MASTER
sed -i -e 's#__ROLE__#MASTER#g' \
-e 's#__NETINTERFACE__#ens32#g' \
-e 's#__IP__#10.10.107.225#g' \
-e 's#__VIP__#10.10.107.222#g' /etc/keepalived/keepalived.conf

# master-224 10.10.107.224 => BACKUP (master-224 主机上执行)
sed -i -e 's#__ROLE__#BACKUP#g' \
-e 's#__NETINTERFACE__#ens32#g' \
-e 's#__IP__#10.10.107.224#g' \
-e 's#__VIP__#10.10.107.222#g' /etc/keepalived/keepalived.conf

# master-223 10.10.107.223 => BACKUP (master-223 主机上执行)
sed -i -e 's#__ROLE__#BACKUP#g' \
-e 's#__NETINTERFACE__#ens32#g' \
-e 's#__IP__#10.10.107.223#g' \
-e 's#__VIP__#10.10.107.222#g' /etc/keepalived/keepalived.conf

温馨提示: 注意上述的健康检查是关闭注释了的,你需要将K8S集群建立完成后再开启。

1
2
3
track_script {
chk_apiserver
}


步骤 04.【Master节点机器】进行配置 KeepAlived 健康检查文件。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
sudo tee /etc/keepalived/check_apiserver.sh <<'EOF'
#!/bin/bash
err=0
for k in $(seq 1 3)
do
check_code=$(pgrep haproxy)
if [[ $check_code == "" ]]; then
err=$(expr $err + 1)
sleep 1
continue
else
err=0
break
fi
done

if [[ $err != "0" ]]; then
echo "systemctl stop keepalived"
/usr/bin/systemctl stop keepalived
exit 1
else
exit 0
fi
EOF
sudo chmod +x /etc/keepalived/check_apiserver.sh


步骤 05.【Master节点机器】启动 haproxy 、keepalived 相关服务及测试VIP漂移。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# 重载 Systemd 设置 haproxy 、keepalived 开机自启以及立即启动
sudo systemctl daemon-reload
sudo systemctl enable --now haproxy && sudo systemctl enable --now keepalived
# Synchronizing state of haproxy.service with SysV service script with /lib/systemd/systemd-sysv-install.
# Executing: /lib/systemd/systemd-sysv-install enable haproxy
# Synchronizing state of keepalived.service with SysV service script with /lib/systemd/systemd-sysv-install.
# Executing: /lib/systemd/systemd-sysv-install enable keepalived

# 在 master-223 主机中发现vip地址在其主机上。
root@master-223:~$ ip addr
# 2: ens32: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
# link/ether 00:0c:29:00:0f:8f brd ff:ff:ff:ff:ff:ff
# inet 10.10.107.223/24 brd 10.10.107.255 scope global ens32
# valid_lft forever preferred_lft forever
# inet 10.10.107.222/32 scope global ens32
# valid_lft forever preferred_lft forever
# 其它两台主机上通信验证。
root@master-224:~$ ping 10.10.107.222
root@master-224:~$ ping 10.10.107.222

WeiyiGeek.master-223-vip

WeiyiGeek.master-223-vip

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# 手动验证VIP漂移,我们将该服务器上keepalived停止掉。
root@master-223:~$ pgrep haproxy
# 6320
# 6321
root@master-223:~$ /usr/bin/systemctl stop keepalived

# 此时,发现VIP已经飘到master-225主机中
root@master-225:~$ ip addr show ens32
# 2: ens32: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
# link/ether 00:0c:29:93:28:61 brd ff:ff:ff:ff:ff:ff
# inet 10.10.107.225/24 brd 10.10.107.255 scope global ens32
# valid_lft forever preferred_lft forever
# inet 10.10.107.222/32 scope global ens32
# valid_lft forever preferred_lft forever

至此,HAProxy 与 Keepalived 配置就告一段落了,下面将学习 ETCD 集群配置与证书签发。


4.配置部署etcd集群与etcd证书签发

描述: 创建一个高可用的ETCD集群,此处我们在【master-225】机器中操作。

步骤 01.【master-225】创建一个配置与相关文件存放的目录, 以及下载获取cfssl工具进行CA证书制作与签发(cfssl工具往期文章参考地址: https://blog.weiyigeek.top/2019/10-21-12.html#3-CFSSL-生成 )。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# 工作目录创建
mkdir -vp /app/k8s-init-work && cd /app/k8s-init-work

# cfssl 最新下载地址: https://github.com/cloudflare/cfssl/releases
# cfssl 相关工具拉取 (如果拉取较慢,建议使用某雷下载,然后上传到服务器里)
curl -L https://github.com/cloudflare/cfssl/releases/download/v1.6.1/cfssl_1.6.1_linux_amd64 -o /usr/local/bin/cfssl
curl -L https://github.com/cloudflare/cfssl/releases/download/v1.6.1/cfssljson_1.6.1_linux_amd64 -o /usr/local/bin/cfssljson
curl -L https://github.com/cloudflare/cfssl/releases/download/v1.6.1/cfssl-certinfo_1.6.1_linux_amd64 -o /usr/local/bin/cfssl-certinfo

# 赋予执行权限
chmod +x /usr/local/bin/cfssl*

/app# cfssl version
# Version: 1.2.0
# Revision: dev
# Runtime: go1.6

温馨提示:

  • cfssl : CFSSL 命令行工具
  • cfssljson : 用于从cfssl程序中获取JSON输出并将证书、密钥、证书签名请求文件CSR和Bundle写入到文件中,


步骤 02.利用上述 cfssl 工具创建 CA 证书。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
# - CA 证书签名请求配置文件
fssl print-defaults csr > ca-csr.json
tee ca-csr.json <<'EOF'
{
"CN": "kubernetes",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "ChongQing",
"ST": "ChongQing",
"O": "k8s",
"OU": "System"
}
],
"ca": {
"expiry": "87600h"
}
}
EOF

# 关键参数解析:
CN: Common Name,浏览器使用该字段验证网站是否合法,一般写的是域名,非常重要。浏览器使用该字段验证网站是否合法
key:生成证书的算法
hosts:表示哪些主机名(域名)或者IP可以使用此csr申请的证书,为空或者""表示所有的都可以使用(本例中没有`"hosts": [""]`字段)
names:常见属性设置
* C: Country, 国家
* ST: State,州或者是省份
* L: Locality Name,地区,城市
* O: Organization Name,组织名称,公司名称(在k8s中常用于指定Group,进行RBAC绑定)
* OU: Organization Unit Name,组织单位名称,公司部门

# - CA 证书策略配置文件
cfssl print-defaults config > ca-config.json
tee ca-config.json <<'EOF'
{
"signing": {
"default": {
"expiry": "87600h"
},
"profiles": {
"kubernetes": {
"expiry": "87600h",
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
]
},
"etcd": {
"expiry": "87600h",
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
]
}
}
}
}
EOF

# 关键参数解析:
default 默认策略,指定了证书的默认有效期是10年(87600h)
profile 自定义策略配置
* kubernetes:表示该配置(profile)的用途是为kubernetes生成证书及相关的校验工作
* signing:表示该证书可用于签名其它证书;生成的 ca.pem 证书中 CA=TRUE
* server auth:表示可以该CA 对 server 提供的证书进行验证
* client auth:表示可以用该 CA 对 client 提供的证书进行验证
* expiry:也表示过期时间,如果不写以default中的为准

# - 执行cfssl gencert 命令生成CA证书
# 利用CA证书签名请求配置文件 ca-csr.json 生成CA证书和CA私钥和CSR(证书签名请求):
cfssl gencert -initca ca-csr.json | cfssljson -bare ca
# 2022/04/27 16:49:37 [INFO] generating a new CA key and certificate from CSR
# 2022/04/27 16:49:37 [INFO] generate received request
# 2022/04/27 16:49:37 [INFO] received CSR
# 2022/04/27 16:49:37 [INFO] generating key: rsa-2048
# 2022/04/27 16:49:37 [INFO] encoded CSR
# 2022/04/27 16:49:37 [INFO] signed certificate with serial number 245643466964695827922023924375276493244980966303
$ ls
# ca-config.json ca.csr ca-csr.json ca-key.pem ca.pem
$ openssl x509 -in ca.pem -text -noout | grep "Not"
# Not Before: Apr 27 08:45:00 2022 GMT
# Not After : Apr 24 08:45:00 2032 GMT

温馨提示: 如果将 expiry 设置为87600h 表示证书过期时间为十年。


步骤 03.配置ETCD证书相关文件以及生成其证书,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
# etcd 证书请求文件
tee etcd-csr.json <<'EOF'
{
"CN": "etcd",
"hosts": [
"127.0.0.1",
"10.10.107.223",
"10.10.107.224",
"10.10.107.225",
"etcd1",
"etcd2",
"etcd3"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "ChongQing",
"ST": "ChongQing",
"O": "etcd",
"OU": "System"
}
]
}
EOF

# 利用ca证书签发生成etcd证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=etcd etcd-csr.json | cfssljson -bare etcd

$ ls etcd*
etcd.csr etcd-csr.json etcd-key.pem etcd.pem
$ openssl x509 -in etcd.pem -text -noout | grep "X509v3 Subject Alternative Name" -A 1
# X509v3 Subject Alternative Name:
# DNS:etcd1, DNS:etcd2, DNS:etcd3, IP Address:127.0.0.1, IP Address:10.10.107.223, IP Address:10.10.107.224, IP Address:10.10.107.225


步骤 04.【所有Master节点主机】下载部署ETCD集群, 首先我们需要下载etcd软件包, 可以 Github release 找到最新版本的etcd下载路径(https://github.com/etcd-io/etcd/releases/)。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# 下载
wget -L https://github.com/etcd-io/etcd/releases/download/v3.5.4/etcd-v3.5.4-linux-amd64.tar.gz
tar -zxvf etcd-v3.5.4-linux-amd64.tar.gz
cp -a etcd* /usr/local/bin/

# 版本
etcd --version
# etcd Version: 3.5.4
# Git SHA: 08407ff76
# Go Version: go1.16.15
# Go OS/Arch: linux/amd64

# 复制到其它master主机上
scp -P 20211 ./etcd-v3.5.4-linux-amd64.tar.gz weiyigeek@master-223:~
scp -P 20211 ./etcd-v3.5.4-linux-amd64.tar.gz weiyigeek@master-224:~

# 分别在master-223与master-224执行, 解压到 /usr/local/ 目录同样复制二进制文件到 /usr/local/bin/
tar -zxvf /home/weiyigeek/etcd-v3.5.4-linux-amd64.tar.gz -C /usr/local/
cp -a /usr/local/etcd-v3.5.4-linux-amd64/etcd* /usr/local/bin/

温馨提示: etcd 官网地址 ( https://etcd.io/)


步骤 05.创建etcd集群所需的配置文件。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
# 证书准备
mkdir -vp /etc/etcd/pki/
cp *.pem /etc/etcd/pki/
ls /etc/etcd/pki/
# ca-key.pem ca.pem etcd-key.pem etcd.pem

# 上传到~家目录,并需要将其复制到 /etc/etcd/pki/ 目录中
scp -P 20211 *.pem weiyigeek@master-224:~
scp -P 20211 *.pem weiyigeek@master-223:~
# ****************** [ 安全登陆 (Security Login) ] *****************
# Authorized only. All activity will be monitored and reported.By Security Center.
# ca-key.pem 100% 1675 3.5MB/s 00:00
# ca.pem 100% 1375 5.2MB/s 00:00
# etcd-key.pem 100% 1679 7.0MB/s 00:00
# etcd.pem 100% 1399 5.8MB/s 00:00

# master-225 执行
tee /etc/etcd/etcd.conf <<'EOF'
# [成员配置]
# member 名称
ETCD_NAME=etcd1
# 存储数据的目录(注意需要建立)
ETCD_DATA_DIR="/var/lib/etcd/data"
# 用于监听客户端etcdctl或者curl连接
ETCD_LISTEN_CLIENT_URLS="https://10.10.107.225:2379,https://127.0.0.1:2379"
# 用于监听集群中其它member的连接
ETCD_LISTEN_PEER_URLS="https://10.10.107.225:2380"

# [证书配置]
# ETCD_CERT_FILE=/etc/etcd/pki/etcd.pem
# ETCD_KEY_FILE=/etc/etcd/pki/etcd-key.pem
# ETCD_TRUSTED_CA_FILE=/etc/kubernetes/pki/ca.pem
# ETCD_CLIENT_CERT_AUTH=true
# ETCD_PEER_CLIENT_CERT_AUTH=true
# ETCD_PEER_CERT_FILE=/etc/etcd/pki/etcd.pem
# ETCD_PEER_KEY_FILE=/etc/etcd/pki/etcd-key.pem
# ETCD_PEER_TRUSTED_CA_FILE=/etc/kubernetes/pki/ca.pem

# [集群配置]
# 本机地址用于通知客户端,客户端通过此IPs与集群通信;
ETCD_ADVERTISE_CLIENT_URLS="https://10.10.107.225:2379"
# 本机地址用于通知集群member与member通信
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://10.10.107.225:2380"
# 描述集群中所有节点的信息,本member根据此信息去联系其他member
ETCD_INITIAL_CLUSTER="etcd1=https://10.10.107.225:2380,etcd2=https://10.10.107.224:2380,etcd3=https://10.10.107.223:2380"
# 集群状态新建集群时候设置为new,若是想加入某个已经存在的集群设置为existing
ETCD_INITIAL_CLUSTER_STATE=new
EOF


# master-224 执行
tee /etc/etcd/etcd.conf <<'EOF'
# [成员配置]
# member 名称
ETCD_NAME=etcd2
# 存储数据的目录(注意需要建立)
ETCD_DATA_DIR="/var/lib/etcd/data"
# 用于监听客户端etcdctl或者curl连接
ETCD_LISTEN_CLIENT_URLS="https://10.10.107.224:2379,https://127.0.0.1:2379"
# 用于监听集群中其它member的连接
ETCD_LISTEN_PEER_URLS="https://10.10.107.224:2380"

# [集群配置]
# 本机地址用于通知客户端,客户端通过此IPs与集群通信;
ETCD_ADVERTISE_CLIENT_URLS="https://10.10.107.224:2379"
# 本机地址用于通知集群member与member通信
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://10.10.107.224:2380"
# 描述集群中所有节点的信息,本member根据此信息去联系其他member
ETCD_INITIAL_CLUSTER="etcd1=https://10.10.107.225:2380,etcd2=https://10.10.107.224:2380,etcd3=https://10.10.107.223:2380"
# 集群状态新建集群时候设置为new,若是想加入某个已经存在的集群设置为existing
ETCD_INITIAL_CLUSTER_STATE=new
EOF


# master-223 执行
tee /etc/etcd/etcd.conf <<'EOF'
# [成员配置]
# member 名称
ETCD_NAME=etcd3
# 存储数据的目录(注意需要建立)
ETCD_DATA_DIR="/var/lib/etcd/data"
# 用于监听客户端etcdctl或者curl连接
ETCD_LISTEN_CLIENT_URLS="https://10.10.107.223:2379,https://127.0.0.1:2379"
# 用于监听集群中其它member的连接
ETCD_LISTEN_PEER_URLS="https://10.10.107.223:2380"

# [集群配置]
# 本机地址用于通知客户端,客户端通过此IPs与集群通信;
ETCD_ADVERTISE_CLIENT_URLS="https://10.10.107.223:2379"
# 本机地址用于通知集群member与member通信
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://10.10.107.223:2380"
# 描述集群中所有节点的信息,本member根据此信息去联系其他member
ETCD_INITIAL_CLUSTER="etcd1=https://10.10.107.225:2380,etcd2=https://10.10.107.224:2380,etcd3=https://10.10.107.223:2380"
# 集群状态新建集群时候设置为new,若是想加入某个已经存在的集群设置为existing
ETCD_INITIAL_CLUSTER_STATE=new
EOF


步骤 06.【所有Master节点主机】创建配置 etcd 的 systemd 管理配置文件,并启动其服务。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
mkdir -vp /var/lib/etcd/
cat > /usr/lib/systemd/system/etcd.service <<EOF
[Unit]
Description=Etcd Server
Documentation=https://github.com/etcd-io/etcd
After=network.target
After=network-online.target
wants=network-online.target

[Service]
Type=notify
WorkingDirectory=/var/lib/etcd/
EnvironmentFile=-/etc/etcd/etcd.conf
ExecStart=/usr/local/bin/etcd \
--client-cert-auth \
--trusted-ca-file /etc/etcd/pki/ca.pem \
--cert-file /etc/etcd/pki/etcd.pem \
--key-file /etc/etcd/pki/etcd-key.pem \
--peer-client-cert-auth \
--peer-trusted-ca-file /etc/etcd/pki/ca.pem \
--peer-cert-file /etc/etcd/pki/etcd.pem \
--peer-key-file /etc/etcd/pki/etcd-key.pem
Restart=on-failure
RestartSec=5
LimitNOFILE=65535
LimitNPROC=65535

[Install]
WantedBy=multi-user.target
EOF

# 重载 systemd && 开机启动与手动启用etcd服务
systemctl daemon-reload && systemctl enable --now etcd.service


步骤 07.【所有Master节点主机】查看各个master节点的etcd集群服务是否正常及其健康状态。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
# 服务查看
systemctl status etcd.service

# 利用 etcdctl 工具查看集群成员信息
export ETCDCTL_API=3
etcdctl --endpoints=https://10.10.107.225:2379,https://10.10.107.224:2379,https://10.10.107.223:2379 \
--cacert="/etc/etcd/pki/ca.pem" --cert="/etc/etcd/pki/etcd.pem" --key="/etc/etcd/pki/etcd-key.pem" \
--write-out=table member list
# +------------------+---------+-------+----------------------------+----------------------------+------------+
# | ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
# +------------------+---------+-------+----------------------------+----------------------------+------------+
# | 144934d02ad45ec7 | started | etcd3 | https://10.10.107.223:2380 | https://10.10.107.223:2379 | false |
# | 2480d95a2df867a4 | started | etcd2 | https://10.10.107.224:2380 | https://10.10.107.224:2379 | false |
# | 2e8fddd3366a3d88 | started | etcd1 | https://10.10.107.225:2380 | https://10.10.107.225:2379 | false |
# +------------------+---------+-------+----------------------------+----------------------------+------------+

# 集群节点信息
etcdctl --endpoints=https://10.10.107.225:2379,https://10.10.107.224:2379,https://10.10.107.223:2379 \
--cacert="/etc/etcd/pki/ca.pem" --cert="/etc/etcd/pki/etcd.pem" --key="/etc/etcd/pki/etcd-key.pem" \
--write-out=table endpoint status
# +----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
# | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
# +----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
# | https://10.10.107.225:2379 | 2e8fddd3366a3d88 | 3.5.4 | 20 kB | false | false | 3 | 12 | 12 | |
# | https://10.10.107.224:2379 | 2480d95a2df867a4 | 3.5.4 | 20 kB | true | false | 3 | 12 | 12 | |
# | https://10.10.107.223:2379 | 144934d02ad45ec7 | 3.5.4 | 20 kB | false | false | 3 | 12 | 12 | |
# +----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

# 集群节点健康状态
etcdctl --endpoints=https://10.10.107.225:2379,https://10.10.107.224:2379,https://10.10.107.223:2379 \
--cacert="/etc/etcd/pki/ca.pem" --cert="/etc/etcd/pki/etcd.pem" --key="/etc/etcd/pki/etcd-key.pem" \
--write-out=table endpoint health
# +----------------------------+--------+-------------+-------+
# | ENDPOINT | HEALTH | TOOK | ERROR |
# +----------------------------+--------+-------------+-------+
# | https://10.10.107.225:2379 | true | 9.151813ms | |
# | https://10.10.107.224:2379 | true | 10.965914ms | |
# | https://10.10.107.223:2379 | true | 11.165228ms | |
# +----------------------------+--------+-------------+-------+

# 集群节点性能测试
etcdctl --endpoints=https://10.10.107.225:2379,https://10.10.107.224:2379,https://10.10.107.223:2379 \
--cacert="/etc/etcd/pki/ca.pem" --cert="/etc/etcd/pki/etcd.pem" --key="/etc/etcd/pki/etcd-key.pem" \
--write-out=tableendpoint check perf
# 59 / 60 Boooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooom ! 98.33%PASS: Throughput is 148 writes/s
# Slowest request took too long: 1.344053s
# Stddev too high: 0.143059s
# FAIL


5.Containerd 运行时安装部署

步骤 01.【所有节点】在各主机中安装二进制版本的 containerd.io 运行时服务,Kubernertes 通过 CRI 插件来连接 containerd 服务中, 控制容器的生命周期。

1
2
3
4
5
6
# 从 Github 中下载最新的版本的 cri-containerd-cni 
wget -L https://github.com/containerd/containerd/releases/download/v1.6.4/cri-containerd-cni-1.6.4-linux-amd64.tar.gz

# 解压到当前cri-containerd-cni目录中。
mkdir -vp cri-containerd-cni
tar -zxvf cri-containerd-cni-1.6.4-linux-amd64.tar.gz -C cri-containerd-cni


步骤 02.查看其文件以及配置文件路径信息。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
$ tree ./cri-containerd-cni/
.
├── etc
│   ├── cni
│   │   └── net.d
│   │   └── 10-containerd-net.conflist
│   ├── crictl.yaml
│   └── systemd
│   └── system
│   └── containerd.service
├── opt
│   ├── cni
│   │   └── bin
│   │   ├── bandwidth
│   │   ├── bridge
│   │   ├── dhcp
│   │   ├── firewall
│   │   ├── host-device
│   │   ├── host-local
│   │   ├── ipvlan
│   │   ├── loopback
│   │   ├── macvlan
│   │   ├── portmap
│   │   ├── ptp
│   │   ├── sbr
│   │   ├── static
│   │   ├── tuning
│   │   ├── vlan
│   │   └── vrf
│   └── containerd
│   └── cluster
│   ├── gce
│   │   ├── cloud-init
│   │   │   ├── master.yaml
│   │   │   └── node.yaml
│   │   ├── cni.template
│   │   ├── configure.sh
│   │   └── env
│   └── version
└── usr
└── local
├── bin
│   ├── containerd
│   ├── containerd-shim
│   ├── containerd-shim-runc-v1
│   ├── containerd-shim-runc-v2
│   ├── containerd-stress
│   ├── crictl
│   ├── critest
│   ├── ctd-decoder
│   └── ctr
└── sbin
└── runc

# 然后在所有节点上复制到上述文件夹到对应目录中
cd ./cri-containerd-cni/
cp -r etc/ /
cp -r opt/ /
cp -r usr/ /


步骤 03.【所有节点】进行containerd 配置创建并修改 config.toml .

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
mkdir -vp /etc/containerd

# 默认配置生成
containerd config default >/etc/containerd/config.toml
ls /etc/containerd/config.toml
# /etc/containerd/config.toml

# pause 镜像源
sed -i "s#k8s.gcr.io/pause#registry.cn-hangzhou.aliyuncs.com/google_containers/pause#g" /etc/containerd/config.toml

# 使用 SystemdCgroup
sed -i 's#SystemdCgroup = false#SystemdCgroup = true#g' /etc/containerd/config.toml

# docker.io mirror
sed -i '/registry.mirrors]/a\ \ \ \ \ \ \ \ [plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]' /etc/containerd/config.toml
sed -i '/registry.mirrors."docker.io"]/a\ \ \ \ \ \ \ \ \ \ endpoint = ["https://xlx9erfu.mirror.aliyuncs.com","https://docker.mirrors.ustc.edu.cn"]' /etc/containerd/config.toml

# gcr.io mirror
sed -i '/registry.mirrors]/a\ \ \ \ \ \ \ \ [plugins."io.containerd.grpc.v1.cri".registry.mirrors."gcr.io"]' /etc/containerd/config.toml
sed -i '/registry.mirrors."gcr.io"]/a\ \ \ \ \ \ \ \ \ \ endpoint = ["https://gcr.mirrors.ustc.edu.cn"]' /etc/containerd/config.toml

# k8s.gcr.io mirror
sed -i '/registry.mirrors]/a\ \ \ \ \ \ \ \ [plugins."io.containerd.grpc.v1.cri".registry.mirrors."k8s.gcr.io"]' /etc/containerd/config.toml
sed -i '/registry.mirrors."k8s.gcr.io"]/a\ \ \ \ \ \ \ \ \ \ endpoint = ["https://gcr.mirrors.ustc.edu.cn/google-containers/","https://registry.cn-hangzhou.aliyuncs.com/google_containers/"]' /etc/containerd/config.toml

# quay.io mirror
sed -i '/registry.mirrors]/a\ \ \ \ \ \ \ \ [plugins."io.containerd.grpc.v1.cri".registry.mirrors."quay.io"]' /etc/containerd/config.toml
sed -i '/registry.mirrors."quay.io"]/a\ \ \ \ \ \ \ \ \ \ endpoint = ["https://quay.mirrors.ustc.edu.cn"]' /etc/containerd/config.toml


步骤 04.客户端工具 runtime 与 镜像 端点配置:

1
2
3
4
5
6
7
8
9
10
11
# 手动设置临时生效
# crictl config runtime-endpoint /run/containerd/containerd.sock
# /run/containerd/containerd.sock

# 配置文件设置永久生效
cat <<EOF > /etc/crictl.yaml
runtime-endpoint: unix:///run/containerd/containerd.sock
image-endpoint: unix:///run/containerd/containerd.sock
timeout: 10
debug: false
EOF


步骤 05.重载 systemd自启和启动containerd.io服务。

1
2
3
4
5
6
7
8
9
10
11
12
systemctl daemon-reload && systemctl enable --now containerd.service
systemctl status containerd.service
ctr version
# Client:
# Version: 1.5.11
# Revision: 3df54a852345ae127d1fa3092b95168e4a88e2f8
# Go version: go1.17.8

# Server:
# Version: 1.5.11
# Revision: 3df54a852345ae127d1fa3092b95168e4a88e2f8
# UUID: 71a28bbb-6ed6-408d-a873-e394d48b35d8


步骤 06.用于根据OCI规范生成和运行容器的CLI工具 runc 版本查看

1
2
3
4
5
6
runc -v
# runc version 1.1.1
# commit: v1.1.1-0-g52de29d7
# spec: 1.0.2-dev
# go: go1.17.9
# libseccomp: 2.5.1

温馨提示: 当默认 runc 执行提示 runc: symbol lookup error: runc: undefined symbol: seccomp_notify_respond 时,由于上述软件包中包含的runc对系统依赖过多,所以建议单独下载安装 runc 二进制项目(https://github.com/opencontainers/runc/)

1
2
3
4
5
6
7
wget https://github.com/opencontainers/runc/releases/download/v1.1.1/runc.amd64

# 执行权限赋予
chmod +x runc.amd64

# 替换掉 /usr/local/sbin/ 路径原软件包中的 runc
mv runc.amd64 /usr/local/sbin/runc


6.Kubernetes 集群安装部署

1) 二进制软件包下载安装 (手动-此处以该实践为例)

步骤 01.【master-225】手动从k8s.io下载Kubernetes软件包并解压安装, 当前最新版本为 v1.23.6 。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# 如果下载较慢请使用某雷
wget -L
https://dl.k8s.io/v1.23.6/kubernetes-server-linux-amd64.tar.gz

# 解压
/app/k8s-init-work/tools# tar -zxf kubernetes-server-linux-amd64.tar.gz

# 安装二进制文件到 /usr/local/bin
/app/k8s-init-work/tools/kubernetes/server/bin# ls
# apiextensions-apiserver kube-apiserver.docker_tag kube-controller-manager.tar kube-log-runner kube-scheduler
# kubeadm kube-apiserver.tar kubectl kube-proxy kube-scheduler.docker_tag
# kube-aggregator kube-controller-manager kubectl-convert kube-proxy.docker_tag kube-scheduler.tar
# kube-apiserver kube-controller-manager.docker_tag kubelet kube-proxy.tar mounter

cp kube-apiserver kube-controller-manager kube-scheduler kube-proxy kubelet kubeadm kubectl /usr/local/bin

# 验证安装
kubectl version
# Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.6", GitCommit:"ad3338546da947756e8a88aa6822e9c11e7eac22", GitTreeState:"clean", BuildDate:"2022-04-14T08:49:13Z", GoVersion:"go1.17.9", Compiler:"gc", Platform:"linux/amd64"}


步骤 02.【master-225】利用scp将kubernetes相关软件包分发到其它机器中。

1
2
3
4
5
6
7
scp -P 20211 kube-apiserver kube-controller-manager kube-scheduler kube-proxy kubelet kubeadm kubectl weiyigeek@master-223:/tmp
scp -P 20211 kube-apiserver kube-controller-manager kube-scheduler kube-proxy kubelet kubeadm kubectl weiyigeek@master-224:/tmp
scp -P 20211 kube-apiserver kube-controller-manager kube-scheduler kube-proxy kubelet kubeadm kubectl weiyigeek@node-1:/tmp
scp -P 20211 kube-apiserver kube-controller-manager kube-scheduler kube-proxy kubelet kubeadm kubectl weiyigeek@node-2:/tmp

# 复制如下到指定bin目录中(注意node节点只需要 kube-proxy kubelet 、kubeadm 在此处可选)
cp kube-apiserver kube-controller-manager kube-scheduler kube-proxy kubelet kubeadm kubectl /usr/local/bin/


步骤 03.【所有节点】在集群所有节点上创建准备如下目录

1
mkdir -vp /etc/kubernetes/{manifests,pki,ssl,cfg} /var/log/kubernetes /var/lib/kubelet


2) 部署配置 kube-apiserver

描述: 它是集群所有服务请求访问的入口点, 通过 API 接口操作集群中的资源。

步骤 01.【master-225】创建apiserver证书请求文件并使用上一章生成的CA签发证书。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
# 创建证书申请文件,注意下述文件hosts字段中IP为所有Master/LB/VIP IP,为了方便后期扩容可以多写几个预留的IP。
# 同时还需要填写 service 网络的首个IP(一般是 kube-apiserver 指定的 service-cluster-ip-range 网段的第一个IP,如 10.96.0.1)。
tee apiserver-csr.json <<'EOF'
{
"CN": "kubernetes",
"hosts": [
"127.0.0.1",
"10.10.107.223",
"10.10.107.224",
"10.10.107.225",
"10.10.107.222",
"10.96.0.1",
"weiyigeek.cluster.k8s",
"master-223",
"master-224",
"master-225",
"kubernetes",
"kubernetes.default.svc",
"kubernetes.default.svc.cluster",
"kubernetes.default.svc.cluster.local"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "ChongQing",
"ST": "ChongQing",
"O": "k8s",
"OU": "System"
}
]
}
EOF

# 使用自签CA签发 kube-apiserver HTTPS证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes apiserver-csr.json | cfssljson -bare apiserver

$ ls apiserver*
apiserver.csr apiserver-csr.json apiserver-key.pem apiserver.pem

# 复制到自定义目录
cp *.pem /etc/kubernetes/ssl/


步骤 02.【master-225】创建TLS机制所需TOKEN

1
2
3
cat > /etc/kubernetes/bootstrap-token.csv << EOF
$(head -c 16 /dev/urandom | od -An -t x | tr -d ' '),kubelet-bootstrap,10001,"system:bootstrappers"
EOF

温馨提示: 启用 TLS Bootstrapping 机制Node节点kubelet和kube-proxy要与kube-apiserver进行通信,必须使用CA签发的有效证书才可以,当Node节点很多时,这种客户端证书颁发需要大量工作,同样也会增加集群扩展复杂度。
为了简化流程,Kubernetes引入了TLS bootstraping机制来自动颁发客户端证书,kubelet会以一个低权限用户自动向apiserver申请证书,kubelet的证书由apiserver动态签署。所以强烈建议在Node上使用这种方式,目前主要用于kubelet,kube-proxy还是由我们统一颁发一个证书。


步骤 03.【master-225】创建 kube-apiserver 配置文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
cat > /etc/kubernetes/cfg/kube-apiserver.conf <<'EOF'
KUBE_APISERVER_OPTS="--apiserver-count=3 \
--advertise-address=10.10.107.225 \
--allow-privileged=true \
--authorization-mode=RBAC,Node \
--bind-address=0.0.0.0 \
--enable-aggregator-routing=true \
--enable-admission-plugins=NamespaceLifecycle,NodeRestriction,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota \
--enable-bootstrap-token-auth=true \
--token-auth-file=/etc/kubernetes/bootstrap-token.csv \
--secure-port=6443 \
--service-node-port-range=30000-32767 \
--service-cluster-ip-range=10.96.0.0/16 \
--client-ca-file=/etc/kubernetes/ssl/ca.pem \
--tls-cert-file=/etc/kubernetes/ssl/apiserver.pem \
--tls-private-key-file=/etc/kubernetes/ssl/apiserver-key.pem \
--kubelet-client-certificate=/etc/kubernetes/ssl/apiserver.pem \
--kubelet-client-key=/etc/kubernetes/ssl/apiserver-key.pem \
--kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname \
--etcd-cafile=/etc/kubernetes/ssl/ca.pem \
--etcd-certfile=/etc/kubernetes/ssl/etcd.pem \
--etcd-keyfile=/etc/kubernetes/ssl/etcd-key.pem \
--etcd-servers=https://10.10.107.225:2379,https://10.10.107.224:2379,https://10.10.107.223:2379 \
--service-account-issuer=https://kubernetes.default.svc.cluster.local \
--service-account-key-file=/etc/kubernetes/ssl/ca-key.pem \
--service-account-signing-key-file=/etc/kubernetes/ssl/ca-key.pem \
--proxy-client-cert-file=/etc/kubernetes/ssl/apiserver.pem \
--proxy-client-key-file=/etc/kubernetes/ssl/apiserver-key.pem \
--requestheader-allowed-names=kubernetes \
--requestheader-extra-headers-prefix=X-Remote-Extra- \
--requestheader-group-headers=X-Remote-Group \
--requestheader-username-headers=X-Remote-User \
--requestheader-client-ca-file=/etc/kubernetes/ssl/ca.pem \
--v=2 \
--event-ttl=1h \
--feature-gates=TTLAfterFinished=true \
--logtostderr=false \
--log-dir=/var/log/kubernetes"
EOF

# 审计日志可选
# --audit-log-maxage=30
# --audit-log-maxbackup=3
# --audit-log-maxsize=100
# --audit-log-path=/var/log/kubernetes/kube-apiserver.log"

# –logtostderr:启用日志
# —v:日志等级
# –log-dir:日志目录
# –etcd-servers:etcd集群地址
# –bind-address:监听地址
# –secure-port:https安全端口
# –advertise-address:集群通告地址
# –allow-privileged:启用授权
# –service-cluster-ip-range:Service虚拟IP地址段
# –enable-admission-plugins:准入控制模块
# –authorization-mode:认证授权,启用RBAC授权和节点自管理
# –enable-bootstrap-token-auth:启用TLS bootstrap机制
# –token-auth-file:bootstrap token文件
# –service-node-port-range:Service nodeport类型默认分配端口范围
# –kubelet-client-xxx:apiserver访问kubelet客户端证书
# –tls-xxx-file:apiserver https证书
# –etcd-xxxfile:连接Etcd集群证书
# –audit-log-xxx:审计日志

# 温馨提示: 在 1.23.* 版本之后请勿使用如下参数。
Flag --enable-swagger-ui has been deprecated,
Flag --insecure-port has been deprecated,
Flag --alsologtostderr has been deprecated,
Flag --logtostderr has been deprecated, will be removed in a future release,
Flag --log-dir has been deprecated, will be removed in a future release,
Flag -- TTLAfterFinished=true. It will be removed in a future release. (还可使用)


步骤 04.【master-225】创建kube-apiserver服务管理配置文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
cat > /lib/systemd/system/kube-apiserver.service << "EOF"
[Unit]
Description=Kubernetes API Server
Documentation=https://github.com/kubernetes/kubernetes
After=etcd.service
Wants=etcd.service

[Service]
EnvironmentFile=-/etc/kubernetes/cfg/kube-apiserver.conf
ExecStart=/usr/local/bin/kube-apiserver $KUBE_APISERVER_OPTS
Restart=on-failure
RestartSec=5
Type=notify
LimitNOFILE=65536
LimitNPROC=65535

[Install]
WantedBy=multi-user.target
EOF


步骤 05.将上述创建生成的文件目录同步到集群的其它master节点上.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# master - 225  /etc/kubernetes/ 目录结构
tree /etc/kubernetes/
/etc/kubernetes/
├── bootstrap-token.csv
├── cfg
│   └── kube-apiserver.conf
├── manifests
├── pki
└── ssl
├── apiserver-key.pem
├── apiserver.pem
├── ca-key.pem
├── ca.pem
├── etcd-key.pem
└── etcd.pem

# 证书及kube-apiserver.conf配置文件
scp -P 20211 -R /etc/kubernetes/ weiyigeek@master-223:/tmp
scp -P 20211 -R /etc/kubernetes/ weiyigeek@master-224:/tmp

# kube-apiserver 服务管理配置文件
scp -P 20211 /lib/systemd/system/kube-apiserver.service weiyigeek@master-223:/tmp
scp -P 20211 /lib/systemd/system/kube-apiserver.service weiyigeek@master-224:/tmp

# 【master-223】 【master-224】 执行如下命令将上传到/tmp相关文件放入指定目录中
cd /tmp/ && cp -r /tmp/kubernetes/ /etc/
mv kube-apiserver.service /lib/systemd/system/kube-apiserver.service
mv kube-apiserver kubectl kube-proxy kube-scheduler kubeadm kube-controller-manager kubelet /usr/local/bin


步骤 06.分别修改 /etc/kubernetes/cfg/kube-apiserver.conf 文件中 --advertise-address=10.10.107.225

1
2
3
4
# 【master-223】
sed -i 's#--advertise-address=10.10.107.225#--advertise-address=10.10.107.223#g' /etc/kubernetes/cfg/kube-apiserver.conf
# 【master-224】
sed -i 's#--advertise-address=10.10.107.225#--advertise-address=10.10.107.224#g' /etc/kubernetes/cfg/kube-apiserver.conf

WeiyiGeek.kube-apiserver.conf

WeiyiGeek.kube-apiserver.conf


步骤 07.【master节点】完成上述操作后分别在三台master节点上启动apiserver服务。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# 重载systemd与自启设置
systemctl daemon-reload
systemctl enable --now kube-apiserver
systemctl status kube-apiserver

# 测试api-server
curl --insecure https://10.10.107.222:16443/
curl --insecure https://10.10.107.223:6443/
curl --insecure https://10.10.107.224:6443/
curl --insecure https://10.10.107.225:6443/

# 测试结果
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {},
"status": "Failure",
"message": "forbidden: User \"system:anonymous\" cannot get path \"/\"",
"reason": "Forbidden",
"details": {},
"code": 403
}

WeiyiGeek.三台master节点apiserver服务状态

WeiyiGeek.三台master节点apiserver服务状态

至此完毕!


3) 部署配置 kubectl

描述: 它是集群管理客户端工具,与 API-Server 服务请求交互, 实现资源的查看与管理。

步骤 01.【master-225】创建kubectl证书请求文件CSR并生成证书

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
tee admin-csr.json <<'EOF'
{
"CN": "admin",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "ChongQing",
"ST": "ChongQing",
"O": "system:masters",
"OU": "System"
}
]
}
EOF

# 证书文件生成
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes admin-csr.json | cfssljson -bare admin

# 复制证书相关文件到指定目录
ls admin*
# admin.csr admin-csr.json admin-key.pem admin.pem
cp admin*.pem /etc/kubernetes/ssl/


步骤 02.【master-225】生成kubeconfig配置文件 admin.conf 为 kubectl 的配置文件,包含访问 apiserver 的所有信息,如 apiserver 地址、CA 证书和自身使用的证书

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
cd /etc/kubernetes 

# 配置集群信息
# 此处也可以采用域名的形式 (https://weiyigeek.cluster.k8s:16443 )
kubectl config set-cluster kubernetes --certificate-authority=/etc/kubernetes/ssl/ca.pem --embed-certs=true --server=https://10.10.107.222:16443 --kubeconfig=admin.conf
# Cluster "kubernetes" set.

# 配置集群认证用户
kubectl config set-credentials admin --client-certificate=/etc/kubernetes/ssl/admin.pem --client-key=/etc/kubernetes/ssl/admin-key.pem --embed-certs=true --kubeconfig=admin.conf
# User "admin" set.

# 配置上下文
kubectl config set-context kubernetes --cluster=kubernetes --user=admin --kubeconfig=admin.conf
# Context "kubernetes" created.

# 使用上下文
kubectl config use-context kubernetes --kubeconfig=admin.conf
# Context "kubernetes" created.


步骤 03.【master-225】准备kubectl配置文件并进行角色绑定。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
mkdir /root/.kube && cp /etc/kubernetes/admin.conf ~/.kube/config
kubectl create clusterrolebinding kube-apiserver:kubelet-apis --clusterrole=system:kubelet-api-admin --user kubernetes --kubeconfig=/root/.kube/config
# clusterrolebinding.rbac.authorization.k8s.io/kube-apiserver:kubelet-apis created

# 该 config 的内容
$ cat /root/.kube/config
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: base64(CA证书)
server: https://10.10.107.222:16443
name: kubernetes
contexts:
- context:
cluster: kubernetes
user: admin
name: kubernetes
current-context: kubernetes
kind: Config
preferences: {}
users:
- name: admin
user:
client-certificate-data: base64(用户证书)
client-key-data: base64(用户证书KEY)


步骤 04.【master-225】查看集群状态

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
export KUBECONFIG=$HOME/.kube/config

# 查看集群信息
kubectl cluster-info
# Kubernetes control plane is running at https://10.10.107.222:16443
# To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

# 查看集群组件状态
kubectl get componentstatuses
# NAME STATUS MESSAGE ERROR
# controller-manager Unhealthy Get "https://127.0.0.1:10257/healthz": dial tcp 127.0.0.1:10257: connect: connection refused
# scheduler Unhealthy Get "https://127.0.0.1:10259/healthz": dial tcp 127.0.0.1:10259: connect: connection refused
# etcd-0 Healthy {"health":"true","reason":""}
# etcd-1 Healthy {"health":"true","reason":""}
# etcd-2 Healthy {"health":"true","reason":""}

# 查看命名空间以及所有名称中资源对象
kubectl get all --all-namespaces
# NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
# default service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 8h
kubectl get ns
# NAME STATUS AGE
# default Active 8h
# kube-node-lease Active 8h
# kube-public Active 8h
# kube-system Active 8h

温馨提示: 由于我们还未进行 controller-manager 与 scheduler 部署所以此时其状态为 Unhealthy。


步骤 05.【master-225】同步kubectl配置文件到集群其它master节点

1
2
3
4
5
6
7
8
9
10
ssh -p 20211 weiyigeek@master-223 'mkdir ~/.kube/'
ssh -p 20211 weiyigeek@master-224 'mkdir ~/.kube/'

scp -P 20211 $HOME/.kube/config weiyigeek@master-223:~/.kube/
scp -P 20211 $HOME/.kube/config weiyigeek@master-224:~/.kube/

# 【master-223】
weiyigeek@master-223:~$ kubectl get services
# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
# kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 8h


步骤 06.配置 kubectl 命令补全 (建议新手勿用,等待后期熟悉相关命令后使用)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# 安装补全工具
apt install -y bash-completion
source /usr/share/bash-completion/bash_completion

# 方式1
source <(kubectl completion bash)

# 方式2
kubectl completion bash > ~/.kube/completion.bash.inc
source ~/.kube/completion.bash.inc

# 自动
tee $HOME/.bash_profile <<'EOF'
# source <(kubectl completion bash)
source ~/.kube/completion.bash.inc
EOF

至此 kubectl 集群客户端配置 完毕.


4) 部署配置 kube-controller-manager

描述: 它是集群中的控制器组件,其内部包含多个控制器资源, 实现对象的自动化控制中心。

步骤 01.【master-225】创建 kube-controller-manager 证书请求文件CSR并生成证书

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
tee controller-manager-csr.json <<'EOF'
{
"CN": "system:kube-controller-manager",
"hosts": [
"127.0.0.1",
"10.10.107.223",
"10.10.107.224",
"10.10.107.225"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "ChongQing",
"ST": "ChongQing",
"O": "system:kube-controller-manager",
"OU": "System"
}
]
}
EOF

# 说明:
* hosts 列表包含所有 kube-controller-manager 节点 IP;
* CN 为 system:kube-controller-manager;
* O 为 system:kube-controller-manager,kubernetes 内置的 ClusterRoleBindings system:kube-controller-manager 赋予 kube-controller-manager 工作所需的权限

# 利用 CA 颁发 kube-controller-manager 证书文件
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes controller-manager-csr.json | cfssljson -bare controller-manager

$ ls controller*
controller-manager.csr controller-manager-csr.json controller-manager-key.pem controller-manager.pem

# 复制证书
cp controller* /etc/kubernetes/ssl


步骤 02.创建 kube-controller-manager 的 controller-manager.conf 配置文件.

1
2
3
4
5
6
7
8
9
10
11
12
13
cd /etc/kubernetes/

# 设置集群 也可采用 (https://weiyigeek.cluster.k8s:16443 ) 域名形式。
kubectl config set-cluster kubernetes --certificate-authority=/etc/kubernetes/ssl/ca.pem --embed-certs=true --server=https://10.10.107.222:16443 --kubeconfig=controller-manager.conf

# 设置认证
kubectl config set-credentials system:kube-controller-manager --client-certificate=/etc/kubernetes/ssl/controller-manager.pem --client-key=/etc/kubernetes/ssl/controller-manager-key.pem --embed-certs=true --kubeconfig=controller-manager.conf

# 设置上下文
kubectl config set-context system:kube-controller-manager --cluster=kubernetes --user=system:kube-controller-manager --kubeconfig=controller-manager.conf

# 且换上下文
kubectl config use-context system:kube-controller-manager --kubeconfig=controller-manager.conf


步骤 03.准备创建 kube-controller-manager 配置文件。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
cat > /etc/kubernetes/cfg/kube-controller-manager.conf << "EOF"
KUBE_CONTROLLER_MANAGER_OPTS="--allocate-node-cidrs=true \
--bind-address=127.0.0.1 \
--secure-port=10257 \
--authentication-kubeconfig=/etc/kubernetes/controller-manager.conf \
--authorization-kubeconfig=/etc/kubernetes/controller-manager.conf \
--cluster-name=kubernetes \
--client-ca-file=/etc/kubernetes/ssl/ca.pem \
--cluster-signing-cert-file=/etc/kubernetes/ssl/ca.pem \
--cluster-signing-key-file=/etc/kubernetes/ssl/ca-key.pem \
--controllers=*,bootstrapsigner,tokencleaner \
--cluster-cidr=10.128.0.0/16 \
--service-cluster-ip-range=10.96.0.0/16 \
--use-service-account-credentials=true \
--root-ca-file=/etc/kubernetes/ssl/ca.pem \
--service-account-private-key-file=/etc/kubernetes/ssl/ca-key.pem \
--tls-cert-file=/etc/kubernetes/ssl/controller-manager.pem \
--tls-private-key-file=/etc/kubernetes/ssl/controller-manager-key.pem \
--leader-elect=true \
--cluster-signing-duration=87600h \
--v=2 \
--logtostderr=false \
--log-dir=/var/log/kubernetes \
--kubeconfig=/etc/kubernetes/controller-manager.conf
EOF

# 温馨提示:
Flag --logtostderr has been deprecated, will be removed in a future release,
Flag --log-dir has been deprecated, will be removed in a future release
--requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt


步骤 04.创建 kube-controller-manager 服务启动文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
cat > /lib/systemd/system/kube-controller-manager.service << "EOF"
[Unit]
Description=Kubernetes Controller Manager
Documentation=https://github.com/kubernetes/kubernetes

[Service]
EnvironmentFile=-/etc/kubernetes/cfg/kube-controller-manager.conf
ExecStart=/usr/local/bin/kube-controller-manager $KUBE_CONTROLLER_MANAGER_OPTS
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF


步骤 05.【master-225】同样分发上述文件到其它master节点中。

1
2
3
4
5
6
7
8
9
# controller-manager 证书 及 kube-controller-manager.conf 配置文件、kube-controller-manager.service 服务管理配置文件
scp -P 20211 /etc/kubernetes/ssl/controller-manager.pem /etc/kubernetes/ssl/controller-manager-key.pem /etc/kubernetes/controller-manager.conf /etc/kubernetes/cfg/kube-controller-manager.conf /lib/systemd/system/kube-controller-manager.service weiyigeek@master-223:/tmp
scp -P 20211 /etc/kubernetes/ssl/controller-manager.pem /etc/kubernetes/ssl/controller-manager-key.pem /etc/kubernetes/controller-manager.conf /etc/kubernetes/cfg/kube-controller-manager.conf /lib/systemd/system/kube-controller-manager.service weiyigeek@master-224:/tmp

# 【master-223】 【master-224】 执行如下命令将上传到/tmp相关文件放入指定目录中
mv controller-manager*.pem /etc/kubernetes/ssl/
mv controller-manager.conf /etc/kubernetes/controller-manager.conf
mv kube-controller-manager.conf /etc/kubernetes/cfg/kube-controller-manager.conf
mv kube-controller-manager.service /lib/systemd/system/kube-controller-manager.service


步骤 06.重载 systemd 以及自动启用 kube-controller-manager 服务。

1
2
3
systemctl daemon-reload 
systemctl enable --now kube-controller-manager
systemctl status kube-controller-manager


步骤 07.启动运行 kube-controller-manager 服务后查看集群组件状态, 发现原本 controller-manager 是 Unhealthy 的状态已经变成了 Healthy 状态。

1
2
3
4
5
6
7
kubectl get componentstatu
# NAME STATUS MESSAGE ERROR
# scheduler Unhealthy Get "https://127.0.0.1:10259/healthz": dial tcp 127.0.0.1:10259: connect: connection refused
# controller-manager Healthy ok
# etcd-2 Healthy {"health":"true","reason":""}
# etcd-0 Healthy {"health":"true","reason":""}
# etcd-1 Healthy {"health":"true","reason":""}

至此 kube-controller-manager 服务的安装、配置完毕!


5) 部署配置 kube-scheduler

描述: 在集群中kube-scheduler调度器组件, 负责任务调度选择合适的节点进行分配任务.

步骤 01.【master-225】创建 kube-scheduler 证书请求文件CSR并生成证书

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
tee scheduler-csr.json <<'EOF'
{
"CN": "system:kube-scheduler",
"hosts": [
"127.0.0.1",
"10.10.107.223",
"10.10.107.224",
"10.10.107.225"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "ChongQing",
"ST": "ChongQing",
"O": "system:kube-scheduler",
"OU": "System"
}
]
}
EOF

# 利用 CA 颁发 kube-scheduler 证书文件
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes scheduler-csr.json | cfssljson -bare scheduler

$ ls scheduler*
scheduler-csr.json scheduler.csr scheduler-key.pem scheduler.pem

# 复制证书
cp scheduler*.pem /etc/kubernetes/ssl


步骤 02.完成后我们需要创建 kube-scheduler 的 kubeconfig 配置文件。

1
2
3
4
5
6
7
8
9
10
11
12
13
cd /etc/kubernetes/

# 设置集群 也可采用 (https://weiyigeek.cluster.k8s:16443 ) 域名形式。
kubectl config set-cluster kubernetes --certificate-authority=/etc/kubernetes/ssl/ca.pem --embed-certs=true --server=https://10.10.107.222:16443 --kubeconfig=scheduler.conf

# 设置认证
kubectl config set-credentials system:kube-scheduler --client-certificate=/etc/kubernetes/ssl/scheduler.pem --client-key=/etc/kubernetes/ssl/scheduler-key.pem --embed-certs=true --kubeconfig=scheduler.conf

# 设置上下文
kubectl config set-context system:kube-scheduler --cluster=kubernetes --user=system:kube-scheduler --kubeconfig=scheduler.conf

# 且换上下文
kubectl config use-context system:kube-scheduler --kubeconfig=scheduler.conf


步骤 03.创建 kube-scheduler 服务配置文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
cat > /etc/kubernetes/cfg/kube-scheduler.conf << "EOF"
KUBE_SCHEDULER_OPTS="--address=127.0.0.1 \
--secure-port=10259 \
--kubeconfig=/etc/kubernetes/scheduler.conf \
--authentication-kubeconfig=/etc/kubernetes/scheduler.conf \
--authorization-kubeconfig=/etc/kubernetes/scheduler.conf \
--client-ca-file=/etc/kubernetes/ssl/ca.pem \
--tls-cert-file=/etc/kubernetes/ssl/scheduler.pem \
--tls-private-key-file=/etc/kubernetes/ssl/scheduler-key.pem \
--leader-elect=true \
--alsologtostderr=true \
--logtostderr=false \
--log-dir=/var/log/kubernetes \
--v=2"
EOF


步骤 04.创建 kube-scheduler 服务启动配置文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
cat > /lib/systemd/system/kube-scheduler.service << "EOF"
[Unit]
Description=Kubernetes Scheduler
Documentation=https://github.com/kubernetes/kubernetes

[Service]
EnvironmentFile=-/etc/kubernetes/cfg/kube-scheduler.conf
ExecStart=/usr/local/bin/kube-scheduler $KUBE_SCHEDULER_OPTS
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF


步骤 05.【master-225】同样分发上述文件到其它master节点中。

1
2
3
4
5
6
7
8
9
# scheduler 证书 及 kube-scheduler.conf 配置文件、kube-scheduler.service 服务管理配置文件
scp -P 20211 /etc/kubernetes/ssl/scheduler.pem /etc/kubernetes/ssl/scheduler-key.pem /etc/kubernetes/scheduler.conf /etc/kubernetes/cfg/kube-scheduler.conf /lib/systemd/system/kube-scheduler.service weiyigeek@master-223:/tmp
scp -P 20211 /etc/kubernetes/ssl/scheduler.pem /etc/kubernetes/ssl/scheduler-key.pem /etc/kubernetes/scheduler.conf /etc/kubernetes/cfg/kube-scheduler.conf /lib/systemd/system/kube-scheduler.service weiyigeek@master-224:/tmp

# 【master-223】 【master-224】 执行如下命令将上传到/tmp相关文件放入指定目录中
mv scheduler*.pem /etc/kubernetes/ssl/
mv scheduler.conf /etc/kubernetes/scheduler.conf
mv kube-scheduler.conf /etc/kubernetes/cfg/kube-scheduler.conf
mv kube-scheduler.service /lib/systemd/system/kube-scheduler.service


步骤 06.【所有master节点】重载 systemd 以及自动启用 kube-scheduler 服务。

1
2
3
systemctl daemon-reload 
systemctl enable --now kube-scheduler
systemctl status kube-scheduler


步骤 07.【所有master节点】验证所有master节点各个组件状态, 正常状态下如下, 如有错误请排查通过后在进行后面操作。

1
2
3
4
5
6
7
8
kubectl get componentstatuses
# Warning: v1 ComponentStatus is deprecated in v1.19+
# NAME STATUS MESSAGE ERROR
# controller-manager Healthy ok
# scheduler Healthy ok
# etcd-0 Healthy {"health":"true","reason":""}
# etcd-1 Healthy {"health":"true","reason":""}
# etcd-2 Healthy {"health":"true","reason":""}

WeiyiGeek.所有master节点组件状态

WeiyiGeek.所有master节点组件状态


6) 部署配置 kubelet

步骤 01.【master-225】读取BOOTSTRAP_TOKE 并 创建 kubelet 的 kubeconfig 配置文件 kubelet.conf。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
cd /etc/kubernetes/

# 读取 bootstrap-token 值
BOOTSTRAP_TOKEN=$(awk -F "," '{print $1}' /etc/kubernetes/bootstrap-token.csv)

# BOOTSTRAP_TOKEN="123456.httpweiyigeektop"

# 设置集群 也可采用 (https://weiyigeek.cluster.k8s:16443 ) 域名形式。
kubectl config set-cluster kubernetes --certificate-authority=/etc/kubernetes/ssl/ca.pem --embed-certs=true --server=https://weiyigeek.cluster.k8s:16443 --kubeconfig=/etc/kubernetes/kubelet-bootstrap.kubeconfig

# 设置认证
kubectl config set-credentials kubelet-bootstrap --token=${BOOTSTRAP_TOKEN} --kubeconfig=/etc/kubernetes/kubelet-bootstrap.kubeconfig

# 设置上下文
kubectl config set-context default --cluster=kubernetes --user=kubelet-bootstrap --kubeconfig=/etc/kubernetes/kubelet-bootstrap.kubeconfig

# 且换上下文
kubectl config use-context default --kubeconfig=/etc/kubernetes/kubelet-bootstrap.kubeconfig

步骤 02.完成后我们需要进行集群角色绑定。

1
2
3
4
5
6
7
8
9
10
11
12
13
# 角色授权
kubectl create clusterrolebinding cluster-system-anonymous --clusterrole=cluster-admin --user=kubelet-bootstrap
kubectl create clusterrolebinding kubelet-bootstrap --clusterrole=system:node-bootstrapper --user=kubelet-bootstrap --kubeconfig=/etc/kubernetes/kubelet-bootstrap.kubeconfig

# 授权 kubelet 创建 CSR
kubectl create clusterrolebinding create-csrs-for-bootstrapping --clusterrole=system:node-bootstrapper --group=system:bootstrappers

# 对 CSR 进行批复
# 允许 kubelet 请求并接收新的证书
kubectl create clusterrolebinding auto-approve-csrs-for-group --clusterrole=system:certificates.k8s.io:certificatesigningrequests:nodeclient --group=system:bootstrappers

# 允许 kubelet 对其客户端证书执行续期操作
kubectl create clusterrolebinding auto-approve-renewals-for-nodes --clusterrole=system:certificates.k8s.io:certificatesigningrequests:selfnodeclient --group=system:bootstrappers

步骤 03.创建 kubelet 配置文件,配置参考地址(https://kubernetes.io/docs/reference/config-api/kubelet-config.v1beta1/)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
# yaml 格式
# 此处为了通用性推荐使用如下格式
cat > /etc/kubernetes/cfg/kubelet-config.yaml <<'EOF'
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
address: 0.0.0.0
port: 10250
readOnlyPort: 10255
authentication:
anonymous:
enabled: false
webhook:
cacheTTL: 2m0s
enabled: true
x509:
clientCAFile: /etc/kubernetes/ssl/ca.pem
authorization:
mode: AlwaysAllow
webhook:
cacheAuthorizedTTL: 5m0s
cacheUnauthorizedTTL: 30s
cgroupDriver: systemd
cgroupsPerQOS: true
clusterDNS:
- 10.96.0.10
clusterDomain: cluster.local
containerLogMaxFiles: 5
containerLogMaxSize: 10Mi
contentType: application/vnd.kubernetes.protobuf
cpuCFSQuota: true
cpuManagerPolicy: none
cpuManagerReconcilePeriod: 10s
enableControllerAttachDetach: true
enableDebuggingHandlers: true
enforceNodeAllocatable:
- pods
eventBurst: 10
eventRecordQPS: 5
evictionHard:
imagefs.available: 15%
memory.available: 100Mi
nodefs.available: 10%
nodefs.inodesFree: 5%
evictionPressureTransitionPeriod: 5m0s
failSwapOn: true
fileCheckFrequency: 20s
hairpinMode: promiscuous-bridge
healthzBindAddress: 127.0.0.1
healthzPort: 10248
httpCheckFrequency: 20s
imageGCHighThresholdPercent: 85
imageGCLowThresholdPercent: 80
imageMinimumGCAge: 2m0s
iptablesDropBit: 15
iptablesMasqueradeBit: 14
kubeAPIBurst: 10
kubeAPIQPS: 5
makeIPTablesUtilChains: true
maxOpenFiles: 1000000
maxPods: 110
nodeStatusUpdateFrequency: 10s
oomScoreAdj: -999
podPidsLimit: -1
registryBurst: 10
registryPullQPS: 5
resolvConf: /etc/resolv.conf
rotateCertificates: true
runtimeRequestTimeout: 2m0s
serializeImagePulls: true
staticPodPath: /etc/kubernetes/manifests
streamingConnectionIdleTimeout: 4h0m0s
syncFrequency: 1m0s
volumeStatsAggPeriod: 1m0s
EOF
# [master-225] 主机执行则替换为其主机地址, 其它节点也是对应的IP地址, 或者直接 0.0.0.0 全地址监听这样也就不用修改了。
# sed -i 's#__IP__#10.10.107.225#g' /etc/kubernetes/cfg/kubelet-config.yaml

# json 格式 (不推荐了解就好)
cat > /etc/kubernetes/cfg/kubelet.json << "EOF"
{
"kind": "KubeletConfiguration",
"apiVersion": "kubelet.config.k8s.io/v1beta1",
"authentication": {
"x509": {
"clientCAFile": "/etc/kubernetes/ssl/ca.pem"
},
"webhook": {
"enabled": true,
"cacheTTL": "2m0s"
},
"anonymous": {
"enabled": false
}
},
"authorization": {
"mode": "Webhook",
"webhook": {
"cacheAuthorizedTTL": "5m0s",
"cacheUnauthorizedTTL": "30s"
}
},
"address": "__IP__",
"port": 10250,
"readOnlyPort": 10255,
"cgroupDriver": "systemd",
"hairpinMode": "promiscuous-bridge",
"serializeImagePulls": false,
"clusterDomain": "cluster.local.",
"clusterDNS": ["10.96.0.10"]
}
EOF

温馨提示: 上述 kubelet.json 中 address 需要修改为当前主机IP地址, 例如在master-225主机中应该更改为 10.10.107.225 , 此处我设置为了 0.0.0.0 表示监听所有网卡,若有指定网卡的需求请指定其IP地址。


步骤 04.【master-225】创建 kubelet 服务启动管理文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
# 工作目录创建
mkdir -vp /var/lib/kubelet

# 创建 kubelet.service 服务
cat > /lib/systemd/system/kubelet.service << "EOF"
[Unit]
Description=kubelet: The Kubernetes Node Agent
Documentation=https://kubernetes.io/docs/home/
Wants=network-online.target
After=network-online.target

[Service]
WorkingDirectory=/var/lib/kubelet
ExecStart=/usr/local/bin/kubelet \
--bootstrap-kubeconfig=/etc/kubernetes/kubelet-bootstrap.kubeconfig \
--config=/etc/kubernetes/cfg/kubelet-config.yaml \
--cert-dir=/etc/kubernetes/ssl \
--kubeconfig=/etc/kubernetes/kubelet.conf \
--network-plugin=cni \
--root-dir=/etc/cni/net.d \
--cni-bin-dir=/opt/cni/bin \
--cni-conf-dir=/etc/cni/net.d \
--container-runtime=remote \
--container-runtime-endpoint=unix:///run/containerd/containerd.sock \
--rotate-certificates \
--pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.6 \
--image-pull-progress-deadline=15m \
--alsologtostderr=true \
--logtostderr=false \
--log-dir=/var/log/kubernetes \
--v=2 \
--node-labels=node.kubernetes.io/node=''
StartLimitInterval=0
RestartSec=10
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF


步骤 05.将master-255节点上的kubelet相关文件复制到所有节点之中。

1
2
3
4
5
6
7
8
9
10
11
# 复制配置文件到其它节点 /tmp 目录
scp -P 20211 /etc/kubernetes/kubelet-bootstrap.kubeconfig /etc/kubernetes/cfg/kubelet-config.yaml /lib/systemd/system/kubelet.service weiyigeek@master-223:/tmp
scp -P 20211 /etc/kubernetes/kubelet-bootstrap.kubeconfig /etc/kubernetes/cfg/kubelet-config.yaml /lib/systemd/system/kubelet.service weiyigeek@master-224:/tmp
scp -P 20211 /etc/kubernetes/kubelet-bootstrap.kubeconfig /etc/kubernetes/cfg/kubelet-config.yaml /lib/systemd/system/kubelet.service weiyigeek@node-1:/tmp
scp -P 20211 /etc/kubernetes/kubelet-bootstrap.kubeconfig /etc/kubernetes/cfg/kubelet-config.yaml /lib/systemd/system/kubelet.service weiyigeek@node-2:/tmp

# 分别将 /tmp 目录中的kubelet配置文件复制到对应目录
sudo mkdir -vp /var/lib/kubelet /etc/kubernetes/cfg/
cp /tmp/kubelet-bootstrap.kubeconfig /etc/kubernetes/kubelet-bootstrap.kubeconfig
cp /tmp/kubelet-config.yaml /etc/kubernetes/cfg/kubelet-config.yaml
cp /tmp/kubelet.service /lib/systemd/system/kubelet.service


步骤 06.【所有节点】重载 systemd 以及自动启用 kube-scheduler 服务。

1
2
3
4
systemctl daemon-reload
systemctl enable --now kubelet.service
systemctl status -l kubelet.service
# systemctl restart kubelet.service


步骤 07.利用kubectl工具查看集群中所有节点信息。

1
2
3
4
5
6
7
$ kubectl get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master-223 NotReady <none> 17h v1.23.6 10.10.107.223 <none> Ubuntu 20.04.3 LTS 5.4.0-92-generic containerd://1.6.4
master-224 NotReady <none> 19s v1.23.6 10.10.107.224 <none> Ubuntu 20.04.3 LTS 5.4.0-92-generic containerd://1.6.4
master-225 NotReady <none> 3d22h v1.23.6 10.10.107.225 <none> Ubuntu 20.04.3 LTS 5.4.0-109-generic containerd://1.6.4
node-1 NotReady <none> 17h v1.23.6 10.10.107.226 <none> Ubuntu 20.04.2 LTS 5.4.0-96-generic containerd://1.6.4
node-2 NotReady <none> 17h v1.23.6 10.10.107.227 <none> Ubuntu 20.04.2 LTS 5.4.0-96-generic containerd://1.6.4

温馨提示: 由上述结果可知各个节点 STATUS 处于 NotReady , 这是由于节点之间的POD无法通信还缺少网络插件,当我们安装好 kube-proxy 与 calico 就可以变成 Ready 状态了。


步骤 08.确认kubelet服务启动成功后,我们可以执行如下命令查看kubelet-bootstrap申请颁发的证书, 如果CONDITION不为Approved,Issued则需要进行排查是否有错误。

1
2
3
4
5
6
$ kubectl get csr
NAME AGE SIGNERNAME REQUESTOR CONDITION
csr-b949p 7m55s kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Approved,Issued
csr-c9hs4 3m34s kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Approved,Issued
csr-r8vhp 5m50s kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Approved,Issued
csr-zb4sr 3m40s kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Approved,Issued

7) 部署配置 kube-proxy

描述: 在集群中kube-proxy组件, 其负责节点上的网络规则使得您可以在集群内、集群外正确地与 Pod 进行网络通信, 同时它也是负载均衡中最重要的点,进行流量转发

步骤 01.【master-225】创建 kube-proxy 证书请求文件CSR并生成证书

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
tee proxy-csr.json <<'EOF'
{
"CN": "system:kube-proxy",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "ChongQing",
"ST": "ChongQing",
"O": "system:kube-proxy",
"OU": "System"
}
]
}
EOF

# 利用 CA 颁发 kube-scheduler 证书文件
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes proxy-csr.json | cfssljson -bare proxy

$ ls proxy*
proxy.csr proxy-csr.json proxy-key.pem proxy.pem

# 复制证书
cp proxy*.pem /etc/kubernetes/ssl


步骤 02.完成后我们需要创建 kube-scheduler 的 kubeconfig 配置文件。

1
2
3
4
5
6
7
8
9
10
11
12
13
cd /etc/kubernetes/

# 设置集群 也可采用 (https://weiyigeek.cluster.k8s:16443 ) 域名形式。
kubectl config set-cluster kubernetes --certificate-authority=/etc/kubernetes/ssl/ca.pem --embed-certs=true --server=https://10.10.107.222:16443 --kubeconfig=kube-proxy.kubeconfig

# 设置认证
kubectl config set-credentials kube-proxy --client-certificate=/etc/kubernetes/ssl/proxy.pem --client-key=/etc/kubernetes/ssl/proxy-key.pem --embed-certs=true --kubeconfig=kube-proxy.kubeconfig

# 设置上下文
kubectl config set-context default --cluster=kubernetes --user=kube-proxy --kubeconfig=kube-proxy.kubeconfig

# 且换上下文
kubectl config use-context default --kubeconfig=kube-proxy.kubeconfig


步骤 03.创建 kube-proxy 服务配置文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
cat > /etc/kubernetes/cfg/kube-proxy.yaml << "EOF"
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
bindAddress: 0.0.0.0
healthzBindAddress: 0.0.0.0:10256
metricsBindAddress: 0.0.0.0:10249
hostnameOverride: __HOSTNAME__
clusterCIDR: 10.128.0.0/16
clientConnection:
kubeconfig: /etc/kubernetes/kube-proxy.kubeconfig
mode: ipvs
ipvs:
excludeCIDRs:
- 10.128.0.1/32
EOF

温馨提示: ProxyMode 表示的是 Kubernetes 代理服务器所使用的模式.

  • 目前在 Linux 平台上有三种可用的代理模式:'userspace'(相对较老,即将被淘汰)、 'iptables'(相对较新,速度较快)、'ipvs'(最新,在性能和可扩缩性上表现好)
  • 目前在 Windows 平台上有两种可用的代理模式:'userspace'(相对较老,但稳定)和 'kernelspace'(相对较新,速度更快)


步骤 04.创建 kube-proxy.service 服务启动管理文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
mkdir -vp /var/lib/kube-proxy
cat > /lib/systemd/system/kube-proxy.service << "EOF"
[Unit]
Description=Kubernetes Kube-proxy Server
Documentation=https://github.com/kubernetes/kubernetes
After=network.target

[Service]
WorkingDirectory=/var/lib/kube-proxy
ExecStart=/usr/local/bin/kube-proxy \
--config=/etc/kubernetes/cfg/kube-proxy.yaml \
--alsologtostderr=true \
--logtostderr=false \
--log-dir=/var/log/kubernetes \
--v=2
Restart=on-failure
RestartSec=5
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF


步骤 05.同步【master-225】节点中如下文件到其它节点对应目录之上。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
scp -P 20211 /etc/kubernetes/kube-proxy.kubeconfig  /etc/kubernetes/ssl/proxy.pem /etc/kubernetes/ssl/proxy-key.pem /etc/kubernetes/cfg/kube-proxy.yaml /lib/systemd/system/kube-proxy.service weiyigeek@master-223:/tmp
scp -P 20211 /etc/kubernetes/kube-proxy.kubeconfig /etc/kubernetes/ssl/proxy.pem /etc/kubernetes/ssl/proxy-key.pem /etc/kubernetes/cfg/kube-proxy.yaml /lib/systemd/system/kube-proxy.service weiyigeek@master-224:/tmp
scp -P 20211 /etc/kubernetes/kube-proxy.kubeconfig /etc/kubernetes/ssl/proxy.pem /etc/kubernetes/ssl/proxy-key.pem /etc/kubernetes/cfg/kube-proxy.yaml /lib/systemd/system/kube-proxy.service weiyigeek@node-1:/tmp
scp -P 20211 /etc/kubernetes/kube-proxy.kubeconfig /etc/kubernetes/ssl/proxy.pem /etc/kubernetes/ssl/proxy-key.pem /etc/kubernetes/cfg/kube-proxy.yaml /lib/systemd/system/kube-proxy.service weiyigeek@node-2:/tmp

# 复制到对应节点上并创建对应目录
cd /tmp
cp kube-proxy.kubeconfig /etc/kubernetes/kube-proxy.kubeconfig
cp proxy*.pem /etc/kubernetes/ssl/
cp kube-proxy.yaml /etc/kubernetes/cfg/kube-proxy.yaml
cp kube-proxy.service /lib/systemd/system/kube-proxy.service
mkdir -vp /var/lib/kube-proxy

# 温馨提示: 非常注意各节点主机名称,请修改 kube-proxy.yaml 中hostnameOverride为当前执行主机名称.
sed -i "s#__HOSTNAME__#$(hostname)#g" /etc/kubernetes/cfg/kube-proxy.yaml


步骤 06.同样重载systemd服务与设置kube-proxy服务自启,启动后查看其服务状态

1
2
3
4
systemctl daemon-reload
systemctl enable --now kube-proxy
# systemctl restart kube-proxy
systemctl status kube-proxy

WeiyiGeek.kube-proxy服务状态查看

WeiyiGeek.kube-proxy服务状态查看


8) 部署配置 Calico 插件

描述: 前面在节点加入到集群时,你会发现其节点状态为 NotReady , 我们说过部署calico插件可以让Pod与集群正常通信。

步骤 01.在【master-225】节点上拉取最新版本的 calico 当前最新版本为 v3.22, 官方项目地址 (https://github.com/projectcalico/calico)

1
2
# 拉取 calico 部署清单
wget https://docs.projectcalico.org/v3.22/manifests/calico.yaml

步骤 02.修改 calico.yaml 文件的中如下 K/V, 即 Pod 获取IP地址的地址池, 从网络规划中我们设置为 10.128.0.0/16, 注意默认情况下如下字段是注释的且默认地址池为192.168.0.0/16

1
2
3
4
5
6
$ vim calico.yaml
# The default IPv4 pool to create on startup if none exists. Pod IPs will be
# chosen from this range. Changing this value after installation will have
# no effect. This should fall within `--cluster-cidr`.
- name: CALICO_IPV4POOL_CIDR
value: "10.128.0.0/16"

步骤 03.部署 calico 到集群之中。

1
2
3
4
kubectl apply -f calico.yaml
# configmap/calico-config created
# ....
# poddisruptionbudget.policy/calico-kube-controllers created

步骤 04.查看calico网络插件在各节点上部署结果,状态为Running表示部署成功。

1
2
3
4
5
6
7
8
$ kubectl get pod -A -o wide
# NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
# kube-system calico-kube-controllers-6f7b5668f7-m55gf 1/1 Running 0 56m <none> master-224 <none> <none>
# kube-system calico-node-6cmgl 1/1 Running 0 48m 10.10.107.226 node-1 <none> <none>
# kube-system calico-node-bx59c 1/1 Running 0 50m 10.10.107.227 node-2 <none> <none>
# kube-system calico-node-ft29c 1/1 Running 0 50m 10.10.107.225 master-225 <none> <none>
# kube-system calico-node-gkz76 1/1 Running 0 49m 10.10.107.223 master-223 <none> <none>
# kube-system calico-node-nbnwx 1/1 Running 0 47m 10.10.107.224 master-224 <none> <none>

步骤 05.查看集群中各个节点状态是否Ready,状态为Running表示节点正常。

1
2
3
4
5
6
7
kubectl get node
# NAME STATUS ROLES AGE VERSION
# master-223 Ready <none> 3d v1.23.6
# master-224 Ready <none> 2d6h v1.23.6
# master-225 Ready <none> 6d5h v1.23.6
# node-1 Ready <none> 2d23h v1.23.6
# node-2 Ready <none> 2d23h v1.23.6

步骤 06.此处我们扩展一哈,您可能会发现在二进制安装Kubernetes集群时master节点ROLES为<none>,而不是我们常见的master名称, 并且由于 K8s 1.24版本以后kubeadm安装Kubernetes集群时,不再给运行控制面组件的节点标记为“master”,只是因为这个词被认为是“具有攻击性的、无礼的(offensive)”。近几年一些用master-slave来表示主-从节点的计算机系统纷纷改掉术语,slave前两年就已经销声匿迹了,现在看来master也不能用, 但我们可以通过如下方式进行自定义设置节点角色名称。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# 分别将主节点设置为control-plane,工作节点角色设置为work、
# node-role.kubernetes.io/节点角色名称=
kubectl label nodes master-223 node-role.kubernetes.io/control-plane=
kubectl label nodes master-224 node-role.kubernetes.io/control-plane=
kubectl label nodes master-225 node-role.kubernetes.io/control-plane=
kubectl label nodes node-1 node-role.kubernetes.io/work=
kubectl label nodes node-2 node-role.kubernetes.io/work=
node/node-2 labeled

# 此时再次执行查看
$ kubectl get node
NAME STATUS ROLES AGE VERSION
master-223 Ready control-plane 3d v1.23.6
master-224 Ready control-plane 2d6h v1.23.6
master-225 Ready control-plane 6d5h v1.23.6
node-1 Ready work 3d v1.23.6
node-2 Ready work 3d v1.23.6


9) 部署配置 CoreDNS 插件

描述: 通过该CoreDNS插件从名称可以看出其用于 DNS和服务发现, 使得集群中可以使用服务名称进行访问相应的后端Pod,不管后端Pod地址如何改变其总是第一时间会更新绑定到对应服务域名, 该项目也是毕业于CNCF。

官网地址: https://coredns.io/
项目地址: https://github.com/coredns/coredns

步骤 01.通过参考Github(https://github.com/coredns/deployment/tree/master/kubernetes)中coredns项目在K8S部署说明,我们需要下载 deploy.sh 与 coredns.yaml.sed 两个文件.

1
2
3
wget -L https://github.com/coredns/deployment/raw/master/kubernetes/deploy.sh
wget -L https://github.com/coredns/deployment/raw/master/kubernetes/coredns.yaml.sed
chmox +x deploy.sh

步骤 02.生成coredns部署清单yaml文件

1
2
3
# -i DNS-IP 参数指定 集群 dns IP 地址, 注意需要与前面 kubelet-config.yaml 中的 clusterDNS 字段值保持一致
# -d CLUSTER-DOMAIN 参数指定 集群域名名称, 注意需要与前面 kubelet-config.yaml 中的 clusterDomain 字段值保持一致
./deploy.sh -i 10.96.0.10 >> coredns.yaml

温馨提示: 实际上我们可以手动替换 coredns.yaml.sed 内容。

1
2
3
4
5
6
7
8
9
10
11
12
13
CLUSTER_DNS_IP=10.96.0.10
CLUSTER_DOMAIN=cluster.local
REVERSE_CIDRS=in-addr.arpa ip6.arpa
STUBDOMAINS=""
UPSTREAMNAMESERVER=UPSTREAMNAMESERVER
YAML_TEMPLATE=coredns.yaml.sed

sed -e "s/CLUSTER_DNS_IP/$CLUSTER_DNS_IP/g" \
-e "s/CLUSTER_DOMAIN/$CLUSTER_DOMAIN/g" \
-e "s?REVERSE_CIDRS?$REVERSE_CIDRS?g" \
-e "s@STUBDOMAINS@${STUBDOMAINS//$orig/$replace}@g" \
-e "s/UPSTREAMNAMESERVER/$UPSTREAM/g" \
"${YAML_TEMPLATE}"

步骤 03.查看生成的coredns部署清单yaml文件(coredns.yml)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
apiVersion: v1
kind: ServiceAccount
metadata:
name: coredns
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
kubernetes.io/bootstrapping: rbac-defaults
name: system:coredns
rules:
- apiGroups:
- ""
resources:
- endpoints
- services
- pods
- namespaces
verbs:
- list
- watch
- apiGroups:
- discovery.k8s.io
resources:
- endpointslices
verbs:
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
labels:
kubernetes.io/bootstrapping: rbac-defaults
name: system:coredns
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:coredns
subjects:
- kind: ServiceAccount
name: coredns
namespace: kube-system
---
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns
namespace: kube-system
data:
Corefile: |
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . /etc/resolv.conf {
max_concurrent 1000
}
cache 30
loop
reload
loadbalance
}
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: coredns
namespace: kube-system
labels:
k8s-app: kube-dns
kubernetes.io/name: "CoreDNS"
spec:
# replicas: not specified here:
# 1. Default is 1.
# 2. Will be tuned in real time if DNS horizontal auto-scaling is turned on.
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
selector:
matchLabels:
k8s-app: kube-dns
template:
metadata:
labels:
k8s-app: kube-dns
spec:
priorityClassName: system-cluster-critical
serviceAccountName: coredns
tolerations:
- key: "CriticalAddonsOnly"
operator: "Exists"
nodeSelector:
kubernetes.io/os: linux
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: k8s-app
operator: In
values: ["kube-dns"]
topologyKey: kubernetes.io/hostname
containers:
- name: coredns
image: coredns/coredns:1.9.1
imagePullPolicy: IfNotPresent
resources:
limits:
memory: 170Mi
requests:
cpu: 100m
memory: 70Mi
args: [ "-conf", "/etc/coredns/Corefile" ]
volumeMounts:
- name: config-volume
mountPath: /etc/coredns
readOnly: true
ports:
- containerPort: 53
name: dns
protocol: UDP
- containerPort: 53
name: dns-tcp
protocol: TCP
- containerPort: 9153
name: metrics
protocol: TCP
securityContext:
allowPrivilegeEscalation: false
capabilities:
add:
- NET_BIND_SERVICE
drop:
- all
readOnlyRootFilesystem: true
livenessProbe:
httpGet:
path: /health
port: 8080
scheme: HTTP
initialDelaySeconds: 60
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 5
readinessProbe:
httpGet:
path: /ready
port: 8181
scheme: HTTP
dnsPolicy: Default
volumes:
- name: config-volume
configMap:
name: coredns
items:
- key: Corefile
path: Corefile
---
apiVersion: v1
kind: Service
metadata:
name: kube-dns
namespace: kube-system
annotations:
prometheus.io/port: "9153"
prometheus.io/scrape: "true"
labels:
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
kubernetes.io/name: "CoreDNS"
spec:
selector:
k8s-app: kube-dns
clusterIP: 10.96.0.10
ports:
- name: dns
port: 53
protocol: UDP
- name: dns-tcp
port: 53
protocol: TCP
- name: metrics
port: 9153
protocol: TCP

步骤 04.在集群中部署coredns,并查看其部署状态。

1
2
3
4
5
6
7
8
9
10
kubectl apply -f coredns.yaml
# serviceaccount/coredns created
# clusterrole.rbac.authorization.k8s.io/system:coredns created
# clusterrolebinding.rbac.authorization.k8s.io/system:coredns created
# configmap/coredns created
# deployment.apps/coredns created
# service/kube-dns created

# 温馨提示: 为了加快部署可以先拉取coredns镜像到master节点
ctr -n k8s.io i pull docker.io/coredns/coredns:1.9.1


0x03 应用部署测试

1.部署Nginx Web服务

方式1.利用create快速部署由deployment管理的nginx应用

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# 在默认的名称空间中创建由 deployment 资源控制器管理的Nginx服务
$ kubectl create deployment --image=nginx:latest --port=5701 --replicas 1 hello-nginx
deployment.apps/hello-nginx created

# 创建的POD
$ kubectl get pod -o wide --show-labels
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES LABELS
hello-nginx-7f4ff84cb-mjw79 1/1 Running 0 65s 10.128.17.203 master-225 <none> <none> app=hello-nginx,pod-template-hash=7f4ff84cb

# 为hello-nginx部署创建一个服务,该部署服务于端口80,并连接到端口80上的Pod容器
$ kubectl expose deployment hello-nginx --port=80 --target-port=80
service/hello-nginx exposed

$ kubectl get svc hello-nginx
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
hello-nginx ClusterIP 10.96.122.135 <none> 80/TCP 23s

# 服务验证
$ curl -I 10.96.122.135
HTTP/1.1 200 OK
Server: nginx/1.21.5
Date: Sun, 08 May 2022 09:58:27 GMT
Content-Type: text/html
Content-Length: 615
Last-Modified: Tue, 28 Dec 2021 15:28:38 GMT
Connection: keep-alive
ETag: "61cb2d26-267"
Accept-Ranges: bytes


方式2.当我们也可以利用资源清单创建nginx服务用于部署我的个人主页
步骤 01.准备StatefulSet控制器管理的资源清单

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
tee nginx.yaml <<'EOF'
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: nginx-web
namespace: default
labels:
app: nginx
spec:
replicas: 2
selector:
matchLabels:
app: nginx
serviceName: "nginx-service"

template:
metadata:
labels:
app: nginx
spec:
volumes:
- name: workdir
emptyDir: {}
containers:
- name: nginx
image: nginx:latest
ports:
- name: http
protocol: TCP
containerPort: 80
volumeMounts:
- name: workdir
mountPath: "/usr/share/nginx/html"
initContainers:
- name: init-html
image: bitnami/git:2.36.1
command: ['sh', '-c', "git clone --depth=1 https://github.com/WeiyiGeek/WeiyiGeek.git /app/html"]
volumeMounts:
- name: workdir
mountPath: "/app/html"
---
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
type: NodePort
ports:
- port: 80
targetPort: 80
nodePort: 30001
protocol: TCP
selector:
app: nginx
EOF

步骤 02.部署资源以及Pod信息查看

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
$ kubectl apply -f nginx.yaml
statefulset.apps/nginx-web created
service/nginx-service created

# 初始化Pod容器
$ kubectl get pod nginx-web-0
NAME READY STATUS RESTARTS AGE
nginx-web-0 0/1 Init:0/1 0 3m19s

# 正常运行
$ kubectl get pod nginx-web-0 -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-web-0 1/1 Running 0 12m 10.128.251.72 master-224 <none> <none>

# nginx-service 服务后端查看
$ kubectl describe svc nginx-service
Name: nginx-service
Namespace: default
Labels: <none>
Annotations: <none>
Selector: app=nginx
Type: NodePort # 此处为 NodePort 暴露服务
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.96.14.218
IPs: 10.96.14.218
Port: <unset> 80/TCP
TargetPort: 80/TCP
NodePort: <unset> 30001/TCP # 暴露的外部端口
Endpoints: 10.128.251.72:80
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>

步骤 03.通过客户端浏览器访问10.10.107.225:30001即可访问我们部署的nginx应用,同时通过kubectl logs -f nginx-web-0查看pod日志.

WeiyiGeek.部署的个人主页

WeiyiGeek.部署的个人主页


2.部署K8s原生Dashboard UI

描述:Kubernetes Dashboard是Kubernetes集群的通用、基于web的UI。它允许用户管理集群中运行的应用程序并对其进行故障排除,以及管理集群本身。

项目地址: https://github.com/kubernetes/dashboard/

步骤 01.从Github中拉取dashboard部署资源清单,当前最新版本v2.5.1

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# 下载部署
wget -L https://raw.githubusercontent.com/kubernetes/dashboard/v2.5.1/aio/deploy/recommended.yaml
# wget -L https://raw.githubusercontent.com/kubernetes/dashboard/v2.6.0/aio/deploy/recommended.yaml -O dashboard.yaml
kubectl apply -f recommended.yaml
grep "image:" recommended.yaml
# image: kubernetesui/dashboard:v2.5.1
# image: kubernetesui/metrics-scraper:v1.0.7

# 或者一条命令搞定部署
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.5.1/aio/deploy/recommended.yaml
# serviceaccount/kubernetes-dashboard created
# service/kubernetes-dashboard created
# secret/kubernetes-dashboard-certs created
# secret/kubernetes-dashboard-csrf created
# secret/kubernetes-dashboard-key-holder created
# configmap/kubernetes-dashboard-settings created
# role.rbac.authorization.k8s.io/kubernetes-dashboard created
# clusterrole.rbac.authorization.k8s.io/kubernetes-dashboard created
# rolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created
# clusterrolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created
# deployment.apps/kubernetes-dashboard created
# service/dashboard-metrics-scraper created
# deployment.apps/dashboard-metrics-scraper created


步骤 02.查看部署的dashboard相关资源是否正常。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
$ kubectl get deploy,svc -n kubernetes-dashboard  -o wide
NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR
deployment.apps/dashboard-metrics-scraper 1/1 1 1 7m45s dashboard-metrics-scraper kubernetesui/metrics-scraper:v1.0.7 k8s-app=dashboard-metrics-scraper
deployment.apps/kubernetes-dashboard 1/1 1 1 7m45s kubernetes-dashboard kubernetesui/dashboard:v2.5.1 k8s-app=kubernetes-dashboard

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
service/dashboard-metrics-scraper ClusterIP 10.96.37.134 <none> 8000/TCP 7m45s k8s-app=dashboard-metrics-scraper
service/kubernetes-dashboard ClusterIP 10.96.26.57 <none> 443/TCP 7m45s k8s-app=kubernetes-dashboard

# 编辑 service/kubernetes-dashboard 服务将端口通过nodePort方式进行暴露为30443。
$ kubectl edit svc -n kubernetes-dashboard kubernetes-dashboard
# service/kubernetes-dashboard edited
apiVersion: v1
kind: Service
.....
spec:
.....
ports:
- port: 443
protocol: TCP
targetPort: 8443
nodePort: 30443 # 新增
selector:
k8s-app: kubernetes-dashboard
sessionAffinity: None
type: NodePort # 修改


步骤 03.默认仪表板部署包含运行所需的最小RBAC权限集,而要想使用dashboard操作集群中的资源,通常我们还需要自定义创建kubernetes-dashboard管理员角色。
权限控制参考地址: https://github.com/kubernetes/dashboard/blob/master/docs/user/access-control/README.md

1
2
3
# 创建后最小权限的Token(只能操作kubernetes-dashboard名称空间下的资源)
kubectl get sa -n kubernetes-dashboard kubernetes-dashboard
kubectl describe secrets -n kubernetes-dashboard kubernetes-dashboard-token-jhdpb | grep '^token:'|awk '{print $2}'
WeiyiGeek.Dashboard默认两种认证方式

WeiyiGeek.Dashboard默认两种认证方式

Kubernetes Dashboard 支持几种不同的用户身份验证方式:

  • Authorization header
  • Bearer Token (默认)
  • Username/password
  • Kubeconfig file (默认)

温馨提示: 此处使用Bearer Token方式, 为了方便演示我们向 Dashboard 的服务帐户授予管理员权限 (Admin privileges), 而在生产环境中通常不建议如此操作, 而是指定一个或者多个名称空间下的资源进行操作。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
tee rbac-dashboard-admin.yaml <<'EOF'
apiVersion: v1
kind: ServiceAccount
metadata:
name: dashboard-admin
namespace: kubernetes-dashboard
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: dashboard-admin
namespace: kubernetes-dashboard
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: dashboard-admin
namespace: kubernetes-dashboard
EOF

kubectl apply -f rbac-dashboard-admin.yaml
# serviceaccount/dashboard-admin created
# clusterrolebinding.rbac.authorization.k8s.io/dashboard-admin created

步骤 04.获取 sa 创建的 dashboard-admin 用户的 secrets 名称并获取认证 token ,用于上述搭建的dashboard 认证使用。

1
2
3
4
5
kubectl get sa -n kubernetes-dashboard dashboard-admin -o yaml | grep "\- name" | awk '{print $3}'
# dashboard-admin-token-crh7v
kubectl describe secrets -n kubernetes-dashboard dashboard-admin-token-crh7v | grep "^token:" | awk '{print $2}'
# 获取到认证Token
eyJhbGciOiJSUzI1NiIsImtpZCI6IkJXdm1YSGNSQ3VFSEU3V0FTRlJKcU10bWxzUDZPY3lfU0lJOGJjNGgzRXMifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJkYXNoYm9hcmQtYWRtaW4tdG9rZW4tY3JoN3YiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoiZGFzaGJvYXJkLWFkbWluIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQudWlkIjoiNDY3ODEwMDMtM2MzNi00NWE1LTliYTQtNDY3MTQ0OWE2N2E0Iiwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50Omt1YmVybmV0ZXMtZGFzaGJvYXJkOmRhc2hib2FyZC1hZG1pbiJ9.X10AzWBxaHObYGoOqjfw3IYkhn8L5E7najdGSeLavb94LX5BY8_rCGizkWgNgNyvUe39NRP8r8YBU5sy9F2K-kN9_5cxUX125cj1drLDmgPJ-L-1m9-fs-luKnkDLRE5ENS_dgv7xsFfhtN7s9prgdqLw8dIrhshHVwflM_VOXW5D26QR6izy2AgPNGz9cRh6x2znrD-dpUNHO1enzvGzlWj7YhaOUFl310V93hh6EEc57gAwmDQM4nWP44KiaAiaW1cnC38Xs9CbWYxjsfxd3lObWShOd3knFk5PUVSBHo0opEv3HQ_-gwu6NGV6pLMY52p_JO1ECPSDnblVbVtPQ

步骤 05.利用上述 Token 进行登陆Kubernetes-dashboard的UI。

WeiyiGeek.拥有管理员权限的dashboard

WeiyiGeek.拥有管理员权限的dashboard


3.部署K9s工具进行管理集群

描述: k9s 是用于管理 Kubernetes 集群的 CLI, K9s 提供了一个终端 UI 来与您的 Kubernetes 集群进行交互。通过封装 kubectl 功能 k9s 持续监视 Kubernetes 的变化并提供后续命令来与您观察到的资源进行交互,直白的说就是k9s可以让开发者快速查看并解决运行 Kubernetes 时的日常问题。
官网地址: https://k9scli.io/
参考地址: https://github.com/derailed/k9s

此处,以安装二进制包为例进行实践。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# 1. 利用 wget 命令 -c 短点续传和 -b 后台下载
wget -b -c https://github.com/derailed/k9s/releases/download/v0.25.18/k9s_Linux_x86_64.tar.gz

# 2.解压并删除多余文件
tar -zxf k9s_linux_x86_64.tar.gz
rm k9s_linux_x86_64.tar.gz LICENSE README.md

# 3.拷贝 kubernetes 控制配置文件到加目录中
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
export KUBECONFIG=/etc/kubernetes/admin.conf

# 4.直接运行即可,如果你对vim操作比较熟悉,那么恭喜你了你很快能上手k9s.
/nfsdisk-31/newK8s-Backup/tools# ./k9s

# 5.退出k9s指令
:quit

WeiyiGeek.使用K9s工具管理集群

WeiyiGeek.使用K9s工具管理集群

更多使用技巧请参考: [K9s之K8s集群管理工具实践尝试] (https://blog.weiyigeek.top/2022/1-1-582.html)


0x04 入坑出坑

二进制搭建K8S集群部署calico网络插件始终有一个calico-node-xxx状态为Running但是READY为0/1解决方法

  • 错误信息:

    1
    2
    3
    4
    5
    6
    7
    8
    kubectl get pod -A
    NAMESPACE NAME READY STATUS RESTARTS AGE
    kube-system calico-kube-controllers-6f7b5668f7-52247 1/1 Running 0 13m
    kube-system calico-node-kb27z 0/1 Running 0 13m

    $ kubectl describe pod -n kube-system calico-node-kb27z
    Warning Unhealthy 14m kubelet Readiness probe failed: 2022-05-07 13:15:12.460 [INFO][204] confd/health.go 180: Number of node(s) with BGP peering established = 0
    calico/node is not ready: BIRD is not ready: BGP not established with 10.10.107.223,10.10.107.224,10.10.107.225,10.10.107.226 # 关键点
  • 错误原因: calico-node daemonset 默认的策略是获取第一个取到的网卡的 ip 作为 calico node 的ip, 由于集群中网卡名称不统一所以可能导致calico获取的网卡IP不对, 所以出现此种情况下就只能 IP_AUTODETECTION_METHOD 字段指定通配符网卡名称或者IP地址。

  • 解决办法:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    # 通信网卡名称查看(为ens32)
    ip addr
    # 2: ens32: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    # link/ether 00:0c:29:a5:66:de brd ff:ff:ff:ff:ff:ff
    # inet 10.10.107.227/24 brd 10.10.107.255 scope global ens32

    # 编辑 calico 资源,添加 IP_AUTODETECTION_METHOD 键与值, 其中interface 是正则表达式为了成功匹配此处设置为en.*
    kubectl edit daemonsets.apps -n kube-system calico-node
    - name: IP
    value: autodetect
    - name: IP_AUTODETECTION_METHOD
    value: interface=en.*
  • 扩展补充: calico 官方文档 https://docs.projectcalico.org/reference/node/configuration#ip-autodetection-methods

    1
    2
    3
    4
    5
    # IP_AUTODETECTION_METHOD	
    # The method to use to autodetect the IPv4 address for this host. This is only used when the IPv4 address is being autodetected. See IP Autodetection methods for details of the valid methods. [Default: first-found]

    - name: IP_AUTODETECTION_METHOD
    value: can-reach=114.114.114.114

二进制部署 containerd 1.6.3 将用于环回的 CNI 配置版本提高到 1.0.0 导致 CNI loopback 无法启用

  • 错误信息:

    1
    Warning  FailedCreatePodSandBox  4m19s (x1293 over 34m)  kubelet            (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "ae7a5d4c6906fac5114679c4b375ecc02e0cebcf4406962f01b40220064a8c1c": plugin type="loopback" failed (add): incompatible CNI versions; config is "1.0.0", plugin supports ["0.1.0" "0.2.0" "0.3.0" "0.3.1" "0.4.0"]
  • 问题原因: 版本containerd 1.6.3 的 Bug

  • 解决办法: 重新安装部署cri-containerd-cni-1.6.4-linux-amd64

1
2
containerd -version
containerd github.com/containerd/containerd v1.6.4 212e8b6fa2f44b9c21b2798135fc6fb7c53efc16


在集群中部署coredns时显示 CrashLoopBackOff 并且报 Readiness probe failed 8181 端口 connection refused

  • 错误信息:

    1
    2
    3
    4
    5
    6
    kubectl get pod -n kube-system -o wide
    # coredns-659648f898-kvgrl 0/1 CrashLoopBackOff 5 (73s ago) 4m14s 10.128.17.198 master-225 <none> <none>

    kubectl describe pod -n kube-system coredns-659648f898-crttw
    # Warning Unhealthy 2m34s kubelet Readiness probe failed: HTTP probe failed with statuscode: 503
    # Warning Unhealthy 95s kubelet Readiness probe failed: Get "http://10.128.251.66:8181/ready": dial tcp 10.128.251.66:8181: connect: connection refused
  • 问题原因: 由于 coredns 是映射宿主机中的 /etc/resolv.conf 到容器中在加载该配置是并不能访问 nameserver 127.0.0.53 这个dns地址导致coredns容器pod无法正常启动,并且在我们手动修改该 /etc/resolv.conf 后systemd-resolved.service会定时刷新覆盖我们的修改,所以在百度上的一些方法只能临时解决,在Pod容器下一次重启后将会又处于该异常状态。

    1
    2
    3
    4
    5
    6
    $ ls -alh /etc/resolv.conf
    # lrwxrwxrwx 1 root root 39 Feb 2 2021 /etc/resolv.conf -> ../run/systemd/resolve/stub-resolv.conf

    $ cat /etc/resolv.conf
    # nameserver 127.0.0.53
    # options edns0 trust-ad
  • 解决办法: 推荐方式将/etc/resolv.conf建立软连接到 /run/systemd/resolve/resolv.conf或者不使用软连接直接删除。

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    # systemd-resolve 服务DNS设定后续便可以防止/etc/resolv.conf被误修改。
    tee -a /etc/systemd/resolved.conf <<'EOF'
    DNS=223.6.6.6
    DNS=10.96.0.10
    EOF
    systemctl restart systemd-resolved.service
    systemd-resolve --status

    # Delete the symbolic link
    sudo rm -f /etc/resolv.conf
    ln -s /run/systemd/resolve/resolv.conf /etc/resolv.conf

    # 删除错误 coredns POD 此处利用pod标签
    kubectl delete pod -n kube-system -l k8s-app=kube-dns

    # coredns 副本扩展为3个查看是否正常了。
    kubectl scale deployment -n kube-system coredns --replicas 3
    # deployment.apps/coredns scaled

    $ cat /etc/resolv.conf
    nameserver 223.6.6.6
    nameserver 10.96.0.10
    nameserver 192.168.10.254


在进行授予管理员权限时利用 Token 登陆 kubernetes-dashboard 认证时始终报 Unauthorized (401): Invalid credentials provided

  • 错误信息: Unauthorized (401): Invalid credentials provided
    1
    2
    3
    $ kubectl logs -f --tail 50 -n kubernetes-dashboard kubernetes-dashboard-fb8648fd9-8pl5c
    2022/05/09 00:48:50 [2022-05-09T00:48:50Z] Incoming HTTP/2.0 POST /api/v1/login request from 10.128.17.194:54676: { contents hidden }
    2022/05/09 00:48:50 Non-critical error occurred during resource retrieval: the server has asked for the client to provide credentials
  • 问题原因: 创键的rbac本身是没有问题,只是我们采用 kubectl get secrets 获取的是经过base64编码后的,我们应该执行 kubectl describe secrets 进行获取。
  • 解决办法:
    1
    2
    3
    kubectl get sa -n kubernetes-dashboard dashboard-admin -o yaml | grep "\- name" | awk '{print $3}'
    # dashboard-admin-token-crh7v
    kubectl describe secrets -n kubernetes-dashboard dashboard-admin-token-crh7v | grep "^token:" | awk '{print $2}'