ubuntu24.04安装Kubernetes1.30.0(kubernetes1.30.0)高可用集群
一、总体概览
目前最新版的K8S
版本应该是1.31.0
,我们安装的是第二新的版本1.30.0
,因为有大神XiaoHH Superme指路,所以基本上没踩坑,很顺利就搭建完成了。所有的机器都采用的最新版Ubuntu-Server-24.04
长期支持版,Ubuntu-Server
机器准备就绪,安装了必要的vim
、ssh
等,并开启了ssh
自启动等最基础的操作,并且每台机器都进行了固定IP
的设置。如果有需要请参考这里。机器都是跑在PVE
虚拟主机上,稳的一逼。
3台master
主节点(4C8G)、3台(4C8G)worker node
、2台LoadBalancer
(4C8G),对应的 IP如下:
hostname | IP | function | Version |
---|---|---|---|
hep-kubernetes-master-prd-01 | 192.168.31.41 | Control plane | Ubuntu-Server-24.04 |
hep-kubernetes-master-prd-02 | 192.168.31.42 | Control plane | Ubuntu-Server-24.04 |
hep-kubernetes-master-prd-03 | 192.168.31.43 | Control plane | Ubuntu-Server-24.04 |
hep-kubernetes-apiserver-lb-prd-01 | 192.168.31.44 | LoadBalancer | Ubuntu-Server-24.04 |
hep-kubernetes-apiserver-lb-prd-02 | 192.168.31.45 | LoadBalancer | Ubuntu-Server-24.04 |
hep-kubernetes-worker-prd-01 | 192.168.31.46 | worker node | Ubuntu-Server-24.04 |
hep-kubernetes-worker-prd-02 | 192.168.31.47 | worker node | Ubuntu-Server-24.04 |
hep-kubernetes-worker-prd-03 | 192.168.31.48 | worker node | Ubuntu-Server-24.04 |
角色分配:
-
hep-kubernetes-apiserver-lb-prd-01
和hep-kubernetes-apiserver-lb-prd-02
安装keepalived
和haproxy
,做主节点apiserver
的负载均衡器 -
hep-kubernetes-master-prd-01
、hep-kubernetes-master-prd-02
、hep-kubernetes-master-prd-03
为三台master
节点 -
hep-kubernetes-worker-prd-01
、hep-kubernetes-worker-prd-02
、hep-kubernetes-worker-prd-03
为三台worker
节点。
在安装部署Kubernetes
的时候,由于机器比较多,网上的大佬们写的文章很多时候都不知道该在哪台机器上操作,让小白们不知所措,所以我们写的这个教程都是基于机器的,每台机器上该操作啥,全部列举出来了。当然如果你觉得繁琐,特别是在PVE
这样的虚拟化管理软件中,可以在一台机器上操作完成,然后复制虚拟机即可,这样就方便了太多。
二、配置容器运行时
选用的容器运行时是containerd
,版本号为 1.7.20
,可使用如下命令进行下载,或者直接访问https://github.com/containerd/containerd/releases/download/v1.7.20/cri-containerd-cni-1.7.20-linux-amd64.tar.gz
。
curl -LO https://github.com/containerd/containerd/releases/download/v1.7.20/cri-containerd-cni-1.7.20-linux-amd64.tar.gz
2.1 hep-kubernetes-master-prd-01配置containerd
#设置主机名
hostnamectl set-hostname hep-kubernetes-master-prd-01
#修改hosts文件,把k8s所有机器主机名都填上去,这样集群内就可以根据主机名匹配到对应的IP地址,从而进行通信了
vim /etc/hosts
#增加如下内容
192.168.31.41 hep-kubernetes-master-prd-01
192.168.31.42 hep-kubernetes-master-prd-02
192.168.31.43 hep-kubernetes-master-prd-03
192.168.31.44 hep-kubernetes-apiserver-lb-prd-01
192.168.31.45 hep-kubernetes-apiserver-lb-prd-02
192.168.31.46 hep-kubernetes-worker-prd-01
192.168.31.47 hep-kubernetes-worker-prd-02
192.168.31.48 hep-kubernetes-worker-prd-03
#设置为中国上海时区
timedatectl set-timezone Asia/Shanghai
#时间同步工具
apt install -y ntpdate
#进行时间同步
ntpdate ntp.aliyun.com
#修改/etc/fstab文件注释掉带/swap.img的这一行
vim /etc/fstab
# 设置所需的 sysctl 参数,参数在重新启动后保持不变
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.ipv4.ip_forward = 1
EOF
# 应用 sysctl 参数而不重新启动
sysctl --system
#验证 net.ipv4.ip_forward 是否设置为 1
sysctl net.ipv4.ip_forward
#选用的容器运行时是containerd,版本号为 1.7.20,可直接执行下面这行命令下载此安装包
curl -LO https://github.com/containerd/containerd/releases/download/v1.7.20/cri-containerd-cni-1.7.20-linux-amd64.tar.gz
#我这里是使用的webmin远程登录到家里的机器,从这台机器拷贝containerd到当前机器
scp root@192.168.31.2:/usr/software/cri-containerd-cni-1.7.20-linux-amd64.tar.gz /usr/software/
#压缩包解压到根目录
tar -zxvf cri-containerd-cni-1.7.20-linux-amd64.tar.gz -C /
#查看版本号
containerd --version
#因为安装后默认是不自带配置文件的,所以需要创建目录并生成配置文件
mkdir /etc/containerd
#执行这行命令生成配置文件
containerd config default | sudo tee /etc/containerd/config.toml
vim /etc/containerd/config.toml
#因为默认的pause镜像是在谷歌上拉取的,国内拉取不下来,所以需要修改 /etc/containerd/config.toml 配置文件,将pause镜像改为 registry.aliyuncs.com/google_containers/pause:3.9
sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.9"
#搜索plugins."io.containerd.grpc.v1.cri".registry.mirrors,增加几个docker仓库
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."https://docker.m.daocloud.io"]
endpoint = ["https://docker.m.daocloud.io"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."https://noohub.ru"]
endpoint = ["https://noohub.ru"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."https://huecker.io"]
endpoint = ["https://huecker.io"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."https://dockerhub.timeweb.cloud"]
endpoint = ["https://dockerhub.timeweb.cloud"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."https://docker.rainbond.cc"]
endpoint = ["https://docker.rainbond.cc"]
#然后需要开启cgroup,用于限制分配给进程的资源。将SystemdCgroup设置为true
SystemdCgroup = true
#启动containerd,并设置为开机自启动
systemctl start containerd
systemctl enable --now containerd
#开放端口,这是百度出的K8S集群需要通信的端口号
ufw allow 6443
ufw allow 10248
ufw allow 10259
ufw allow 10257
ufw allow 10250
ufw allow 10251
ufw allow 10252
ufw allow 10259
ufw allow 10257
ufw allow 10255
ufw allow 10256
ufw allow 2375
ufw allow 8472
ufw allow 4789
ufw allow 9099
ufw allow 9796
ufw allow 2379
ufw allow 2380
ufw allow 80
ufw allow 443
ufw allow 9443
#开机自启动
ufw enable
#当然你也可以无脑操作关闭防火墙,但生产环境不建议,如果你只是本地跑跑玩玩
ufw disable
2.2 hep-kubernetes-master-prd-02配置containerd
#设置主机名
hostnamectl set-hostname hep-kubernetes-master-prd-02
#修改hosts文件,把k8s所有机器主机名都填上去,这样集群内就可以根据主机名匹配到对应的IP地址,从而进行通信了
vim /etc/hosts
#增加如下内容
192.168.31.41 hep-kubernetes-master-prd-01
192.168.31.42 hep-kubernetes-master-prd-02
192.168.31.43 hep-kubernetes-master-prd-03
192.168.31.44 hep-kubernetes-apiserver-lb-prd-01
192.168.31.45 hep-kubernetes-apiserver-lb-prd-02
192.168.31.46 hep-kubernetes-worker-prd-01
192.168.31.47 hep-kubernetes-worker-prd-02
192.168.31.48 hep-kubernetes-worker-prd-03
#设置为中国上海时区
timedatectl set-timezone Asia/Shanghai
#时间同步工具
apt install -y ntpdate
#进行时间同步
ntpdate ntp.aliyun.com
#修改/etc/fstab文件注释掉带/swap.img的这一行
vim /etc/fstab
# 设置所需的 sysctl 参数,参数在重新启动后保持不变
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.ipv4.ip_forward = 1
EOF
# 应用 sysctl 参数而不重新启动
sysctl --system
#验证 net.ipv4.ip_forward 是否设置为 1
sysctl net.ipv4.ip_forward
#选用的容器运行时是containerd,版本号为 1.7.20,可直接执行下面这行命令下载此安装包
curl -LO https://github.com/containerd/containerd/releases/download/v1.7.20/cri-containerd-cni-1.7.20-linux-amd64.tar.gz
#我这里是使用的webmin远程登录到家里的机器,从这台机器拷贝containerd到当前机器
scp root@192.168.31.2:/usr/software/cri-containerd-cni-1.7.20-linux-amd64.tar.gz /usr/software/
#压缩包解压到根目录
tar -zxvf cri-containerd-cni-1.7.20-linux-amd64.tar.gz -C /
#查看版本号
containerd --version
#因为安装后默认是不自带配置文件的,所以需要创建目录并生成配置文件
mkdir /etc/containerd
#执行这行命令生成配置文件
containerd config default | sudo tee /etc/containerd/config.toml
vim /etc/containerd/config.toml
#因为默认的pause镜像是在谷歌上拉取的,国内拉取不下来,所以需要修改 /etc/containerd/config.toml 配置文件,将pause镜像改为 registry.aliyuncs.com/google_containers/pause:3.9
sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.9"
#搜索plugins."io.containerd.grpc.v1.cri".registry.mirrors,增加几个docker仓库
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."https://docker.m.daocloud.io"]
endpoint = ["https://docker.m.daocloud.io"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."https://noohub.ru"]
endpoint = ["https://noohub.ru"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."https://huecker.io"]
endpoint = ["https://huecker.io"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."https://dockerhub.timeweb.cloud"]
endpoint = ["https://dockerhub.timeweb.cloud"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."https://docker.rainbond.cc"]
endpoint = ["https://docker.rainbond.cc"]
#然后需要开启cgroup,用于限制分配给进程的资源。将SystemdCgroup设置为true
SystemdCgroup = true
#启动containerd,并设置为开机自启动
systemctl start containerd
systemctl enable --now containerd
#开放端口,这是百度出的K8S集群需要通信的端口号
ufw allow 6443
ufw allow 10248
ufw allow 10259
ufw allow 10257
ufw allow 10250
ufw allow 10251
ufw allow 10252
ufw allow 10259
ufw allow 10257
ufw allow 10255
ufw allow 10256
ufw allow 2375
ufw allow 8472
ufw allow 4789
ufw allow 9099
ufw allow 9796
ufw allow 2379
ufw allow 2380
ufw allow 80
ufw allow 443
ufw allow 9443
#开机自启动
ufw enable
#当然你也可以无脑操作关闭防火墙,但生产环境不建议,如果你只是本地跑跑玩玩
ufw disable
2.3 hep-kubernetes-master-prd-03配置containerd
#设置主机名
hostnamectl set-hostname hep-kubernetes-master-prd-03
#修改hosts文件,把k8s所有机器主机名都填上去,这样集群内就可以根据主机名匹配到对应的IP地址,从而进行通信了
vim /etc/hosts
#增加如下内容
192.168.31.41 hep-kubernetes-master-prd-01
192.168.31.42 hep-kubernetes-master-prd-02
192.168.31.43 hep-kubernetes-master-prd-03
192.168.31.44 hep-kubernetes-apiserver-lb-prd-01
192.168.31.45 hep-kubernetes-apiserver-lb-prd-02
192.168.31.46 hep-kubernetes-worker-prd-01
192.168.31.47 hep-kubernetes-worker-prd-02
192.168.31.48 hep-kubernetes-worker-prd-03
#设置为中国上海时区
timedatectl set-timezone Asia/Shanghai
#时间同步工具
apt install -y ntpdate
#进行时间同步
ntpdate ntp.aliyun.com
#修改/etc/fstab文件注释掉带/swap.img的这一行
vim /etc/fstab
# 设置所需的 sysctl 参数,参数在重新启动后保持不变
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.ipv4.ip_forward = 1
EOF
# 应用 sysctl 参数而不重新启动
sysctl --system
#验证 net.ipv4.ip_forward 是否设置为 1
sysctl net.ipv4.ip_forward
#选用的容器运行时是containerd,版本号为 1.7.20,可直接执行下面这行命令下载此安装包
curl -LO https://github.com/containerd/containerd/releases/download/v1.7.20/cri-containerd-cni-1.7.20-linux-amd64.tar.gz
#我这里是使用的webmin远程登录到家里的机器,从这台机器拷贝containerd到当前机器
scp root@192.168.31.2:/usr/software/cri-containerd-cni-1.7.20-linux-amd64.tar.gz /usr/software/
#压缩包解压到根目录
tar -zxvf cri-containerd-cni-1.7.20-linux-amd64.tar.gz -C /
#查看版本号
containerd --version
#因为安装后默认是不自带配置文件的,所以需要创建目录并生成配置文件
mkdir /etc/containerd
#执行这行命令生成配置文件
containerd config default | sudo tee /etc/containerd/config.toml
vim /etc/containerd/config.toml
#因为默认的pause镜像是在谷歌上拉取的,国内拉取不下来,所以需要修改 /etc/containerd/config.toml 配置文件,将pause镜像改为 registry.aliyuncs.com/google_containers/pause:3.9
sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.9"
#搜索plugins."io.containerd.grpc.v1.cri".registry.mirrors,增加几个docker仓库
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."https://docker.m.daocloud.io"]
endpoint = ["https://docker.m.daocloud.io"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."https://noohub.ru"]
endpoint = ["https://noohub.ru"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."https://huecker.io"]
endpoint = ["https://huecker.io"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."https://dockerhub.timeweb.cloud"]
endpoint = ["https://dockerhub.timeweb.cloud"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."https://docker.rainbond.cc"]
endpoint = ["https://docker.rainbond.cc"]
#然后需要开启cgroup,用于限制分配给进程的资源。将SystemdCgroup设置为true
SystemdCgroup = true
#启动containerd,并设置为开机自启动
systemctl start containerd
systemctl enable --now containerd
#开放端口,这是百度出的K8S集群需要通信的端口号
ufw allow 6443
ufw allow 10248
ufw allow 10259
ufw allow 10257
ufw allow 10250
ufw allow 10251
ufw allow 10252
ufw allow 10259
ufw allow 10257
ufw allow 10255
ufw allow 10256
ufw allow 2375
ufw allow 8472
ufw allow 4789
ufw allow 9099
ufw allow 9796
ufw allow 2379
ufw allow 2380
ufw allow 80
ufw allow 443
ufw allow 9443
#开机自启动
ufw enable
#当然你也可以无脑操作关闭防火墙,但生产环境不建议,如果你只是本地跑跑玩玩
ufw disable
三、安装keepalived和haproxy
3.1 hep-kubernetes-apiserver-lb-prd-01
hep-kubernetes-apiserver-lb-prd-01
和hep-kubernetes-apiserver-lb-prd-02
机器上执行
#设置主机名
hostnamectl set-hostname hep-kubernetes-apiserver-lb-prd-01
#修改hosts文件,把k8s所有机器主机名都填上去,这样集群内就可以根据主机名匹配到对应的IP地址,从而进行通信了
vim /etc/hosts
#增加如下内容
192.168.31.41 hep-kubernetes-master-prd-01
192.168.31.42 hep-kubernetes-master-prd-02
192.168.31.43 hep-kubernetes-master-prd-03
192.168.31.44 hep-kubernetes-apiserver-lb-prd-01
192.168.31.45 hep-kubernetes-apiserver-lb-prd-02
192.168.31.46 hep-kubernetes-worker-prd-01
192.168.31.47 hep-kubernetes-worker-prd-02
192.168.31.48 hep-kubernetes-worker-prd-03
#更新
apt update
apt upgrade
#安装keepalived和haproxy
apt install -y keepalived haproxy
#修改keepalived配置文件/etc/keepalived/keepalived.conf
vim /etc/keepalived/keepalived.conf
#其内容修改如下
! /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id LVS_DEVEL
}
vrrp_script check_apiserver {
script "/etc/keepalived/check_apiserver.sh"
interval 3
weight -2
fall 10
rise 2
}
vrrp_instance VI_1 {
state MASTER
interface ens18
virtual_router_id 51
priority 101
authentication {
auth_type PASS
auth_pass XiaoHH
}
virtual_ipaddress {
192.168.31.49
}
track_script {
check_apiserver
}
}
tips
:
state
: 状态,主节点为MASTER
,从节点为BACKUP
interface
:物理网口名称,可执行ip a
命令获得你自己的priority
:优先级,主节点为101
,从节点为100
virtual_ipaddress
:虚拟IP地址,我的为192.168.31.49
,K8S
集群机器的IP
是从192.168.31.41
到192.168.31.48
,192.168.31.49
这个IP
是没有被占用的
#keepalived还需要一个健康检查脚本,脚本地址为/etc/keepalived/check_apiserver.sh,注意这里规划的负载均衡端口为6443,如果不同注意修改为你自己的。内容为(主从节点内容一样)
vim /etc/keepalived/check_apiserver.sh
#!/bin/sh
errorExit() {
echo "*** $*" 1>&2
exit 1
}
curl -sfk --max-time 2 https://localhost:6443/healthz -o /dev/null || errorExit "Error GET https://localhost:6443/healthz"
#将这个文件添加执行权限
chmod +x /etc/keepalived/check_apiserver.sh
#修改haproxy配置文件
vim /etc/haproxy/haproxy.cfg
#其内容增加frontend apiserver和backend apiserverbackend两部分
global
log /dev/log local0
log /dev/log local1 notice
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin
stats timeout 30s
user haproxy
group haproxy
daemon
# Default SSL material locations
ca-base /etc/ssl/certs
crt-base /etc/ssl/private
# See: https://ssl-config.mozilla.org/#server=haproxy&server-version=2.0.3&config=intermediate
ssl-default-bind-ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384
ssl-default-bind-ciphersuites TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256
ssl-default-bind-options ssl-min-ver TLSv1.2 no-tls-tickets
defaults
log global
mode http
option httplog
option dontlognull
timeout connect 5000
timeout client 50000
timeout server 50000
errorfile 400 /etc/haproxy/errors/400.http
errorfile 403 /etc/haproxy/errors/403.http
errorfile 408 /etc/haproxy/errors/408.http
errorfile 500 /etc/haproxy/errors/500.http
errorfile 502 /etc/haproxy/errors/502.http
errorfile 503 /etc/haproxy/errors/503.http
errorfile 504 /etc/haproxy/errors/504.http
#---------------------------------------------------------------------
# apiserver frontend which proxys to the control plane nodes
#---------------------------------------------------------------------
frontend apiserver
# 负载均衡端口
bind *:6443
mode tcp
option tcplog
default_backend apiserverbackend
#---------------------------------------------------------------------
# round robin balancing for apiserver
#---------------------------------------------------------------------
backend apiserverbackend
option httpchk
http-check connect ssl
http-check send meth GET uri /healthz
http-check expect status 200
mode tcp
balance roundrobin
# 主节点列表,注意IP地址修改为你自己的
server hep-kubernetes-master-prd-01 192.168.31.41:6443 check verify none
server hep-kubernetes-master-prd-02 192.168.31.42:6443 check verify none
server hep-kubernetes-master-prd-03 192.168.31.43:6443 check verify none
#启动keepalived和haproxy,并开机自启动
systemctl enable --now keepalived
systemctl enable --now haproxy
#查看状态
systemctl status keepalived
#开放端口,这是百度出的K8S集群需要通信的端口号
ufw allow 6443
ufw allow 10248
ufw allow 10259
ufw allow 10257
ufw allow 10250
ufw allow 10251
ufw allow 10252
ufw allow 10259
ufw allow 10257
ufw allow 10255
ufw allow 10256
ufw allow 2375
ufw allow 8472
ufw allow 4789
ufw allow 9099
ufw allow 9796
ufw allow 2379
ufw allow 2380
ufw allow 80
ufw allow 443
ufw allow 9443
#开机自启动
ufw enable
#当然你也可以无脑操作关闭防火墙,但生产环境不建议,如果你只是本地跑跑玩玩
ufw disable
3.2 hep-kubernetes-apiserver-lb-prd-02
#设置主机名
hostnamectl set-hostname hep-kubernetes-apiserver-lb-prd-02
#修改hosts文件,把k8s所有机器主机名都填上去,这样集群内就可以根据主机名匹配到对应的IP地址,从而进行通信了
vim /etc/hosts
#增加如下内容
192.168.31.41 hep-kubernetes-master-prd-01
192.168.31.42 hep-kubernetes-master-prd-02
192.168.31.43 hep-kubernetes-master-prd-03
192.168.31.44 hep-kubernetes-apiserver-lb-prd-01
192.168.31.45 hep-kubernetes-apiserver-lb-prd-02
192.168.31.46 hep-kubernetes-worker-prd-01
192.168.31.47 hep-kubernetes-worker-prd-02
192.168.31.48 hep-kubernetes-worker-prd-03
#更新
apt update
apt upgrade
#安装keepalived和haproxy
apt install -y keepalived haproxy
#修改keepalived配置文件/etc/keepalived/keepalived.conf
vim /etc/keepalived/keepalived.conf
#其内容修改如下
! /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id LVS_DEVEL
}
vrrp_script check_apiserver {
script "/etc/keepalived/check_apiserver.sh"
interval 3
weight -2
fall 10
rise 2
}
vrrp_instance VI_1 {
state BACKUP
interface ens18
virtual_router_id 51
priority 100
authentication {
auth_type PASS
auth_pass XiaoHH
}
virtual_ipaddress {
192.168.31.49
}
track_script {
check_apiserver
}
}
tips
:
state
: 状态,主节点为MASTER
,从节点为BACKUP
interface
:物理网口名称,可执行ip a
命令获得你自己的priority
:优先级,主节点为101
,从节点为100
virtual_ipaddress
:虚拟IP地址,我的为192.168.31.49
,K8S
集群机器的IP
是从192.168.31.41
到192.168.31.48
,192.168.31.49
这个IP
是没有被占用的
#keepalived还需要一个健康检查脚本,脚本地址为/etc/keepalived/check_apiserver.sh,注意这里规划的负载均衡端口为6443,如果不同注意修改为你自己的。内容为(主从节点内容一样)
vim /etc/keepalived/check_apiserver.sh
#!/bin/sh
errorExit() {
echo "*** $*" 1>&2
exit 1
}
curl -sfk --max-time 2 https://localhost:6443/healthz -o /dev/null || errorExit "Error GET https://localhost:6443/healthz"
#将这个文件添加执行权限
chmod +x /etc/keepalived/check_apiserver.sh
#修改haproxy配置文件
vim /etc/haproxy/haproxy.cfg
#其内容增加frontend apiserver和backend apiserverbackend两部分
global
log /dev/log local0
log /dev/log local1 notice
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin
stats timeout 30s
user haproxy
group haproxy
daemon
# Default SSL material locations
ca-base /etc/ssl/certs
crt-base /etc/ssl/private
# See: https://ssl-config.mozilla.org/#server=haproxy&server-version=2.0.3&config=intermediate
ssl-default-bind-ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384
ssl-default-bind-ciphersuites TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256
ssl-default-bind-options ssl-min-ver TLSv1.2 no-tls-tickets
defaults
log global
mode http
option httplog
option dontlognull
timeout connect 5000
timeout client 50000
timeout server 50000
errorfile 400 /etc/haproxy/errors/400.http
errorfile 403 /etc/haproxy/errors/403.http
errorfile 408 /etc/haproxy/errors/408.http
errorfile 500 /etc/haproxy/errors/500.http
errorfile 502 /etc/haproxy/errors/502.http
errorfile 503 /etc/haproxy/errors/503.http
errorfile 504 /etc/haproxy/errors/504.http
#---------------------------------------------------------------------
# apiserver frontend which proxys to the control plane nodes
#---------------------------------------------------------------------
frontend apiserver
# 负载均衡端口
bind *:6443
mode tcp
option tcplog
default_backend apiserverbackend
#---------------------------------------------------------------------
# round robin balancing for apiserver
#---------------------------------------------------------------------
backend apiserverbackend
option httpchk
http-check connect ssl
http-check send meth GET uri /healthz
http-check expect status 200
mode tcp
balance roundrobin
# 主节点列表,注意IP地址修改为你自己的
server hep-kubernetes-master-prd-01 192.168.31.41:6443 check verify none
server hep-kubernetes-master-prd-02 192.168.31.42:6443 check verify none
server hep-kubernetes-master-prd-03 192.168.31.43:6443 check verify none
#启动keepalived和haproxy,并开机自启动
systemctl enable --now keepalived
systemctl enable --now haproxy
#查看状态
systemctl status keepalived
#开放端口,这是百度出的K8S集群需要通信的端口号
ufw allow 6443
ufw allow 10248
ufw allow 10259
ufw allow 10257
ufw allow 10250
ufw allow 10251
ufw allow 10252
ufw allow 10259
ufw allow 10257
ufw allow 10255
ufw allow 10256
ufw allow 2375
ufw allow 8472
ufw allow 4789
ufw allow 9099
ufw allow 9796
ufw allow 2379
ufw allow 2380
ufw allow 80
ufw allow 443
ufw allow 9443
#开机自启动
ufw enable
#当然你也可以无脑操作关闭防火墙,但生产环境不建议,如果你只是本地跑跑玩玩
ufw disable
看到下面结果代表启动成功。
四、安装kubelet kubeadm kubectl
下面操作在hep-kubernetes-master-prd-01
、hep-kubernetes-master-prd-02
、hep-kubernetes-master-prd-03
上分别执行
apt update
#apt-transport-https可能是一个虚拟包(dummy package),如果是的话,你可以跳过安装这个包
apt install -y apt-transport-https ca-certificates curl gpg
# 下载用于 Kubernetes 软件包仓库的公共签名密钥。如果/etc/apt/keyrings目录不存在,则应在curl命令之前创建它,请阅读下面的注释
# sudo mkdir -p -m 755 /etc/apt/keyrings
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.30/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
#添加Kubernetes apt仓库。此操作会覆盖/etc/apt/sources.list.d/kubernetes.list中现存的所有配置。
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.30/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list
#更新apt包索引,安装kubelet、kubeadm和kubectl
apt update
apt install -y kubelet kubeadm kubectl
#锁定其版本
apt-mark hold kubelet kubeadm kubectl
五、初始化集群
在主节点的三台机器hep-kubernetes-master-prd-01
、hep-kubernetes-master-prd-02
、hep-kubernetes-master-prd-03
上分别执行拉取镜像。
#hep-kubernetes-master-prd-01、hep-kubernetes-master-prd-02、hep-kubernetes-master-prd-03上执行
kubeadm config images pull --image-repository=registry.aliyuncs.com/google_containers --kubernetes-version=v1.30.4
任意一台主节点上执行下面命令进行初始化,我选择了hep-kubernetes-master-prd-01
这台。
kubeadm init --apiserver-advertise-address=192.168.31.41 --control-plane-endpoint="192.168.31.49:6443" --upload-certs --image-repository=registry.aliyuncs.com/google_containers --kubernetes-version=v1.30.4 --service-cidr=10.96.0.0/12 --pod-network-cidr=10.244.0.0/16 --cri-socket=unix:///run/containerd/containerd.sock
参数解释:
-
–apiserver-advertise-address
:执行这行命令的主节点IP地址,注意修改为你自己的 -
–control-plane-endpoint
:负载均衡apiserver
的虚拟IP
地址和端口,注意修改为自己的 -
–upload-certs
:标志用来将在所有控制平面实例之间的共享证书上传到集群 -
–image-repository
:因为官方镜像在谷歌,国内拉取不下来,所以使用国内的阿里云镜像 -
–kubernetes-version
:Kubernetes
的版本号 -
–service-cidr
:Service
的网段地址 -
–pod-network-cidr
:pod
的网段地址 -
–cri-socket
:标志使用containerd
作为容器运行时
root@hep-kubernetes-master-prd-01:/# kubeadm init --apiserver-advertise-address=192.168.31.41 --control-plane-endpoint="192.168.31.49:6443" --upload-certs --image-repository=registry.aliyuncs.com/google_containers --kubernetes-version=v1.30.4 --service-cidr=10.96.0.0/12 --pod-network-cidr=10.244.0.0/16 --cri-socket=unix:///run/containerd/containerd.sock
[init] Using Kubernetes version: v1.30.4
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [hep-kubernetes-master-prd-01 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.31.41 192.168.31.49]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [hep-kubernetes-master-prd-01 localhost] and IPs [192.168.31.41 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [hep-kubernetes-master-prd-01 localhost] and IPs [192.168.31.41 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "super-admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests"
[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 1.002423766s
[api-check] Waiting for a healthy API server. This can take up to 4m0s
[api-check] The API server is healthy after 8.031481885s
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[upload-certs] Using certificate key:
32a341ee000f200b411d5cdd0ddacc2d1813e968119b66795eb37eb8257f3e43
[mark-control-plane] Marking the node hep-kubernetes-master-prd-01 as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node hep-kubernetes-master-prd-01 as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]
[bootstrap-token] Using token: 2vqrer.gd62n98hnn8sllft
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of the control-plane node running the following command on each as root:
kubeadm join 192.168.31.49:6443 --token 2vqrer.gd62n98hnn8sllft \
--discovery-token-ca-cert-hash sha256:bcc83fdffbb24d51576c2e7a37dd0c07d0068b3eb43327f254f326426f961fbe \
--control-plane --certificate-key 32a341ee000f200b411d5cdd0ddacc2d1813e968119b66795eb37eb8257f3e43
Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.31.49:6443 --token 2vqrer.gd62n98hnn8sllft \
--discovery-token-ca-cert-hash sha256:bcc83fdffbb24d51576c2e7a37dd0c07d0068b3eb43327f254f326426f961fbe
root@hep-kubernetes-master-prd-01:/#
如果执行不成功,可以使用kubeadm reset
命令,修改参数后重新执行,如果执行成功可以kubectl get nodes -o wide
获得节点信息。
#获得node
kubectl get nodes -o wide
#重置
kubeadm reset
在初始化成功的主节点本地执行
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
其余主节点执行这条命了加入主节点集群,注意后面加上了 --cri-
。这里的token
、certificate
都不一样,你可以根据上面初始化成功后返回的代码中,有对应的主节点和工作节点加入集群的命令,复制过来就行啦。
kubeadm join 192.168.31.49:6443 --token 2vqrer.gd62n98hnn8sllft \
--discovery-token-ca-cert-hash sha256:bcc83fdffbb24d51576c2e26f961fbe7a37dd0c07d0068b3eb43327f254f3264 \
--control-plane --certificate-key 32a341ee000f200b411d5cdd0ddb37eb8257f3e43acc2d1813e968119b66795e \
--cri-socket=unix:///run/containerd/containerd.sock
#如果有报错可能是换行的原因,可以写成一行的形式
kubeadm join 192.168.31.49:6443 --token 2vqrer.gd62n98hnn8sllft --discovery-token-ca-cert-hash sha256:bcc83fdffbb24d51576c2e26f961fbe7a37dd0c07d0068b3eb43327f254f3264 --control-plane --certificate-key 32a341ee000f200b411d5cdd0ddb37eb8257f3e43acc2d1813e968119b66795e --cri-socket=unix:///run/containerd/containerd.sock
初始化成功之后,可以查看一下pod
信息
kubectl get pods --all-namespaces
六、加入工作节点
6.1 hep-kubernetes-worker-prd-01
#设置主机名
hostnamectl set-hostname ep-kubernetes-worker-prd-01
#修改hosts文件,把k8s所有机器主机名都填上去,这样集群内就可以根据主机名匹配到对应的IP地址,从而进行通信了
vim /etc/hosts
#增加如下内容
192.168.31.41 hep-kubernetes-master-prd-01
192.168.31.42 hep-kubernetes-master-prd-02
192.168.31.43 hep-kubernetes-master-prd-03
192.168.31.44 hep-kubernetes-apiserver-lb-prd-01
192.168.31.45 hep-kubernetes-apiserver-lb-prd-02
192.168.31.46 hep-kubernetes-worker-prd-01
192.168.31.47 hep-kubernetes-worker-prd-02
192.168.31.48 hep-kubernetes-worker-prd-03
#设置为中国上海时区
timedatectl set-timezone Asia/Shanghai
#时间同步工具
apt install -y ntpdate
#进行时间同步
ntpdate ntp.aliyun.com
#修改/etc/fstab文件注释掉带/swap.img的这一行
vim /etc/fstab
# 设置所需的 sysctl 参数,参数在重新启动后保持不变
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.ipv4.ip_forward = 1
EOF
# 应用 sysctl 参数而不重新启动
sysctl --system
#验证 net.ipv4.ip_forward 是否设置为 1
sysctl net.ipv4.ip_forward
#选用的容器运行时是containerd,版本号为 1.7.20,可直接执行下面这行命令下载此安装包
curl -LO https://github.com/containerd/containerd/releases/download/v1.7.20/cri-containerd-cni-1.7.20-linux-amd64.tar.gz
#我这里是使用的webmin远程登录到家里的机器,从这台机器拷贝containerd到当前机器
scp root@192.168.31.2:/usr/software/cri-containerd-cni-1.7.20-linux-amd64.tar.gz /usr/software/
#压缩包解压到根目录
tar -zxvf cri-containerd-cni-1.7.20-linux-amd64.tar.gz -C /
#查看版本号
containerd --version
#因为安装后默认是不自带配置文件的,所以需要创建目录并生成配置文件
mkdir /etc/containerd
#执行这行命令生成配置文件
containerd config default | sudo tee /etc/containerd/config.toml
vim /etc/containerd/config.toml
#因为默认的pause镜像是在谷歌上拉取的,国内拉取不下来,所以需要修改 /etc/containerd/config.toml 配置文件,将pause镜像改为 registry.aliyuncs.com/google_containers/pause:3.9
sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.9"
#搜索plugins."io.containerd.grpc.v1.cri".registry.mirrors,增加几个docker仓库
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."https://docker.m.daocloud.io"]
endpoint = ["https://docker.m.daocloud.io"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."https://noohub.ru"]
endpoint = ["https://noohub.ru"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."https://huecker.io"]
endpoint = ["https://huecker.io"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."https://dockerhub.timeweb.cloud"]
endpoint = ["https://dockerhub.timeweb.cloud"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."https://docker.rainbond.cc"]
endpoint = ["https://docker.rainbond.cc"]
#然后需要开启cgroup,用于限制分配给进程的资源。将SystemdCgroup设置为true
SystemdCgroup = true
#启动containerd,并设置为开机自启动
systemctl start containerd
systemctl enable --now containerd
#开放端口,这是百度出的K8S集群需要通信的端口号
ufw allow 6443
ufw allow 10248
ufw allow 10259
ufw allow 10257
ufw allow 10250
ufw allow 10251
ufw allow 10252
ufw allow 10259
ufw allow 10257
ufw allow 10255
ufw allow 10256
ufw allow 2375
ufw allow 8472
ufw allow 4789
ufw allow 9099
ufw allow 9796
ufw allow 2379
ufw allow 2380
ufw allow 80
ufw allow 443
ufw allow 9443
#开机自启动
ufw enable
#当然你也可以无脑操作关闭防火墙,但生产环境不建议,如果你只是本地跑跑玩玩
ufw disable
apt update
#apt-transport-https可能是一个虚拟包(dummy package),如果是的话,你可以跳过安装这个包
apt install -y apt-transport-https ca-certificates curl gpg
# 下载用于 Kubernetes 软件包仓库的公共签名密钥。如果/etc/apt/keyrings目录不存在,则应在curl命令之前创建它,请阅读下面的注释
# sudo mkdir -p -m 755 /etc/apt/keyrings
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.30/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
#添加Kubernetes apt仓库。此操作会覆盖/etc/apt/sources.list.d/kubernetes.list中现存的所有配置。
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.30/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list
#更新apt包索引,安装kubelet、kubeadm和kubectl
apt update
apt install -y kubelet kubeadm kubectl
#锁定其版本
apt-mark hold kubelet kubeadm kubectl
#从指定的镜像仓库registry.aliyuncs.com/google_containers拉取安装 Kubernetes指定版本v1.30.4所需的容器镜像
kubeadm config images pull --image-repository=registry.aliyuncs.com/google_containers --kubernetes-version=v1.30.4
#加入k8s集群
kubeadm join 192.168.31.49:6443 --token 2vqrer.gd62n98hnn8sllft --discovery-token-ca-cert-hash sha256:bcc83fdffbb24d51576c2e26f961fbe7a37dd0c07d0068b3eb43327f254f3264 --cri-socket=unix:///run/containerd/containerd.sock
6.2 hep-kubernetes-worker-prd-02
#设置主机名
hostnamectl set-hostname ep-kubernetes-worker-prd-02
#修改hosts文件,把k8s所有机器主机名都填上去,这样集群内就可以根据主机名匹配到对应的IP地址,从而进行通信了
vim /etc/hosts
#增加如下内容
192.168.31.41 hep-kubernetes-master-prd-01
192.168.31.42 hep-kubernetes-master-prd-02
192.168.31.43 hep-kubernetes-master-prd-03
192.168.31.44 hep-kubernetes-apiserver-lb-prd-01
192.168.31.45 hep-kubernetes-apiserver-lb-prd-02
192.168.31.46 hep-kubernetes-worker-prd-01
192.168.31.47 hep-kubernetes-worker-prd-02
192.168.31.48 hep-kubernetes-worker-prd-03
#设置为中国上海时区
timedatectl set-timezone Asia/Shanghai
#时间同步工具
apt install -y ntpdate
#进行时间同步
ntpdate ntp.aliyun.com
#修改/etc/fstab文件注释掉带/swap.img的这一行
vim /etc/fstab
# 设置所需的 sysctl 参数,参数在重新启动后保持不变
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.ipv4.ip_forward = 1
EOF
# 应用 sysctl 参数而不重新启动
sysctl --system
#验证 net.ipv4.ip_forward 是否设置为 1
sysctl net.ipv4.ip_forward
#选用的容器运行时是containerd,版本号为 1.7.20,可直接执行下面这行命令下载此安装包
curl -LO https://github.com/containerd/containerd/releases/download/v1.7.20/cri-containerd-cni-1.7.20-linux-amd64.tar.gz
#我这里是使用的webmin远程登录到家里的机器,从这台机器拷贝containerd到当前机器
scp root@192.168.31.2:/usr/software/cri-containerd-cni-1.7.20-linux-amd64.tar.gz /usr/software/
#压缩包解压到根目录
tar -zxvf cri-containerd-cni-1.7.20-linux-amd64.tar.gz -C /
#查看版本号
containerd --version
#因为安装后默认是不自带配置文件的,所以需要创建目录并生成配置文件
mkdir /etc/containerd
#执行这行命令生成配置文件
containerd config default | sudo tee /etc/containerd/config.toml
vim /etc/containerd/config.toml
#因为默认的pause镜像是在谷歌上拉取的,国内拉取不下来,所以需要修改 /etc/containerd/config.toml 配置文件,将pause镜像改为 registry.aliyuncs.com/google_containers/pause:3.9
sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.9"
#搜索plugins."io.containerd.grpc.v1.cri".registry.mirrors,增加几个docker仓库
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."https://docker.m.daocloud.io"]
endpoint = ["https://docker.m.daocloud.io"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."https://noohub.ru"]
endpoint = ["https://noohub.ru"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."https://huecker.io"]
endpoint = ["https://huecker.io"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."https://dockerhub.timeweb.cloud"]
endpoint = ["https://dockerhub.timeweb.cloud"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."https://docker.rainbond.cc"]
endpoint = ["https://docker.rainbond.cc"]
#然后需要开启cgroup,用于限制分配给进程的资源。将SystemdCgroup设置为true
SystemdCgroup = true
#启动containerd,并设置为开机自启动
systemctl start containerd
systemctl enable --now containerd
#开放端口,这是百度出的K8S集群需要通信的端口号
ufw allow 6443
ufw allow 10248
ufw allow 10259
ufw allow 10257
ufw allow 10250
ufw allow 10251
ufw allow 10252
ufw allow 10259
ufw allow 10257
ufw allow 10255
ufw allow 10256
ufw allow 2375
ufw allow 8472
ufw allow 4789
ufw allow 9099
ufw allow 9796
ufw allow 2379
ufw allow 2380
ufw allow 80
ufw allow 443
ufw allow 9443
#开机自启动
ufw enable
#当然你也可以无脑操作关闭防火墙,但生产环境不建议,如果你只是本地跑跑玩玩
ufw disable
apt update
#apt-transport-https可能是一个虚拟包(dummy package),如果是的话,你可以跳过安装这个包
apt install -y apt-transport-https ca-certificates curl gpg
# 下载用于 Kubernetes 软件包仓库的公共签名密钥。如果/etc/apt/keyrings目录不存在,则应在curl命令之前创建它,请阅读下面的注释
# sudo mkdir -p -m 755 /etc/apt/keyrings
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.30/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
#添加Kubernetes apt仓库。此操作会覆盖/etc/apt/sources.list.d/kubernetes.list中现存的所有配置。
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.30/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list
#更新apt包索引,安装kubelet、kubeadm和kubectl
apt update
apt install -y kubelet kubeadm kubectl
#锁定其版本
apt-mark hold kubelet kubeadm kubectl
#从指定的镜像仓库registry.aliyuncs.com/google_containers拉取安装 Kubernetes指定版本v1.30.4所需的容器镜像
kubeadm config images pull --image-repository=registry.aliyuncs.com/google_containers --kubernetes-version=v1.30.4
#加入k8s集群
kubeadm join 192.168.31.49:6443 --token 75zmv3.h2hus7ym9b5lhsym \
--discovery-token-ca-cert-hash sha256:cb65e5d203864392463f630741beea2be3f0453cbf119536c0454560d754939d
--cri-socket=unix:///run/containerd/containerd.sock
6.3 hep-kubernetes-worker-prd-03
#设置主机名
hostnamectl set-hostname ep-kubernetes-worker-prd-03
#修改hosts文件,把k8s所有机器主机名都填上去,这样集群内就可以根据主机名匹配到对应的IP地址,从而进行通信了
vim /etc/hosts
#增加如下内容
192.168.31.41 hep-kubernetes-master-prd-01
192.168.31.42 hep-kubernetes-master-prd-02
192.168.31.43 hep-kubernetes-master-prd-03
192.168.31.44 hep-kubernetes-apiserver-lb-prd-01
192.168.31.45 hep-kubernetes-apiserver-lb-prd-02
192.168.31.46 hep-kubernetes-worker-prd-01
192.168.31.47 hep-kubernetes-worker-prd-02
192.168.31.48 hep-kubernetes-worker-prd-03
#设置为中国上海时区
timedatectl set-timezone Asia/Shanghai
#时间同步工具
apt install -y ntpdate
#进行时间同步
ntpdate ntp.aliyun.com
#修改/etc/fstab文件注释掉带/swap.img的这一行
vim /etc/fstab
# 设置所需的 sysctl 参数,参数在重新启动后保持不变
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.ipv4.ip_forward = 1
EOF
# 应用 sysctl 参数而不重新启动
sysctl --system
#验证 net.ipv4.ip_forward 是否设置为 1
sysctl net.ipv4.ip_forward
#选用的容器运行时是containerd,版本号为 1.7.20,可直接执行下面这行命令下载此安装包
curl -LO https://github.com/containerd/containerd/releases/download/v1.7.20/cri-containerd-cni-1.7.20-linux-amd64.tar.gz
#我这里是使用的webmin远程登录到家里的机器,从这台机器拷贝containerd到当前机器
scp root@192.168.31.2:/usr/software/cri-containerd-cni-1.7.20-linux-amd64.tar.gz /usr/software/
#压缩包解压到根目录
tar -zxvf cri-containerd-cni-1.7.20-linux-amd64.tar.gz -C /
#查看版本号
containerd --version
#因为安装后默认是不自带配置文件的,所以需要创建目录并生成配置文件
mkdir /etc/containerd
#执行这行命令生成配置文件
containerd config default | sudo tee /etc/containerd/config.toml
vim /etc/containerd/config.toml
#因为默认的pause镜像是在谷歌上拉取的,国内拉取不下来,所以需要修改 /etc/containerd/config.toml 配置文件,将pause镜像改为 registry.aliyuncs.com/google_containers/pause:3.9
sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.9"
#搜索plugins."io.containerd.grpc.v1.cri".registry.mirrors,增加几个docker仓库
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."https://docker.m.daocloud.io"]
endpoint = ["https://docker.m.daocloud.io"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."https://noohub.ru"]
endpoint = ["https://noohub.ru"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."https://huecker.io"]
endpoint = ["https://huecker.io"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."https://dockerhub.timeweb.cloud"]
endpoint = ["https://dockerhub.timeweb.cloud"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."https://docker.rainbond.cc"]
endpoint = ["https://docker.rainbond.cc"]
#然后需要开启cgroup,用于限制分配给进程的资源。将SystemdCgroup设置为true
SystemdCgroup = true
#启动containerd,并设置为开机自启动
systemctl start containerd
systemctl enable --now containerd
#开放端口,这是百度出的K8S集群需要通信的端口号
ufw allow 6443
ufw allow 10248
ufw allow 10259
ufw allow 10257
ufw allow 10250
ufw allow 10251
ufw allow 10252
ufw allow 10259
ufw allow 10257
ufw allow 10255
ufw allow 10256
ufw allow 2375
ufw allow 8472
ufw allow 4789
ufw allow 9099
ufw allow 9796
ufw allow 2379
ufw allow 2380
ufw allow 80
ufw allow 443
ufw allow 9443
#开机自启动
ufw enable
#当然你也可以无脑操作关闭防火墙,但生产环境不建议,如果你只是本地跑跑玩玩
ufw disable
apt update
#apt-transport-https可能是一个虚拟包(dummy package),如果是的话,你可以跳过安装这个包
apt install -y apt-transport-https ca-certificates curl gpg
# 下载用于 Kubernetes 软件包仓库的公共签名密钥。如果/etc/apt/keyrings目录不存在,则应在curl命令之前创建它,请阅读下面的注释
# sudo mkdir -p -m 755 /etc/apt/keyrings
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.30/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
#添加Kubernetes apt仓库。此操作会覆盖/etc/apt/sources.list.d/kubernetes.list中现存的所有配置。
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.30/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list
#更新apt包索引,安装kubelet、kubeadm和kubectl
apt update
apt install -y kubelet kubeadm kubectl
#锁定其版本
apt-mark hold kubelet kubeadm kubectl
#从指定的镜像仓库registry.aliyuncs.com/google_containers拉取安装 Kubernetes指定版本v1.30.4所需的容器镜像
kubeadm config images pull --image-repository=registry.aliyuncs.com/google_containers --kubernetes-version=v1.30.4
#加入k8s集群
kubeadm join 192.168.31.49:6443 --token 75zmv3.h2hus7ym9b5lhsym \
--discovery-token-ca-cert-hash sha256:cb65e5d203864392463f630741beea2be3f0453cbf119536c0454560d754939d
--cri-socket=unix:///run/containerd/containerd.sock
七、安装calico网络插件
大佬说的方式是,如果你的节点数量小于等于50,执行下面这行命令安装calico
。
kubectl create -f https://raw.githubusercontent.com/xiaohh-me/kubernetes-yaml/main/network/calico/calico-v3.28.1.yaml
如果你的节点数量大于50个,执行这行命令进行安装calico
kubectl create -f https://raw.githubusercontent.com/xiaohh-me/kubernetes-yaml/main/network/calico/calico-typha-v3.28.1.yaml
但是我的机器还是访问不到这个网址,所以自己操作一下吧。
#你可以使用wget命令把这个文件下载到本地,或者直接访问这个网址拷贝下来保存都可以,最后就是个calico-v3.28.1.yaml文件而已
wget https://raw.githubusercontent.com/xiaohh-me/kubernetes-yaml/main/network/calico/calico-v3.28.1.yaml
#calico-v3.28.1.yaml这个文件已经存在于当前目录了,安装calico网络插件
kubectl create -f calico-v3.28.1.yaml
--cri-socket=unix:///run/containerd/containerd.sock
#获得pod节点信息
kubectl get pod -A
#获得node节点信息
kubectl get node -A
如果你没有成功安装的话,这个状态不是Runing
,下面是失败与成功的对比图。解决办法可以参考问题排错与解决。
八、我的第一个K8S应用
创建一个nginx-deployment.yaml
文件,其内容如下
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
-name: nginx-container
image: docker.m.daocloud.io/library/nginx:latest
ports:
-containerPort: 80
#当执行kubectl apply-f nginx-deployment.yaml时,kubectl会读取nginx-deployment.yaml文件中的配置信息,然后在 Kubernetes 集群中创建一个名为nginx-deployment的部署,这个部署会根据文件中的定义创建1个包含Nginx容器的Pod副本,并确保这些副本的状态符合预期。如果这个部署已经存在,kubectl apply会根据文件中的配置更新该部署,使其与新的配置一致
kubectl apply -f nginx-deployment.yaml
创建nginx-service.yaml
文件,其内容如下
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
type: NodePort
selector:
app: nginx
ports:
-protocol: TCP
port: 80
targetPort: 80
nodePort: 30080
#kubectl apply -f nginx-service.yaml的含义是以文件形式nginx-service.yaml将定义的NodePort类型的Nginx服务名为nginx-service,关联app:nginx标签的 Pod,端口映射为80到80且节点暴露端口为30080应用到Kubernetes集群,若不存在则创建,存在则更新
kubectl apply -f nginx-service.yaml
ufw allow 30080
我们的hep-kubernetes-master-prd-01
机器的IP地址为192.168.31.41
,所以在局域网浏览器输入master
的IP
地址+暴露出的30080
端口号,就可以访问到K8S
容器里部署的Nginx
主页了。
我们的第一个k8s
应用就这么愉快地部署成功了,后面还需要继续加油努力,构建更强大的应用服务。
九、最后
9.1 重启顺序
9.1.1 关闭集群的合理步骤
首先,在负载均衡器节点上,停止haproxy
和keepalived
服务,负载均衡器节点(hep-kubernetes-apiserver-lb-prd-01 和 hep-kubernetes-apiserver-lb-prd-02)
执行以下命令。
systemctl stop haproxy
systemctl stop keepalived
Master
节点(hep-kubernetes-master-prd-01、hep-kubernetes-master-prd-02、hep-kubernetes-master-prd-03)
将节点标记为不可调度,并且安全地驱逐该节点上的 Pod
。
kubectl drain hep-kubernetes-master-prd-01 --ignore-daemonsets
同样在master
三个节点上,停止 kube-apiserver
、kube-scheduler
和kube-controller-manager
等关键的Kubernetes
组件服务。
systemctl stop kube-apiserver
systemctl stop kube-scheduler
systemctl stop kube-controller-manager
Worker
节点(hep-kubernetes-worker-prd-01、hep-kubernetes-worker-prd-02、hep-kubernetes-worker-prd-03)
使用kubectl drain
命令将节点标记为不可调度并驱逐Pod
。
kubectl drain hep-kubernetes-worker-prd-01 --ignore-daemonsets
systemctl stop kubelet
systemctl stop kube-proxy
9.1.2 重启集群的合理步骤
首先启动三个worker
节点(hep-kubernetes-worker-prd-01、hep-kubernetes-worker-prd-02、hep-kubernetes-worker-prd-03)
。
systemctl start kubelet
systemctl start kube-proxy
#待一段时间,可以通过查看日志或者检查服务状态来确定服务是否完全启动,然后使用`kubectl uncordon`命令将节点重新标记为可调度状态。
kubectl uncordon hep-kubernetes-worker-prd-01
其次启动Master
节点(hep-kubernetes-master-prd-01、hep-kubernetes-master-prd-02、hep-kubernetes-master-prd-03)
启动下面三个服务。
systemctl start kube-apiserver
systemctl start kube-scheduler
systemctl start kube-controller-manager
最后启动负载均衡器(LB)
节点(hep-kubernetes-apiserver-lb-prd-01 和 hep-kubernetes-apiserver-lb-prd-02)
启动haproxy
和keepalived
。
systemctl start haproxy
systemctl start keepalived
9.2 排错与解决
#master节点错误或者workder节点加入集群错误并重置
#删除配置
rm -rf $HOME/.kube/config
#重置
kubeadm reset
#如果你第一次没有成功安装calico,下面操作可以清除calico,操作之后就可以重新安装了。
kubectl delete crd bgpconfigurations.crd.projectcalico.org
kubectl delete crd bgppeers.crd.projectcalico.org
kubectl delete crd blockaffinities.crd.projectcalico.org
kubectl delete crd caliconodestatuses.crd.projectcalico.org
kubectl delete crd clusterinformations.crd.projectcalico.org
kubectl delete crd felixconfigurations.crd.projectcalico.org
kubectl delete crd globalnetworkpolicies.crd.projectcalico.org
kubectl delete crd globalnetworksets.crd.projectcalico.org
kubectl delete crd hostendpoints.crd.projectcalico.org
kubectl delete crd ipamblocks.crd.projectcalico.org
kubectl delete crd ipamconfigs.crd.projectcalico.org
kubectl delete crd ipamhandles.crd.projectcalico.org
kubectl delete crd ippools.crd.projectcalico.org
kubectl delete crd kubecontrollersconfigurations.crd.projectcalico.org
kubectl delete crd networkpolicies.crd.projectcalico.org
kubectl delete crd networksets.crd.projectcalico.org
kubectl delete deployment calico-kube-controllers -n kube-system
kubectl delete daemonset calico-node -n kube-system
#删除所有calico的pod
kubectl delete pod -l k8s-app=calico-node -n kube-system
rm -rf /etc/calico/
kubectl -n kube-system delete poddisruptionbudgets.policy calico-kube-controllers
kubectl -n kube-system delete poddisruptionbudgets.policy calico-node
kubectl -n kube-system delete serviceaccount calico-kube-controllers
kubectl -n kube-system delete serviceaccount calico-node
kubectl -n kube-system delete serviceaccount calico-cni-plugin
kubectl -n kube-system delete configmap calico-config
kubectl -n kube-system delete clusterrole calico-kube-controllers
kubectl -n kube-system delete clusterrole calico-node
kubectl -n kube-system delete clusterrole calico-cni-plugin
kubectl -n kube-system delete clusterrolebinding calico-kube-controllers
kubectl -n kube-system delete clusterrolebinding calico-node
kubectl -n kube-system delete clusterrolebinding calico-cni-plugin
kubectl delete crd bgpfilters.crd.projectcalico.org
kubectl delete crd ipreservations.crd.projectcalico.org
#Nginx部署失败,删除并重新部署
kubectl describe pod nginx-deployment-67b449cd77-qwvw2
kubectl describe deployment nginx-deployment
kubectl delete deployment -l app=nginx -n default
kubectl apply -f nginx-deployment.yaml
#docker可用源
{
"registry-mirrors" :
[
"https://docker.m.daocloud.io",
"https://noohub.ru",
"https://huecker.io",
"https://dockerhub.timeweb.cloud",
"https://docker.rainbond.cc"
]
}
#containerd尝试拉取镜像
ctr -n k8s.io images pull docker.m.daocloud.io/library/nginx:latest
Reference:
在Ubuntu24.04上安装多主多从的高可用Kubernetes节点:https://blog.csdn.net/m0_51510236/article/details/141671652