0%

Kubernetes HA Cluster 证书过期解决方案

环境说明

  • Kubernetes 版本:1.13.5
  • Kubeadm 版本:1.14.1
  • Kubernetes HA集群,3台Master

总结

先上总结,快速查看。

  1. 备份/etc/kubernetes 目录;
  2. 更新证书文件:kubeadm alpha certs renew all –config=kubernetes/kubeadm-init.yaml;
  3. 更新配置文件:先删除,再更新:kubeadm init phase kubeconfig all –config=kubernetes/kubeadm-init.yaml;
  4. Kubernetes版本不同,上面两步的命令位置不同,具体见下方详细说明;
  5. 拷贝kubectl配置:cd ~/.kube && cp config{,.bak} && cp /etc/kubernetes/admin.conf ~/.kube/;
  6. 分别在所有Master节点上执行以上命令,而不能直接拷贝证书或者配置文件。

Kubernetes 1.13以上的版本

定位问题

临近下班的时候有同事反映,Rancher Server的界面无法打开,亲测后发现提示无法连接。心想Rancher Server可是集群呢?而且前边有三台nginx ingress controller,再前边还有AWS NLB,怎么也不至于连不上啊,架构如下图:

紧接着登陆Kubernetes,通过kubectl命令查看,问题来了:提示无法连接到Kubernetes,三台master节点,全军覆没,通过docker ps命令查看:etcd、api-server都在一直重启,而无法启动。

通过docker logs查看etcd日志:

2020-04-20 13:37:31.471260 W | rafthttp: health check for peer 4898f4a4483db9e3 could not connect: dial tcp 172.17.21.60:2380: getsockopt: connection refused
2020-04-20 13:37:32.273336 I | raft: 923b8cf0a7704360 is starting a new election at term 33044
2020-04-20 13:37:32.273355 I | raft: 923b8cf0a7704360 became candidate at term 33045
2020-04-20 13:37:32.273363 I | raft: 923b8cf0a7704360 received MsgVoteResp from 923b8cf0a7704360 at term 33045
2020-04-20 13:37:32.273371 I | raft: 923b8cf0a7704360 [logterm: 11084, index: 167884379] sent MsgVote request to 4898f4a4483db9e3 at term 33045
2020-04-20 13:37:33.973358 I | raft: 923b8cf0a7704360 is starting a new election at term 33045
2020-04-20 13:37:33.973388 I | raft: 923b8cf0a7704360 became candidate at term 33046
2020-04-20 13:37:33.973395 I | raft: 923b8cf0a7704360 received MsgVoteResp from 923b8cf0a7704360 at term 33046
2020-04-20 13:37:33.973404 I | raft: 923b8cf0a7704360 [logterm: 11084, index: 167884379] sent MsgVote request to 4898f4a4483db9e3 at term 33046
2020-04-20 13:37:34.420803 E | etcdserver: publish error: etcdserver: request timed out
2020-04-20 13:37:35.573354 I | raft: 923b8cf0a7704360 is starting a new election at term 33046
2020-04-20 13:37:35.573381 I | raft: 923b8cf0a7704360 became candidate at term 33047
2020-04-20 13:37:35.573389 I | raft: 923b8cf0a7704360 received MsgVoteResp from 923b8cf0a7704360 at term 33047
2020-04-20 13:37:35.573397 I | raft: 923b8cf0a7704360 [logterm: 11084, index: 167884379] sent MsgVote request to 4898f4a4483db9e3 at term 33047

通过docker logs查看kube-api-server日志,发现如下证书过期的错误:

E0420 13:49:46.438819       1 authentication.go:65] Unable to authenticate the request due to an error: [x509: certificate has expired or is not yet valid, x509: certificate has expired or is not yet valid]
E0420 13:49:46.444011 1 authentication.go:65] Unable to authenticate the request due to an error: [x509: certificate has expired or is not yet valid, x509: certificate has expired or is not yet valid]
E0420 13:49:46.448161 1 authentication.go:65] Unable to authenticate the request due to an error: [x509: certificate has expired or is not yet valid, x509: certificate has expired or is not yet valid]

在通过查看kubelet日志:/var/logs/messages日志也发现了证书过期的错误:

Apr 20 18:37:33 k8s01 systemd: Unit kubelet.service entered failed state.
Apr 20 18:37:33 k8s01 systemd: kubelet.service failed.
Apr 20 18:37:40 k8s01 dhclient[3397]: XMT: Solicit on eth0, interval 131610ms.
Apr 20 18:37:43 k8s01 systemd: kubelet.service holdoff time over, scheduling restart.
Apr 20 18:37:43 k8s01 systemd: Started kubelet: The Kubernetes Node Agent.
Apr 20 18:37:43 k8s01 systemd: Starting kubelet: The Kubernetes Node Agent...
Apr 20 18:37:43 k8s01 kubelet: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config
-file/ for more information.
Apr 20 18:37:43 k8s01 kubelet: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config
-file/ for more information.
Apr 20 18:37:43 k8s01 systemd: Started Kubernetes systemd probe.
Apr 20 18:37:43 k8s01 systemd: Starting Kubernetes systemd probe.
Apr 20 18:37:43 k8s01 kubelet: I0420 18:37:43.785502 18338 server.go:407] Version: v1.13.5
Apr 20 18:37:43 k8s01 kubelet: I0420 18:37:43.785637 18338 plugins.go:103] No cloud provider specified.
Apr 20 18:37:43 k8s01 kubelet: E0420 18:37:43.787102 18338 bootstrap.go:209] Part of the existing bootstrap client certificate is expired: 2020-04-19 16:11:32 +0000 UTC
Apr 20 18:37:43 k8s01 kubelet: F0420 18:37:43.787209 18338 server.go:261] failed to run Kubelet: unable to load bootstrap kubeconfig: Error loading config file "/etc/kubernetes/bootstrap-kubelet.conf": yaml: line 6: could not find e
xpected ':'
Apr 20 18:37:43 k8s01 systemd: kubelet.service: main process exited, code=exited, status=255/n/a

回想一下,也差不多了,Kubernetes集群安装有一年了,默认的证书就是一年的期限,下面验证一下:

# 当前日期
# date
Mon Apr 20 18:18:28 CST 2020

$ sudo su -
# cd /etc/kubernetes/pki

# 发现文件都是2019年4月21号的更新时间
# ll
total 56
-rw-r--r-- 1 root root 1245 Apr 21 2019 apiserver.crt
-rw-r--r-- 1 root root 1090 Apr 21 2019 apiserver-etcd-client.crt
-rw------- 1 root root 1679 Apr 21 2019 apiserver-etcd-client.key
-rw------- 1 root root 1675 Apr 21 2019 apiserver.key
-rw-r--r-- 1 root root 1099 Apr 21 2019 apiserver-kubelet-client.crt
-rw------- 1 root root 1675 Apr 21 2019 apiserver-kubelet-client.key
-rw-r--r-- 1 root root 1025 Apr 21 2019 ca.crt
-rw------- 1 root root 1675 Apr 21 2019 ca.key
drwxr-xr-x 2 root root 162 Apr 21 2019 etcd
-rw-r--r-- 1 root root 1038 Apr 21 2019 front-proxy-ca.crt
-rw------- 1 root root 1675 Apr 21 2019 front-proxy-ca.key
-rw-r--r-- 1 root root 1058 Apr 21 2019 front-proxy-client.crt
-rw------- 1 root root 1679 Apr 21 2019 front-proxy-client.key
drwxr-xr-x 2 root root 210 Apr 29 2019 helm
-rw------- 1 root root 1675 Apr 21 2019 sa.key
-rw------- 1 root root 451 Apr 21 2019 sa.pub
# ll etcd/
total 32
-rw-r--r-- 1 root root 1017 Apr 21 2019 ca.crt
-rw------- 1 root root 1679 Apr 21 2019 ca.key
-rw-r--r-- 1 root root 1094 Apr 21 2019 healthcheck-client.crt
-rw------- 1 root root 1679 Apr 21 2019 healthcheck-client.key
-rw-r--r-- 1 root root 1159 Apr 21 2019 peer.crt
-rw------- 1 root root 1679 Apr 21 2019 peer.key
-rw-r--r-- 1 root root 1159 Apr 21 2019 server.crt
-rw------- 1 root root 1675 Apr 21 2019 server.key
# ll helm/
total 48
-rw-r--r-- 1 root root 2106 Apr 28 2019 ca.cert.pem
-rw-r--r-- 1 root root 3243 Apr 28 2019 ca.key.pem
-rw-r--r-- 1 root root 17 Apr 28 2019 ca.srl
-rw-r--r-- 1 root root 1988 Apr 28 2019 helm.cert.pem
-rw-r--r-- 1 root root 12008 Apr 29 2019 helm.certs.tar.gz
-rw-r--r-- 1 root root 1744 Apr 28 2019 helm.csr.pem
-rw-r--r-- 1 root root 3243 Apr 28 2019 helm.key.pem
-rw-r--r-- 1 root root 2004 Apr 28 2019 tiller.cert.pem
-rw-r--r-- 1 root root 1760 Apr 28 2019 tiller.csr.pem
-rw-r--r-- 1 root root 3243 Apr 28 2019 tiller.key.pem

# 查看证书过期时间命令
openssl x509 -noout -enddate -in /etc/kubernetes/pki/ca.crt

# 整体如下:
# tree /etc/kubernetes/pki/
/etc/kubernetes/pki/
├── apiserver.crt # notAfter=Apr 19 16:11:28 2020 GMT 已过期
├── apiserver-etcd-client.crt # notAfter=Apr 19 16:11:30 2020 GMT 已过期
├── apiserver-etcd-client.key
├── apiserver.key
├── apiserver-kubelet-client.crt # notAfter=Apr 19 16:11:29 2020 GMT 已过期
├── apiserver-kubelet-client.key
├── ca.crt # notAfter=Apr 17 16:11:28 2029 GMT
├── ca.key
├── etcd
│   ├── ca.crt # notAfter=Apr 17 16:11:29 2029 GMT
│   ├── ca.key
│   ├── healthcheck-client.crt # notAfter=Apr 19 16:11:31 2020 GMT 已过期
│   ├── healthcheck-client.key
│   ├── peer.crt # notAfter=Apr 19 16:11:31 2020 GMT 已过期
│   ├── peer.key
│   ├── server.crt # notAfter=Apr 19 16:11:30 2020 GMT 已过期
│   └── server.key
├── front-proxy-ca.crt # notAfter=Apr 17 16:11:29 2029 GMT
├── front-proxy-ca.key
├── front-proxy-client.crt # notAfter=Apr 19 16:11:29 2020 GMT 已过期
├── front-proxy-client.key
├── helm
│   ├── ca.cert.pem # notAfter=Apr 23 02:59:43 2039 GMT
│   ├── ca.key.pem
│   ├── ca.srl
│   ├── helm.cert.pem # notAfter=Apr 25 03:06:11 2029 GMT
│   ├── helm.certs.tar.gz
│   ├── helm.csr.pem
│   ├── helm.key.pem
│   ├── tiller.cert.pem # notAfter=Apr 25 03:05:49 2029 GMT
│   ├── tiller.csr.pem
│   └── tiller.key.pem
├── sa.key
└── sa.pub

2 directories, 32 files

确实7个证书已经过期了。

更新证书

因为Kubernetes集群时通过kubeadm来安装的,默认kubeadm是有更新证书功能的:

老规矩,查看kubeadm工具帮助:

# 我的kubeadm是1.14.1版本,来安装的Kubernetes 1.13.5版本的。
# kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.1", GitCommit:"b7394102d6ef778017f2ca4046abbaa23b88c290", GitTreeState:"clean", BuildDate:"2019-04-08T17:08:49Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}

# kubectl version
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.5", GitCommit:"2166946f41b36dea2c4626f90a77706f426cdea2", GitTreeState:"clean", BuildDate:"2019-03-25T15:26:52Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}
Unable to connect to the server: EOF

# kubeadm alpha -h
Kubeadm experimental sub-commands

Usage:
kubeadm alpha [command]

Available Commands:
certs Commands related to handling kubernetes certificates # 证书相关
kubeconfig Kubeconfig file utilities
kubelet Commands related to handling the kubelet
selfhosting Makes a kubeadm cluster self-hosted

Flags:
-h, --help help for alpha

Global Flags:
--log-file string If non-empty, use this log file
--rootfs string [EXPERIMENTAL] The path to the 'real' host root filesystem.
--skip-headers If true, avoid header prefixes in the log messages
-v, --v Level number for the log level verbosity

Additional help topics:
kubeadm alpha phase Invoke subsets of kubeadm functions separately for a manual install

Use "kubeadm alpha [command] --help" for more information about a command.

# kubeadm alpha certs -h
Commands related to handling kubernetes certificates

Usage:
kubeadm alpha certs [command]

Aliases:
certs, certificates

Available Commands:
renew Renews certificates for a Kubernetes cluster # 这里就是更新证书

Flags:
-h, --help help for certs

Global Flags:
--log-file string If non-empty, use this log file
--rootfs string [EXPERIMENTAL] The path to the 'real' host root filesystem.
--skip-headers If true, avoid header prefixes in the log messages
-v, --v Level number for the log level verbosity

Use "kubeadm alpha certs [command] --help" for more information about a command.


# 7个已过期的,正好是这7个。
# kubeadm alpha certs renew -h
This command is not meant to be run on its own. See list of available subcommands.

Usage:
kubeadm alpha certs renew [flags]
kubeadm alpha certs renew [command]

Available Commands:
all renew all available certificates
apiserver Generates the certificate for serving the Kubernetes API
apiserver-etcd-client Generates the client apiserver uses to access etcd
apiserver-kubelet-client Generates the Client certificate for the API server to connect to kubelet
etcd-healthcheck-client Generates the client certificate for liveness probes to healtcheck etcd
etcd-peer Generates the credentials for etcd nodes to communicate with each other
etcd-server Generates the certificate for serving etcd
front-proxy-client Generates the client for the front proxy

Flags:
-h, --help help for renew

Global Flags:
--log-file string If non-empty, use this log file
--rootfs string [EXPERIMENTAL] The path to the 'real' host root filesystem.
--skip-headers If true, avoid header prefixes in the log messages
-v, --v Level number for the log level verbosity

Use "kubeadm alpha certs renew [command] --help" for more information about a command.

备份

先别急,操作之前一定要备份,以防误操作。

mkdir etc.kubernet.bakcup
cp -a /etc/kubernetes etc.kubernet.bakcup/

更新证书

找到之前kubeadm的init文件:

# cat kubeadm-init.yaml 
apiVersion: kubeadm.k8s.io/v1beta1
bootstrapTokens:
kind: InitConfiguration
nodeRegistration:
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
timeoutForControlPlane: 4m0s
extraArgs:
runtime-config: "api/all=true"
audit-log-path: /var/log/kubernetes/audit.log
apiVersion: kubeadm.k8s.io/v1beta1
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager:
extraArgs:
horizontal-pod-autoscaler-use-rest-clients: "true"
horizontal-pod-autoscaler-sync-period: "10s"
node-monitor-grace-period: "10s"
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: k8s.gcr.io
kind: ClusterConfiguration
kubernetesVersion: v1.13.5
controlPlaneEndpoint: 172.17.0.180:6443
networking:
dnsDomain: cluster.local
podSubnet: 10.101.0.0/16
serviceSubnet: 10.100.0.0/16
scheduler: {}
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: "ipvs"

如果已经找不到,新建一个也是可以的:

apiVersion: kubeadm.k8s.io/v1beta1
kind: ClusterConfiguration
kubernetesVersion: v1.13.5 #-->这里改成你集群对应的版本
imageRepository: k8s.gcr.io

# 主要是告诉kubeadm中Kubernetes集群的版本,以防止它去网上查找,因为被墙,所以会报错如下:
# could not fetch a Kubernetes version from the internet: unable to get URL "https://dl.k8s.io/release/stable-1.txt"

开始更新:

$ kubeadm alpha certs renew all --config=kubernetes/kubeadm-init.yaml

检查证书文件:

# 发现
[root@k8s01.dev.awsbj.cn ~]# ll /etc/kubernetes/pki/
total 56
-rw-r--r-- 1 root root 1245 Apr 20 18:35 apiserver.crt # 已更新
-rw-r--r-- 1 root root 1090 Apr 20 18:35 apiserver-etcd-client.crt # 已更新
-rw------- 1 root root 1675 Apr 20 18:35 apiserver-etcd-client.key # 已更新
-rw------- 1 root root 1675 Apr 20 18:35 apiserver.key # 已更新
-rw-r--r-- 1 root root 1099 Apr 20 18:35 apiserver-kubelet-client.crt # 已更新
-rw------- 1 root root 1679 Apr 20 18:35 apiserver-kubelet-client.key # 已更新
-rw-r--r-- 1 root root 1025 Apr 21 2019 ca.crt
-rw------- 1 root root 1675 Apr 21 2019 ca.key
drwxr-xr-x 2 root root 162 Apr 21 2019 etcd
-rw-r--r-- 1 root root 1038 Apr 21 2019 front-proxy-ca.crt
-rw------- 1 root root 1675 Apr 21 2019 front-proxy-ca.key
-rw-r--r-- 1 root root 1058 Apr 20 18:35 front-proxy-client.crt # 已更新
-rw------- 1 root root 1679 Apr 20 18:35 front-proxy-client.key # 已更新
drwxr-xr-x 2 root root 210 Apr 29 2019 helm
-rw------- 1 root root 1675 Apr 21 2019 sa.key
-rw------- 1 root root 451 Apr 21 2019 sa.pub
[root@k8s01.dev.awsbj.cn ~]# ll /etc/kubernetes/pki/etcd/
total 32
-rw-r--r-- 1 root root 1017 Apr 21 2019 ca.crt
-rw------- 1 root root 1679 Apr 21 2019 ca.key
-rw-r--r-- 1 root root 1094 Apr 20 18:35 healthcheck-client.crt # 已更新
-rw------- 1 root root 1679 Apr 20 18:35 healthcheck-client.key # 已更新
-rw-r--r-- 1 root root 1159 Apr 20 18:35 peer.crt # 已更新
-rw------- 1 root root 1679 Apr 20 18:35 peer.key # 已更新
-rw-r--r-- 1 root root 1159 Apr 20 18:35 server.crt # 已更新
-rw------- 1 root root 1675 Apr 20 18:35 server.key # 已更新
[root@k8s01.dev.awsbj.cn ~]# ll /etc/kubernetes/pki/helm/
total 48
-rw-r--r-- 1 root root 2106 Apr 28 2019 ca.cert.pem
-rw-r--r-- 1 root root 3243 Apr 28 2019 ca.key.pem
-rw-r--r-- 1 root root 17 Apr 28 2019 ca.srl
-rw-r--r-- 1 root root 1988 Apr 28 2019 helm.cert.pem
-rw-r--r-- 1 root root 12008 Apr 29 2019 helm.certs.tar.gz
-rw-r--r-- 1 root root 1744 Apr 28 2019 helm.csr.pem
-rw-r--r-- 1 root root 3243 Apr 28 2019 helm.key.pem
-rw-r--r-- 1 root root 2004 Apr 28 2019 tiller.cert.pem
-rw-r--r-- 1 root root 1760 Apr 28 2019 tiller.csr.pem
-rw-r--r-- 1 root root 3243 Apr 28 2019 tiller.key.pem

# 通过命令检查:
$ openssl x509 -noout -enddate -in apiserver.crt
notAfter=Apr 20 10:35:36 2021 GMT

证书更新完了,还需要更新组件:kubectl,kubelet,controller-manager,scheduler对应的配置文件,因为这里面也集成了证书内容的。

更新配置文件

kubeadm工具,同样有这个功能,老规矩,查看对应的帮助:

# kubeadm init phase kubeconfig -h
This command is not meant to be run on its own. See list of available subcommands.

Usage:
kubeadm init phase kubeconfig [flags]
kubeadm init phase kubeconfig [command]

Available Commands: # 正好四个
admin Generates a kubeconfig file for the admin to use and for kubeadm itself
all Generates all kubeconfig files
controller-manager Generates a kubeconfig file for the controller manager to use
kubelet Generates a kubeconfig file for the kubelet to use *only* for cluster bootstrapping purposes
scheduler Generates a kubeconfig file for the scheduler to use

Flags:
-h, --help help for kubeconfig

Global Flags:
--log-file string If non-empty, use this log file
--rootfs string [EXPERIMENTAL] The path to the 'real' host root filesystem.
--skip-headers If true, avoid header prefixes in the log messages
-v, --v Level number for the log level verbosity

Use "kubeadm init phase kubeconfig [command] --help" for more information about a command.

备份

一定要记得备份,因为之前已经备份了整个/etc/kubernetes文件夹了,所以这里略过。

删除对应的conf文件,否则更新命令会直接用现有的而不会更新:

$ rm -f admin.conf kubelet.conf controller-manager.conf scheduler.conf

更新配置文件

# kubeadm init phase kubeconfig all --config=kubernetes/kubeadm-init.yaml 
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file

拷贝kubectl配置文件

$ cd ~/.kube
$ cp config{,.bak}
$ cp /etc/kubernetes/admin.conf ~/.kube/

稍等片刻,久违的node节点列表终于出来了:

$ kubectl get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k8s01.dev.awsbj.cn Ready master 366d v1.13.5 172.17.20.238 <none> Amazon Linux 2 4.14.106-97.85.amzn2.x86_64 docker://18.6.3
k8s02.dev.awsbj.cn Ready master 366d v1.13.5 172.17.21.60 <none> Amazon Linux 2 4.14.109-99.92.amzn2.x86_64 docker://18.6.3
k8s03.dev.awsbj.cn Ready master 366d v1.13.5 172.17.20.52 <none> Amazon Linux 2 4.14.109-99.92.amzn2.x86_64 docker://18.6.3
k8s04.dev.awsbj.cn Ready node 366d v1.13.5 172.17.20.162 <none> Amazon Linux 2 4.14.109-99.92.amzn2.x86_64 docker://18.6.3
k8s05.dev.awsbj.cn Ready node 350d v1.13.5 172.17.21.178 <none> Amazon Linux 2 4.14.114-103.97.amzn2.x86_64 docker://18.6.3
k8s06.dev.awsbj.cn Ready node 322d v1.13.5 172.17.21.146 <none> Amazon Linux 2 4.14.114-105.126.amzn2.x86_64 docker://18.6.3

rancher server 也终于正常了。

Kubernetes 1.13以下的版本

备份

mkdir etc.kubernet.bakcup
cp -a /etc/kubernetes etc.kubernet.bakcup/

对于Kubernetes的版本小于1.13.0的,命令变化如下:

更新证书

# 需要将已经过期的证书删掉,否则还是会使用现有的证书,而不会重新生成
$ cd /etc/kubernetes/pki
# 删除掉7个已经过期的证书,注意备份
$ rm -rf apiserver.* apiserver-etcd-client.* apiserver-kubelet-client.* front-proxy-client.* etcd/healthcheck-client.* etcd/peer.* etcd/server.*
# 更新证书
$ kubeadm alpha phase certs all --apiserver-advertise-address=${MASTER_API_SERVER_IP} --apiserver-cert-extra-sans=Master1ip,Master2ip,MasterNip

更新配置文件

删除现有的配置文件:

$ rm -f admin.conf kubelet.conf controller-manager.conf scheduler.conf

重新生成配置文件:

$ kubeadm alpha phase kubeconfig all --apiserver-advertise-address=MasterIP

如果报错:

unable to get URL "https://dl.k8s.io/release/stable-1.9.txt":Get https://storage.googleapis.com/kubernetes-release/release/stable-1.9.txt: dial tcp 172.217.160.112:443: i/o timeout

则可以使用kubeadm的配置文件:

config.yaml
kind: MasterConfiguration
apiVersion: kubeadm.k8s.io/v1alpha1
kubernetesVersion: v1.9.6
api:
advertiseAddress: MasterIP

使用这个配置文件来重新生成配置配置:

kubeadm alpha phase kubeconfig all --config=config.yaml

拷贝kubectl配置文件

$ cd ~/.kube
$ cp config{,.bak}
$ cp /etc/kubernetes/admin.conf ~/.kube/

注意事项

  • 更新证书时,不要更新ca证书以及etcd的ca证书,如果这个ca变更了,会有其他手动的操作,远不止这么简单,而且默认ca的证书都是10年有效期。

  • 一定要备份

  • 注意:更新证书的命令是:

# 是在kubeadm alpha命令里
$ kubeadm alpha certs renew all --config=kubernetes/kubeadm-init.yaml

而不是:

# 不是在kubeadm init命令里
$ kubeadm init phase certs apiserver [flags]

更新配置文件的命令是:

# 这个是kubeadm init 这个命令里的
$ kubeadm init phase kubeconfig

参考:https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-init-phase/#cmd-phase-certs

  • 对于Kubernetes HA 多台Master,这些步骤,需要在每台master节点上都要执行,而不能直接拷贝,因为涉及到证书的运行:SAN:
$ openssl x509 -text -in /etc/kubernetes/pki/etcd/server.crt 
......
# 默认的etcd/peer.crt 和etcd/server.crt 可以的SAN
X509v3 Subject Alternative Name:
DNS:k8s01.dev.awsbj.cn, DNS:localhost, IP Address:172.17.20.238, IP Address:127.0.0.1, IP Address:0:0:0:0:0:0:0:1
......

如果图省事拷贝了,对于etcd集群时无法启动的,因为本机IP或者主机名不在证书的SAN列表内。

生产建议

对于生产环境,一定不要等到证书过期了,再去更新,一定要在证书过期前将证书更新了,以免造成停机。

对于手动通过Kubeadm安装Kubernetes集群,可以更改kubernetes源代码,将默认的证书过期时间1年调整更长,具体参考这里


本文到这里就结束了,欢迎期待后面的文章。您可以关注下方的公众号二维码,在第一时间查看新文章。

公众号

如有疏忽错误欢迎在留言区评论指正,如果对您有所帮助欢迎点击下方进行打赏。