用 kubeadm 部署生产级 k8s 集群

51reboot · · 1860 次点击 · · 开始浏览    
这是一个创建于 的文章,其中的信息可能已经有所发展或是发生改变。

概述

kubeadm 已⽀持集群部署,且在1.13 版本中 GA,⽀持多 master,多 etcd 集群化部署,它也是官⽅最为推荐的部署⽅式,⼀来是由它的 sig 组来推进的,⼆来 kubeadm 在很多⽅⾯确实很好的利⽤了 kubernetes 的许多特性,接下来⼏篇我们来实践并了解下它的魅⼒。

⽬标

1. 通过 kubeadm 搭建⾼可⽤ kubernetes 集群,并新建管理⽤户

2. 为后续做版本升级演示,此处使⽤1.13.1版本,到下⼀篇再升级到 v1.14

3. kubeadm 的原理解读

本⽂主要介绍 kubeadm 对⾼可⽤集群的部署

kubeadm 部署 k8s v1.13 ⾼可⽤集群

⽅式有两种

  • Stacked etcd topology
  • 即每台 etcd 各⾃独⽴,分别部署在 3 台 master 上,互不通信,优点是简单,缺点是缺乏 etcd ⾼可⽤性

  • 需要⾄少 4 台机器(3master 和 etcd,1node)

  • External etcd topology

image
  • 即采⽤集群外 etcd 拓扑结构,这样的冗余性更好,但需要⾄少7台机器(3master,3etcd,1node)

  • ⽣产环境建议采⽤该⽅案

  • 本⽂也采⽤这个拓扑

步骤

  • 环境准备

  • 安装组件:docker,kubelet,kubeadm(所有节点)

  • 使⽤上述组件部署 etcd ⾼可⽤集群

  • 部署 master

  • 加⼊node

  • ⽹络安装

  • 验证

  • 总结

机环境准备

  • 系统环境
#操作系统版本(⾮必须,仅为此处案例)
$cat /etc/redhat-release
CentOS Linux release 7.2.1511 (Core)
#内核版本(⾮必须,仅为此处案例)
$uname -r
4.17.8-1.el7.elrepo.x86_64

#数据盘开启ftype(在每台节点上执⾏)
umount /data
mkfs.xfs -n ftype=1 -f /dev/vdb
#禁⽤swap
swapoff -a
sed -i "s#^/swapfile#\#/swapfile#g" /etc/fstab
mount -a

docker,kubelet,kubeadm 的安装(所有节点)

  • 安装运⾏时(docker)

    • k8s1.13 版本根据官⽅建议,暂不采⽤最新的 18.09,这⾥我们采⽤18.06,安装时需指定版本

    • 来源:kubeadm now properly recognizes Docker 18.09.0 and newer, but still treats 18.06 as the default supported version.

    • 安装脚本如下(在每台节点上执⾏):

yum install -y yum-utils device-mapper-persistent-data lvm2
yum-config-manager \
        --add-repo \
        https://download.docker.com/linux/centos/docker-ce.repo

yum makecache fast

yum install -y --setopt=obsoletes=0 \
      docker-ce-18.06.1.ce-3.el7

systemctl start docker
systemctl enable docker
  • 安装 kubeadm,kubelet,kubectl

    • 官⽅的 Google yum 源⽆法从国内服务器上直接下载,所以可先在其他渠道下载好,在上传到服务器上
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
 * 本地安装
# 禁⽤selinux
 * ⽹络修复,已知 centos7 会因 iptables 被绕过⽽将流量错误路由,因此需确保sysctl 配置中的 net.bridge.bridgenf-call-iptables 被设置为 1
cat <<EOF > /etc/sysctl.d/k8s.conf

使⽤上述组件部署 etcd ⾼可⽤集群

1. 在 etcd 节点上,将 etcd 服务设置为由 kubelet 启动管理

cat << EOF > /etc/systemd/system/kubelet.service.d/20-etcd-service-manager.conf
  1. 给每台 etcd 主机⽣成 kubeadm 配置⽂件,确保每台主机运⾏⼀个 etcd 实例:在 etcd1(即上述的 hosts0)上执⾏上述

命令,可以在 /tmp ⽬录下看到⼏个主机名的⽬录

# Update HOST0, HOST1, and HOST2 with the IPs or resolvable names of your hosts
export HOST0=10.10.184.226
export HOST1=10.10.213.222
export HOST2=10.10.239.108

# Create temp directories to store files that will end up on other hosts.
mkdir -p /tmp/${HOST0}/ /tmp/${HOST1}/ /tmp/${HOST2}/

ETCDHOSTS=(${HOST0} ${HOST1} ${HOST2})
NAMES=("infra0" "infra1" "infra2")

for i in "${!ETCDHOSTS[@]}"; do
HOST=${ETCDHOSTS[$i]}
NAME=${NAMES[$i]}
cat << EOF > /tmp/${HOST}/kubeadmcfg.yaml
apiVersion: "kubeadm.k8s.io/v1beta1"
kind: ClusterConfiguration
etcd:
      local:
            serverCertSANs:
            - "${HOST}"
            peerCertSANs:
            - "${HOST}"
            extraArgs:
                  initial-cluster:
${NAMES[0]}=https://${ETCDHOSTS[0]}:2380,${NAMES[1]}=https://${ETCDHOSTS[1]}:2380,${N
AMES[2]}=https://${ETCDHOSTS[2]}:2380
                  initial-cluster-state: new
                  name: ${NAME}
                  listen-peer-urls: https://${HOST}:2380
                  listen-client-urls: https://${HOST}:2379
                  advertise-client-urls: https://${HOST}:2379
                  initial-advertise-peer-urls: https://${HOST}:2380
EOF
done
  1. 制作CA:在host0上执⾏命令⽣成证书,它将创建两个⽂件:/etc/kubernetes/pki/etcd/ca.crt/etc/kubernetes/pki/etcd/ca.key (这⼀步需要翻墙)
[root@10-10-184-226 ~]# kubeadm init phase certs etcd-ca
[certs] Generating "etcd/ca" certificate and key
  1. 在 host0 上给每个 etcd 节点⽣成证书:
export HOST0=10.10.184.226
export HOST1=10.10.213.222
export HOST2=10.10.239.108
kubeadm init phase certs etcd-server --config=/tmp/${HOST2}/kubeadmcfg.yaml
kubeadm init phase certs etcd-peer --config=/tmp/${HOST2}/kubeadmcfg.yaml
kubeadm init phase certs etcd-healthcheck-client --
config=/tmp/${HOST2}/kubeadmcfg.yaml
kubeadm init phase certs apiserver-etcd-client --config=/tmp/${HOST2}/kubeadmcfg.yaml
cp -R /etc/kubernetes/pki /tmp/${HOST2}/
# cleanup non-reusable certificates
find /etc/kubernetes/pki -not -name ca.crt -not -name ca.key -type f -delete

kubeadm init phase certs etcd-server --config=/tmp/${HOST1}/kubeadmcfg.yaml
kubeadm init phase certs etcd-peer --config=/tmp/${HOST1}/kubeadmcfg.yaml
kubeadm init phase certs etcd-healthcheck-client --
config=/tmp/${HOST1}/kubeadmcfg.yaml
kubeadm init phase certs apiserver-etcd-client --config=/tmp/${HOST1}/kubeadmcfg.yaml
cp -R /etc/kubernetes/pki /tmp/${HOST1}/
find /etc/kubernetes/pki -not -name ca.crt -not -name ca.key -type f -delete

kubeadm init phase certs etcd-server --config=/tmp/${HOST0}/kubeadmcfg.yaml
kubeadm init phase certs etcd-peer --config=/tmp/${HOST0}/kubeadmcfg.yaml
kubeadm init phase certs etcd-healthcheck-client --
config=/tmp/${HOST0}/kubeadmcfg.yaml
kubeadm init phase certs apiserver-etcd-client --config=/tmp/${HOST0}/kubeadmcfg.yaml
# No need to move the certs because they are for HOST0

# clean up certs that should not be copied off this host
find /tmp/${HOST2} -name ca.key -type f -delete
find /tmp/${HOST1} -name ca.key -type f -delete
  • 将证书和 kubeadmcfg.yaml 下发到各个 etcd 节点上效果为
/root/
└── kubeadmcfg.yaml
---
/etc/kubernetes/pki
├── apiserver-etcd-client.crt
├── apiserver-etcd-client.key
└── etcd
        ├── ca.crt
        ├── ca.key
        ├── healthcheck-client.crt
        ├── healthcheck-client.key
        ├── peer.crt
        ├── peer.key
        ├── server.crt
        └── server.key
  1. ⽣成静态 pod manifest ,在 3 台 etcd 节点上分别执⾏:(需翻墙)
$ kubeadm init phase etcd local --config=/root/kubeadmcfg.yaml
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"

6. 检查 etcd 集群状态,⾄此 etcd 集群搭建完成

docker run --rm -it --net host -v /etc/kubernetes:/etc/kubernetes
k8s.gcr.io/etcd:3.2.24 etcdctl --cert-file /etc/kubernetes/pki/etcd/peer.crt --key�file /etc/kubernetes/pki/etcd/peer.key --ca-file /etc/kubernetes/pki/etcd/ca.crt --
endpoints https://${HOST0}:2379 cluster-health
member 9969ee7ea515cbd2 is healthy: got healthy result from
https://10.10.213.222:2379
member cad4b939d8dfb250 is healthy: got healthy result from
https://10.10.239.108:2379
member e6e86b3b5b495dfb is healthy: got healthy result from
https://10.10.184.226:2379
cluster is healthy

使⽤ kubeadm 部署 master

  • 将任意⼀台 etcd 上的证书拷⻉到 master1 节点
export CONTROL_PLANE="ubuntu@10.0.0.7"
+scp /etc/kubernetes/pki/etcd/ca.crt "${CONTROL_PLANE}":
+scp /etc/kubernetes/pki/apiserver-etcd-client.crt "${CONTROL_PLANE}":
+scp /etc/kubernetes/pki/apiserver-etcd-client.key "${CONTROL_PLANE}":
  • 在第⼀台 master 上编写配置⽂件 kubeadm-config.yaml 并初始化
apiVersion: kubeadm.k8s.io/v1beta1
kind: ClusterConfiguration
kubernetesVersion: stable
#kubernetesVersion: 1.13.1
#注意这⾥我们想先安装1.13.1再后续升级1.13.4所以直接声明版本号
apiServer:
   certSANs:
   - "k8s.paas.test"
controlPlaneEndpoint: "k8s.paas.test:6443"
etcd:
       external:
              endpoints:
              - https://10.10.184.226:2379
              - https://10.10.213.222:2379
              - https://10.10.239.108:2379
              caFile: /etc/kubernetes/pki/etcd/ca.crt
              certFile: /etc/kubernetes/pki/apiserver-etcd-client.crt
              keyFile: /etc/kubernetes/pki/apiserver-etcd-client.key
networking:
    podSubnet: "10.244.0.0/16"

注意:这⾥的 k8s.paas.test:6443 是⼀个 LB,如果没有可⽤虚拟 IP 来做

  • 使⽤私有仓库(⾃定义镜像功能) kubeadm ⽀持通过修改配置⽂件中的参数来灵活定制集群初始化⼯作,如 imageRepository 可以设置镜像前缀,我们可以在将镜像传到⾃⼰内部私服上之后,编辑 kubeadm-config.yaml 中的该参数之后再执⾏ init

  • 在 master1 上执⾏:kubeadm init --config kubeadm-config.yaml

Your Kubernetes master has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

    mkdir -p $HOME/.kube
    sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
    https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of machines by running the following on each node
as root:

  kubeadm join k8s.paas.test:6443 --token f1oygc.3zlc31yjcut46prf --discovery�token-ca-cert-hash xx
  
  [root@k8s-m1 ~]
  $kubectl get node
  NAME STATUS ROLES AGE VERSION
  k8s-m1 NotReady master 4m54s v1.13.1

安装另外2台 master

  • master1 上的 admin.conf 配置⽂件和 pki 相关证书拷⻉到另外 2 台 master 同样⽬录下如:
/etc/kubernetes/pki/ca.crt
/etc/kubernetes/pki/ca.key
/etc/kubernetes/pki/sa.key
/etc/kubernetes/pki/sa.pub
/etc/kubernetes/pki/front-proxy-ca.crt
/etc/kubernetes/pki/front-proxy-ca.key
/etc/kubernetes/pki/etcd/ca.crt
/etc/kubernetes/pki/etcd/ca.key (官⽅⽂档中此处需要拷⻉,但实际不需要)
/etc/kubernetes/admin.conf

注:官⽹⽂档少了两个⽂件/etc/kubernetes/pki/apiserver-etcd-client.crt/etc/kubernetes/pki/apiserver-etcdclient.key,不加 apiserver 会启动失败并报错:

Unable to create storage backend: config (&{ /registry []
/etc/kubernetes/pki/apiserver-etcd-client.key /etc/kubernetes/pki/apiserver-etcd�client.crt /etc/kubernetes/pki/etcd/ca.crt true 0xc000133c20 <nil> 5m0s 1m0s}),
err (open /etc/kubernetes/pki/apiserver-etcd-client.crt: no such file or
directory)
  • 在 2 和 3 master 中执⾏加⼊:
  kubeadm join k8s.paas.test:6443 --token f1oygc.3zlc31yjcut46prf --discovery-token�ca-cert-hash sha256:078b63e29378fb6dcbedd80dd830b83e37521f294b4e3416cd77e854041d912f
--experimental-control-plane

加⼊ node 节点

[root@k8s-n1 ~]
$ kubeadm join k8s.paas.test:6443 --token f1oygc.3zlc31yjcut46prf --discovery-token�ca-cert-hash sha256:078b63e29378fb6dcbedd80dd830b83e37521f294b4e3416cd77e854041d912f
[preflight] Running pre-flight checks
[discovery] Trying to connect to API Server "k8s.paas.test:6443"
[discovery] Created cluster-info discovery client, requesting info from
"https://k8s.paas.test:6443"
[discovery] Requesting info from "https://k8s.paas.test:6443" again to validate TLS
against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate
validates against pinned roots, will use API Server "k8s.paas.test:6443"
[discovery] Successfully established connection with API Server "k8s.paas.test:6443"
[join] Reading configuration from the cluster...
[join] FYI: You can look at this config file with 'kubectl -n kube-system get cm
kubeadm-config -oyaml'
[kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.13"
ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file
"/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Activating the kubelet service
[tlsbootstrap] Waiting for the kubelet to perform the TLS Bootstrap...
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the
Node API object "k8s-n1" as an annotation

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the master to see this node join the cluster.

网络安装

kubectl apply -f
https://raw.githubusercontent.com/coreos/flannel/a70459be0084506e4ec919aa1c114638878d
b11b/Documentation/kube-flannel.yml
kubectl get pod --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE
IP NODE NOMINATED NODE READINESS GATES
kube-system coredns-86c58d9df4-dc4t2 1/1 Running 0 14m
172.17.0.3 k8s-m1 <none> <none>
kube-system coredns-86c58d9df4-jxv6v 1/1 Running 0 14m
172.17.0.2 k8s-m1 <none> <none>
kube-system kube-apiserver-k8s-m1 1/1 Running 0 13m
10.10.119.128 k8s-m1 <none> <none>
kube-system kube-apiserver-k8s-m2 1/1 Running 0 5m
10.10.76.80 k8s-m2 <none> <none>
kube-system kube-apiserver-k8s-m3 1/1 Running 0 4m58s
10.10.56.27 k8s-m3 <none> <none>
kube-system kube-controller-manager-k8s-m1 1/1 Running 0 13m
10.10.119.128 k8s-m1 <none> <none>
kube-system kube-controller-manager-k8s-m2 1/1 Running 0 5m
10.10.76.80 k8s-m2 <none> <none>
kube-system kube-controller-manager-k8s-m3 1/1 Running 0 4m58s
10.10.56.27 k8s-m3 <none> <none>
kube-system kube-flannel-ds-amd64-nvmtk 1/1 Running 0 44s
10.10.56.27 k8s-m3 <none> <none>
kube-system kube-flannel-ds-amd64-pct2g 1/1 Running 0 44s
10.10.76.80 k8s-m2 <none> <none>
kube-system kube-flannel-ds-amd64-ptv9z 1/1 Running 0 44s
10.10.119.128 k8s-m1 <none> <none>
kube-system kube-flannel-ds-amd64-zcv49 1/1 Running 0 44s
10.10.175.146 k8s-n1 <none> <none>
kube-system kube-proxy-9cmg2 1/1 Running 0 2m34s
10.10.175.146 k8s-n1 <none> <none>
kube-system kube-proxy-krlkf 1/1 Running 0 4m58s
10.10.56.27 k8s-m3 <none> <none>
kube-system kube-proxy-p9v66 1/1 Running 0 14m
10.10.119.128 k8s-m1 <none> <none>
kube-system kube-proxy-wcgg6 1/1 Running 0 5m
10.10.76.80 k8s-m2 <none> <none>
kube-system kube-scheduler-k8s-m1 1/1 Running 0 13m
10.10.119.128 k8s-m1 <none> <none>
kube-system kube-scheduler-k8s-m2 1/1 Running 0 5m
10.10.76.80 k8s-m2 <none> <none>
kube-system kube-scheduler-k8s-m3 1/1 Running 0 4m58s
10.10.56.27 k8s-m3 <none> <none>

安装完成

验证

  • ⾸先验证kube-apiserver, kube-controller-manager, kube-scheduler, pod network 是否正常:
$kubectl create deployment nginx --image=nginx:alpine
$kubectl get pods -l app=nginx -o wide
NAME READY STATUS RESTARTS AGE IP NODE
NOMINATED NODE READINESS GATES
nginx-54458cd494-r6hqm 1/1 Running 0 5m24s 10.244.4.2 k8s-n1
<none> <none>
  • kube-proxy 验证
$kubectl expose deployment nginx --port=80 --type=NodePort
service/nginx exposed

[root@k8s-m1 ~]
$kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 122m
nginx NodePort 10.108.192.221 <none> 80:30992/TCP 4s
[root@k8s-m1 ~]
$kubectl get pods -l app=nginx -o wide
NAME READY STATUS RESTARTS AGE IP NODE
NOMINATED NODE READINESS GATES
nginx-54458cd494-r6hqm 1/1 Running 0 6m53s 10.244.4.2 k8s-n1
<none> <none>

$curl -I k8s-n1:30992
HTTP/1.1 200 OK
  • 验证 dns,pod ⽹络状态
kubectl run --generator=run-pod/v1 -it curl --image=radial/busyboxplus:curl
If you don't see a command prompt, try pressing enter.
[ root@curl:/ ]$ nslookup nginx
Server: 10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local

Name: nginx
Address 1: 10.108.192.221 nginx.default.svc.cluster.local
  • ⾼可⽤

关机 master1,在⼀个随机 pod 中访问 Nginx 关机 master1,在⼀个随机 pod 中访问 Nginx

while true;do curl -I nginx && sleep 1 ;done

总结

  • 关于版本

    • 内核版本 4.19 更加稳定此处不建议4.17(再新就是5了)

    • docker 最新稳定版是 1.17.12,此处是 1.18.06,虽 k8s 官⽅也确认兼容1.18.09 了但⽣产上还是建议 1.17.12

  • 关于⽹络,各家选择不同,flannel在中⼩公司较为普遍,但部署前要选好⽹络插件,在配置⽂件中提前设置好(官⽅博客⼀开始的kubeadm配置中没写,后⾯在⽹络设置中⼜要求必须加)

  • 出错处理

    • 想重置环境的话,kubeadm reset 是个很好的⼯具,但它并不会完全重置,在etcd 中的的部分数据(如 configmap secure 等)是没有被清空的,所以如果有必要重置真个环境,记得在 reset 后将 etcd 也重置。

    • 重置 etcd 办法为清空 etcd 节点 /var/lib/etcd,重启 docker 服务)

  • 翻墙

    • 镜像:kubeadm 已⽀持⾃定义镜像前缀,kubeadm-config.yaml 中设置 imageRepository 即可

    • yum,可提前下载导⼊,也可以设置 http_proxy 来访问

    • init,签发证书和 init 时需要连接 google,也可以设置 http_proxy来访问。

  • 更多

    • 证书、升级问题将在下⼀篇继续介绍。

参考:

Python 运维开发——18天训练营

本课程为:网络班+面授班(北京)

第一:¥1299 仅剩2天

第二:开课时间 5月12日

第三:为期18天(每周一天)

主讲师:

Panda :曾就职于豆瓣,某互联网医疗企业运维负责人

Monkey :曾就职爱奇艺,某在线教育独角兽企业运维研发负责人

训练大纲

Day1-Day4 Python 基础

● 基础语法 + 数据类型+文件处理

Day5-Day6 Python 进阶

● 函数式编程 + 面向对象 + 多线程

Day7-Day10 Django Web 开发入门

● Django MTV + Django Admin + Bootstrap/jQuery

Day11-Day18 项目实战

● 用户权限管理系统+ 多云管理 CMDB 系统 + 运维工单系统 + 代码发布系统

报名方式

添加小助手WeChat:17812796384,回复【18天训练营】

image

Golang 课程5月开课,想要学习的小伙伴抓紧时间,早报名享受早鸟价


有疑问加站长微信联系(非本文作者)

本文来自:简书

感谢作者:51reboot

查看原文:用 kubeadm 部署生产级 k8s 集群

入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889

1860 次点击  
加入收藏 微博
暂无回复
添加一条新回复 (您需要 登录 后才能回复 没有账号 ?)
  • 请尽量让自己的回复能够对别人有帮助
  • 支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
  • 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
  • 图片支持拖拽、截图粘贴等方式上传