本地调试和发布Controller

前言

前面几篇已经完成了初步的代码编写,但随后还需要 调试测试 --> 正式发布 到K8s集群内。这个步骤是官方文档内简笔带过的,极易带来困扰,写代码很难一气呵成,总会有error需要调试,但不可能每次为了调试一两行代码或加个打印输出,重复进行打镜像、推镜像、重启pod这一系列繁杂的操作,本篇专门讲述如何使用本地开发环境连接K8s集群进行调试,以及测试完毕后正式发布到K8s集群中运行。

调试/测试

众所周知,APIServer交互是需要TLS加密的,CRD controller与APIServer的交互当然也不例外,因此,首先要获取K8s CA的授权证书,才能进行正常的通信工作。

kubebuilder官方文档中所介绍的是,部署Cert Manager这个三方组件,Cert Manager专门用于帮助应用取得CA的授权、证书注入等操作,可以实现注入的自动化,无需手动参与:

Deploying the cert manager

Cert Manager官方文档

然而大跌眼镜的却是,我按照两边的文档来尝试进行CA认证授权,无论如何尝试,最终获取的证书工作起来都是报错x509 certificate signed by unknown authority。猜测毕竟双方并不是同一进度同一团队的项目,可能存在版本对接上的问题,在这里困扰了好几天,最终不得不放弃,改为手动签发证书,后面的朋友可以先尝试Cert manager方式,如果届时可行,请告知我。

修改配置文件

因为启用了webhook,所以要对默认的配置文件进行一些修改,来到config目录,config的核心是config/default目录。

修改config/default/kustomization.yaml文件

namespace和前缀可自定义:

根据阅读注释的描述,把下图圈中的部分注释打开

修改config/crd/kustomization.yaml文件

根据阅读注释的描述,把下图圈中的部分注释打开:

make

部署和运行controller及其他关联资源的命令是make deploy IMG=${IMAGE},来看看Makefile:

可以看到,此命令会使用kustomize订制整个config/default目录下的配置文件,生成所有的资源文件,再使用kubectl apply命令部署,但直接apply在部分版本的K8s中可能会出错。为了更清晰地了解kustomize生成的资源有哪些,我将它做了一些小修改,不直接apply,转而将资源重定向到all_in_one.yaml文件内。

修改后重新执行:

# 替换为自己的registry
mbp-16in:Unit ywq$ export IMAGE="my.registry.com:5000/unit-controller:tmp"
mbp-16in:Unit ywq$ make deploy IMG=${IMAGE}

all_in_one

分析

仔细分析一番生成的all_in_one.yaml文件,有6000多行,其中的CustomResourceDefinition资源占据绝大部分的内容,总共可大概有这几种类型的资源:


# CRD的资源描述,涉及到Unit的每一个字段,因此非常冗长.
kind: CustomResourceDefinition

# admission webhook
kind: MutatingWebhookConfiguration
kind: ValidatingWebhookConfiguration

# RBAC授权
kind: Role
kind: ClusterRole
kind: RoleBinding
kind: ClusterRoleBinding

# prometheus metric service
kind: Service

# unit-webhook-service,接收APIServer的回调
kind: Service

# unit controller deployment
kind: Deployment

修改yaml文件

  1. 需要把yaml文件中CustomResourceDefinition.spec下新增一个字段:preserveUnknownFields: false

否则不加此字段kubectl apply会报错,bug已知存在于1.15-1.17以下的版本中,参考: Generated Metadata breaks crd

  1. MutatingWebhookConfiguration 和 ValidatingWebhookConfiguration

这两个webhook配置需要修改什么呢?来看看下载的配置,以为例:ValidatingWebhookConfiguration

apiVersion: admissionregistration.k8s.io/v1beta1
kind: ValidatingWebhookConfiguration
metadata:
  creationTimestamp: null
  name: unit-validating-webhook-configuration
webhooks:
- clientConfig:
    caBundle: Cg==
    service:
      name: unit-webhook-service
      namespace: default
      path: /validate-custom-my-crd-com-v1-unit
  failurePolicy: Fail
  name: vunit.kb.io
  rules:
  - apiGroups:
    - custom.my.crd.com
    apiVersions:
    - v1
    operations:
    - CREATE
    - UPDATE
    resources:
    - units

这里面有两个地方要修改:

  • caBundle现在是空的,需要补上
  • clientConfig现在的配置是ca授权给的是Service unit-webhook-service,也即是会转发到deployment的pod,但我们现在是要本地调试,这里就要改成本地环境。

下面来讲述如何配置这两个点。

CA证书签发

这里要分为多个步骤:

1.ca.cert

首先获取K8s CA的CA.cert文件:

kubectl config view --raw -o json | jq -r '.clusters[0].cluster."certificate-authority-data"' | tr -d '"' > ca.cert

ca.cert的内容,即可复制替换到上面的MutatingWebhookConfiguration和ValidatingWebhookConfigurationd的webhooks.clientConfig.caBundle里。(原来的Cg==要删掉.)

2.csr

创建证书签署请求json配置文件:

注意,hosts里面填写两种内容:

  • Unit controller的service 在K8s中的域名,最后Unit controller是要放在K8s里运行的。
  • 本地开发机的某个网卡IP地址,这个地址用来连接K8s集群进行调试。因此必须保证这个IP与K8s集群可以互通
cat > unit-csr.json << EOF
{
  "hosts": [
    "unit-webhook-service.default.svc",
    "unit-webhook-service.default.svc.cluster.local",
    "192.168.254.1"
  ],
  "CN": "unit-webhook-service",
  "key": {
    "algo": "rsa",
    "size": 2048
  }
}
EOF

3.生成csr和pem私钥文件:

[root@vm254011 unit]# cat unit-csr.json | cfssl genkey - | cfssljson -bare unit
2020/05/23 17:44:39 [INFO] generate received request
2020/05/23 17:44:39 [INFO] received CSR
2020/05/23 17:44:39 [INFO] generating key: rsa-2048
2020/05/23 17:44:39 [INFO] encoded CSR
[root@vm254011 unit]#
[root@vm254011 unit]# ls unit*
unit.csr  unit-csr.json  unit-key.pem

4.创建CertificateSigningRequest资源

cat > csr.yaml << EOF 
apiVersion: certificates.k8s.io/v1beta1
kind: CertificateSigningRequest
metadata:
  name: unit
spec:
  request: $(cat unit.csr | base64 | tr -d '\n')
  usages:
  - digital signature
  - key encipherment
  - server auth
EOF

# apply
kubectl apply -f csr.yaml

5.向集群提交此CertificateSigningRequest.

查看状态:

[root@vm254011 unit]# kubectl apply -f csr.yaml
certificatesigningrequest.certificates.k8s.io/unit created
[root@vm254011 unit]# kubectl describe csr unit
Name:         unit
Labels:       <none>
...
CreationTimestamp:  Sat, 23 May 2020 17:56:14 +0800
Requesting User:    kubernetes-admin
Status:             Pending
Subject:
  Common Name:    unit-webhook-service
  Serial Number:
Subject Alternative Names:
         DNS Names:     unit-webhook-service.default.svc
                        unit-webhook-service.default.svc.cluster.local
         IP Addresses:  192.168.254.1
Events:  <none>

可以看到它还是pending的状态,需要同意一下请求:

[root@vm254011 unit]# kubectl certificate approve unit
certificatesigningrequest.certificates.k8s.io/unit approved
[root@vm254011 unit]#
[root@vm254011 unit]# kubectl get csr unit
NAME   AGE    REQUESTOR          CONDITION
unit   111s   kubernetes-admin   Approved,Issued
# 保存客户端crt文件
[root@vm254011 unit]# kubectl get csr unit -o jsonpath='{.status.certificate}' | base64 --decode > unit.crt

可以看到,现在已经签署完毕了。

汇总一下:

  • 第1步生成的ca.cert文件给caBundle字段使用
  • 第3步生成的unit-key.pem私钥文件和第5步生成的unit.crt文件,提供给客户端(unit controller)https服务使用

更新WebhookConfiguration

根据上面生成的证书相关内容,对all_in_one.yaml 中的WebhookConfiguration进行替换,替换之后:

apiVersion: admissionregistration.k8s.io/v1beta1
kind: MutatingWebhookConfiguration
metadata:
  creationTimestamp: null
  name: unit-mutating-webhook-configuration
webhooks:
- clientConfig:
    caBundle: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUN5RENDQWJDZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJd01EVXhNakEzTkRNeE0xb1hEVE13TURVeE1EQTNORE14TTFvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTG5CCmRvZVRHNTlYMkZsYXRoN1RhRnYrZ2hjbGxsV0NLbkxuT1hQLzZydE0wdE92U0RCQjV2UVJsNUF0L3BWMEJucmQKZGtyOWRnMWRKSHp1T05WamkxTml6QVdUbWtSbDBKczMrdjFMUzBCY2xLeU5XbWRQM0NNUWl2M1BDbjNISG9rcgoveDZncnFaa3RxeUo2ck5JMXFocmkzbjNLSWFQWFBtYUJIeW1zWCt1UjQyMk1kaGNhU3dBUDQwUktzcUtWcS81CkRodzdHdVZzdFZHNG5GZUZ2dlFuYU1jVm13WUpyellFQWxNRitlSyswM3IyWEFLQUZxQnBEWXBaZlg1Wi9tUEsKVXlxNlIwcEJUaG9adXlwSUhQekwwMkJGazlDbmU3eTBXd1d6L1VleDJSN2toOVJhendNeVVTNlJKYU4wT2hRaQpsTTZyM2lZcnIzVWIxSW1ieE5NQ0F3RUFBYU1qTUNFd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0RRWUpLb1pJaHZjTkFRRUxCUUFEZ2dFQkFENHVNaVZpL28zSkVhVi9UZzVKRWhQK2tQZm8KVzBLeUtaT3FNVlZzRVZsM1l2aFdYdGxOaCtwT0ZHSTlPQVFZdE5NKzZDeEJLVm9Xd1NzSUpyYkpZeVR2bGFlYgpHZnJGZWRkL2NkM0N5M2N1UDQ0ZjRPQ3VabTZWckJUVy8wUms3LzVKMHlLTmlSSDVqelRJL0szZGtKWkNERktOCjRGdWZxZ3Y0QTNxdVYwQXJaNFNOV2poVEx2SlM1VVdaOUpxUndyU3NqNlpvenRJRVhiU1d2aWhyS2FGQmtoWWwKRG5KM2N4cFljYXJ0aVZqS1g3SUNQQTJxdmw1azF4ZEMwVldTQWlLdTVFR24zZkFmdkQwN2poeVBub3lkMjVmWApQeDlkaGlzaDgwaFl4Nm9pbHpHdUppMGZDNjgxZ0VRRTQzUGhNRHRCZHNKMTBEejRQYTdrL2QvY3hETT0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
    url: https://192.168.254.1:9443/mutate-custom-my-crd-com-v1-unit
#    service:
#      name: unit-webhook-service
#      namespace: default
#      path: /mutate-custom-my-crd-com-v1-unit
  failurePolicy: Fail
  name: munit.kb.io
  rules:
  - apiGroups:
    - custom.my.crd.com
    apiVersions:
    - v1
    operations:
    - CREATE
    - UPDATE
    resources:
    - units
---
apiVersion: admissionregistration.k8s.io/v1beta1
kind: ValidatingWebhookConfiguration
metadata:
  creationTimestamp: null
  name: unit-validating-webhook-configuration
webhooks:
- clientConfig:
    caBundle: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUN5RENDQWJDZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJd01EVXhNakEzTkRNeE0xb1hEVE13TURVeE1EQTNORE14TTFvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTG5CCmRvZVRHNTlYMkZsYXRoN1RhRnYrZ2hjbGxsV0NLbkxuT1hQLzZydE0wdE92U0RCQjV2UVJsNUF0L3BWMEJucmQKZGtyOWRnMWRKSHp1T05WamkxTml6QVdUbWtSbDBKczMrdjFMUzBCY2xLeU5XbWRQM0NNUWl2M1BDbjNISG9rcgoveDZncnFaa3RxeUo2ck5JMXFocmkzbjNLSWFQWFBtYUJIeW1zWCt1UjQyMk1kaGNhU3dBUDQwUktzcUtWcS81CkRodzdHdVZzdFZHNG5GZUZ2dlFuYU1jVm13WUpyellFQWxNRitlSyswM3IyWEFLQUZxQnBEWXBaZlg1Wi9tUEsKVXlxNlIwcEJUaG9adXlwSUhQekwwMkJGazlDbmU3eTBXd1d6L1VleDJSN2toOVJhendNeVVTNlJKYU4wT2hRaQpsTTZyM2lZcnIzVWIxSW1ieE5NQ0F3RUFBYU1qTUNFd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0RRWUpLb1pJaHZjTkFRRUxCUUFEZ2dFQkFENHVNaVZpL28zSkVhVi9UZzVKRWhQK2tQZm8KVzBLeUtaT3FNVlZzRVZsM1l2aFdYdGxOaCtwT0ZHSTlPQVFZdE5NKzZDeEJLVm9Xd1NzSUpyYkpZeVR2bGFlYgpHZnJGZWRkL2NkM0N5M2N1UDQ0ZjRPQ3VabTZWckJUVy8wUms3LzVKMHlLTmlSSDVqelRJL0szZGtKWkNERktOCjRGdWZxZ3Y0QTNxdVYwQXJaNFNOV2poVEx2SlM1VVdaOUpxUndyU3NqNlpvenRJRVhiU1d2aWhyS2FGQmtoWWwKRG5KM2N4cFljYXJ0aVZqS1g3SUNQQTJxdmw1azF4ZEMwVldTQWlLdTVFR24zZkFmdkQwN2poeVBub3lkMjVmWApQeDlkaGlzaDgwaFl4Nm9pbHpHdUppMGZDNjgxZ0VRRTQzUGhNRHRCZHNKMTBEejRQYTdrL2QvY3hETT0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
    url: https://192.168.254.1:9443/validate-custom-my-crd-com-v1-unit
#    service:
#      name: unit-webhook-service
#      namespace: default
#      path: /validate-custom-my-crd-com-v1-unit
  failurePolicy: Fail
  name: vunit.kb.io
  rules:
  - apiGroups:
    - custom.my.crd.com
    apiVersions:
    - v1
    operations:
    - CREATE
    - UPDATE
    resources:
    - units

主意,url中的ip地址需要是本地开发机的ip地址,同时此ip需要能与K8s集群正常通信,uri为service.path.

修改完两个WebhookConfiguration之后,下一步就可以去部署all_in_one.yaml文件了,由于现在controller要在本地运行调试,因此,这个阶段,要记得把all_in_one.yaml中的Deployment资源部分注释掉。

[root@vm254011 unit]# kubectl apply -f all_in_one.local.yaml  --validate=false

namespace/unit-system created
customresourcedefinition.apiextensions.k8s.io/units.custom.my.crd.com created
mutatingwebhookconfiguration.admissionregistration.k8s.io/unit-mutating-webhook-configuration created
role.rbac.authorization.k8s.io/unit-leader-election-role created
clusterrole.rbac.authorization.k8s.io/unit-manager-role created
clusterrole.rbac.authorization.k8s.io/unit-proxy-role created
clusterrole.rbac.authorization.k8s.io/unit-metrics-reader created
rolebinding.rbac.authorization.k8s.io/unit-leader-election-rolebinding created
clusterrolebinding.rbac.authorization.k8s.io/unit-manager-rolebinding created
clusterrolebinding.rbac.authorization.k8s.io/unit-proxy-rolebinding created
service/unit-controller-manager-metrics-service created
service/unit-webhook-service created
validatingwebhookconfiguration.admissionregistration.k8s.io/unit-validating-webhook-configuration created

K8s这边的CRD资源、webhook资源、RBAC授权都已经搞定了,下一步就是启动本地的controller进行调试了。

本地启动controller

启动之前要把上面准备好的证书、私钥,放在指定的目录内,默认指定目录是:/tmp/k8s-webhook-server/serving-certs/

 # linux
 # export TMPDIR=`/tmp`

 # mac os的默认/tmp目录是以环境变量形式动态设置的,不固定在/tmp,每次系统重启后TMPDIR环境变量会变化
 mbp-16in:kubebuilder ywq$ echo $TMPDIR
 /var/folders/7w/x9fbfc6d5mn3942fmdpwy3yr0000gn/T/

 mbp-16in:kubebuilder ywq$ mkdir -pv $TMPDIR/k8s-webhook-server/serving-certs/
 mbp-16in:kubebuilder ywq$ cp unit-key.pem $TMPDIR/k8s-webhook-server/serving-certs/tls.key
 mbp-16in:kubebuilder ywq$ cp unit.crt $TMPDIR/k8s-webhook-server/serving-certs/tls.crt

证书准备好之后,就可以在IDE内启动controller了:

可以开始愉快的调试了~

部署sample

假设调试已经完毕,可以开始测试部署一个Unit实例了。

默认的sample在这里:config/samples/custom_v1_unit.yaml,里面的Group、version、kind等已经填好了,补充下内容即可,例如sample.yaml:

apiVersion: custom.my.crd.com/v1
kind: Unit
metadata:
  name: unit-sample
spec:
  category: "Deployment"
  template:
    spec:
      containers:
        - image: my.registry.com:5000/nginx
          imagePullPolicy: IfNotPresent
          name: unit-sample
          resources:
            limits:
              cpu: 110m
              memory: 256Mi
            requests:
              cpu: 100m
              memory: 128Mi
  relationResource:
    serviceInfo:
      ports:
      - name: http
        port: 80
        protocol: TCP
        targetPort: 80
    ingressInfo:
      domain:
      - "unit.sample.com"
    pvcInfo:
      spec:
        accessModes:
        - ReadWriteMany
        resources:
          requests:
            storage: 10Gi
        storageClassName: cephfs

mbp-16in:kubebuilder ywq$ kubectl apply -f sample.yaml
unit.custom.my.crd.com/unit-sample created

mbp-16in:kubebuilder ywq$ kubectl get all 
NAME                              READY   STATUS    RESTARTS   AGE
pod/unit-sample-b554d9594-btcck   1/1     Running   0          112s

NAME                                              TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
service/unit-sample                               ClusterIP   10.252.254.85    <none>        80/TCP     6m44s

NAME                          READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/unit-sample   1/1     1            1           6m44s

NAME                                     DESIRED   CURRENT   READY   AGE
replicaset.apps/unit-sample-b554d9594    1         1         1       112s

mbp-16in:kubebuilder ywq$
mbp-16in:kubebuilder ywq$ kubectl get ing
NAME          HOSTS             ADDRESS   PORTS   AGE
unit-sample   unit.sample.com             80      6m56s
mbp-16in:kubebuilder ywq$
## storageClass及对接的存储服务还未部署,所以pvc Pending状态是正常的,忽略
mbp-16in:kubebuilder ywq$ kubectl get pvc
NAME          STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
unit-sample   Pending                                      cephfs         7m

可以看到,unit-sample短短的二十几行yaml文件,实现了对多种资源的集中控制,已经实现了一开始设计Unit的目标。

发布

如果已经调试和测试完毕,可以进入正式发布了。弄清了上面的步骤,发布比较简单了

1.打包push docker镜像

make docker-build docker-push IMG=my.registry.com:5000/unit-controller:tmp

2. Make deploy

make deploy IMG=my.registry.com:5000/unit-controller:tmp

3. 修改deployment

为什么要修改deployment呢?还是因为证书的问题,deployment运行同样也需要证书,那就将证书做成Secret资源,以Secret的形式挂载进pod里面把。

生成secret
kubectl create secret generic unit-cert --from-file=./tls.crt --from-file=./tls.key
修改Deployment,添加Secret挂载

另外,这两点不要忘记:

  • 添加CustomResourceDefinition.spec.preserveUnknownFields: false

  • webhooks.clientConfig.caBundleca值配置

修改完毕,清理前面的apply的资源和sample,再次执行kubectl apply -f all_in_one.yaml --validate=false命令,可以看到,部署成功!

mbp-16in:~ ywq$ kubectl apply -f all_in_one.yaml  --validate=false
mbp-16in:~ ywq$ kubectl apply -f sample.yaml
mbp-16in:~ ywq$ kubectl get pods
NAME                                       READY   STATUS    RESTARTS   AGE
unit-controller-manager-76dc8bff8f-jw42r   2/2     Running   0          91s
unit-sample-b554d9594-k6md8                1/1     Running   0          47s

总结

在调试和部署的部分官方文档简笔带过,没有文档的指导,摸索起来很容易栽跟头,另外cert manager的问题也令人头疼,只能选择手动签发证书,这带来了额外的一些动手步骤,不过这也能帮助更好的理解各个组件、资源之间的协作。本篇希望能给看到的朋友一些帮助和指引,如果假以时日你在按照官方文档所说使用cert manager成功签发证书且可正常使用后,请告知我,谢谢~

results matching ""

    No results matching ""