Prometheus-Operator 手动入门实战

prometheus operator 手动入门实战

虽然生产中我们大部分都是通过Helm的方式来安装Prometheus-Operator，但是我们也是需要明白Prometheus Operator的原理，以及其中5个CRD的YAML定义方式，这里我们以手动的方式来安装部署Prometheus Operator以及Prometheus Server、Alertmanager集群，配置我们需要监控的对象。

Prometheus Operator GitHub

本文采用的环境以及版本：

Kubernetes 1.15.5 + Calico v3.6.5
Prometheus-Operator v0.34.0

Prometheus Operator简介

Prometheus Operator 安装后提供如下功能：

很容易的在kubernetes集群中特定的Namespace中创建和销毁prometheus实例；
配置简单：使用kubernetes原生的方式（CRD）对prometheus的版本，持久化，数据保留，实例数进行配置；
通过labels来指定生成对目标监控的配置；

架构图如下：

Operator 监控Kubernetes集群中自定义资源Prometheus、Alertmanager的变化，部署和管理Prometheus Server以及Alertmanager集群，使其状态以及配置达到用户的期望状态，同时监控k8s集群中自定义资源ServiceMonitor（它是用来指定需要收集Metrics的目标的服务），来生成Prometheus的配置信息。

Prometheus Operator vs. kube-prometheus vs. community helm chart 都是什么？

prometheus operator使用kubernetes原生的方式来管理和操作prometheus和alertmanager集群。
kube-prometheus 联合prometheus operator和一些manifests来帮助监控kubernetes集群本身以及跑在kubernetes上面的应用。
stable/prometheus-operator Helm chart 提供简单的功能集来创建kube-prometheus。

前提依赖以及兼容性：

推荐Kubernetes集群版本在1.8以上。

快速安装Kubernetes测试集群：

兼容的prometheus版本：v1.4.0–v2.10.0，参考

兼容的alertmanager版本：>= v0.15，参考

CRD（Custom Resource Definitions 自定义资源）

Prometheus：定义期望的Prometheus实例，同时保证任何时候有期望的Prometheus实例在运行。
ServiceMonitor：通过声明式的方式指定哪些服务需要被监控，它自动生成Prometheus 的scrape配置。
PodMonitor：通过声明式的方式指定哪些pod需要被监控，它自动生成Prometheus 的scrape配置。
PrometheusRule：配置Prometheus rule文件，包括recording rules和alerting，它能够自动被Prometheus加载。
Alertmanager：定义期望的Alertmanager实例，同时保证任何时候有期望的Alertmanager实例在运行，对于指定多台Alertmanager，prometheus operator会自动将它们配置成集群。

安装prometheus-operator

下载prometheus-operator YAML文件：

$ wget -c "https://github.com/coreos/prometheus-operator/raw/master/bundle.yaml"

更改bundel.yaml 文件中的image地址，默认是使用quay.io的镜像，这里修改成quay.azk8s.cn的地址，可以快速的下载镜像，同时更改默认的namespace为prometheus

$ sed -i 's#quay.io#quay.azk8s.cn#g' bundle.yaml
$ sed -i 's#namespace: default#namespace: prometheus#g' bundle.yaml

更改后的显示如下：

containers:
- args:
  - --kubelet-service=kube-system/kubelet
  - --logtostderr=true
  - --config-reloader-image=quay.azk8s.cn/coreos/configmap-reload:v0.0.1
  - --prometheus-config-reloader=quay.azk8s.cn/coreos/prometheus-config-reloader:v0.34.0
  image: quay.azk8s.cn/coreos/prometheus-operator:v0.34.0

安装prometheus-operator 到prometheus namespace中：

$ kubectl create ns prometheus
namespace/prometheus created
$ kubectl get ns
NAME              STATUS   AGE
default           Active   4d6h
kube-node-lease   Active   4d6h
kube-public       Active   4d6h
kube-system       Active   4d6h
prometheus        Active   7s
$ kubectl apply -f bundle.yaml
clusterrolebinding.rbac.authorization.k8s.io/prometheus-operator configured
clusterrole.rbac.authorization.k8s.io/prometheus-operator unchanged
deployment.apps/prometheus-operator created
serviceaccount/prometheus-operator created
service/prometheus-operator created

检查安装状态：

$ kubectl -n prometheus get all
NAME                                       READY   STATUS    RESTARTS   AGE
pod/prometheus-operator-5748cc95dd-ssn7s   1/1     Running   0          2m45s


NAME                          TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)    AGE
service/prometheus-operator   ClusterIP   None         <none>        8080/TCP   2m46s


NAME                                  READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/prometheus-operator   1/1     1            1           2m46s

NAME                                             DESIRED   CURRENT   READY   AGE
replicaset.apps/prometheus-operator-5748cc95dd   1         1         1       2m46s

查看prometheus operator的启动参数，以及命令帮助：

# 进入prometheus operator容器内：
$ kubectl -n prometheus exec -it prometheus-operator-5748cc95dd-ssn7s -- sh

# 查看operator启动参数：
$ ps -ef|more
PID   USER     TIME  COMMAND
    1 nobody    0:04 /bin/operator --kubelet-service=kube-system/kubelet --logtostderr=true --config-reloader-image=quay.azk8s.cn/coreos/configmap-reload:v0.0.1 --prometheus-config-reloader=quay.azk8s.cn/coreos/prometheus-config-reloader:v0.34.0
    
# 查看operator的命令帮助：
$ operator --help
Usage of operator:
  -alertmanager-default-base-image string
    	Alertmanager default base image (default "quay.io/prometheus/alertmanager")
  -alertmanager-instance-namespaces value
    	Namespaces where Alertmanager custom resources and corresponding StatefulSets are watched/created. If set this takes precedence over --namespaces or --deny-namespaces for Alertmanager custom resources.
  -alertmanager-instance-selector string
    	Label selector to filter AlertManager CRDs to manage
  -alsologtostderr
    	log to standard error as well as files
  -apiserver string
    	API Server addr, e.g. ' - NOT RECOMMENDED FOR PRODUCTION - http://127.0.0.1:8080'. Omit parameter to run in on-cluster mode and utilize the service account token.
  -ca-file string
    	- NOT RECOMMENDED FOR PRODUCTION - Path to TLS CA file.
  -cert-file string
    	 - NOT RECOMMENDED FOR PRODUCTION - Path to public TLS certificate file.
  -config-reloader-cpu string
    	Config Reloader CPU. Value "0" disables it and causes no limit to be configured. (default "100m")
  -config-reloader-image string
    	Reload Image (default "quay.io/coreos/configmap-reload:v0.0.1")
  -config-reloader-memory string
    	Config Reloader Memory. Value "0" disables it and causes no limit to be configured. (default "25Mi")
  -crd-kinds value
    	 - EXPERIMENTAL (could be removed in future releases) - customize CRD kind names
  -deny-namespaces value
    	Namespaces not to scope the interaction of the Prometheus Operator (deny list). This is mutually exclusive with --namespaces.
  -key-file string
    	- NOT RECOMMENDED FOR PRODUCTION - Path to private TLS certificate file.
  -kubelet-service string
    	Service/Endpoints object to write kubelets into in format "namespace/name"
  -labels value
    	Labels to be add to all resources created by the operator
  -localhost string
    	EXPERIMENTAL (could be removed in future releases) - Host used to communicate between local services on a pod. Fixes issues where localhost resolves incorrectly. (default "localhost")
  -log-format string
    	Log format to use. Possible values: logfmt, json (default "logfmt")
  -log-level string
    	Log level to use. Possible values: all, debug, info, warn, error, none (default "info")
  -log_backtrace_at value
    	when logging hits line file:N, emit a stack trace
  -log_dir string
    	If non-empty, write log files in this directory
  -log_file string
    	If non-empty, use this log file
  -log_file_max_size uint
    	Defines the maximum size a log file can grow to. Unit is megabytes. If the value is 0, the maximum file size is unlimited. (default 1800)
  -logtostderr
    	log to standard error instead of files (default true)
  -manage-crds
    	Manage all CRDs with the Prometheus Operator. (default true)
  -namespaces value
    	Namespaces to scope the interaction of the Prometheus Operator and the apiserver (allow list). This is mutually exclusive with --deny-namespaces.
  -prometheus-config-reloader string
    	Prometheus config reloader image (default "quay.io/coreos/prometheus-config-reloader:v0.34.0")
  -prometheus-default-base-image string
    	Prometheus default base image (default "quay.io/prometheus/prometheus")
  -prometheus-instance-namespaces value
    	Namespaces where Prometheus custom resources and corresponding Secrets, Configmaps and StatefulSets are watched/created. If set this takes precedence over --namespaces or --deny-namespaces for Prometheus custom resources.
  -prometheus-instance-selector string
    	Label selector to filter Prometheus CRDs to manage
  -skip_headers
    	If true, avoid header prefixes in the log messages
  -skip_log_headers
    	If true, avoid headers when opening log files
  -stderrthreshold value
    	logs at or above this threshold go to stderr (default 2)
  -thanos-default-base-image string
    	Thanos default base image (default "quay.io/thanos/thanos")
  -tls-insecure
    	- NOT RECOMMENDED FOR PRODUCTION - Don't verify API server's CA certificate.
  -v value
    	number for the log level verbosity
  -vmodule value
    	comma-separated list of pattern=N settings for file-filtered logging
  -with-validation
    	Include the validation spec in the CRD (default true)

安装Prometheus Operator CRDs ??

安装完prometheus operator 我们还需要手动的安装Prometheus-Operator所需的CRD嘛？

至少GitHub上没有写安装，但是如果不需要手动安装对应的CRD，那么是谁安装的呢？这是我的疑问，相信你们也会有；那么我们先检查一下是否已经有了这些CRD呢？

$ kubectl get crd | grep monitoring.coreos.com
alertmanagers.monitoring.coreos.com           2019-11-16T12:19:01Z
podmonitors.monitoring.coreos.com             2019-11-16T12:19:02Z
prometheuses.monitoring.coreos.com            2019-11-16T12:19:01Z
prometheusrules.monitoring.coreos.com         2019-11-16T12:19:02Z
servicemonitors.monitoring.coreos.com         2019-11-16T12:19:02Z

我们发现这些CRD已经安装上了，那么是谁安装的呢？答案只有一个，那就是Prometheus Operator了，我们来检查一下，首先检查一下安装Operator的yaml文件：bundle.yaml

# 其中的ClusterRole，我们发现它是有权限去安装CRD的，而且对应的权限都是*，也就是可以对需要的资源做任何操作。

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    app.kubernetes.io/component: controller
    app.kubernetes.io/name: prometheus-operator
    app.kubernetes.io/version: v0.34.0
  name: prometheus-operator
rules:
- apiGroups:
  - apiextensions.k8s.io
  resources:
  - customresourcedefinitions
  verbs:
  - '*'
- apiGroups:
  - monitoring.coreos.com
  resources:
  - alertmanagers
  - prometheuses
  - prometheuses/finalizers
  - alertmanagers/finalizers
  - servicemonitors
  - podmonitors
  - prometheusrules
  verbs:
  - '*'
- apiGroups:
  - apps
  resources:
  - statefulsets
  verbs:
  - '*'
- apiGroups:
  - ""
  resources:
  - configmaps
  - secrets
  verbs:
  - '*'
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - list
  - delete
- apiGroups:
  - ""
  resources:
  - services
  - services/finalizers
  - endpoints
  verbs:
  - get
  - create
  - update
  - delete
- apiGroups:
  - ""
  resources:
  - nodes
  verbs:
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - namespaces
  verbs:
  - get
  - list
  - watch

检查一下Operator的启动日志：

$ kubectl -n prometheus logs prometheus-operator-5748cc95dd-wgkhn 
ts=2019-11-17T01:21:50.319966698Z caller=main.go:199 msg="Starting Prometheus Operator version '0.34.0'."
ts=2019-11-17T01:21:50.324753193Z caller=main.go:96 msg="Staring insecure server on :8080"
level=info ts=2019-11-17T01:21:50.417372629Z caller=operator.go:441 component=prometheusoperator msg="connection established" cluster-version=v1.15.5
level=info ts=2019-11-17T01:21:50.417486093Z caller=operator.go:219 component=alertmanageroperator msg="connection established" cluster-version=v1.15.5
level=info ts=2019-11-17T01:21:51.315610123Z caller=operator.go:641 component=alertmanageroperator msg="CRD updated" crd=Alertmanager
level=info ts=2019-11-17T01:21:51.415575543Z caller=operator.go:1870 component=prometheusoperator msg="CRD updated" crd=Prometheus
level=info ts=2019-11-17T01:21:51.427497667Z caller=operator.go:1870 component=prometheusoperator msg="CRD updated" crd=ServiceMonitor
level=info ts=2019-11-17T01:21:51.439926809Z caller=operator.go:1870 component=prometheusoperator msg="CRD updated" crd=PodMonitor
level=info ts=2019-11-17T01:21:51.449866666Z caller=operator.go:1870 component=prometheusoperator msg="CRD updated" crd=PrometheusRule
level=info ts=2019-11-17T01:21:54.415708201Z caller=operator.go:235 component=alertmanageroperator msg="CRD API endpoints ready"
level=info ts=2019-11-17T01:21:54.616157837Z caller=operator.go:190 component=alertmanageroperator msg="successfully synced all caches"
level=info ts=2019-11-17T01:22:03.528619385Z caller=operator.go:457 component=prometheusoperator msg="CRD API endpoints ready"
level=info ts=2019-11-17T01:22:04.330326124Z caller=operator.go:387 component=prometheusoperator msg="successfully synced all caches"

我们发现有如下几个CRD被创建了：

msg="CRD updated" crd=Alertmanager
msg="CRD updated" crd=Prometheus
msg="CRD updated" crd=ServiceMonitor
msg="CRD updated" crd=PodMonitor
msg="CRD updated" crd=PrometheusRule

正式我们需要的5个。

我们查看一下Kubernetes中的API：

# 方式1
$ kubectl api-versions 
admissionregistration.k8s.io/v1beta1
apiextensions.k8s.io/v1beta1
apiregistration.k8s.io/v1
apiregistration.k8s.io/v1beta1
apps/v1
apps/v1beta1
apps/v1beta2
authentication.k8s.io/v1
authentication.k8s.io/v1beta1
authorization.k8s.io/v1
authorization.k8s.io/v1beta1
autoscaling/v1
autoscaling/v2beta1
autoscaling/v2beta2
batch/v1
batch/v1beta1
certificates.k8s.io/v1beta1
coordination.k8s.io/v1
coordination.k8s.io/v1beta1
crd.projectcalico.org/v1
events.k8s.io/v1beta1
extensions/v1beta1
monitoring.coreos.com/v1
networking.k8s.io/v1
networking.k8s.io/v1beta1
node.k8s.io/v1beta1
policy/v1beta1
rbac.authorization.k8s.io/v1
rbac.authorization.k8s.io/v1beta1
scheduling.k8s.io/v1
scheduling.k8s.io/v1beta1
storage.k8s.io/v1
storage.k8s.io/v1beta1
v1

# 方式2：
$ kubectl get APIService
NAME                                   SERVICE   AVAILABLE   AGE
v1.                                    Local     True        4d20h
v1.apps                                Local     True        4d20h
v1.authentication.k8s.io               Local     True        4d20h
v1.authorization.k8s.io                Local     True        4d20h
v1.autoscaling                         Local     True        4d20h
v1.batch                               Local     True        4d20h
v1.coordination.k8s.io                 Local     True        4d20h
v1.crd.projectcalico.org               Local     True        4d16h
v1.monitoring.coreos.com               Local     True        13h
v1.networking.k8s.io                   Local     True        4d20h
v1.rbac.authorization.k8s.io           Local     True        4d20h
v1.scheduling.k8s.io                   Local     True        4d20h
v1.storage.k8s.io                      Local     True        4d20h
v1beta1.admissionregistration.k8s.io   Local     True        4d20h
v1beta1.apiextensions.k8s.io           Local     True        4d20h
v1beta1.apps                           Local     True        4d20h
v1beta1.authentication.k8s.io          Local     True        4d20h
v1beta1.authorization.k8s.io           Local     True        4d20h
v1beta1.batch                          Local     True        4d20h
v1beta1.certificates.k8s.io            Local     True        4d20h
v1beta1.coordination.k8s.io            Local     True        4d20h
v1beta1.events.k8s.io                  Local     True        4d20h
v1beta1.extensions                     Local     True        4d20h
v1beta1.networking.k8s.io              Local     True        4d20h
v1beta1.node.k8s.io                    Local     True        4d20h
v1beta1.policy                         Local     True        4d20h
v1beta1.rbac.authorization.k8s.io      Local     True        4d20h
v1beta1.scheduling.k8s.io              Local     True        4d20h
v1beta1.storage.k8s.io                 Local     True        4d20h
v1beta2.apps                           Local     True        4d20h
v2beta1.autoscaling                    Local     True        4d20h
v2beta2.autoscaling                    Local     True        4d20h

我们发现有monitoring.coreos.com这个API：v1.monitoring.coreos.com，我们来具体的查看一下：

$ kubectl get --raw /apis/monitoring.coreos.com/v1 | python -m json.tool
{
    "apiVersion": "v1",
    "groupVersion": "monitoring.coreos.com/v1",
    "kind": "APIResourceList",
    "resources": [
        {
            "kind": "PodMonitor",
            "name": "podmonitors",
            "namespaced": true,
            "singularName": "podmonitor",
            "storageVersionHash": "t6BHpUAzPig=",
            "verbs": [
                "delete",
                "deletecollection",
                "get",
                "list",
                "patch",
                "create",
                "update",
                "watch"
            ]
        },
        {
            "kind": "PrometheusRule",
            "name": "prometheusrules",
            "namespaced": true,
            "singularName": "prometheusrule",
            "storageVersionHash": "RSJ8iG+KDOo=",
            "verbs": [
                "delete",
                "deletecollection",
                "get",
                "list",
                "patch",
                "create",
                "update",
                "watch"
            ]
        },
        {
            "kind": "Alertmanager",
            "name": "alertmanagers",
            "namespaced": true,
            "singularName": "alertmanager",
            "storageVersionHash": "NshW3zg1K7o=",
            "verbs": [
                "delete",
                "deletecollection",
                "get",
                "list",
                "patch",
                "create",
                "update",
                "watch"
            ]
        },
        {
            "kind": "ServiceMonitor",
            "name": "servicemonitors",
            "namespaced": true,
            "singularName": "servicemonitor",
            "storageVersionHash": "JLhPcfa+5xE=",
            "verbs": [
                "delete",
                "deletecollection",
                "get",
                "list",
                "patch",
                "create",
                "update",
                "watch"
            ]
        },
        {
            "kind": "Prometheus",
            "name": "prometheuses",
            "namespaced": true,
            "singularName": "prometheus",
            "storageVersionHash": "C8naPY4eojU=",
            "verbs": [
                "delete",
                "deletecollection",
                "get",
                "list",
                "patch",
                "create",
                "update",
                "watch"
            ]
        }
    ]
}

所以呢：在安装完Prometheus-Operator后不需要我们在手动的方式安装所需的CRD了。

来看一下GitHub上对应的CRD目录：

https://github.com/coreos/prometheus-operator/tree/master/example/prometheus-operator-crd

注意：再使用Helm的方式安装Operator的时候，可能需要提前的手动安装一下，可以通过下面的方式直接安装，总共5个CRD：参考

$ kubectl apply -f https://github.com/coreos/prometheus-operator/raw/master/example/prometheus-operator-crd/servicemonitor.crd.yaml
$ kubectl apply -f https://github.com/coreos/prometheus-operator/raw/master/example/prometheus-operator-crd/prometheusrule.crd.yaml
$ kubectl apply -f https://github.com/coreos/prometheus-operator/raw/master/example/prometheus-operator-crd/prometheus.crd.yaml
$ kubectl apply -f https://github.com/coreos/prometheus-operator/raw/master/example/prometheus-operator-crd/podmonitor.crd.yaml
$ kubectl apply -f https://github.com/coreos/prometheus-operator/raw/master/example/prometheus-operator-crd/alertmanager.crd.yaml

Prometheus Operator 以及对应的CRD都已经安装并且成功启动，下面我们来通过kubernetes原生的方式来创建prometheus，alertmanager实例，以及配置监控对象和规则等。

Prometheus Operator 实战

参考：

https://github.com/coreos/prometheus-operator/blob/master/Documentation/api.md

创建Prometheus实例：

不知道怎么创建，直接看上面的参考吧，我摘录下来：

Prometheus

Prometheus defines a Prometheus deployment.

Field	Description	Scheme	Required
metadata	Standard object’s metadata. More info: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#metadata	metav1.ObjectMeta	false
spec	Specification of the desired behavior of the Prometheus cluster. More info: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#spec-and-status	PrometheusSpec	true
status	Most recent observed status of the Prometheus cluster. Read-only. Not included when requesting from the apiserver, only from the Prometheus Operator API itself. More info: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#spec-and-status	*PrometheusStatus	false

发现只有PrometheusSpec是必须的，看下面：

PrometheusSpec

PrometheusSpec is a specification of the desired behavior of the Prometheus cluster. More info: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#spec-and-status

Field	Description	Scheme	Required
podMetadata	Standard object’s metadata. More info: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#metadata Metadata Labels and Annotations gets propagated to the prometheus pods.	*metav1.ObjectMeta	false
serviceMonitorSelector	ServiceMonitors to be selected for target discovery.	*metav1.LabelSelector	false
serviceMonitorNamespaceSelector	Namespaces to be selected for ServiceMonitor discovery. If nil, only check own namespace.	*metav1.LabelSelector	false
podMonitorSelector	Experimental PodMonitors to be selected for target discovery.	*metav1.LabelSelector	false
podMonitorNamespaceSelector	Namespaces to be selected for PodMonitor discovery. If nil, only check own namespace.	*metav1.LabelSelector	false
version	Version of Prometheus to be deployed.	string	false
tag	Tag of Prometheus container image to be deployed. Defaults to the value of `version`. Version is ignored if Tag is set.	string	false
sha	SHA of Prometheus container image to be deployed. Defaults to the value of `version`. Similar to a tag, but the SHA explicitly deploys an immutable container image. Version and Tag are ignored if SHA is set.	string	false
paused	When a Prometheus deployment is paused, no actions except for deletion will be performed on the underlying objects.	bool	false
image	Image if specified has precedence over baseImage, tag and sha combinations. Specifying the version is still necessary to ensure the Prometheus Operator knows what version of Prometheus is being configured.	*string	false
baseImage	Base image to use for a Prometheus deployment.	string	false
imagePullSecrets	An optional list of references to secrets in the same namespace to use for pulling prometheus and alertmanager images from registries see http://kubernetes.io/docs/user-guide/images#specifying-imagepullsecrets-on-a-pod	[]v1.LocalObjectReference	false
replicas	Number of instances to deploy for a Prometheus deployment.	*int32	false
replicaExternalLabelName	Name of Prometheus external label used to denote replica name. Defaults to the value of `prometheus_replica`. External label will not be added when value is set to empty string (`\"\"`).	*string	false
prometheusExternalLabelName	Name of Prometheus external label used to denote Prometheus instance name. Defaults to the value of `prometheus`. External label will not be added when value is set to empty string (`\"\"`).	*string	false
retention	Time duration Prometheus shall retain data for. Default is ‘24h’, and must match the regular expression `[0-9]+(ms	s	m
retentionSize	Maximum amount of disk space used by blocks.	string	false
walCompression	Enable compression of the write-ahead log using Snappy. This flag is only available in versions of Prometheus >= 2.11.0.	*bool	false
logLevel	Log level for Prometheus to be configured with.	string	false
logFormat	Log format for Prometheus to be configured with.	string	false
scrapeInterval	Interval between consecutive scrapes.	string	false
evaluationInterval	Interval between consecutive evaluations.	string	false
rules	/–rules.*/ command-line arguments.	Rules	false
externalLabels	The labels to add to any time series or alerts when communicating with external systems (federation, remote storage, Alertmanager).	map[string]string	false
enableAdminAPI	Enable access to prometheus web admin API. Defaults to the value of `false`. WARNING: Enabling the admin APIs enables mutating endpoints, to delete data, shutdown Prometheus, and more. Enabling this should be done with care and the user is advised to add additional authentication authorization via a proxy to ensure only clients authorized to perform these actions can do so. For more information see https://prometheus.io/docs/prometheus/latest/querying/api/#tsdb-admin-apis	bool	false
externalUrl	The external URL the Prometheus instances will be available under. This is necessary to generate correct URLs. This is necessary if Prometheus is not served from root of a DNS name.	string	false
routePrefix	The route prefix Prometheus registers HTTP handlers for. This is useful, if using ExternalURL and a proxy is rewriting HTTP routes of a request, and the actual ExternalURL is still true, but the server serves requests under a different route prefix. For example for use with `kubectl proxy`.	string	false
query	QuerySpec defines the query command line flags when starting Prometheus.	*QuerySpec	false
storage	Storage spec to specify how storage shall be used.	*StorageSpec	false
volumes	Volumes allows configuration of additional volumes on the output StatefulSet definition. Volumes specified will be appended to other volumes that are generated as a result of StorageSpec objects.	[]v1.Volume	false
ruleSelector	A selector to select which PrometheusRules to mount for loading alerting rules from. Until (excluding) Prometheus Operator v0.24.0 Prometheus Operator will migrate any legacy rule ConfigMaps to PrometheusRule custom resources selected by RuleSelector. Make sure it does not match any config maps that you do not want to be migrated.	*metav1.LabelSelector	false
ruleNamespaceSelector	Namespaces to be selected for PrometheusRules discovery. If unspecified, only the same namespace as the Prometheus object is in is used.	*metav1.LabelSelector	false
alerting	Define details regarding alerting.	*AlertingSpec	false
resources	Define resources requests and limits for single Pods.	v1.ResourceRequirements	false
nodeSelector	Define which Nodes the Pods are scheduled on.	map[string]string	false
serviceAccountName	ServiceAccountName is the name of the ServiceAccount to use to run the Prometheus Pods.	string	false
secrets	Secrets is a list of Secrets in the same namespace as the Prometheus object, which shall be mounted into the Prometheus Pods. The Secrets are mounted into /etc/prometheus/secrets/.	[]string	false
configMaps	ConfigMaps is a list of ConfigMaps in the same namespace as the Prometheus object, which shall be mounted into the Prometheus Pods. The ConfigMaps are mounted into /etc/prometheus/configmaps/.	[]string	false
affinity	If specified, the pod’s scheduling constraints.	*v1.Affinity	false
tolerations	If specified, the pod’s tolerations.	[]v1.Toleration	false
remoteWrite	If specified, the remote_write spec. This is an experimental feature, it may change in any upcoming release in a breaking way.	[]RemoteWriteSpec	false
remoteRead	If specified, the remote_read spec. This is an experimental feature, it may change in any upcoming release in a breaking way.	[]RemoteReadSpec	false
securityContext	SecurityContext holds pod-level security attributes and common container settings. This defaults to the default PodSecurityContext.	*v1.PodSecurityContext	false
listenLocal	ListenLocal makes the Prometheus server listen on loopback, so that it does not bind against the Pod IP.	bool	false
containers	Containers allows injecting additional containers or modifying operator generated containers. This can be used to allow adding an authentication proxy to a Prometheus pod or to change the behavior of an operator generated container. Containers described here modify an operator generated container if they share the same name and modifications are done via a strategic merge patch. The current container names are: `prometheus`, `prometheus-config-reloader`, `rules-configmap-reloader`, and `thanos-sidecar`. Overriding containers is entirely outside the scope of what the maintainers will support and by doing so, you accept that this behaviour may break at any time without notice.	[]v1.Container	false
initContainers	InitContainers allows adding initContainers to the pod definition. Those can be used to e.g. fetch secrets for injection into the Prometheus configuration from external sources. Any errors during the execution of an initContainer will lead to a restart of the Pod. More info: https://kubernetes.io/docs/concepts/workloads/pods/init-containers/ Using initContainers for any use case other then secret fetching is entirely outside the scope of what the maintainers will support and by doing so, you accept that this behaviour may break at any time without notice.	[]v1.Container	false
additionalScrapeConfigs	AdditionalScrapeConfigs allows specifying a key of a Secret containing additional Prometheus scrape configurations. Scrape configurations specified are appended to the configurations generated by the Prometheus Operator. Job configurations specified must have the form as specified in the official Prometheus documentation: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config. As scrape configs are appended, the user is responsible to make sure it is valid. Note that using this feature may expose the possibility to break upgrades of Prometheus. It is advised to review Prometheus release notes to ensure that no incompatible scrape configs are going to break Prometheus after the upgrade.	*v1.SecretKeySelector	false
additionalAlertRelabelConfigs	AdditionalAlertRelabelConfigs allows specifying a key of a Secret containing additional Prometheus alert relabel configurations. Alert relabel configurations specified are appended to the configurations generated by the Prometheus Operator. Alert relabel configurations specified must have the form as specified in the official Prometheus documentation: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#alert_relabel_configs. As alert relabel configs are appended, the user is responsible to make sure it is valid. Note that using this feature may expose the possibility to break upgrades of Prometheus. It is advised to review Prometheus release notes to ensure that no incompatible alert relabel configs are going to break Prometheus after the upgrade.	*v1.SecretKeySelector	false
additionalAlertManagerConfigs	AdditionalAlertManagerConfigs allows specifying a key of a Secret containing additional Prometheus AlertManager configurations. AlertManager configurations specified are appended to the configurations generated by the Prometheus Operator. Job configurations specified must have the form as specified in the official Prometheus documentation: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#alertmanager_config. As AlertManager configs are appended, the user is responsible to make sure it is valid. Note that using this feature may expose the possibility to break upgrades of Prometheus. It is advised to review Prometheus release notes to ensure that no incompatible AlertManager configs are going to break Prometheus after the upgrade.	*v1.SecretKeySelector	false
apiserverConfig	APIServerConfig allows specifying a host and auth methods to access apiserver. If left empty, Prometheus is assumed to run inside of the cluster and will discover API servers automatically and use the pod’s CA certificate and bearer token file at /var/run/secrets/kubernetes.io/serviceaccount/.	*APIServerConfig	false
thanos	Thanos configuration allows configuring various aspects of a Prometheus server in a Thanos environment.\n\nThis section is experimental, it may change significantly without deprecation notice in any release.\n\nThis is experimental and may change significantly without backward compatibility in any release.	*ThanosSpec	false
priorityClassName	Priority class assigned to the Pods	string	false
portName	Port name used for the pods and governing service. This defaults to web	string	false
arbitraryFSAccessThroughSMs	ArbitraryFSAccessThroughSMs configures whether configuration based on a service monitor can access arbitrary files on the file system of the Prometheus container e.g. bearer token files.	ArbitraryFSAccessThroughSMsConfig	false
overrideHonorLabels	OverrideHonorLabels if set to true overrides all user configured honor_labels. If HonorLabels is set in ServiceMonitor or PodMonitor to true, this overrides honor_labels to false.	bool	false
overrideHonorTimestamps	OverrideHonorTimestamps allows to globally enforce honoring timestamps in all scrape configs.	bool	false
ignoreNamespaceSelectors	IgnoreNamespaceSelectors if set to true will ignore NamespaceSelector settings from the podmonitor and servicemonitor configs, and they will only discover endpoints within their current namespace. Defaults to false.	bool	false
enforcedNamespaceLabel	EnforcedNamespaceLabel enforces adding a namespace label of origin for each alert and metric that is user created. The label value will always be the namespace of the object that is being created.	string	false

我们发现也没有必须要填的，那就先创建一个默认的吧：prometheus.yaml

kind: Prometheus
apiVersion: monitoring.coreos.com/v1
metadata:
  name: prometheus
  namespace: prometheus
spec: {}

# appley一下：
$ kubectl apply -f prometheus.yaml

查看一下：

$ kubectl -n prometheus get all
NAME                                       READY   STATUS              RESTARTS   AGE
pod/prometheus-operator-5748cc95dd-wgkhn   1/1     Running             0          31h
pod/prometheus-prometheus-0                0/3     ContainerCreating   0          4m11s


NAME                          TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)    AGE
service/prometheus-operated   ClusterIP   None         <none>        9090/TCP   4m11s
service/prometheus-operator   ClusterIP   None         <none>        8080/TCP   31h


NAME                                  READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/prometheus-operator   1/1     1            1           31h

NAME                                             DESIRED   CURRENT   READY   AGE
replicaset.apps/prometheus-operator-5748cc95dd   1         1         1       31h

NAME                                     READY   AGE
statefulset.apps/prometheus-prometheus   0/1     4m11s

# 我们看到operator 创建了一个statefulSet 为prometheus，查看一下对应的pod
$ kubectl -n prometheus get po
NAME                                   READY   STATUS              RESTARTS   AGE
prometheus-operator-5748cc95dd-wgkhn   1/1     Running             0          31h
prometheus-prometheus-0                0/3     ContainerCreating   0          3m11s

# 查看一下描述信息
$ kubectl -n prometheus describe po prometheus-prometheus-0
...
Events:
  Type    Reason     Age   From                          Message
  ----    ------     ----  ----                          -------
  Normal  Scheduled  56s   default-scheduler             Successfully assigned prometheus/prometheus-prometheus-0 to k8s03.test.awsbj.cn
  Normal  Pulling    55s   kubelet, k8s03.test.awsbj.cn  Pulling image "quay.io/prometheus/prometheus:v2.7.1"

# 还在拉取镜像，而且版本还是v2.7.1的，截止到目前官方最新版本是：v2.14.0，operator兼容的最高prometheus版本是：v2.10.0

# 稍等片刻即可：
$ kubectl -n prometheus get po
NAME                                   READY   STATUS    RESTARTS   AGE
prometheus-operator-5748cc95dd-wgkhn   1/1     Running   0          32h
prometheus-prometheus-0                3/3     Running   1          53m

我们创建prometheus实例的service：

$ cat prometheus-svc.yaml 
kind: Service
apiVersion: v1
metadata:
  name: prometheus
  namespace: prometheus
spec:
  selector:
    app: prometheus
    prometheus: prometheus
  ports:
  - name: http
    port: 9090
    targetPort: 9090
    protocol: TCP
  type: NodePort

# appley一下：
$ kubectl apply -f prometheus-svc.yaml 

# 查看具体的svc的nodeport
$ kubectl -n prometheus get svc
NAME                  TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
prometheus            NodePort    10.100.56.198   <none>        9090:32766/TCP   4s
prometheus-operated   ClusterIP   None            <none>        9090/TCP         65m
prometheus-operator   ClusterIP   None            <none>        8080/TCP         32h

我们访问测试一下，看看具体的配置文件吧：

指定prometheus server的版本

默认创建的prometheus server版本太低了，我们更换成operator兼容列表中最高的prometheus server版本：

我们注意到在上面的：PrometheusSpec跟版本有关的总共有三个：

Field	Description	Scheme	Required
version	Version of Prometheus to be deployed. 部署的prometheus server版本	string	false
tag	Tag of Prometheus container image to be deployed. Defaults to the value of `version`. Version is ignored if Tag is set. 部署的prometheus server版本，优先级高于version	string	false
image	Image if specified has precedence over baseImage, tag and sha combinations. Specifying the version is still necessary to ensure the Prometheus Operator knows what version of Prometheus is being configured. 当指定image时优先级高于baseImage，tag 和sha。但是仍然需要指定version来告诉operator是哪个版本的prometheus需要被配置	*string	false

所以呢：这里我们使用image+version的方式：

$ cat prometheus-image.yaml 
kind: Prometheus
apiVersion: monitoring.coreos.com/v1
metadata:
  name: prometheus
  namespace: prometheus
spec:
  version: v2.10.0
  image: quay.azk8s.cn/prometheus/prometheus:v2.10.0

# 应用
$ kubectl apply -f prometheus-image.yaml

注意：image中需要指定prometheus的版本v2.10.0*，否则就会拉取latest的版本;

这里的version只是告诉operator哪个prometheus版本需要被配置的，对于使用哪个版本是由image指定的。

version中的版本要和image中的版本一致，否则可能会出现未知的错误。

apply后，operator会根据我们定义的声明式yaml文件重新部署prometheus server，稍等片刻我们来验证一下：

$ kubectl -n prometheus get po prometheus-prometheus-0 -o yaml --export | grep image
Flag --export has been deprecated, This flag is deprecated and will be removed in future.
    image: quay.azk8s.cn/prometheus/prometheus:v2.10.0
    imagePullPolicy: IfNotPresent
    image: quay.azk8s.cn/coreos/prometheus-config-reloader:v0.34.0
    imagePullPolicy: IfNotPresent
    image: quay.azk8s.cn/coreos/configmap-reload:v0.0.1
    imagePullPolicy: IfNotPresent

我们发现连基础的镜像都是用了我们配置的image的镜像源：quay.azk8s.cn，所以呢，如果使用了自建的docker registry也需要提前拉取好这些镜像。

因为我们使用的Azure的镜像，它就是quay的镜像，所以默认都已经存在了，所以不需要手动再去拉取的。

所以推荐使用Azure中国镜像，使用方式请参考：docker.io gcr.io k8s.gcr.io quay.io 中国区加速

我们继续，我们看到prometheus-prometheus-0 这个statefulSet创建的pod中，另外两个关于重载prometheus配置的pod都已经配置了资源限制，只有prometheus server这个没有配置，显然不符合生产要求。

指定prometheus server的资源限制：

从上面的PrometheusSpec中我们看到有resource字段，内用和原声的kubernetes resource是一样的。

$ cat prometheus-image-resource.yaml 
kind: Prometheus
apiVersion: monitoring.coreos.com/v1
metadata:
  name: prometheus
  namespace: prometheus
spec:
  version: v2.10.0
  image: quay.azk8s.cn/prometheus/prometheus:v2.10.0
  resources:
    requests:
      memory: 400Mi
      cpu: 200m
    limits:
      memory: 512Mi
      cpu: 500m

# 应用一下：
$ kubectl apply -f prometheus-image-resource.yaml

检查一下：

$ kubectl -n prometheus get po prometheus-prometheus-0 -o yaml --export
...
spec:
  containers:
  - args:
    - --web.console.templates=/etc/prometheus/consoles
    - --web.console.libraries=/etc/prometheus/console_libraries
    - --config.file=/etc/prometheus/config_out/prometheus.env.yaml
    - --storage.tsdb.path=/prometheus
    - --storage.tsdb.retention.time=24h
    - --web.enable-lifecycle
    - --storage.tsdb.no-lockfile
    - --web.route-prefix=/
    image: quay.azk8s.cn/prometheus/prometheus:v2.10.0
    imagePullPolicy: IfNotPresent
  ...
    resources:
      limits:
        cpu: 500m
        memory: 512Mi
      requests:
        cpu: 200m
        memory: 400Mi
...

配置prometheus server的存储以及数据保留期限

查看具体的api：

Field	Description	Scheme	Required
retention	Time duration Prometheus shall retain data for. Default is ‘24h’, and must match the regular expression `[0-9]+(ms	s	m
storage	Storage spec to specify how storage shall be used.	*StorageSpec	false

StorageSpec

StorageSpec defines the configured storage for a group Prometheus servers. If neither emptyDir nor volumeClaimTemplate is specified, then by default an EmptyDir will be used.

Field	Description	Scheme	Required
emptyDir	EmptyDirVolumeSource to be used by the Prometheus StatefulSets. If specified, used in place of any volumeClaimTemplate. More info: https://kubernetes.io/docs/concepts/storage/volumes/#emptydir	*v1.EmptyDirVolumeSource	false
volumeClaimTemplate	A PVC spec to be used by the Prometheus StatefulSets.	v1.PersistentVolumeClaim	false

volumeClaimTemplate 中和kubernetes statefulSet 中定义的是一样的。

参考

$ cat prometheus.yaml 
kind: Prometheus
apiVersion: monitoring.coreos.com/v1
metadata:
  name: prometheus
  namespace: prometheus
spec:
  version: v2.10.0
  image: quay.azk8s.cn/prometheus/prometheus:v2.10.0
  retention: 1d # 保留1天
  storage:
    volumeClaimTemplate:
      spec:
        storageClassName: ssd
        resources:
          requests:
            storage: 40Gi
  resources:
    requests:
      memory: 400Mi
      cpu: 200m
    limits:
      memory: 512Mi
      cpu: 500m

# 准备storageClass
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: ssd
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2

由于我这里还没有storageClass的环境，所以不再测试，后期搭建一个Rook-Ceph测试一下。

配置prometheus server ingress，并配置basicAuth 用户名密码认证

Prometheus Operator参考

Nginx Ingress Basic Auth参考

由于默认的prometheus dashboard没有认证的，对于生产来说公网访问不安全，所以需要加上认证。

安装htpasswd命令：

$ sudo yum install httpd-tools -y

# 查看一下htpasswd命令帮助：
$ htpasswd --help
Usage:
4htpasswd [-cimBdpsDv] [-C cost] passwordfile username
4htpasswd -b[cmBdpsDv] [-C cost] passwordfile username password

4htpasswd -n[imBdps] [-C cost] username
4htpasswd -nb[mBdps] [-C cost] username password
 -c  Create a new file.
 -n  Don't update file; display results on stdout.
 -b  Use the password from the command line rather than prompting for it.
 -i  Read password from stdin without verification (for script usage).
 -m  Force MD5 encryption of the password (default).
 -B  Force bcrypt encryption of the password (very secure).
 -C  Set the computing time used for the bcrypt algorithm
     (higher is more secure but slower, default: 5, valid: 4 to 17).
 -d  Force CRYPT encryption of the password (8 chars max, insecure).
 -s  Force SHA encryption of the password (insecure).
 -p  Do not encrypt the password (plaintext, insecure).
 -D  Delete the specified user.
 -v  Verify password for the specified user.
On other systems than Windows and NetWare the '-p' flag will probably not work.
The SHA algorithm does not use a salt and is less secure than the MD5 algorithm.

创建nginx ingress所需的密码文件：

# auth 为创建的文件名，prometheus为用户名，然后输入两次一样的密码
$ htpasswd -c auth prometheus
New password: xxx
Re-type new password: xxx
Adding password for user prometheus

$ cat auth 
prometheus:$apr1$JhroPTnp$dzelC968rK0QkCnq1NnrY1

注意：文件名必须为auth，否则报503错误，用户名无所谓。

创建对应的kubernetes secret：

$ $ kubectl -n prometheus create secret generic basic-auth --from-file=auth

注意：secret必须要和ingress在同一个namespace，注意是ingress yaml文件的资源所在的namespace，不是nginx-ingress-controller所在的namespace。否则也会报503错误。

我这里ingress是准备在prometheus这个namespace下的，所以这个namespace也是在prometheus这个namespace。

创建prometheus的ingress，在prometheus这个namespace：我这里没有证书，所以把tls的部分注释掉了。

$ cat prometheus-ingress.yaml 
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: prometheus
  namespace: prometheus
  annotations:
    kubernetes.io/ingress.class: "nginx"
    nginx.ingress.kubernetes.io/auth-type: basic
    nginx.ingress.kubernetes.io/auth-secret: basic-auth
    nginx.ingress.kubernetes.io/auth-realm: "Authentication Required Prometheus"
spec:
  #tls:
  #  - hosts:
  #    - prometheus.test.aws.test.com
  #    secretName: prometheus.test.aws.test.com
  rules:
  - host: prometheus.test.aws.test.com
    http:
      paths:
      - path: /
        backend:
          serviceName: prometheus
          servicePort: http

配置好域名解析访问就会提示输入我们设定的用户名密码。

配置ingress的prometheus path路径为 /prometheus

有的朋友可能会问到：

这里是直接把域名的/根路径给了prometheus，后期的alertmanager还得需要域名，不太好，建议用prometheus.test.aws.test.com/promethues 访问prometheus，prometheus.test.aws.test.com/alertmanager来访问alertmanager。

要实现这种：配置如下：

首先看一下prometheus的参数：

--web.external-url=<URL>   The URL under which Prometheus is externally reachable (for example, if Prometheus is served via a reverse proxy). Used for generating relative and absolute links back to Prometheus itself. If the URL has a path portion, it will be used to prefix all HTTP endpoints served by Prometheus. If omitted, relevant URL components will be derived automatically.

--web.route-prefix=<path>  Prefix for the internal routes of web endpoints. Defaults to path of --web.external-url.

直接在非docker环境启动测试OK，如下：

./prometheus --config.file=prometheus.yml --web.enable-lifecycle --web.enable-admin-api --web.route-prefix="/prometheus" --web.external-url="http://prometheus.test.aws.test.com/prometheus"

测试：发现都加上了我们的 /prometheus 这个path：

$ curl http://prometheus.test.aws.test.com
<a href="/prometheus">Found</a>.

$ curl http://prometheus.test.aws.test.com/prometheus
<a href="/prometheus/">Moved Permanently</a>.

$ curl http://prometheus.test.aws.test.com/prometheus -L
<a href="/prometheus/graph">Found</a>.

那么k8s环境中prometheus-operator如何配置呢？

查看一下：PrometheusSpec

Field	Description	Scheme	Required
externalUrl	The external URL the Prometheus instances will be available under. This is necessary to generate correct URLs. This is necessary if Prometheus is not served from root of a DNS name.	string	false
routePrefix	The route prefix Prometheus registers HTTP handlers for. This is useful, if using ExternalURL and a proxy is rewriting HTTP routes of a request, and the actual ExternalURL is still true, but the server serves requests under a different route prefix. For example for use with `kubectl proxy`.	string	false

$ cat prometheus-path.yaml 
kind: Prometheus
apiVersion: monitoring.coreos.com/v1
metadata:
  name: prometheus
  namespace: prometheus
spec:
  version: v2.10.0
  image: quay.azk8s.cn/prometheus/prometheus:v2.10.0
  retention: 1d
  externalUrl: http://prometheus.test.aws.microoak.cn/prometheus
  routePrefix: /prometheus
  resources:
    requests:
      memory: 400Mi
      cpu: 200m
    limits:
      memory: 512Mi
      cpu: 500m

# appley 一下
$ kubectl apply -f prometheus-path.yaml

# 记得把prometheus ingress中的path更换成
path: /prometheus

测试一下吧：

# curl -L 表示跟谁301/302.返回正常
$ curl -L http://prometheus.test.aws.test.com/prometheus -uprometheus -p 
Enter host password for user 'prometheus':
<!DOCTYPE html>
<html lang="en">
    <head>
        <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
        <meta name="robots" content="noindex,nofollow">
        <title>Prometheus Time Series Collection and Processing Server</title>
        <link rel="shortcut icon" href="/prometheus/static/img/favicon.ico?v=d20e84d0fb64aff2f62a977adc8cfb656da4e286">
        <script src="/prometheus/static/vendor/js/jquery-3.3.1.min.js?v=d20e84d0fb64aff2f62a977adc8cfb656da4e286"></script>    
        <script src="/prometheus/static/vendor/js/popper.min.js?v=d20e84d0fb64aff2f62a977adc8cfb656da4e286"></script>
        <script src="/prometheus/static/vendor/bootstrap-4.3.1/js/bootstrap.min.js?v=d20e84d0fb64aff2f62a977adc8cfb656da4e286"></script>

注意：如果你使用的是Helm安装prometheus服务的，或者在k8s上手动安装prometheus，而不是通过prometheus-operator的，需要更改一些参数。

prometheus-operator只需要配置这两个参数即可，它自动的把下面的都配置好了。

需要更改如下：

readinessProbe路径为：/prometheus/-/ready
livenessProbe路径为：/prometheus/-/healthy
prometheus sidecar用于发现配置更改重载prometheus的服务的参数：- –webhook-url=http://127.0.0.1:9090/prometheus/-/reload
如果你的prometheus也收集了本身，需要配置收集的路径为：/prometheus/metircs

好了，关于operator的Prometheus资源配置就到这里，如果有其他的配置，建议参考：PrometheusSpec

ServiceMonitor 配置

参考

我们首先部署一下我们要搜集的目标服务：注意是在default名称空间下：

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: example-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: example-app
  template:
    metadata:
      labels:
        app: example-app
    spec:
      containers:
      - name: example-app
        image: fabxc/instrumented_app
        ports:
        - name: web
          containerPort: 8080
---
kind: Service
apiVersion: v1
metadata:
  name: example-app
  labels:
    app: example-app
spec:
  selector:
    app: example-app
  ports:
  - name: web
    port: 8080

检查这个服务：

$ kubectl get svc,po
NAME                  TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
service/example-app   ClusterIP   10.100.151.33   <none>        8080/TCP   45s
service/kubernetes    ClusterIP   10.100.0.1      <none>        443/TCP    7d2h
service/nginx         ClusterIP   10.100.54.82    <none>        80/TCP     5d23h

NAME                               READY   STATUS    RESTARTS   AGE
pod/example-app-66db748757-lmtsk   1/1     Running   0          45s
pod/example-app-66db748757-pzc6w   1/1     Running   0          45s
pod/example-app-66db748757-tchgl   1/1     Running   0          45s
pod/nginx-9cb7f8c7d-wk89s          1/1     Running   0          5d23h

# 查看其metrics：
$ curl 10.100.151.33:8080/metrics
# HELP codelab_api_http_requests_in_progress The current number of API HTTP requests in progress.
# TYPE codelab_api_http_requests_in_progress gauge
codelab_api_http_requests_in_progress 1
# HELP codelab_api_request_duration_seconds A histogram of the API HTTP request durations in seconds.
# TYPE codelab_api_request_duration_seconds histogram
codelab_api_request_duration_seconds_bucket{method="GET",path="/api/bar",status="200",le="0.0001"} 0
codelab_api_request_duration_seconds_bucket{method="GET",path="/api/bar",status="200",le="0.00015000000000000001"} 0
codelab_api_request_duration_seconds_bucket{method="GET",path="/api/bar",status="200",le="0.00022500000000000002"} 0
codelab_api_request_duration_seconds_bucket{method="GET",path="/api/bar",status="200",le="0.0003375"} 0
codelab_api_request_duration_seconds_bucket{method="GET",path="/api/bar",status="200",le="0.00050625"} 0
codelab_api_request_duration_seconds_bucket{method="GET",path="/api/bar",status="200",le="0.000759375"} 0
codelab_api_request_duration_seconds_bucket{method="GET",path="/api/bar",status="200",le="0.0011390624999999999"} 0
codelab_api_request_duration_seconds_bucket{method="GET",path="/api/bar",status="200",le="0.0017085937499999998"} 0
codelab_api_request_duration_seconds_bucket{method="GET",path="/api/bar",status="200",le="0.0025628906249999996"} 0
codelab_api_request_duration_seconds_bucket{method="GET",path="/api/bar",status="200",le="0.0038443359374999994"} 0
codelab_api_request_duration_seconds_bucket{method="GET",path="/api/bar",status="200",le="0.00576650390625"} 0

好了，让我们配置ServiceMonitor，来让prometheus搜集吧。

老规矩，先看ServiceMonitor的api吧，我也不知道怎么配置。

ServiceMonitor

ServiceMonitor defines monitoring for a set of services.

Field	Description	Scheme	Required
metadata	Standard object’s metadata. More info: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#metadata	metav1.ObjectMeta	false
spec	Specification of desired Service selection for target discrovery by Prometheus.	ServiceMonitorSpec	true

ServiceMonitorSpec

ServiceMonitorSpec contains specification parameters for a ServiceMonitor.

Field	Description	Scheme	Required
jobLabel	The label to use to retrieve the job name from.	string	false
targetLabels	TargetLabels transfers labels on the Kubernetes Service onto the target.	[]string	false
podTargetLabels	PodTargetLabels transfers labels on the Kubernetes Pod onto the target.	[]string	false
endpoints	A list of endpoints allowed as part of this ServiceMonitor.	[]Endpoint	true
selector	Selector to select Endpoints objects.	metav1.LabelSelector	true
namespaceSelector	Selector to select which namespaces the Endpoints objects are discovered from.	NamespaceSelector	false
sampleLimit	SampleLimit defines per-scrape limit on number of scraped samples that will be accepted.	uint64	false

Endpoint

Endpoint defines a scrapeable endpoint serving Prometheus metrics.

Field	Description	Scheme	Required
port	Name of the service port this endpoint refers to. Mutually exclusive with targetPort.	string	false
targetPort	Name or number of the target port of the endpoint. Mutually exclusive with port.	*intstr.IntOrString	false
path	HTTP path to scrape for metrics.	string	false
scheme	HTTP scheme to use for scraping.	string	false
params	Optional HTTP URL parameters	map[string][]string	false
interval	Interval at which metrics should be scraped	string	false
scrapeTimeout	Timeout after which the scrape is ended	string	false
tlsConfig	TLS configuration to use when scraping the endpoint	*TLSConfig	false
bearerTokenFile	File to read bearer token for scraping targets.	string	false
bearerTokenSecret	Secret to mount to read bearer token for scraping targets. The secret needs to be in the same namespace as the service monitor and accessible by the Prometheus Operator.	v1.SecretKeySelector	false
honorLabels	HonorLabels chooses the metric’s labels on collisions with target labels.	bool	false
honorTimestamps	HonorTimestamps controls whether Prometheus respects the timestamps present in scraped data.	*bool	false
basicAuth	BasicAuth allow an endpoint to authenticate over basic authentication More info: https://prometheus.io/docs/operating/configuration/#endpoints	*BasicAuth	false
metricRelabelings	MetricRelabelConfigs to apply to samples before ingestion.	[]*RelabelConfig	false
relabelings	RelabelConfigs to apply to samples before scraping. More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config	[]*RelabelConfig	false
proxyUrl	ProxyURL eg http://proxyserver:2195 Directs scrapes to proxy through this endpoint.	*string	false

NamespaceSelector

NamespaceSelector is a selector for selecting either all namespaces or a list of namespaces.

Field	Description	Scheme	Required
any	Boolean describing whether all namespaces are selected in contrast to a list restricting them.	bool	false
matchNames	List of namespace names.	[]string	false

有两个必须要提供的：

ServiceMonitor.spec.endpoints

ServiceMonitor.spec.selector

我们来定义serviceMonitor：

$ cat example-app-ServiceMonitor.yaml 
kind: ServiceMonitor
apiVersion: monitoring.coreos.com/v1
metadata:
  name: example-app
  namespace: prometheus # 我把ServiceMonitor资源放到了prometheus 空间下了
  labels: # 需要配置自身的标签，以便prometheus能够通过标签找到自己
    team: frontend
spec:
  namespaceSelector: # 而目标的service在default 空间下，需要指定；如果不指定那么就默认找自己所在的空间
    matchNames:
    - default
  selector:
    matchLabels:
      app: example-app # 需要搜集对象service的labels，通过service的lables找到server对应的endpoint。
  endpoints:
  - port: web # 需要搜集对象service中对应的端口名称

# 应用一下：
$ kubectl apply -f example-app-ServiceMonitor.yaml

如果希望ServiceMonitor可以关联任何空间下的标签，可以通过下面的方式定义：

spec:
  namespaceSelector:
    any: true

如果监控的目标对应中有basicAuth认证的话，需要配置如下：

kind: ServiceMonitor
apiVersion: monitoring.coreos.com/v1
metadata:
  name: example-app
  namespace: prometheus
  labels:
    team: frontend
spec:
  namespaceSelector:
    matchNames:
    - default
  selector:
    matchLabels:
      app: example-app
  endpoints:
  - port: web
    basicAuth:
      password:
        name: basic-auth
        key: password
      username:
        name: basic-auth
        key: username

# basicAuth 关联了目标对象的secret，需要手动创建
apiVersion: v1
kind: Secret
metadata:
  name: basic-auth
data:
  password: cHJvbWV0aGV1c0AxMjM=
  username: cHJvbWV0aGV1cw==
type: Opaque

检查一下：

$ kubectl -n prometheus get servicemonitors.monitoring.coreos.com 
NAME          AGE
example-app   7m58s

OK了，刷新一下prometheus dashboard看一下吧。呀，啥也没有呢，没生效？

其实是这样的：

ServiceMonitor定义了目标收集相关的，比如端口啊，哪个service啊等，但是还没有指定是由哪个prometheus去搜集呢？

Prometheus,Alertmanager,ServiceMonitor等自定义的资源在一个k8s集群中可能都有多个，比如定义了两台prometheus server，并且定义了多个ServiceMonitor，那么哪个prometheus通过哪个ServiceMonitor去搜集呢？是不是？

所以还需要更改之前定义的Prometheus自定义资源，因为prometheus server是通过kubernetes_sd_configs自动发现的，所以需要给prometheus server权限去发现：所以呢，首先定义prometheus server所需的RBAC：

参考

$ wget -c "https://github.com/coreos/prometheus-operator/raw/master/example/rbac/prometheus/prometheus-cluster-role-binding.yaml"
$ wget -c "https://github.com/coreos/prometheus-operator/raw/master/example/rbac/prometheus/prometheus-cluster-role.yaml"
$ wget -c "https://github.com/coreos/prometheus-operator/raw/master/example/rbac/prometheus/prometheus-service-account.yaml"

# 因为我把prometheus安装到了prometheus空间中，所以需要修改上面的namespace：这两个文件需要修改
prometheus-service-account.yaml 和 prometheus-cluster-role-binding.yaml

# 然后appley一下：
$ kubectl apply -f ./

serviceAccount 配置OK了，prometheus怎么用呢？以及怎么选择对应的serviceMonitor呢？当然是通过强大的labels选择器了啊：

$ cat prometheus-path.yaml 
kind: Prometheus
apiVersion: monitoring.coreos.com/v1
metadata:
  name: prometheus
  namespace: prometheus
spec:
  version: v2.10.0
  image: quay.azk8s.cn/prometheus/prometheus:v2.10.0
  retention: 1d
  externalUrl: http://prometheus.test.aws.microoak.cn/prometheus
  routePrefix: /prometheus
  serviceAccountName: prometheus # 指定需要serviceAccount
  serviceMonitorSelector: # 通过label找到具体的serviceMonitor
    matchLabels:
      team: frontend
  resources:
    requests:
      memory: 400Mi
      cpu: 200m
    limits:
      memory: 512Mi
      cpu: 500m

# 再次应用一下：
$ kubectl apply -f prometheus-path.yaml

好了，再次刷新一下prometheus dashboard：

target：

UI 上查看具体的prometheus 配置：

global:
  scrape_interval: 30s
  scrape_timeout: 10s
  evaluation_interval: 30s
  external_labels:
    prometheus: prometheus/prometheus
    prometheus_replica: prometheus-prometheus-0
alerting:
  alert_relabel_configs:
  - separator: ;
    regex: prometheus_replica
    replacement: $1
    action: labeldrop
rule_files:
- /etc/prometheus/rules/prometheus-prometheus-rulefiles-0/*.yaml
scrape_configs:
- job_name: prometheus/example-app/0
  honor_timestamps: true
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: http
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - default
  relabel_configs:
  - source_labels: [__meta_kubernetes_service_label_app]
    separator: ;
    regex: example-app
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: web
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Node;(.*)
    target_label: node
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_endpoint_address_target_kind, __meta_kubernetes_endpoint_address_target_name]
    separator: ;
    regex: Pod;(.*)
    target_label: pod
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: web
    action: replace

command-line flags 截图：

创建Alertmanager实例：

老规矩先看其api：

Alertmanager

Alertmanager describes an Alertmanager cluster.

Field	Description	Scheme	Required
metadata	Standard object’s metadata. More info: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#metadata	metav1.ObjectMeta	false
spec	Specification of the desired behavior of the Alertmanager cluster. More info: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#spec-and-status	AlertmanagerSpec	true
status	Most recent observed status of the Alertmanager cluster. Read-only. Not included when requesting from the apiserver, only from the Prometheus Operator API itself. More info: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#spec-and-status	*AlertmanagerStatus	false

AlertmanagerSpec

AlertmanagerSpec is a specification of the desired behavior of the Alertmanager cluster. More info: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#spec-and-status

Field	Description	Scheme	Required
podMetadata	Standard object’s metadata. More info: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#metadata Metadata Labels and Annotations gets propagated to the prometheus pods.	*metav1.ObjectMeta	false
image	Image if specified has precedence over baseImage, tag and sha combinations. Specifying the version is still necessary to ensure the Prometheus Operator knows what version of Alertmanager is being configured.	*string	false
version	Version the cluster should be on.	string	false
tag	Tag of Alertmanager container image to be deployed. Defaults to the value of `version`. Version is ignored if Tag is set.	string	false
sha	SHA of Alertmanager container image to be deployed. Defaults to the value of `version`. Similar to a tag, but the SHA explicitly deploys an immutable container image. Version and Tag are ignored if SHA is set.	string	false
baseImage	Base image that is used to deploy pods, without tag.	string	false
imagePullSecrets	An optional list of references to secrets in the same namespace to use for pulling prometheus and alertmanager images from registries see http://kubernetes.io/docs/user-guide/images#specifying-imagepullsecrets-on-a-pod	[]v1.LocalObjectReference	false
secrets	Secrets is a list of Secrets in the same namespace as the Alertmanager object, which shall be mounted into the Alertmanager Pods. The Secrets are mounted into /etc/alertmanager/secrets/.	[]string	false
configMaps	ConfigMaps is a list of ConfigMaps in the same namespace as the Alertmanager object, which shall be mounted into the Alertmanager Pods. The ConfigMaps are mounted into /etc/alertmanager/configmaps/.	[]string	false
configSecret	ConfigSecret is the name of a Kubernetes Secret in the same namespace as the Alertmanager object, which contains configuration for this Alertmanager instance. Defaults to ‘alertmanager-‘ The secret is mounted into /etc/alertmanager/config.	string	false
logLevel	Log level for Alertmanager to be configured with.	string	false
logFormat	Log format for Alertmanager to be configured with.	string	false
replicas	Size is the expected size of the alertmanager cluster. The controller will eventually make the size of the running cluster equal to the expected size.	*int32	false
retention	Time duration Alertmanager shall retain data for. Default is ‘120h’, and must match the regular expression `[0-9]+(ms	s	m
storage	Storage is the definition of how storage will be used by the Alertmanager instances.	*StorageSpec	false
volumes	Volumes allows configuration of additional volumes on the output StatefulSet definition. Volumes specified will be appended to other volumes that are generated as a result of StorageSpec objects.	[]v1.Volume	false
volumeMounts	VolumeMounts allows configuration of additional VolumeMounts on the output StatefulSet definition. VolumeMounts specified will be appended to other VolumeMounts in the alertmanager container, that are generated as a result of StorageSpec objects.	[]v1.VolumeMount	false
externalUrl	The external URL the Alertmanager instances will be available under. This is necessary to generate correct URLs. This is necessary if Alertmanager is not served from root of a DNS name.	string	false
routePrefix	The route prefix Alertmanager registers HTTP handlers for. This is useful, if using ExternalURL and a proxy is rewriting HTTP routes of a request, and the actual ExternalURL is still true, but the server serves requests under a different route prefix. For example for use with `kubectl proxy`.	string	false
paused	If set to true all actions on the underlaying managed objects are not goint to be performed, except for delete actions.	bool	false
nodeSelector	Define which Nodes the Pods are scheduled on.	map[string]string	false
resources	Define resources requests and limits for single Pods.	v1.ResourceRequirements	false
affinity	If specified, the pod’s scheduling constraints.	*v1.Affinity	false
tolerations	If specified, the pod’s tolerations.	[]v1.Toleration	false
securityContext	SecurityContext holds pod-level security attributes and common container settings. This defaults to the default PodSecurityContext.	*v1.PodSecurityContext	false
serviceAccountName	ServiceAccountName is the name of the ServiceAccount to use to run the Prometheus Pods.	string	false
listenLocal	ListenLocal makes the Alertmanager server listen on loopback, so that it does not bind against the Pod IP. Note this is only for the Alertmanager UI, not the gossip communication.	bool	false
containers	Containers allows injecting additional containers. This is meant to allow adding an authentication proxy to an Alertmanager pod.	[]v1.Container	false
initContainers	InitContainers allows adding initContainers to the pod definition. Those can be used to e.g. fetch secrets for injection into the Alertmanager configuration from external sources. Any errors during the execution of an initContainer will lead to a restart of the Pod. More info: https://kubernetes.io/docs/concepts/workloads/pods/init-containers/ Using initContainers for any use case other then secret fetching is entirely outside the scope of what the maintainers will support and by doing so, you accept that this behaviour may break at any time without notice.	[]v1.Container	false
priorityClassName	Priority class assigned to the Pods	string	false
additionalPeers	AdditionalPeers allows injecting a set of additional Alertmanagers to peer with to form a highly available cluster.	[]string	false
portName	Port name used for the pods and governing service. This defaults to web	string	false

好吧，既然没有必须的那就参考上面配置的Prometheus，指定如下配置：

replicas为3，创建以3台alertmanager组成的集群
更改image资源为Azure中国的
配置一下resource限制
同时采用类似prometheus 的ingress path为 /alertmanager
配置数据保留期限

$ cat prometheus-alertmanager.yaml 
kind: Alertmanager
apiVersion: monitoring.coreos.com/v1
metadata:
  name: alertmanager
  namespace: prometheus
spec:
  replicas: 3
  image: quay.azk8s.cn/prometheus/alertmanager:v0.17.0
  version: v0.17.0
  retention: 24h # 注意这里最大的单位是h，没有d。
  externalUrl: http://prometheus.test.aws.microoak.cn/alertmanager
  routePrefix: /alertmanager
  resources:
    requests:
      memory: 256Mi
      cpu: 100m
    limits:
      memory: 256Mi
      cpu: 100m

# apply一下
$ kubectl apply -f prometheus-alertmanager.yaml

# 检查一下：
$ kubectl -n prometheus get po
NAME                                   READY   STATUS              RESTARTS   AGE
alertmanager-alertmanager-0            0/2     ContainerCreating   0          109s
alertmanager-alertmanager-1            0/2     ContainerCreating   0          109s
alertmanager-alertmanager-2            0/2     ContainerCreating   0          109s
prometheus-operator-5748cc95dd-wgkhn   1/1     Running             0          2d11h
prometheus-prometheus-0                3/3     Running             1          3h47m

# 发现需要创建配置文件：alertmanager-alertmanager 而且是secret
$ kubectl -n prometheus describe po alertmanager-alertmanager-0
Events:
  Type     Reason       Age                From                          Message
  ----     ------       ----               ----                          -------
  Normal   Scheduled    97s                default-scheduler             Successfully assigned prometheus/alertmanager-alertmanager-1 to k8s03.test.awsbj.cn
  Warning  FailedMount  33s (x8 over 96s)  kubelet, k8s03.test.awsbj.cn  MountVolume.SetUp failed for volume "config-volume" : secret "alertmanager-alertmanager" not found

准备alertmanager的配置文件：名为：alertmanager-<Alertmanager> 格式的secret，其中<Alertmanager>为创建Alertmanager资源对象的名称：

这里配置了email以及企业微信的告警：

$ cat alertmanager.yaml
global:
  resolve_timeout: 5m
  # smtp
  smtp_from: alert@xxx.xx
  smtp_smarthost: smtphm.qiye.163.com:465
  smtp_hello: alert@xxx.xx
  smtp_auth_username: alert@xxx.xx
  smtp_auth_password: xxx
  smtp_require_tls: false
  # wechat
  wechat_api_secret: PxxxxxxxxxxxxxxxxxxxxSc
  wechat_api_corp_id: wxxxxxxxxxxx7

route:
  group_by:
  - alertname
  - instance
  group_wait: 30s
  group_interval: 1m
  repeat_interval: 5m
  receiver: 'email-receiver'
  routes:
  - receiver: "wechat-receiver"
    match_re:
      job: pushgw|grafana

receivers:
- name: 'email-receiver'
  email_configs:
    - to: xxx@xxx.xx
      send_resolved: true

- name: 'wechat-receiver'
  wechat_configs:
    - send_resolved: true  
      agent_id: 1xxxx3
      to_user: '@all'

inhibit_rules:
  - source_match:
      severity: 'critical'
    target_match:
      severity: 'warning'
    equal: ['alertname', 'service', 'instance']

创建secret，注意是在和Alertmanager所在的namespace，我这里是prometheus：

$ kubectl -n prometheus create secret generic alertmanager-alertmanager --from-file=alertmanager.yaml

注意alertmanager的配置文件名称必须是alertmanager.yaml，为啥呢？

$ kubectl -n prometheus get statefulsets.apps alertmanager-alertmanager -o yaml
apiVersion: apps/v1
kind: StatefulSet
\......
    spec:
      containers:
      - args:
        - --config.file=/etc/alertmanager/config/alertmanager.yaml # 这里指定的就是alertmanager.yaml
        - --cluster.listen-address=[$(POD_IP)]:9094
        - --storage.path=/alertmanager
        - --data.retention=24h
        - --web.listen-address=:9093
        - --web.external-url=http://prometheus.test.aws.microoak.cn/alertmanager
        - --web.route-prefix=/alertmanager
        - --cluster.peer=alertmanager-alertmanager-0.alertmanager-operated.prometheus.svc:9094
        - --cluster.peer=alertmanager-alertmanager-1.alertmanager-operated.prometheus.svc:9094
        - --cluster.peer=alertmanager-alertmanager-2.alertmanager-operated.prometheus.svc:9094
        image: quay.azk8s.cn/prometheus/alertmanager:v0.17.0
        imagePullPolicy: IfNotPresent
        name: alertmanager
        \......
        volumeMounts:
        - mountPath: /etc/alertmanager/config
          name: config-volume
        - mountPath: /alertmanager
          name: alertmanager-alertmanager-db
      - args:
        - -webhook-url=http://localhost:9093/-/reload
        - -volume-dir=/etc/alertmanager/config
        image: quay.azk8s.cn/coreos/configmap-reload:v0.0.1
        imagePullPolicy: IfNotPresent
        name: config-reloader
       \......
        volumeMounts:
        - mountPath: /etc/alertmanager/config
          name: config-volume
          readOnly: true
     \......
      volumes:
      - name: config-volume
        secret: # secret类型的
          defaultMode: 420
          secretName: alertmanager-alertmanager # secret的名称
      - emptyDir: {}
        name: alertmanager-alertmanager-db

创建alertmanager的service：

$ cat alertmanager-svc.yaml 
kind: Service
apiVersion: v1
metadata:
  name: alertmanager
  namespace: prometheus
spec:
  selector:
    app: alertmanager
    alertmanager: alertmanager
  ports:
  - name: web
    port: 9093
    targetPort: 9093
    protocol: TCP
  type: ClusterIP

# apply
$ kubectl apply -f alertmanager-svc.yaml

创建alertmanager的ingress：用户名密码直接复用prometheus的

$ cat alertmanager-ingress.yaml 
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: alertmanager
  namespace: prometheus
  annotations:
    kubernetes.io/ingress.class: "nginx"
    nginx.ingress.kubernetes.io/auth-type: basic
    nginx.ingress.kubernetes.io/auth-secret: basic-auth
    nginx.ingress.kubernetes.io/auth-realm: "Authentication Required Prometheus"
spec:
  #tls:
  #  - hosts:
  #    - prometheus.test.aws.test.com
  #    secretName: prometheus.test.aws.test.com
  rules:
  - host: prometheus.test.aws.test.com
    http:
      paths:
      - path: /alertmanager
        backend:
          serviceName: alertmanager
          servicePort: web

# apply
$ kubectl apply -f alertmanager-ingress.yaml

好了，访问测试一下：http://prometheus.test.aws.test.com

$ curl -L http://prometheus.test.aws.test.com/alertmanager -uprometheus -p
Enter host password for user 'prometheus':
<!DOCTYPE html>
<html lang="en">
    <head>
        <meta charset="utf-8">
        <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
        <title>Alertmanager</title>
    </head>
    <body>
    ......

下图可以看到三台alertmanager已经组成了集群。

好了，alertmanager完事了，如果告知前面创建的prometheus呢？

还是查看一下PrometheusSpec：

Field	Description	Scheme	Required
alerting	Define details regarding alerting.	*AlertingSpec	false

AlertingSpec

AlertingSpec defines parameters for alerting configuration of Prometheus servers.

Field	Description	Scheme	Required
alertmanagers	AlertmanagerEndpoints Prometheus should fire alerts against.	[]AlertmanagerEndpoints	true

AlertmanagerEndpoints

AlertmanagerEndpoints defines a selection of a single Endpoints object containing alertmanager IPs to fire alerts against.

Field	Description	Scheme	Required
namespace	Namespace of Endpoints object.	string	true
name	Name of Endpoints object in Namespace.	string	true
port	Port the Alertmanager API is exposed on.	intstr.IntOrString	true
scheme	Scheme to use when firing alerts.	string	false
pathPrefix	Prefix for the HTTP path alerts are pushed to.	string	false
tlsConfig	TLS Config to use for alertmanager connection.	*TLSConfig	false
bearerTokenFile	BearerTokenFile to read from filesystem to use when authenticating to Alertmanager.	string	false

Prometheus定义如下：

$ cat prometheus-path.yaml 
kind: Prometheus
apiVersion: monitoring.coreos.com/v1
metadata:
  name: prometheus
  namespace: prometheus
spec:
  version: v2.10.0
  image: quay.azk8s.cn/prometheus/prometheus:v2.10.0
  retention: 1d
  externalUrl: http://prometheus.test.aws.microoak.cn/prometheus
  routePrefix: /prometheus
  serviceAccountName: prometheus
  serviceMonitorSelector:
    matchLabels:
      team: frontend
  alerting: # 这里定义alerting
    alertmanagers:
    - namespace: prometheus # alertmanager endpoint对象所在namespace
      name: alertmanager # alertmanager endpoint名称，等同于service名称
      port: web # alertmanager端口
      pathPrefix: /alertmanager # alertmanager path的前缀，因为我们定义了/alertmanager
  resources:
    requests:
      memory: 400Mi
      cpu: 200m
    limits:
      memory: 512Mi
      cpu: 500m

注意：

如果你没有配置：routePrefix和externalUrl，那么这里也不需要配置pathPrefix

如果你配置了如上的：routePrefix和externalUrl，那么pathPrefix需要配置成和routePrefix一致：

pathPrefix: /alertmanager：

否则prometheus去通知alertmanager的时候会报404错误：

level=error ts=2019-11-20T03:01:24.934Z caller=notifier.go:487 component=notifier alertmanager=http://10.101.217.208:9093/api/v1/alerts count=0 msg=”Error sending alert” err=”bad response status 404 Not Found”
level=error ts=2019-11-20T03:01:24.934Z caller=notifier.go:487 component=notifier alertmanager=http://10.101.253.80:9093/api/v1/alerts count=0 msg=”Error sending alert” err=”bad response status 404 Not Found”
level=error ts=2019-11-20T03:01:24.934Z caller=notifier.go:487 component=notifier alertmanager=http://10.101.253.81:9093/api/v1/alerts count=0 msg=”Error sending alert” err=”bad response status 404 Not Found”

检查一下prometheus dashboard 中alerting的配置吧：

alerting:
  alert_relabel_configs:
  - separator: ;
    regex: prometheus_replica
    replacement: $1
    action: labeldrop
  alertmanagers:
  - kubernetes_sd_configs:
    - role: endpoints
      namespaces:
        names:
        - prometheus
    scheme: http
    path_prefix: /alertmanager
    timeout: 10s
    relabel_configs:
    - source_labels: [__meta_kubernetes_service_name]
      separator: ;
      regex: alertmanager
      replacement: $1
      action: keep
    - source_labels: [__meta_kubernetes_endpoint_port_name]
      separator: ;
      regex: web
      replacement: $1
      action: keep

查看prometheus是否发现了alertmanager实例呢：prometheus dashboard—Status—Runtime & BUild Information：

好了alertmanager 实例已经创建了，现在可以创建告警以及规则了。

通过PrometheusRule 创建rule，以及alert告警

老规矩，查看api：

PrometheusRule

PrometheusRule defines alerting rules for a Prometheus instance

Field	Description	Scheme	Required
metadata	Standard object’s metadata. More info: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#metadata	metav1.ObjectMeta	false
spec	Specification of desired alerting rule definitions for Prometheus.	PrometheusRuleSpec	true

PrometheusRuleSpec 这里的定义是跟prometheus中定义rules，alert是一样的。具体可以alerting-rules，rules

PrometheusRuleSpec contains specification parameters for a Rule.

Field	Description	Scheme	Required
groups	Content of Prometheus rule file	[]RuleGroup	false

RuleGroup

RuleGroup is a list of sequentially evaluated recording and alerting rules.

Field	Scheme	Required
name	string	true
interval	string	false
rules	[]Rule	true

Rule

Rule describes an alerting or recording rule.

Field	Scheme	Required
record	string	false
alert	string	false
expr	intstr.IntOrString	true
for	string	false
labels	map[string]string	false
annotations	map[string]string	false

定义prometheusrule如下：

$ cat prometheusrule.yaml 
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: rules-alerts
  namespace: prometheus
  labels: # 需要定义labels，方便prometheus进行筛选。
    prometheus: prometheus
    role: rules-alerts
spec:
  groups:
  - name: alerting-group
    interval: 13s
    rules:
    - alert: InstanceDown
      expr: up == 0
      for: 1m
      labels:
        severity: error
      annotations:
        summary: "Instance {{ $labels.instance }} down"
        description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 3 minutes."
    - alert: WatchDog
      expr: vector(1)
      for: 10m
      labels:
        severity: info
      annotations:
        summary: "watchdog service every 10m send message."
  - name: rules-group
    interval: 30s
    rules:
    - record: http_requests_total:avg_rate1m
      expr: avg(rate(http_requests_total[1m])) by (instance,job,method,service,code)

# apply一下
$ kubectl apply -f prometheusrule.yaml

定义完了prometheusrule其实还不能用：类似与serviceMonitor、Alertmanager都需要在Prometheus资源中通过lables进行筛选prometheus需要的。

$ cat prometheus-path.yaml 
kind: Prometheus
apiVersion: monitoring.coreos.com/v1
metadata:
  name: prometheus
  namespace: prometheus
spec:
  version: v2.10.0
  image: quay.azk8s.cn/prometheus/prometheus:v2.10.0
  retention: 1d
  externalUrl: http://prometheus.test.aws.microoak.cn/prometheus
  routePrefix: /prometheus
  serviceAccountName: prometheus
  serviceMonitorSelector:
    matchLabels:
      team: frontend
  alerting:
    alertmanagers:
    - namespace: prometheus
      name: alertmanager
      port: web
      pathPrefix: /alertmanager
  ruleSelector: # 通过labels筛选prometheusRule
    matchLabels:
      prometheus: prometheus
      role: rules-alerts
  resources:
    requests:
      memory: 400Mi
      cpu: 200m
    limits:
      memory: 512Mi
      cpu: 500m
      
# apply一下
$ kubectl apply -f prometheus-path.yaml

等待prometheus的自动reload，然后检查如下：

查询一下定义的rule：

查看alert：

等待10分钟再次查看：

在Alertmanager上查看：

查看邮件通知：

点击上面的Source能够跳转到Prometheus：

http://prometheus.test.aws.test.com/prometheus/graph?g0.expr=vector%281%29&g0.tab=1

总结：

看到上面的Prometheus、Alertmanager、ServiceMonitor、PrometheusRule、PodMonitor，以及其中的依赖关系，是不是觉得很懵逼呢？看个图就明白了很多：

Prometheus、Alertmanager、ServiceMonitor、PrometheusRule、PodMonitor在一个Kubernetes集群中都可能会有多个。

Alertmanager定义alertmanager server，负责等待prometheus server推送过来的告警，然后根据告警规则通知最终用户。

ServiceMonitor、PodMonitor都是定义要搜集的目标，以及怎么搜集，都是通过prometheus的kubernetes自动发现配置的，所以需要为prometheus配置RBAC权限，让它有权限去发现对应的目标。

然后就是Prometheus去部署一个prometheus server，然后通过标签选择要搜集的对象（ServiceMonitor、PodMonitor）、告警和规则（PrometheusRule）、以及当发生告警时要通知那些alertmanager（Alertmanager）。

所有的这些操作都是通过kubernetes的声明式API配置的，operator定义了如上几个CRD，用户定义期望的值（配置的这些的CRD），operator负责根据Prometheus、Alertmanager部署prometheus server，alertmanager server，然后根据ServiceMonitor、PrometheusRule、PodMonitor 来生成prometheus的配置文件，prometheus pod内的sidecar负责重载配置，最终尽可能的达到用户的期望值。

移除Prometheus Operator

我们以手动的方式测试完prometheus operator，了解其中的原理后，我们需要将其卸载掉，因为我们后续会使用Helm的方式来安装Prometheus-Operator（后面会有博文去介绍，欢迎期待），这样会帮助我们安装kube-prometheus、监控Kubernetes集群所需的Manifests，以及对应的Grafana和Dashboard，这样会大大的解放我们双手。

移除的方法参考GitHub

要移除prometheus operator首先要移除在每个namespace下面的定义的CRD（5个），operator会自动的的关闭prometheus和alertmanager pod以及其关联的ConfigMaps。

$ for n in $(kubectl get namespaces -o jsonpath={..metadata.name}); do
  kubectl delete --all --namespace=$n prometheus,servicemonitor,podmonitor,alertmanager
done

然后再移除operator本身：

$ kubectl delete -f bundle.yaml

清理operator自动在每个namespace下面创建的service，以及5个CRD：

for n in $(kubectl get namespaces -o jsonpath={..metadata.name}); do
  kubectl delete --ignore-not-found --namespace=$n service prometheus-operated alertmanager-operated
done

kubectl delete --ignore-not-found customresourcedefinitions \
  prometheuses.monitoring.coreos.com \
  servicemonitors.monitoring.coreos.com \
  podmonitors.monitoring.coreos.com \
  alertmanagers.monitoring.coreos.com \
  prometheusrules.monitoring.coreos.com

参考：

https://coreos.com/blog/the-prometheus-operator.html

https://github.com/coreos/prometheus-operator/blob/master/Documentation/troubleshooting.md

https://github.com/coreos/prometheus-operator/blob/master/Documentation/api.md # 重要

https://zhangguanzhang.github.io/2018/10/12/prometheus-operator/

https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/

https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/

本文到这里就结束了，欢迎期待后面的文章。您可以关注下方的公众号二维码，在第一时间查看新文章。