prometheus operator 手动入门实战
虽然生产中我们大部分都是通过Helm的方式来安装Prometheus-Operator,但是我们也是需要明白Prometheus Operator的原理,以及其中5个CRD的YAML定义方式,这里我们以手动的方式来安装部署Prometheus Operator以及Prometheus Server、Alertmanager集群,配置我们需要监控的对象。
本文采用的环境以及版本:
- Kubernetes 1.15.5 + Calico v3.6.5
- Prometheus-Operator v0.34.0
Prometheus Operator简介
Prometheus Operator 安装后提供如下功能:
- 很容易的在kubernetes集群中特定的Namespace中创建和销毁prometheus实例;
- 配置简单:使用kubernetes原生的方式(CRD)对prometheus的版本,持久化,数据保留,实例数进行配置;
- 通过labels来指定生成对目标监控的配置;
架构图如下:
Operator 监控Kubernetes集群中自定义资源Prometheus、Alertmanager的变化,部署和管理Prometheus Server以及Alertmanager集群,使其状态以及配置达到用户的期望状态,同时监控k8s集群中自定义资源ServiceMonitor(它是用来指定需要收集Metrics的目标的服务),来生成Prometheus的配置信息。
Prometheus Operator vs. kube-prometheus vs. community helm chart 都是什么?
- prometheus operator使用kubernetes原生的方式来管理和操作prometheus和alertmanager集群。
- kube-prometheus 联合prometheus operator和一些manifests来帮助监控kubernetes集群本身以及跑在kubernetes上面的应用。
- stable/prometheus-operator Helm chart 提供简单的功能集来创建kube-prometheus。
前提依赖以及兼容性:
推荐Kubernetes集群版本在1.8以上。
快速安装Kubernetes测试集群:
兼容的prometheus版本:v1.4.0–v2.10.0,参考
兼容的alertmanager版本:>= v0.15,参考
CRD(Custom Resource Definitions 自定义资源)
Prometheus
:定义期望的Prometheus实例,同时保证任何时候有期望的Prometheus实例在运行。ServiceMonitor
:通过声明式的方式指定哪些服务需要被监控,它自动生成Prometheus 的scrape配置。PodMonitor
:通过声明式的方式指定哪些pod需要被监控,它自动生成Prometheus 的scrape配置。PrometheusRule
:配置Prometheus rule文件,包括recording rules和alerting,它能够自动被Prometheus加载。Alertmanager
:定义期望的Alertmanager实例,同时保证任何时候有期望的Alertmanager实例在运行,对于指定多台Alertmanager,prometheus operator会自动将它们配置成集群。
安装prometheus-operator
下载prometheus-operator YAML文件:
$ wget -c "https://github.com/coreos/prometheus-operator/raw/master/bundle.yaml" |
更改bundel.yaml 文件中的image地址,默认是使用quay.io的镜像,这里修改成quay.azk8s.cn的地址,可以快速的下载镜像,同时更改默认的namespace为prometheus
$ sed -i 's#quay.io#quay.azk8s.cn#g' bundle.yaml |
更改后的显示如下:
containers: |
安装prometheus-operator 到prometheus namespace中:
$ kubectl create ns prometheus |
检查安装状态:
$ kubectl -n prometheus get all |
查看prometheus operator的启动参数,以及命令帮助:
# 进入prometheus operator容器内: |
安装Prometheus Operator CRDs ??
安装完prometheus operator 我们还需要手动的安装Prometheus-Operator所需的CRD嘛?
至少GitHub上没有写安装,但是如果不需要手动安装对应的CRD,那么是谁安装的呢?这是我的疑问,相信你们也会有;那么我们先检查一下是否已经有了这些CRD呢?
$ kubectl get crd | grep monitoring.coreos.com |
我们发现这些CRD已经安装上了,那么是谁安装的呢?答案只有一个,那就是Prometheus Operator了,我们来检查一下,首先检查一下安装Operator的yaml文件:bundle.yaml
# 其中的ClusterRole,我们发现它是有权限去安装CRD的,而且对应的权限都是*,也就是可以对需要的资源做任何操作。 |
检查一下Operator的启动日志:
$ kubectl -n prometheus logs prometheus-operator-5748cc95dd-wgkhn |
我们发现有如下几个CRD被创建了:
msg="CRD updated" crd=Alertmanager |
正式我们需要的5个。
我们查看一下Kubernetes中的API:
# 方式1 |
我们发现有monitoring.coreos.com这个API:v1.monitoring.coreos.com,我们来具体的查看一下:
$ kubectl get --raw /apis/monitoring.coreos.com/v1 | python -m json.tool |
所以呢:在安装完Prometheus-Operator后不需要我们在手动的方式安装所需的CRD了。
来看一下GitHub上对应的CRD目录:
https://github.com/coreos/prometheus-operator/tree/master/example/prometheus-operator-crd
注意:再使用Helm的方式安装Operator的时候,可能需要提前的手动安装一下,可以通过下面的方式直接安装,总共5个CRD:参考
$ kubectl apply -f https://github.com/coreos/prometheus-operator/raw/master/example/prometheus-operator-crd/servicemonitor.crd.yaml |
Prometheus Operator 以及对应的CRD都已经安装并且成功启动,下面我们来通过kubernetes原生的方式来创建prometheus,alertmanager实例,以及配置监控对象和规则等。
Prometheus Operator 实战
参考:
https://github.com/coreos/prometheus-operator/blob/master/Documentation/api.md
创建Prometheus实例:
不知道怎么创建,直接看上面的参考吧,我摘录下来:
Prometheus
Prometheus defines a Prometheus deployment.
Field | Description | Scheme | Required |
---|---|---|---|
metadata | Standard object’s metadata. More info: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#metadata | metav1.ObjectMeta | false |
spec | Specification of the desired behavior of the Prometheus cluster. More info: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#spec-and-status | PrometheusSpec | true |
status | Most recent observed status of the Prometheus cluster. Read-only. Not included when requesting from the apiserver, only from the Prometheus Operator API itself. More info: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#spec-and-status | *PrometheusStatus | false |
发现只有PrometheusSpec是必须的,看下面:
PrometheusSpec
PrometheusSpec is a specification of the desired behavior of the Prometheus cluster. More info: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#spec-and-status
Field | Description | Scheme | Required |
---|---|---|---|
podMetadata | Standard object’s metadata. More info: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#metadata Metadata Labels and Annotations gets propagated to the prometheus pods. | *metav1.ObjectMeta | false |
serviceMonitorSelector | ServiceMonitors to be selected for target discovery. | *metav1.LabelSelector | false |
serviceMonitorNamespaceSelector | Namespaces to be selected for ServiceMonitor discovery. If nil, only check own namespace. | *metav1.LabelSelector | false |
podMonitorSelector | Experimental PodMonitors to be selected for target discovery. | *metav1.LabelSelector | false |
podMonitorNamespaceSelector | Namespaces to be selected for PodMonitor discovery. If nil, only check own namespace. | *metav1.LabelSelector | false |
version | Version of Prometheus to be deployed. | string | false |
tag | Tag of Prometheus container image to be deployed. Defaults to the value of version . Version is ignored if Tag is set. |
string | false |
sha | SHA of Prometheus container image to be deployed. Defaults to the value of version . Similar to a tag, but the SHA explicitly deploys an immutable container image. Version and Tag are ignored if SHA is set. |
string | false |
paused | When a Prometheus deployment is paused, no actions except for deletion will be performed on the underlying objects. | bool | false |
image | Image if specified has precedence over baseImage, tag and sha combinations. Specifying the version is still necessary to ensure the Prometheus Operator knows what version of Prometheus is being configured. | *string | false |
baseImage | Base image to use for a Prometheus deployment. | string | false |
imagePullSecrets | An optional list of references to secrets in the same namespace to use for pulling prometheus and alertmanager images from registries see http://kubernetes.io/docs/user-guide/images#specifying-imagepullsecrets-on-a-pod | []v1.LocalObjectReference | false |
replicas | Number of instances to deploy for a Prometheus deployment. | *int32 | false |
replicaExternalLabelName | Name of Prometheus external label used to denote replica name. Defaults to the value of prometheus_replica . External label will not be added when value is set to empty string (\"\" ). |
*string | false |
prometheusExternalLabelName | Name of Prometheus external label used to denote Prometheus instance name. Defaults to the value of prometheus . External label will not be added when value is set to empty string (\"\" ). |
*string | false |
retention | Time duration Prometheus shall retain data for. Default is ‘24h’, and must match the regular expression `[0-9]+(ms | s | m |
retentionSize | Maximum amount of disk space used by blocks. | string | false |
walCompression | Enable compression of the write-ahead log using Snappy. This flag is only available in versions of Prometheus >= 2.11.0. | *bool | false |
logLevel | Log level for Prometheus to be configured with. | string | false |
logFormat | Log format for Prometheus to be configured with. | string | false |
scrapeInterval | Interval between consecutive scrapes. | string | false |
evaluationInterval | Interval between consecutive evaluations. | string | false |
rules | /–rules.*/ command-line arguments. | Rules | false |
externalLabels | The labels to add to any time series or alerts when communicating with external systems (federation, remote storage, Alertmanager). | map[string]string | false |
enableAdminAPI | Enable access to prometheus web admin API. Defaults to the value of false . WARNING: Enabling the admin APIs enables mutating endpoints, to delete data, shutdown Prometheus, and more. Enabling this should be done with care and the user is advised to add additional authentication authorization via a proxy to ensure only clients authorized to perform these actions can do so. For more information see https://prometheus.io/docs/prometheus/latest/querying/api/#tsdb-admin-apis |
bool | false |
externalUrl | The external URL the Prometheus instances will be available under. This is necessary to generate correct URLs. This is necessary if Prometheus is not served from root of a DNS name. | string | false |
routePrefix | The route prefix Prometheus registers HTTP handlers for. This is useful, if using ExternalURL and a proxy is rewriting HTTP routes of a request, and the actual ExternalURL is still true, but the server serves requests under a different route prefix. For example for use with kubectl proxy . |
string | false |
query | QuerySpec defines the query command line flags when starting Prometheus. | *QuerySpec | false |
storage | Storage spec to specify how storage shall be used. | *StorageSpec | false |
volumes | Volumes allows configuration of additional volumes on the output StatefulSet definition. Volumes specified will be appended to other volumes that are generated as a result of StorageSpec objects. | []v1.Volume | false |
ruleSelector | A selector to select which PrometheusRules to mount for loading alerting rules from. Until (excluding) Prometheus Operator v0.24.0 Prometheus Operator will migrate any legacy rule ConfigMaps to PrometheusRule custom resources selected by RuleSelector. Make sure it does not match any config maps that you do not want to be migrated. | *metav1.LabelSelector | false |
ruleNamespaceSelector | Namespaces to be selected for PrometheusRules discovery. If unspecified, only the same namespace as the Prometheus object is in is used. | *metav1.LabelSelector | false |
alerting | Define details regarding alerting. | *AlertingSpec | false |
resources | Define resources requests and limits for single Pods. | v1.ResourceRequirements | false |
nodeSelector | Define which Nodes the Pods are scheduled on. | map[string]string | false |
serviceAccountName | ServiceAccountName is the name of the ServiceAccount to use to run the Prometheus Pods. | string | false |
secrets | Secrets is a list of Secrets in the same namespace as the Prometheus object, which shall be mounted into the Prometheus Pods. The Secrets are mounted into /etc/prometheus/secrets/. | []string | false |
configMaps | ConfigMaps is a list of ConfigMaps in the same namespace as the Prometheus object, which shall be mounted into the Prometheus Pods. The ConfigMaps are mounted into /etc/prometheus/configmaps/. | []string | false |
affinity | If specified, the pod’s scheduling constraints. | *v1.Affinity | false |
tolerations | If specified, the pod’s tolerations. | []v1.Toleration | false |
remoteWrite | If specified, the remote_write spec. This is an experimental feature, it may change in any upcoming release in a breaking way. | []RemoteWriteSpec | false |
remoteRead | If specified, the remote_read spec. This is an experimental feature, it may change in any upcoming release in a breaking way. | []RemoteReadSpec | false |
securityContext | SecurityContext holds pod-level security attributes and common container settings. This defaults to the default PodSecurityContext. | *v1.PodSecurityContext | false |
listenLocal | ListenLocal makes the Prometheus server listen on loopback, so that it does not bind against the Pod IP. | bool | false |
containers | Containers allows injecting additional containers or modifying operator generated containers. This can be used to allow adding an authentication proxy to a Prometheus pod or to change the behavior of an operator generated container. Containers described here modify an operator generated container if they share the same name and modifications are done via a strategic merge patch. The current container names are: prometheus , prometheus-config-reloader , rules-configmap-reloader , and thanos-sidecar . Overriding containers is entirely outside the scope of what the maintainers will support and by doing so, you accept that this behaviour may break at any time without notice. |
[]v1.Container | false |
initContainers | InitContainers allows adding initContainers to the pod definition. Those can be used to e.g. fetch secrets for injection into the Prometheus configuration from external sources. Any errors during the execution of an initContainer will lead to a restart of the Pod. More info: https://kubernetes.io/docs/concepts/workloads/pods/init-containers/ Using initContainers for any use case other then secret fetching is entirely outside the scope of what the maintainers will support and by doing so, you accept that this behaviour may break at any time without notice. | []v1.Container | false |
additionalScrapeConfigs | AdditionalScrapeConfigs allows specifying a key of a Secret containing additional Prometheus scrape configurations. Scrape configurations specified are appended to the configurations generated by the Prometheus Operator. Job configurations specified must have the form as specified in the official Prometheus documentation: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config. As scrape configs are appended, the user is responsible to make sure it is valid. Note that using this feature may expose the possibility to break upgrades of Prometheus. It is advised to review Prometheus release notes to ensure that no incompatible scrape configs are going to break Prometheus after the upgrade. | *v1.SecretKeySelector | false |
additionalAlertRelabelConfigs | AdditionalAlertRelabelConfigs allows specifying a key of a Secret containing additional Prometheus alert relabel configurations. Alert relabel configurations specified are appended to the configurations generated by the Prometheus Operator. Alert relabel configurations specified must have the form as specified in the official Prometheus documentation: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#alert_relabel_configs. As alert relabel configs are appended, the user is responsible to make sure it is valid. Note that using this feature may expose the possibility to break upgrades of Prometheus. It is advised to review Prometheus release notes to ensure that no incompatible alert relabel configs are going to break Prometheus after the upgrade. | *v1.SecretKeySelector | false |
additionalAlertManagerConfigs | AdditionalAlertManagerConfigs allows specifying a key of a Secret containing additional Prometheus AlertManager configurations. AlertManager configurations specified are appended to the configurations generated by the Prometheus Operator. Job configurations specified must have the form as specified in the official Prometheus documentation: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#alertmanager_config. As AlertManager configs are appended, the user is responsible to make sure it is valid. Note that using this feature may expose the possibility to break upgrades of Prometheus. It is advised to review Prometheus release notes to ensure that no incompatible AlertManager configs are going to break Prometheus after the upgrade. | *v1.SecretKeySelector | false |
apiserverConfig | APIServerConfig allows specifying a host and auth methods to access apiserver. If left empty, Prometheus is assumed to run inside of the cluster and will discover API servers automatically and use the pod’s CA certificate and bearer token file at /var/run/secrets/kubernetes.io/serviceaccount/. | *APIServerConfig | false |
thanos | Thanos configuration allows configuring various aspects of a Prometheus server in a Thanos environment.\n\nThis section is experimental, it may change significantly without deprecation notice in any release.\n\nThis is experimental and may change significantly without backward compatibility in any release. | *ThanosSpec | false |
priorityClassName | Priority class assigned to the Pods | string | false |
portName | Port name used for the pods and governing service. This defaults to web | string | false |
arbitraryFSAccessThroughSMs | ArbitraryFSAccessThroughSMs configures whether configuration based on a service monitor can access arbitrary files on the file system of the Prometheus container e.g. bearer token files. | ArbitraryFSAccessThroughSMsConfig | false |
overrideHonorLabels | OverrideHonorLabels if set to true overrides all user configured honor_labels. If HonorLabels is set in ServiceMonitor or PodMonitor to true, this overrides honor_labels to false. | bool | false |
overrideHonorTimestamps | OverrideHonorTimestamps allows to globally enforce honoring timestamps in all scrape configs. | bool | false |
ignoreNamespaceSelectors | IgnoreNamespaceSelectors if set to true will ignore NamespaceSelector settings from the podmonitor and servicemonitor configs, and they will only discover endpoints within their current namespace. Defaults to false. | bool | false |
enforcedNamespaceLabel | EnforcedNamespaceLabel enforces adding a namespace label of origin for each alert and metric that is user created. The label value will always be the namespace of the object that is being created. | string | false |
我们发现也没有必须要填的,那就先创建一个默认的吧:prometheus.yaml
kind: Prometheus |
查看一下:
$ kubectl -n prometheus get all |
我们创建prometheus实例的service:
$ cat prometheus-svc.yaml |
我们访问测试一下,看看具体的配置文件吧:
指定prometheus server的版本
默认创建的prometheus server版本太低了,我们更换成operator兼容列表中最高的prometheus server版本:
我们注意到在上面的:PrometheusSpec跟版本有关的总共有三个:
Field | Description | Scheme | Required |
---|---|---|---|
version | Version of Prometheus to be deployed. 部署的prometheus server版本 | string | false |
tag | Tag of Prometheus container image to be deployed. Defaults to the value of version . Version is ignored if Tag is set. 部署的prometheus server版本,优先级高于version |
string | false |
image | Image if specified has precedence over baseImage, tag and sha combinations. Specifying the version is still necessary to ensure the Prometheus Operator knows what version of Prometheus is being configured. 当指定image时优先级高于baseImage,tag 和sha。但是仍然需要指定version来告诉operator是哪个版本的prometheus需要被配置 | *string | false |
所以呢:这里我们使用image+version的方式:
$ cat prometheus-image.yaml |
注意:image中需要指定prometheus的版本v2.10.0*,否则就会拉取latest的版本;
这里的version只是告诉operator哪个prometheus版本需要被配置的,对于使用哪个版本是由image指定的。
version中的版本要和image中的版本一致,否则可能会出现未知的错误。
apply后,operator会根据我们定义的声明式yaml文件重新部署prometheus server,稍等片刻我们来验证一下:
$ kubectl -n prometheus get po prometheus-prometheus-0 -o yaml --export | grep image |
我们发现连基础的镜像都是用了我们配置的image的镜像源:quay.azk8s.cn,所以呢,如果使用了自建的docker registry也需要提前拉取好这些镜像。
因为我们使用的Azure的镜像,它就是quay的镜像,所以默认都已经存在了,所以不需要手动再去拉取的。
所以推荐使用Azure中国镜像,使用方式请参考:docker.io gcr.io k8s.gcr.io quay.io 中国区加速
我们继续,我们看到prometheus-prometheus-0 这个statefulSet创建的pod中,另外两个关于重载prometheus配置的pod都已经配置了资源限制,只有prometheus server这个没有配置,显然不符合生产要求。
指定prometheus server的资源限制:
从上面的PrometheusSpec中我们看到有resource字段,内用和原声的kubernetes resource是一样的。
$ cat prometheus-image-resource.yaml |
检查一下:
$ kubectl -n prometheus get po prometheus-prometheus-0 -o yaml --export |
配置prometheus server的存储以及数据保留期限
查看具体的api:
Field | Description | Scheme | Required |
---|---|---|---|
retention | Time duration Prometheus shall retain data for. Default is ‘24h’, and must match the regular expression `[0-9]+(ms | s | m |
storage | Storage spec to specify how storage shall be used. | *StorageSpec | false |
StorageSpec
StorageSpec defines the configured storage for a group Prometheus servers. If neither emptyDir
nor volumeClaimTemplate
is specified, then by default an EmptyDir will be used.
Field | Description | Scheme | Required |
---|---|---|---|
emptyDir | EmptyDirVolumeSource to be used by the Prometheus StatefulSets. If specified, used in place of any volumeClaimTemplate. More info: https://kubernetes.io/docs/concepts/storage/volumes/#emptydir | *v1.EmptyDirVolumeSource | false |
volumeClaimTemplate | A PVC spec to be used by the Prometheus StatefulSets. | v1.PersistentVolumeClaim | false |
volumeClaimTemplate 中和kubernetes statefulSet 中定义的是一样的。
$ cat prometheus.yaml |
由于我这里还没有storageClass的环境,所以不再测试,后期搭建一个Rook-Ceph测试一下。
配置prometheus server ingress,并配置basicAuth 用户名密码认证
由于默认的prometheus dashboard没有认证的,对于生产来说公网访问不安全,所以需要加上认证。
安装htpasswd命令:
$ sudo yum install httpd-tools -y |
创建nginx ingress所需的密码文件:
# auth 为创建的文件名,prometheus为用户名,然后输入两次一样的密码 |
注意:文件名必须为auth,否则报503错误,用户名无所谓。
创建对应的kubernetes secret:
$ $ kubectl -n prometheus create secret generic basic-auth --from-file=auth |
注意:secret必须要和ingress在同一个namespace,注意是ingress yaml文件的资源所在的namespace,不是nginx-ingress-controller所在的namespace。否则也会报503错误。
我这里ingress是准备在prometheus这个namespace下的,所以这个namespace也是在prometheus这个namespace。
创建prometheus的ingress,在prometheus这个namespace:我这里没有证书,所以把tls的部分注释掉了。
$ cat prometheus-ingress.yaml |
配置好域名解析访问就会提示输入我们设定的用户名密码。
配置ingress的prometheus path路径为 /prometheus
有的朋友可能会问到:
这里是直接把域名的/根路径给了prometheus,后期的alertmanager还得需要域名,不太好,建议用prometheus.test.aws.test.com/promethues 访问prometheus,prometheus.test.aws.test.com/alertmanager来访问alertmanager。
要实现这种:配置如下:
首先看一下prometheus的参数:
--web.external-url=<URL> The URL under which Prometheus is externally reachable (for example, if Prometheus is served via a reverse proxy). Used for generating relative and absolute links back to Prometheus itself. If the URL has a path portion, it will be used to prefix all HTTP endpoints served by Prometheus. If omitted, relevant URL components will be derived automatically. |
直接在非docker环境启动测试OK,如下:
./prometheus --config.file=prometheus.yml --web.enable-lifecycle --web.enable-admin-api --web.route-prefix="/prometheus" --web.external-url="http://prometheus.test.aws.test.com/prometheus" |
测试:发现都加上了我们的 /prometheus 这个path:
$ curl http://prometheus.test.aws.test.com |
那么k8s环境中prometheus-operator如何配置呢?
查看一下:PrometheusSpec
Field | Description | Scheme | Required |
---|---|---|---|
externalUrl | The external URL the Prometheus instances will be available under. This is necessary to generate correct URLs. This is necessary if Prometheus is not served from root of a DNS name. | string | false |
routePrefix | The route prefix Prometheus registers HTTP handlers for. This is useful, if using ExternalURL and a proxy is rewriting HTTP routes of a request, and the actual ExternalURL is still true, but the server serves requests under a different route prefix. For example for use with kubectl proxy . |
string | false |
$ cat prometheus-path.yaml |
测试一下吧:
# curl -L 表示跟谁301/302.返回正常 |
注意:如果你使用的是Helm安装prometheus服务的,或者在k8s上手动安装prometheus,而不是通过prometheus-operator的,需要更改一些参数。
prometheus-operator只需要配置这两个参数即可,它自动的把下面的都配置好了。
需要更改如下:
- readinessProbe路径为:/prometheus/-/ready
- livenessProbe路径为:/prometheus/-/healthy
- prometheus sidecar用于发现配置更改重载prometheus的服务的参数:- –webhook-url=http://127.0.0.1:9090/prometheus/-/reload
- 如果你的prometheus也收集了本身,需要配置收集的路径为:/prometheus/metircs
好了,关于operator的Prometheus资源配置就到这里,如果有其他的配置,建议参考:PrometheusSpec
ServiceMonitor 配置
我们首先部署一下我们要搜集的目标服务:注意是在default名称空间下:
|
检查这个服务:
$ kubectl get svc,po |
好了,让我们配置ServiceMonitor,来让prometheus搜集吧。
老规矩,先看ServiceMonitor的api吧,我也不知道怎么配置。
ServiceMonitor
ServiceMonitor defines monitoring for a set of services.
Field | Description | Scheme | Required |
---|---|---|---|
metadata | Standard object’s metadata. More info: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#metadata | metav1.ObjectMeta | false |
spec | Specification of desired Service selection for target discrovery by Prometheus. | ServiceMonitorSpec | true |
ServiceMonitorSpec
ServiceMonitorSpec contains specification parameters for a ServiceMonitor.
Field | Description | Scheme | Required |
---|---|---|---|
jobLabel | The label to use to retrieve the job name from. | string | false |
targetLabels | TargetLabels transfers labels on the Kubernetes Service onto the target. | []string | false |
podTargetLabels | PodTargetLabels transfers labels on the Kubernetes Pod onto the target. | []string | false |
endpoints | A list of endpoints allowed as part of this ServiceMonitor. | []Endpoint | true |
selector | Selector to select Endpoints objects. | metav1.LabelSelector | true |
namespaceSelector | Selector to select which namespaces the Endpoints objects are discovered from. | NamespaceSelector | false |
sampleLimit | SampleLimit defines per-scrape limit on number of scraped samples that will be accepted. | uint64 | false |
Endpoint
Endpoint defines a scrapeable endpoint serving Prometheus metrics.
Field | Description | Scheme | Required |
---|---|---|---|
port | Name of the service port this endpoint refers to. Mutually exclusive with targetPort. | string | false |
targetPort | Name or number of the target port of the endpoint. Mutually exclusive with port. | *intstr.IntOrString | false |
path | HTTP path to scrape for metrics. | string | false |
scheme | HTTP scheme to use for scraping. | string | false |
params | Optional HTTP URL parameters | map[string][]string | false |
interval | Interval at which metrics should be scraped | string | false |
scrapeTimeout | Timeout after which the scrape is ended | string | false |
tlsConfig | TLS configuration to use when scraping the endpoint | *TLSConfig | false |
bearerTokenFile | File to read bearer token for scraping targets. | string | false |
bearerTokenSecret | Secret to mount to read bearer token for scraping targets. The secret needs to be in the same namespace as the service monitor and accessible by the Prometheus Operator. | v1.SecretKeySelector | false |
honorLabels | HonorLabels chooses the metric’s labels on collisions with target labels. | bool | false |
honorTimestamps | HonorTimestamps controls whether Prometheus respects the timestamps present in scraped data. | *bool | false |
basicAuth | BasicAuth allow an endpoint to authenticate over basic authentication More info: https://prometheus.io/docs/operating/configuration/#endpoints | *BasicAuth | false |
metricRelabelings | MetricRelabelConfigs to apply to samples before ingestion. | []*RelabelConfig | false |
relabelings | RelabelConfigs to apply to samples before scraping. More info: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config | []*RelabelConfig | false |
proxyUrl | ProxyURL eg http://proxyserver:2195 Directs scrapes to proxy through this endpoint. | *string | false |
NamespaceSelector
NamespaceSelector is a selector for selecting either all namespaces or a list of namespaces.
Field | Description | Scheme | Required |
---|---|---|---|
any | Boolean describing whether all namespaces are selected in contrast to a list restricting them. | bool | false |
matchNames | List of namespace names. | []string | false |
有两个必须要提供的:
ServiceMonitor.spec.endpoints
ServiceMonitor.spec.selector
我们来定义serviceMonitor:
$ cat example-app-ServiceMonitor.yaml |
如果希望ServiceMonitor可以关联任何空间下的标签,可以通过下面的方式定义:
spec: |
如果监控的目标对应中有basicAuth认证的话,需要配置如下:
kind: ServiceMonitor |
检查一下:
$ kubectl -n prometheus get servicemonitors.monitoring.coreos.com |
OK了,刷新一下prometheus dashboard看一下吧。呀,啥也没有呢,没生效?
其实是这样的:
ServiceMonitor定义了目标收集相关的,比如端口啊,哪个service啊等,但是还没有指定是由哪个prometheus去搜集呢?
Prometheus,Alertmanager,ServiceMonitor等自定义的资源在一个k8s集群中可能都有多个,比如定义了两台prometheus server,并且定义了多个ServiceMonitor,那么哪个prometheus通过哪个ServiceMonitor去搜集呢?是不是?
所以还需要更改之前定义的Prometheus自定义资源,因为prometheus server是通过kubernetes_sd_configs自动发现的,所以需要给prometheus server权限去发现:所以呢,首先定义prometheus server所需的RBAC:
$ wget -c "https://github.com/coreos/prometheus-operator/raw/master/example/rbac/prometheus/prometheus-cluster-role-binding.yaml" |
serviceAccount 配置OK了,prometheus怎么用呢?以及怎么选择对应的serviceMonitor呢?当然是通过强大的labels选择器了啊:
$ cat prometheus-path.yaml |
好了,再次刷新一下prometheus dashboard:
target:
UI 上查看具体的prometheus 配置:
global: |
command-line flags 截图:
创建Alertmanager实例:
老规矩先看其api:
Alertmanager
Alertmanager describes an Alertmanager cluster.
Field | Description | Scheme | Required |
---|---|---|---|
metadata | Standard object’s metadata. More info: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#metadata | metav1.ObjectMeta | false |
spec | Specification of the desired behavior of the Alertmanager cluster. More info: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#spec-and-status | AlertmanagerSpec | true |
status | Most recent observed status of the Alertmanager cluster. Read-only. Not included when requesting from the apiserver, only from the Prometheus Operator API itself. More info: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#spec-and-status | *AlertmanagerStatus | false |
AlertmanagerSpec
AlertmanagerSpec is a specification of the desired behavior of the Alertmanager cluster. More info: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#spec-and-status
Field | Description | Scheme | Required |
---|---|---|---|
podMetadata | Standard object’s metadata. More info: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#metadata Metadata Labels and Annotations gets propagated to the prometheus pods. | *metav1.ObjectMeta | false |
image | Image if specified has precedence over baseImage, tag and sha combinations. Specifying the version is still necessary to ensure the Prometheus Operator knows what version of Alertmanager is being configured. | *string | false |
version | Version the cluster should be on. | string | false |
tag | Tag of Alertmanager container image to be deployed. Defaults to the value of version . Version is ignored if Tag is set. |
string | false |
sha | SHA of Alertmanager container image to be deployed. Defaults to the value of version . Similar to a tag, but the SHA explicitly deploys an immutable container image. Version and Tag are ignored if SHA is set. |
string | false |
baseImage | Base image that is used to deploy pods, without tag. | string | false |
imagePullSecrets | An optional list of references to secrets in the same namespace to use for pulling prometheus and alertmanager images from registries see http://kubernetes.io/docs/user-guide/images#specifying-imagepullsecrets-on-a-pod | []v1.LocalObjectReference | false |
secrets | Secrets is a list of Secrets in the same namespace as the Alertmanager object, which shall be mounted into the Alertmanager Pods. The Secrets are mounted into /etc/alertmanager/secrets/. | []string | false |
configMaps | ConfigMaps is a list of ConfigMaps in the same namespace as the Alertmanager object, which shall be mounted into the Alertmanager Pods. The ConfigMaps are mounted into /etc/alertmanager/configmaps/. | []string | false |
configSecret | ConfigSecret is the name of a Kubernetes Secret in the same namespace as the Alertmanager object, which contains configuration for this Alertmanager instance. Defaults to ‘alertmanager-‘ The secret is mounted into /etc/alertmanager/config. | string | false |
logLevel | Log level for Alertmanager to be configured with. | string | false |
logFormat | Log format for Alertmanager to be configured with. | string | false |
replicas | Size is the expected size of the alertmanager cluster. The controller will eventually make the size of the running cluster equal to the expected size. | *int32 | false |
retention | Time duration Alertmanager shall retain data for. Default is ‘120h’, and must match the regular expression `[0-9]+(ms | s | m |
storage | Storage is the definition of how storage will be used by the Alertmanager instances. | *StorageSpec | false |
volumes | Volumes allows configuration of additional volumes on the output StatefulSet definition. Volumes specified will be appended to other volumes that are generated as a result of StorageSpec objects. | []v1.Volume | false |
volumeMounts | VolumeMounts allows configuration of additional VolumeMounts on the output StatefulSet definition. VolumeMounts specified will be appended to other VolumeMounts in the alertmanager container, that are generated as a result of StorageSpec objects. | []v1.VolumeMount | false |
externalUrl | The external URL the Alertmanager instances will be available under. This is necessary to generate correct URLs. This is necessary if Alertmanager is not served from root of a DNS name. | string | false |
routePrefix | The route prefix Alertmanager registers HTTP handlers for. This is useful, if using ExternalURL and a proxy is rewriting HTTP routes of a request, and the actual ExternalURL is still true, but the server serves requests under a different route prefix. For example for use with kubectl proxy . |
string | false |
paused | If set to true all actions on the underlaying managed objects are not goint to be performed, except for delete actions. | bool | false |
nodeSelector | Define which Nodes the Pods are scheduled on. | map[string]string | false |
resources | Define resources requests and limits for single Pods. | v1.ResourceRequirements | false |
affinity | If specified, the pod’s scheduling constraints. | *v1.Affinity | false |
tolerations | If specified, the pod’s tolerations. | []v1.Toleration | false |
securityContext | SecurityContext holds pod-level security attributes and common container settings. This defaults to the default PodSecurityContext. | *v1.PodSecurityContext | false |
serviceAccountName | ServiceAccountName is the name of the ServiceAccount to use to run the Prometheus Pods. | string | false |
listenLocal | ListenLocal makes the Alertmanager server listen on loopback, so that it does not bind against the Pod IP. Note this is only for the Alertmanager UI, not the gossip communication. | bool | false |
containers | Containers allows injecting additional containers. This is meant to allow adding an authentication proxy to an Alertmanager pod. | []v1.Container | false |
initContainers | InitContainers allows adding initContainers to the pod definition. Those can be used to e.g. fetch secrets for injection into the Alertmanager configuration from external sources. Any errors during the execution of an initContainer will lead to a restart of the Pod. More info: https://kubernetes.io/docs/concepts/workloads/pods/init-containers/ Using initContainers for any use case other then secret fetching is entirely outside the scope of what the maintainers will support and by doing so, you accept that this behaviour may break at any time without notice. | []v1.Container | false |
priorityClassName | Priority class assigned to the Pods | string | false |
additionalPeers | AdditionalPeers allows injecting a set of additional Alertmanagers to peer with to form a highly available cluster. | []string | false |
portName | Port name used for the pods and governing service. This defaults to web | string | false |
好吧,既然没有必须的那就参考上面配置的Prometheus,指定如下配置:
- replicas为3,创建以3台alertmanager组成的集群
- 更改image资源为Azure中国的
- 配置一下resource限制
- 同时采用类似prometheus 的ingress path为 /alertmanager
- 配置数据保留期限
$ cat prometheus-alertmanager.yaml |
准备alertmanager的配置文件:名为:alertmanager-<Alertmanager> 格式的secret,其中<Alertmanager>为创建Alertmanager资源对象的名称:
这里配置了email以及企业微信的告警:
$ cat alertmanager.yaml |
创建secret,注意是在和Alertmanager所在的namespace,我这里是prometheus:
$ kubectl -n prometheus create secret generic alertmanager-alertmanager --from-file=alertmanager.yaml |
注意alertmanager的配置文件名称必须是alertmanager.yaml,为啥呢?
$ kubectl -n prometheus get statefulsets.apps alertmanager-alertmanager -o yaml |
创建alertmanager的service:
$ cat alertmanager-svc.yaml |
创建alertmanager的ingress:用户名密码直接复用prometheus的
$ cat alertmanager-ingress.yaml |
好了,访问测试一下:http://prometheus.test.aws.test.com
$ curl -L http://prometheus.test.aws.test.com/alertmanager -uprometheus -p |
下图可以看到三台alertmanager已经组成了集群。
好了,alertmanager完事了,如果告知前面创建的prometheus呢?
还是查看一下PrometheusSpec:
Field | Description | Scheme | Required |
---|---|---|---|
alerting | Define details regarding alerting. | *AlertingSpec | false |
AlertingSpec
AlertingSpec defines parameters for alerting configuration of Prometheus servers.
Field | Description | Scheme | Required |
---|---|---|---|
alertmanagers | AlertmanagerEndpoints Prometheus should fire alerts against. | []AlertmanagerEndpoints | true |
AlertmanagerEndpoints
AlertmanagerEndpoints defines a selection of a single Endpoints object containing alertmanager IPs to fire alerts against.
Field | Description | Scheme | Required |
---|---|---|---|
namespace | Namespace of Endpoints object. | string | true |
name | Name of Endpoints object in Namespace. | string | true |
port | Port the Alertmanager API is exposed on. | intstr.IntOrString | true |
scheme | Scheme to use when firing alerts. | string | false |
pathPrefix | Prefix for the HTTP path alerts are pushed to. | string | false |
tlsConfig | TLS Config to use for alertmanager connection. | *TLSConfig | false |
bearerTokenFile | BearerTokenFile to read from filesystem to use when authenticating to Alertmanager. | string | false |
Prometheus定义如下:
$ cat prometheus-path.yaml |
注意:
如果你没有配置:routePrefix和externalUrl,那么这里也不需要配置pathPrefix
如果你配置了如上的:routePrefix和externalUrl,那么pathPrefix需要配置成和routePrefix一致:
pathPrefix: /alertmanager:
否则prometheus去通知alertmanager的时候会报404错误:
level=error ts=2019-11-20T03:01:24.934Z caller=notifier.go:487 component=notifier alertmanager=http://10.101.217.208:9093/api/v1/alerts count=0 msg=”Error sending alert” err=”bad response status 404 Not Found”
level=error ts=2019-11-20T03:01:24.934Z caller=notifier.go:487 component=notifier alertmanager=http://10.101.253.80:9093/api/v1/alerts count=0 msg=”Error sending alert” err=”bad response status 404 Not Found”
level=error ts=2019-11-20T03:01:24.934Z caller=notifier.go:487 component=notifier alertmanager=http://10.101.253.81:9093/api/v1/alerts count=0 msg=”Error sending alert” err=”bad response status 404 Not Found”
检查一下prometheus dashboard 中alerting的配置吧:
alerting: |
查看prometheus是否发现了alertmanager实例呢:prometheus dashboard—Status—Runtime & BUild Information:
好了alertmanager 实例已经创建了,现在可以创建告警以及规则了。
通过PrometheusRule 创建rule,以及alert告警
老规矩,查看api:
PrometheusRule
PrometheusRule defines alerting rules for a Prometheus instance
Field | Description | Scheme | Required |
---|---|---|---|
metadata | Standard object’s metadata. More info: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#metadata | metav1.ObjectMeta | false |
spec | Specification of desired alerting rule definitions for Prometheus. | PrometheusRuleSpec | true |
PrometheusRuleSpec 这里的定义是跟prometheus中定义rules,alert是一样的。具体可以alerting-rules,rules
PrometheusRuleSpec contains specification parameters for a Rule.
Field | Description | Scheme | Required |
---|---|---|---|
groups | Content of Prometheus rule file | []RuleGroup | false |
RuleGroup
RuleGroup is a list of sequentially evaluated recording and alerting rules.
Field | Description | Scheme | Required |
---|---|---|---|
name | string | true | |
interval | string | false | |
rules | []Rule | true |
Rule
Rule describes an alerting or recording rule.
Field | Description | Scheme | Required |
---|---|---|---|
record | string | false | |
alert | string | false | |
expr | intstr.IntOrString | true | |
for | string | false | |
labels | map[string]string | false | |
annotations | map[string]string | false |
定义prometheusrule如下:
$ cat prometheusrule.yaml |
定义完了prometheusrule其实还不能用:类似与serviceMonitor、Alertmanager都需要在Prometheus资源中通过lables进行筛选prometheus需要的。
$ cat prometheus-path.yaml |
等待prometheus的自动reload,然后检查如下:
查询一下定义的rule:
查看alert:
等待10分钟再次查看:
在Alertmanager上查看:
查看邮件通知:
点击上面的Source能够跳转到Prometheus:
http://prometheus.test.aws.test.com/prometheus/graph?g0.expr=vector%281%29&g0.tab=1
总结:
看到上面的Prometheus、Alertmanager、ServiceMonitor、PrometheusRule、PodMonitor,以及其中的依赖关系,是不是觉得很懵逼呢?看个图就明白了很多:
Prometheus、Alertmanager、ServiceMonitor、PrometheusRule、PodMonitor在一个Kubernetes集群中都可能会有多个。
Alertmanager定义alertmanager server,负责等待prometheus server推送过来的告警,然后根据告警规则通知最终用户。
ServiceMonitor、PodMonitor都是定义要搜集的目标,以及怎么搜集,都是通过prometheus的kubernetes自动发现配置的,所以需要为prometheus配置RBAC权限,让它有权限去发现对应的目标。
然后就是Prometheus去部署一个prometheus server,然后通过标签选择要搜集的对象(ServiceMonitor、PodMonitor)、告警和规则(PrometheusRule)、以及当发生告警时要通知那些alertmanager(Alertmanager)。
所有的这些操作都是通过kubernetes的声明式API配置的,operator定义了如上几个CRD,用户定义期望的值(配置的这些的CRD),operator负责根据Prometheus、Alertmanager部署prometheus server,alertmanager server,然后根据ServiceMonitor、PrometheusRule、PodMonitor 来生成prometheus的配置文件,prometheus pod内的sidecar负责重载配置,最终尽可能的达到用户的期望值。
移除Prometheus Operator
我们以手动的方式测试完prometheus operator,了解其中的原理后,我们需要将其卸载掉,因为我们后续会使用Helm的方式来安装Prometheus-Operator(后面会有博文去介绍,欢迎期待),这样会帮助我们安装kube-prometheus、监控Kubernetes集群所需的Manifests,以及对应的Grafana和Dashboard,这样会大大的解放我们双手。
要移除prometheus operator首先要移除在每个namespace下面的定义的CRD(5个),operator会自动的的关闭prometheus和alertmanager pod以及其关联的ConfigMaps。
$ for n in $(kubectl get namespaces -o jsonpath={..metadata.name}); do |
然后再移除operator本身:
$ kubectl delete -f bundle.yaml |
清理operator自动在每个namespace下面创建的service,以及5个CRD:
for n in $(kubectl get namespaces -o jsonpath={..metadata.name}); do |
参考:
https://coreos.com/blog/the-prometheus-operator.html
https://github.com/coreos/prometheus-operator/blob/master/Documentation/troubleshooting.md
https://github.com/coreos/prometheus-operator/blob/master/Documentation/api.md # 重要
https://zhangguanzhang.github.io/2018/10/12/prometheus-operator/
https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/
https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/
本文到这里就结束了,欢迎期待后面的文章。您可以关注下方的公众号二维码,在第一时间查看新文章。