在K8S上搭建skywalking监控平台
2023-09-03 20:10:56

用途介绍

这里主要针对监控的是k8s上的JAVA应用节点,包括接口调用链的耗时时间,各应用之前调用关系,SQL执行耗时时间等。

准备工作

需要准备以下镜像,从docker官方仓库中获取。

  • apache-skywalking-apm-bin:8.5:java探针代理程序,启动命令中需要引入此镜像中的jar包文件。
  • skywalking-oap-server:9.4.0:skywalking核心服务,用来收集数据以及查询监控数据,也包括将收集的数据发送至ES存储。
  • skywalking-ui:9.4.0:skywalking监控UI网页页面,用于展示监控数据。
  • nginx:任意nginx,需要提前引入htpasswd组件,用于上面ui页面的访问鉴权。

额外需要准备一个ES集群,存储采集数据。

K8S配置文件

应用节点的yaml配置

应用中添加探针需要用到的jar包,采用initContainers的方式,挂载镜像中的文件夹

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
kind: Deployment
apiVersion: apps/v1
metadata:
name: app1
namespace: default
labels:
k8s-app: app1
spec:
replicas: 1
selector:
matchLabels:
k8s-app: app1
template:
metadata:
name: app1
creationTimestamp: null
labels:
k8s-app: app1
spec:
volumes:
- name: agent
emptyDir: {}
initContainers:
- name: init-agent
image: >-
apache-skywalking-apm-bin:8.5
command:
- sh
- '-c'
- >-
set -ex;mkdir -p /skywalking/agent;cp -r /opt/skywalking/agent/*
/skywalking/agent;
resources: {}
volumeMounts:
- name: agent
mountPath: /skywalking/agent
imagePullPolicy: IfNotPresent
containers:
- name: app1
image: 'app1:0.1'
command:
- /bin/bash
- '-c'
- sh bin/start.sh
args:
- while true; do sleep 30; done;
ports:
- containerPort: 7101
protocol: TCP
env:
- name: SW_AGENT_COLLECTOR_BACKEND_SERVICES
value: 'skywalking-oap-server:11800'
- name: SW_AGENT_NAME
value: app1
resources:
limits:
cpu: '1'
memory: 1Gi
requests:
cpu: '1'
memory: 1Gi
volumeMounts:
- name: agent
mountPath: /opt/skywalking/agent
imagePullPolicy: IfNotPresent
restartPolicy: Always

重点参数介绍:

  1. app1:举例的应用的镜像名称
  2. initContainers:配置的是探针jar包所在的镜像
  3. volumeMounts:需要挂在探针中的文件夹到应用容器中
  4. env:SW_AGENT_COLLECTOR_BACKEND_SERVICES为skywalking-oap应用的服务地址,SW_AGENT_NAME为该应用在skywalking中的应用名称。

skywalking-oap服务节点的yaml配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
kind: Deployment
apiVersion: apps/v1
metadata:
name: skywalking-oap-server
namespace: default
labels:
app: skywalking-oap-server
spec:
replicas: 1
selector:
matchLabels:
app: skywalking-oap-server
template:
metadata:
creationTimestamp: null
labels:
app: skywalking-oap-server
spec:
containers:
- name: skywalking-oap-server
image: 'skywalking-oap-server:9.4.0'
ports:
- name: grpc
containerPort: 11800
protocol: TCP
- name: tcp
containerPort: 12800
protocol: TCP
env:
- name: SW_STORAGE
value: elasticsearch
- name: SW_STORAGE_ES_CLUSTER_NODES
value: '10.10.10.10:34095'
- name: SW_ES_USER
value: elastic
- name: SW_ES_PASSWORD
value: 'elastic'
- name: TZ
value: CST-8
- name: SW_STORAGE_ES_RECORD_DATA_TTL
value: '1'
- name: SW_STORAGE_ES_OTHER_METRIC_DATA_TTL
value: '1'
- name: SW_STORAGE_ES_MONTH_METRIC_DATA_TTL
value: '1'
- name: JAVA_OPTS
value: '-Xms3072m -Xmx3072m'
- name: SW_RECEIVER_SHARING_GRPC_THREAD_POOL_QUEUE_SIZE
value: '20000'
- name: SW_RECEIVER_SHARING_GRPC_THREAD_POOL_SIZE
value: '10'
- name: SW_CORE_RECORD_DATA_TTL
value: '2'
resources:
limits:
cpu: '2'
memory: 4Gi
requests:
cpu: '2'
memory: 4Gi
livenessProbe:
tcpSocket:
port: 12800
initialDelaySeconds: 30
timeoutSeconds: 2
periodSeconds: 20
successThreshold: 1
failureThreshold: 3
readinessProbe:
tcpSocket:
port: 12800
initialDelaySeconds: 30
timeoutSeconds: 2
periodSeconds: 20
successThreshold: 2
failureThreshold: 3
imagePullPolicy: IfNotPresent
restartPolicy: Always
  1. 在应用中的探针采集到的信息会发送的这个服务节点上,并自动存储到指定数据库中。
  2. 主要注意环境变量的配置,以及livenessProbe、readinessProbe节点下的配置,否则服务无法在k8s中正常启动。
  3. SW_CORE_RECORD_DATA_TTL:表示存储“明细”记录的天数,这里设置的最小值为两天。
  4. 这里的数据库配置,会将监控信息发送到ES中。

skywalking-oap service的yaml配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
kind: Service
apiVersion: v1
metadata:
name: skywalking-oap-server
namespace: default
labels:
app: skywalking-oap-server-service
spec:
ports:
- name: grpc
protocol: TCP
port: 11800
targetPort: 11800
- name: tcp
protocol: TCP
port: 12800
targetPort: 12800
selector:
app: skywalking-oap-server
type: ClusterIP

skywalking-ui服务的yaml配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
kind: Deployment
apiVersion: apps/v1
metadata:
name: skywalking-ui
namespace: default
labels:
app: skywalking-ui
spec:
replicas: 1
selector:
matchLabels:
app: skywalking-ui
template:
metadata:
creationTimestamp: null
labels:
app: skywalking-ui
spec:
containers:
- name: skywalking-ui
image: 'skywalking-ui:9.4.0'
ports:
- name: tcp
containerPort: 8080
protocol: TCP
env:
- name: SW_OAP_ADDRESS
value: 'http://skywalking-oap-server:12800'
- name: TZ
value: CST-8
resources:
limits:
cpu: '2'
memory: 1Gi
requests:
cpu: '1'
memory: 1Gi
livenessProbe:
tcpSocket:
port: 8080
initialDelaySeconds: 15
timeoutSeconds: 2
periodSeconds: 2
successThreshold: 1
failureThreshold: 3
readinessProbe:
tcpSocket:
port: 8080
initialDelaySeconds: 15
timeoutSeconds: 2
periodSeconds: 2
successThreshold: 2
failureThreshold: 3
imagePullPolicy: IfNotPresent
restartPolicy: Always
  1. env中的skywalking-oap-server地址,需要在k8s中配置对应节点的service,配置后容器之间才能够相互通信。

nginx代理ui界面并且进行登录鉴权

由于skywalking的可视化界面没有登录功能,这里使用nginx的鉴权功能代理UI界面。

需要引入带有htpasswd功能的nginx镜像

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
kind: Deployment
apiVersion: apps/v1
metadata:
name: nginx-htpasswd
namespace: default
labels:
k8s-app: nginx-htpasswd
spec:
replicas: 1
selector:
matchLabels:
k8s-app: nginx-htpasswd
template:
metadata:
name: nginx-htpasswd
creationTimestamp: null
labels:
k8s-app: nginx-htpasswd
spec:
volumes:
- name: nginx-config
configMap:
name: nginx-htpasswd
items:
- key: nginx-config
path: nginx.conf
defaultMode: 420
- name: nginx-pwd
configMap:
name: nginx-pwd
items:
- key: nginx-htpasswd
path: htpasswd
defaultMode: 420
containers:
- name: nginx-htpasswd
image: 'nginx:htpasswd'
env:
- name: TZ
value: Asia/Shanghai
resources:
limits:
cpu: '1'
memory: 1Gi
requests:
cpu: 300m
memory: 1Gi
volumeMounts:
- name: nginx-config
mountPath: /etc/nginx/nginx.conf
subPath: nginx.conf
- name: nginx-pwd
mountPath: /etc/nginx/htpasswd
subPath: htpasswd
imagePullPolicy: IfNotPresent
restartPolicy: Always

登录信息通过ConfigMap配置

1
2
3
4
5
6
7
8
9
kind: ConfigMap
apiVersion: v1
metadata:
name: nginx-pwd
namespace: default
data:
nginx-htpasswd: |
test:$apr1$xCml6c7T$5gtf25owVGxqYAA1SBGBO.

上一页
2023-09-03 20:10:56
下一页