Test
parent
0580d3725c
commit
0b5b01a9a6
@ -1,374 +0,0 @@
|
|||||||
# How I set up Node Exporter and Prometheus in K8s
|
|
||||||
|
|
||||||
## K3s
|
|
||||||
|
|
||||||
I have made a K3s cluster with 1 control node and 3 worker nodes. All of this is running in my ProxMox-server. All nodes are setup with 4 VCPU, 4G RAM and 16G disk. Everything here is inspired by [How I set up Node Exporter and Prometheus in K8s](https://devopscube.com/node-exporter-kubernetes/) and [How to Setup Prometheus Monitoring On Kubernetes Cluster](https://devopscube.com/setup-prometheus-monitoring-on-kubernetes/), so I take no credit of this. I simply just put the info together and installed on my Kubernetes cluster.
|
|
||||||
|
|
||||||
## Prep
|
|
||||||
|
|
||||||
Just make a namespace for the monitoring to reside in:
|
|
||||||
```kubectl create namespace monitoring```
|
|
||||||
|
|
||||||
|
|
||||||
## Node Exporter
|
|
||||||
|
|
||||||
First of all you need a daemonset for making sure that there is a node exporter running on all of the nodes. ```nano daemonset.yaml``` and copy this into:
|
|
||||||
|
|
||||||
```
|
|
||||||
apiVersion: apps/v1
|
|
||||||
kind: DaemonSet
|
|
||||||
metadata:
|
|
||||||
labels:
|
|
||||||
app.kubernetes.io/component: exporter
|
|
||||||
app.kubernetes.io/name: node-exporter
|
|
||||||
name: node-exporter
|
|
||||||
namespace: monitoring
|
|
||||||
spec:
|
|
||||||
selector:
|
|
||||||
matchLabels:
|
|
||||||
app.kubernetes.io/component: exporter
|
|
||||||
app.kubernetes.io/name: node-exporter
|
|
||||||
template:
|
|
||||||
metadata:
|
|
||||||
labels:
|
|
||||||
app.kubernetes.io/component: exporter
|
|
||||||
app.kubernetes.io/name: node-exporter
|
|
||||||
spec:
|
|
||||||
containers:
|
|
||||||
- args:
|
|
||||||
- --path.sysfs=/host/sys
|
|
||||||
- --path.rootfs=/host/root
|
|
||||||
- --no-collector.wifi
|
|
||||||
- --no-collector.hwmon
|
|
||||||
- --collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+|var/lib/kubelet/pods/.+)($|/)
|
|
||||||
- --collector.netclass.ignored-devices=^(veth.*)$
|
|
||||||
name: node-exporter
|
|
||||||
image: prom/node-exporter
|
|
||||||
ports:
|
|
||||||
- containerPort: 9100
|
|
||||||
protocol: TCP
|
|
||||||
resources:
|
|
||||||
limits:
|
|
||||||
cpu: 250m
|
|
||||||
memory: 180Mi
|
|
||||||
requests:
|
|
||||||
cpu: 102m
|
|
||||||
memory: 180Mi
|
|
||||||
volumeMounts:
|
|
||||||
- mountPath: /host/sys
|
|
||||||
mountPropagation: HostToContainer
|
|
||||||
name: sys
|
|
||||||
readOnly: true
|
|
||||||
- mountPath: /host/root
|
|
||||||
mountPropagation: HostToContainer
|
|
||||||
name: root
|
|
||||||
readOnly: true
|
|
||||||
volumes:
|
|
||||||
- hostPath:
|
|
||||||
path: /sys
|
|
||||||
name: sys
|
|
||||||
- hostPath:
|
|
||||||
path: /
|
|
||||||
name: root
|
|
||||||
```
|
|
||||||
|
|
||||||
Then you have to deploy this into your kubernetes cluster. Run this from the control node:
|
|
||||||
```kubectl create -f daemonset.yaml```
|
|
||||||
|
|
||||||
If you want to monitor the createon of the deamonset while it's creating, you can run:
|
|
||||||
```kubectl get daemonset -n monitoring```
|
|
||||||
|
|
||||||
You have to have services for the pods to be able to expose their data. Create a file ```nano service.yaml``` with the contents:
|
|
||||||
```
|
|
||||||
---
|
|
||||||
kind: Service
|
|
||||||
apiVersion: v1
|
|
||||||
metadata:
|
|
||||||
name: node-exporter
|
|
||||||
namespace: monitoring
|
|
||||||
annotations:
|
|
||||||
prometheus.io/scrape: 'true'
|
|
||||||
prometheus.io/port: '9100'
|
|
||||||
spec:
|
|
||||||
selector:
|
|
||||||
app.kubernetes.io/component: exporter
|
|
||||||
app.kubernetes.io/name: node-exporter
|
|
||||||
ports:
|
|
||||||
- name: node-exporter
|
|
||||||
protocol: TCP
|
|
||||||
port: 9100
|
|
||||||
targetPort: 9100
|
|
||||||
```
|
|
||||||
Save, and run:
|
|
||||||
```kubectl create -f service.yaml```
|
|
||||||
|
|
||||||
(you can check the service by running ```kubectl get endpoints -n monitoring```)
|
|
||||||
|
|
||||||
## Prometheus
|
|
||||||
|
|
||||||
To allow acces to Prometheus you have to involve the Ingress of the kubernetes-cluster. For that to be possible you have to make a clusterrole with access to RBAC.
|
|
||||||
Create a file ```nano clusterRole.yaml``` and add the contents:
|
|
||||||
```
|
|
||||||
apiVersion: rbac.authorization.k8s.io/v1
|
|
||||||
kind: ClusterRole
|
|
||||||
metadata:
|
|
||||||
name: prometheus
|
|
||||||
rules:
|
|
||||||
- apiGroups: [""]
|
|
||||||
resources:
|
|
||||||
- nodes
|
|
||||||
- nodes/proxy
|
|
||||||
- services
|
|
||||||
- endpoints
|
|
||||||
- pods
|
|
||||||
verbs: ["get", "list", "watch"]
|
|
||||||
- apiGroups:
|
|
||||||
- extensions
|
|
||||||
resources:
|
|
||||||
- ingresses
|
|
||||||
verbs: ["get", "list", "watch"]
|
|
||||||
- nonResourceURLs: ["/metrics"]
|
|
||||||
verbs: ["get"]
|
|
||||||
---
|
|
||||||
apiVersion: rbac.authorization.k8s.io/v1
|
|
||||||
kind: ClusterRoleBinding
|
|
||||||
metadata:
|
|
||||||
name: prometheus
|
|
||||||
roleRef:
|
|
||||||
apiGroup: rbac.authorization.k8s.io
|
|
||||||
kind: ClusterRole
|
|
||||||
name: prometheus
|
|
||||||
subjects:
|
|
||||||
- kind: ServiceAccount
|
|
||||||
name: default
|
|
||||||
namespace: monitoring
|
|
||||||
```
|
|
||||||
Then you create it:
|
|
||||||
```kubectl create -f clusterRole.yaml```
|
|
||||||
|
|
||||||
Then you have to have a config map for Prometheus. That way you don't have to fiddle around with rules and config in external files, which force you to rebuild the pod every time you make a config change.
|
|
||||||
Create a file ```nano config-map.yaml``` and add the contents:
|
|
||||||
```
|
|
||||||
apiVersion: v1
|
|
||||||
kind: ConfigMap
|
|
||||||
metadata:
|
|
||||||
name: prometheus-server-conf
|
|
||||||
labels:
|
|
||||||
name: prometheus-server-conf
|
|
||||||
namespace: monitoring
|
|
||||||
data:
|
|
||||||
prometheus.rules: |-
|
|
||||||
groups:
|
|
||||||
- name: devopscube demo alert
|
|
||||||
rules:
|
|
||||||
- alert: High Pod Memory
|
|
||||||
expr: sum(container_memory_usage_bytes) > 1
|
|
||||||
for: 1m
|
|
||||||
labels:
|
|
||||||
severity: slack
|
|
||||||
annotations:
|
|
||||||
summary: High Memory Usage
|
|
||||||
prometheus.yml: |-
|
|
||||||
global:
|
|
||||||
scrape_interval: 5s
|
|
||||||
evaluation_interval: 5s
|
|
||||||
rule_files:
|
|
||||||
- /etc/prometheus/prometheus.rules
|
|
||||||
alerting:
|
|
||||||
alertmanagers:
|
|
||||||
- scheme: http
|
|
||||||
static_configs:
|
|
||||||
- targets:
|
|
||||||
- "alertmanager.monitoring.svc:9093"
|
|
||||||
scrape_configs:
|
|
||||||
- job_name: 'node-exporter'
|
|
||||||
kubernetes_sd_configs:
|
|
||||||
- role: endpoints
|
|
||||||
relabel_configs:
|
|
||||||
- source_labels: [__meta_kubernetes_endpoints_name]
|
|
||||||
regex: 'node-exporter'
|
|
||||||
action: keep
|
|
||||||
- job_name: 'kubernetes-apiservers'
|
|
||||||
kubernetes_sd_configs:
|
|
||||||
- role: endpoints
|
|
||||||
scheme: https
|
|
||||||
tls_config:
|
|
||||||
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
|
|
||||||
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
|
|
||||||
relabel_configs:
|
|
||||||
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
|
|
||||||
action: keep
|
|
||||||
regex: default;kubernetes;https
|
|
||||||
- job_name: 'kubernetes-nodes'
|
|
||||||
scheme: https
|
|
||||||
tls_config:
|
|
||||||
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
|
|
||||||
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
|
|
||||||
kubernetes_sd_configs:
|
|
||||||
- role: node
|
|
||||||
relabel_configs:
|
|
||||||
- action: labelmap
|
|
||||||
regex: __meta_kubernetes_node_label_(.+)
|
|
||||||
- target_label: __address__
|
|
||||||
replacement: kubernetes.default.svc:443
|
|
||||||
- source_labels: [__meta_kubernetes_node_name]
|
|
||||||
regex: (.+)
|
|
||||||
target_label: __metrics_path__
|
|
||||||
replacement: /api/v1/nodes/${1}/proxy/metrics
|
|
||||||
- job_name: 'kubernetes-pods'
|
|
||||||
kubernetes_sd_configs:
|
|
||||||
- role: pod
|
|
||||||
relabel_configs:
|
|
||||||
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
|
|
||||||
action: keep
|
|
||||||
regex: true
|
|
||||||
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
|
|
||||||
action: replace
|
|
||||||
target_label: __metrics_path__
|
|
||||||
regex: (.+)
|
|
||||||
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
|
|
||||||
action: replace
|
|
||||||
regex: ([^:]+)(?::\d+)?;(\d+)
|
|
||||||
replacement: $1:$2
|
|
||||||
target_label: __address__
|
|
||||||
- action: labelmap
|
|
||||||
regex: __meta_kubernetes_pod_label_(.+)
|
|
||||||
- source_labels: [__meta_kubernetes_namespace]
|
|
||||||
action: replace
|
|
||||||
target_label: kubernetes_namespace
|
|
||||||
- source_labels: [__meta_kubernetes_pod_name]
|
|
||||||
action: replace
|
|
||||||
target_label: kubernetes_pod_name
|
|
||||||
- job_name: 'kube-state-metrics'
|
|
||||||
static_configs:
|
|
||||||
- targets: ['kube-state-metrics.kube-system.svc.cluster.local:8080']
|
|
||||||
- job_name: 'kubernetes-cadvisor'
|
|
||||||
scheme: https
|
|
||||||
tls_config:
|
|
||||||
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
|
|
||||||
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
|
|
||||||
kubernetes_sd_configs:
|
|
||||||
- role: node
|
|
||||||
relabel_configs:
|
|
||||||
- action: labelmap
|
|
||||||
regex: __meta_kubernetes_node_label_(.+)
|
|
||||||
- target_label: __address__
|
|
||||||
replacement: kubernetes.default.svc:443
|
|
||||||
- source_labels: [__meta_kubernetes_node_name]
|
|
||||||
regex: (.+)
|
|
||||||
target_label: __metrics_path__
|
|
||||||
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
|
|
||||||
- job_name: 'kubernetes-service-endpoints'
|
|
||||||
kubernetes_sd_configs:
|
|
||||||
- role: endpoints
|
|
||||||
relabel_configs:
|
|
||||||
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
|
|
||||||
action: keep
|
|
||||||
regex: true
|
|
||||||
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
|
|
||||||
action: replace
|
|
||||||
target_label: __scheme__
|
|
||||||
regex: (https?)
|
|
||||||
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
|
|
||||||
action: replace
|
|
||||||
target_label: __metrics_path__
|
|
||||||
regex: (.+)
|
|
||||||
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
|
|
||||||
action: replace
|
|
||||||
target_label: __address__
|
|
||||||
regex: ([^:]+)(?::\d+)?;(\d+)
|
|
||||||
replacement: $1:$2
|
|
||||||
- action: labelmap
|
|
||||||
regex: __meta_kubernetes_service_label_(.+)
|
|
||||||
- source_labels: [__meta_kubernetes_namespace]
|
|
||||||
action: replace
|
|
||||||
target_label: kubernetes_namespace
|
|
||||||
- source_labels: [__meta_kubernetes_service_name]
|
|
||||||
action: replace
|
|
||||||
target_label: kubernetes_name
|
|
||||||
```
|
|
||||||
After that you create it.
|
|
||||||
```kubectl create -f config-map.yaml```
|
|
||||||
|
|
||||||
Now you are ready to deploy Prometheus. Create a file ```nano prometheus-deployment.yaml``` and add the contents:
|
|
||||||
```
|
|
||||||
apiVersion: apps/v1
|
|
||||||
kind: Deployment
|
|
||||||
metadata:
|
|
||||||
name: prometheus-deployment
|
|
||||||
namespace: monitoring
|
|
||||||
labels:
|
|
||||||
app: prometheus-server
|
|
||||||
spec:
|
|
||||||
replicas: 1
|
|
||||||
selector:
|
|
||||||
matchLabels:
|
|
||||||
app: prometheus-server
|
|
||||||
template:
|
|
||||||
metadata:
|
|
||||||
labels:
|
|
||||||
app: prometheus-server
|
|
||||||
spec:
|
|
||||||
containers:
|
|
||||||
- name: prometheus
|
|
||||||
image: prom/prometheus
|
|
||||||
args:
|
|
||||||
- "--storage.tsdb.retention.time=12h"
|
|
||||||
- "--config.file=/etc/prometheus/prometheus.yml"
|
|
||||||
- "--storage.tsdb.path=/prometheus/"
|
|
||||||
ports:
|
|
||||||
- containerPort: 9090
|
|
||||||
resources:
|
|
||||||
requests:
|
|
||||||
cpu: 500m
|
|
||||||
memory: 500M
|
|
||||||
limits:
|
|
||||||
cpu: 1
|
|
||||||
memory: 1Gi
|
|
||||||
volumeMounts:
|
|
||||||
- name: prometheus-config-volume
|
|
||||||
mountPath: /etc/prometheus/
|
|
||||||
- name: prometheus-storage-volume
|
|
||||||
mountPath: /prometheus/
|
|
||||||
volumes:
|
|
||||||
- name: prometheus-config-volume
|
|
||||||
configMap:
|
|
||||||
defaultMode: 420
|
|
||||||
name: prometheus-server-conf
|
|
||||||
|
|
||||||
- name: prometheus-storage-volume
|
|
||||||
emptyDir: {}
|
|
||||||
```
|
|
||||||
And create it:
|
|
||||||
```kubectl create -f prometheus-deployment.yaml```
|
|
||||||
|
|
||||||
For you to be able to access it remotely, you have a couple of solutions available.
|
|
||||||
- Port-forward with kubectl.
|
|
||||||
- Ingress (Then you have to fiddle with SSL).
|
|
||||||
- Expose it as a service, which I ended up choosing.
|
|
||||||
|
|
||||||
Create a file ```nano prometheus-service.yaml``` and copy this content into it:
|
|
||||||
```
|
|
||||||
apiVersion: v1
|
|
||||||
kind: Service
|
|
||||||
metadata:
|
|
||||||
name: prometheus-service
|
|
||||||
namespace: monitoring
|
|
||||||
annotations:
|
|
||||||
prometheus.io/scrape: 'true'
|
|
||||||
prometheus.io/port: '9090'
|
|
||||||
spec:
|
|
||||||
selector:
|
|
||||||
app: prometheus-server
|
|
||||||
type: NodePort
|
|
||||||
ports:
|
|
||||||
- port: 8080
|
|
||||||
targetPort: 9090
|
|
||||||
nodePort: 30000
|
|
||||||
```
|
|
||||||
And create it:
|
|
||||||
```kubectl create -f prometheus-service.yaml --namespace=monitoring```
|
|
||||||
|
|
||||||
Now you should be able to access Prometheus by using the URL: ```http://<control-node-ip>:30000```
|
|
||||||
|
|
||||||
Happy monitoring!
|
|
||||||
@ -1,38 +0,0 @@
|
|||||||
# Kernel 6.2.x on an Intel-system
|
|
||||||
|
|
||||||
When I updated my laptop today, I discovered an error-message during the `apt upgrade`:
|
|
||||||
`W: Possible missing firmware /lib/firmware/i915/skl_guc_ver6.bin for module i915_bpo`.
|
|
||||||
This lead me to a bit of searching around. Most solutions was quite rustic and manual fixing with downloading and copying files in the correct directories and rebuilding a>
|
|
||||||
But after some frustration and more downdrilling in the search results, I came upon this solution from a post on askubuntu.com. A simple script which handled everything. J>
|
|
||||||
```
|
|
||||||
#!/bin/bash
|
|
||||||
|
|
||||||
WARNING_PATTERN='(?<=W: Possible missing firmware /lib/firmware/i915/)[\w.]+.bin(?= for module i915)'
|
|
||||||
DOWNLOAD_URL='https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/i915/{}'
|
|
||||||
FIRMWARE_DIR='/lib/firmware/i915'
|
|
||||||
|
|
||||||
shopt -s nullglob
|
|
||||||
|
|
||||||
WORKDIR="$(mktemp -d)"
|
|
||||||
cd "$WORKDIR" || exit 1
|
|
||||||
echo "Will check for missing i915 firmware and download blobs in '$WORKDIR'."
|
|
||||||
|
|
||||||
sudo update-initramfs -u |&
|
|
||||||
grep -Po "$WARNING_PATTERN" |
|
|
||||||
xargs -t -I {} curl -O "$DOWNLOAD_URL"
|
|
||||||
|
|
||||||
if [[ -n "$(shopt -s nullglob; echo ./*.bin)" ]] ; then
|
|
||||||
sudo chown root: ./*.bin
|
|
||||||
sudo chmod 644 ./*.bin
|
|
||||||
sudo mv ./*.bin "$FIRMWARE_DIR"
|
|
||||||
sudo update-initramfs -u
|
|
||||||
else
|
|
||||||
echo 'No missing firmware found/downloaded.'
|
|
||||||
fi
|
|
||||||
|
|
||||||
rmdir "$WORKDIR"`
|
|
||||||
```
|
|
||||||
The script was shamelessly "stolen" from [Here](https://askubuntu.com/questions/811453/w-possible-missing-firmware-for-module-i915-bpo-when-updating-initramfs/811487)
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
Loading…
x
Reference in New Issue
Block a user