Add a generic Plugin for Prometheus Exporter metrics
There are several applications that provide their metrics as Prometheus metrics.
Please create a generic check plugin that can read these Prometheus metrics and map them as a service in CMK. This would allow a large number of applications to be included in CMK quickly and easily.
Comments: 20
-
04 Apr, '23
WouterDefinitely wanted, for these applications (among probably many others) this would be much appreciated:
- Sonatype Nexus
- Jenkins with Prometheus plugin
- SonarQube
- GitLab
- RabbitMQ with rabbitmq_prometheus plugin
- KeyCloak
See also:
https://forum.checkmk.com/t/datasource-agent-for-prometheus-node-exporter/34069 -
29 Jun, '23
DanielThis would not only push the agent and CMK on the next level, most likely the CMK Community as well.
1
You Support with one Feature, ALL Prometheus Exporter, which would mean not to worry so much anymore about the Agent side as there is huge support on the exporters.
On the Plugin side, everyone can write checks to support the Exporter wherefor the CMK-Exchange
https://prometheus.io/docs/instrumenting/exporters/
(many self written to be found on github)
With all the requests for plugins and the limited amount of time you guys have for development, this would open up hundrets of CMK users a faster and maybe more flexible way to monitor there applications -
11 Jul, '23
Daniellooks like even the snclient will get it
https://github.com/ConSol-Monitoring/snclient
- add basic prometheus exporters
- exporter_exporter
- windows_exporter
- node_exporter
- add time support in threshold, ex.: warn=time > 18:00 && load > 10
- add config include folder -
24 Jan, '24
Christian WirtzThe sniclient only covers basic measurements. These are already part of the Checkmk agent (ok, not where the agent could not be installed).
2
nice:
go_goroutines 7
not so nice:
go_gc_duration_seconds{quantile="0.75"} 0.000276884
uuhh:
node_filesystem_files{device="zroot/ROOT/default",fstype="zfs",mountpoint="/"} 1.069231e+07
I think it would be much more useful in cases where an exporter provides data where we do not have a plugin for the agent.
I'm dreaming about a solution where on can enter a keyword (a line in the exporter data) and a unit and a field how to calculate things with the given value.
This might be not useful when JSON data or whatever is proided by the exporter. But...
Next dream:
If the community would have a kind of interface somewhere (github?) where these pairs (keyword, unit, command) can be entered, we could build up a fast growing database which can be used in a Checkmk datasource for exporters.
Is that understandable?
Why not start with the easy ones? -
18 Mar, '24
Neil BinghamAnother app for the list would be Keycloak that exposes all it's metrics via a Prometheus exporter.
-
08 May, '24
OliverExposing metrics via prometheus metrics format is also getting a thing in Microsoft worlds... just saying (finally getting rid of perfmon and consorts)
-
13 May, '24
Martin Hirschvogel AdminHello,
4
Thank you for your idea. After thorough internal discussion, we’ve decided to plan its implementation for one of the next releases of our software.
We look forward to keeping you updated on progress.
At Checkmk we work on ideas based on business needs, customer demand, and resource availability. For strategic reasons, we reserve the right to re-evaluate the priority and/or scope of this feature as new information becomes available. We therefore ask for your understanding that we do not guarantee its implementation.
Warm regards,
Your Checkmk team -
16 May, '24
AdemThis is exactly what we need. actuator -> prometheus-format -> check_mk
Could you share the timeline for this feature please
Regards Adem -
17 May, '24
Marcel Arentz AdminHi,
1
At the moment we cannot share a more detailed timeline as we need to evaluate dependencies for such a feature and the feasibility. We will keep you updated if we can commit on a more precise timeline and details. -
20 May, '24
Martin Hirschvogel Admin@Daniel: The SNclient+ can handle Prometheus metrics, but that is not at all connected towards Naemon/Thruk. It just provides an interface to pull the Prometheus metrics from a Prometheus server + some nice config around it + exporter_exporter functionality.
1
That's two different paths thus:
- SNclient+ & classic monitoring plug-ins -> Naemon/Thruk
- SNclient+ & Prometheus exporters -> Prometheus
Within OMDLabs, they bundle Prometheus into their stack. But each is running as it's own application. The only connection that being an external link from Thruk to Prometheus. With everything in the end going towards Grafana - as possible with Checkmk.
What we plan is a proper integration, being able to digest Prometheus-formatted with basic query operations. Thus, quite a bit more. -
21 May, '24
DanielIm aware what SNclient+ can handle. And we had talked about whats needed as well.
So we hope your share a first draft soon so we are not running into obstacles and make it a great and very flexible way to open up much much more potential for CMK -
08 Jun, '24
AndyWe do have quite extensive prometheus integrations and we consume quite a lot of data in Grafana. However we use Checkmk for IT monitoring and setting up alerts, thresholds etc. is quite complicated (And Alertmanager not easy to work with)
For reading Prometheus metrics we use PromQL, it's powerful as we can join multiple metrics using labels and we can use automation (RestAPI) to create rules.
Our biggest hurdle at the moment is that one PromQL only needs to produce one metric. In prometheus that is almost impossible as most labels are related to multiple metrics and we might want them all.. In one of our use-cases we would need to create 10.000 rules to get the data we want..
Here we would like to be able to use labels in the service name, and/or produce performance data for each metrics if retrieving multiple metrics.
We would be able to add some/all labels from Prometheus into Checkmk labels.
Having that functionality I think would take us far. -
28 Aug, '24
Daniel RoettgermannWe already spend some time and would like to share some ideas
1
Basicly for us it mk_lolokia is an good example and has already some of the nesessary plugin functionality
Wil be split in some parts as Limit is 1000 signs
######
CMK Prometheus generic metric exporter collector
######
Two variants:
1. 100% generic
- By default it only outputs what it gets, metric for metric, 100% generic
- There is the option to create rules that bring in more logic (see Discovery)
2. Generic with the possibility of individually developing checks
Specific checks can be written for specific applications and their Prometheus metric exporters (puts logic into the metrics, all Grafana/PromQL)
(Example of merging metrics into a graph) -
28 Aug, '24
Daniel RoettgermannPart2:
Plugin:
mk_prom_metr_exp
- timeout option has to be available
- parallelization as for example with mk_oracle - will run otehrwise in timeouts or causing the agent to run more then 60s
- caching function
- set prefilter - due to size we can already prefilter lines, helpfull for example for variant 2. - less data transfered
check Logik
- Metrics Type definition needs to be distinguished (counter/gauge/histogram/...)
- generate Labels from Prometheus labels
- should understand what "values" there are and cut them off for the service name and include them in the service metric/graph accordingly
-- vendor:memory_pool_g1_survivor_space_usage_max_bytes
-- vendor:memory_pool_g1_old_gen_usage_bytes
-- wildfly_datasources_pool_xaforget_average_time
-- solr_metrics_core_searcher_cumulative_cache_total -
28 Aug, '24
Daniel RoettgermannPart 3
Discovery / Rules
- Metrics excluden/includen (regexp)
- Metrics grouping (regexp)
- Grouping by Label (regexp)
Example: (shortend due to character limit)
node_cpu_seconds_total{cpu="0",mode="idle"} 57464.83
node_cpu_seconds_total{cpu="0",mode="iowait"} 0.87
node_cpu_seconds_total{cpu="0",mode="irq"} 0
node_cpu_seconds_total{cpu="1",mode="steal"} 0
node_cpu_seconds_total{cpu="3",mode="nice"} 0.21
node_cpu_seconds_total{cpu="3",mode="softirq"} 0
node_cpu_seconds_total{cpu="3",mode="steal"} 0
node_cpu_seconds_total{cpu="3",mode="system"} 37.25
node_cpu_seconds_total{cpu="3",mode="user"} 32.26
Possible Output
node_cpu_seconds_cpu_{0,1,2,3}
node_cpu_seconds_cpu_{0,1,2,3}_$mode
node_cpu_seconds_total_cpu_{$num}_mode_$mode_value
- Grouping by Name (regexp)
-- node_load1 0.24
-- node_load15 0.02
-- node_load5 0.09
- Metrics Type definition (counter/gauge/histogram/...)
- Metrics renaming
- Service Names with Label/without Label consideration -
28 Aug, '24
Daniel RoettgermannHapp to share and discuss full draft and ideas with your development
-
08 Oct, '24
Andy MarshLooks like The newer HPE Alletra storage will be moving toward prometheus in the next version (v4) of their firmware. 3Par connector currently partially works with these devices but doesn't always make sense as some of the metrics are out of context and is down to how HPE are moving to using Cloud management for their products and supporting local management less and less. Prometheus integration with this would be excellent. (Very aware that "potential vapourware" doesn't really count as an idea as it would be impossible to implement but HPE currently have a separate Exporter that currently only integrates with prometheus too https://hpe-storage.github.io/array-exporter/deployment/index.html)
-
21 Oct, '24
AdemGive us an update of the current status, please...
-
28 Oct, '24
Marcel Arentz AdminHi Adem,
2
We're actively working on a more native solution to use Prometheus endpoints. I cannot give details right now, but we will communicate more officially very soon. -
06 Dec, '24
Martin Hirschvogel AdminHello,
3
Good news: We have started the development process.
So what’s next? Depending on the scope of your idea, this process may include various steps such as detailed requirements research or a specification phase before it is passed on to development. During this process we may contact you for further information.
Warm regards,
Your Checkmk team