从普罗米修斯抓取traefik指标

提问者：小点点

从普罗米修斯抓取traefik指标

我试图从普罗米修斯抓取traefik指标。

Traefik（最新版本）作为服务托管在群集群上，并激活prometheus指标。匹配的endpoint10.200.1.1:8088/metrics

当我从导航器到达我的endpoint时，我看到了预期的指标：

...
# HELP traefik_config_last_reload_failure Last config reload failure
# TYPE traefik_config_last_reload_failure gauge
traefik_config_last_reload_failure 0
# HELP traefik_config_last_reload_success Last config reload success
# TYPE traefik_config_last_reload_success gauge
traefik_config_last_reload_success 1.53633684e+09
# HELP traefik_config_reloads_failure_total Config failure reloads
# TYPE traefik_config_reloads_failure_total counter
traefik_config_reloads_failure_total 0
# HELP traefik_config_reloads_total Config reloads
# TYPE traefik_config_reloads_total counter
traefik_config_reloads_total 76
...

因此，在我的观点中，编辑以下prometheus. yml（并发布到 /-/重新加载）应该添加这些指标。

global:
  scrape_interval:     15s

rule_files:
  - "targets.rules"
  - "host.rules"
  - "containers.rules"

scrape_configs:

...

  - job_name: 'traefik'
    metrics_path: '/metrics'
    static_configs:
      - targets: ['10.200.1.2:8088']

但不幸的是，这些都没有出现在prometheus api的下拉列表中。

由于我是traefik和prometheus的新手，我很确定我理解错了。我尝试遵循一些指南（例如这个），但无法使其工作（可能与以前的版本一起工作）。

所以……有人知道我做错了什么和/或什么是正确的方法吗？

共1个答案

匿名用户

过了一会儿，许多尝试和一些相关的问题之后：我最终认为这与我的配置无关…所以，由于我也观察到一些随机的奇怪行为（例如远程 /providers调用上的一些503错误），我开始认为问题与访问我的机器有关。

所以我试图降级经理，转而提升群体的另一个节点。…奏效了！我的traefik指标现在出现在prometheus中！

我仍然要明白我的前任经理有什么问题，但至少，我在向前迈进！

感谢@AlinSínp lean