无法启动Prometheus Server

如何解决无法启动Prometheus Server?

开发过程中遇到无法启动Prometheus Server的问题如何解决?下面主要结合日常开发的经验,给出你关于无法启动Prometheus Server的解决方法建议,希望对你解决无法启动Prometheus Server有所启发或帮助;

问题描述

我在Amazon linux 2实例上安装了prometheus,这是我在用户数据中使用的配置:

cat << EOF > /etc/systemd/system/prometheus.service 
[Unit] 
Description=Prometheus Server 
Documentation=https://prometheus.io/docs/introduction/overview/ 
Wants=network-online.target
After=network-online.target

[Service] 
User=prometheus 
Restart=on-failure 

#Change this line if you download the  
#Prometheus on different path user 
ExecStart=/home/prometheus/prometheus/prometheus --config.file=/home/prometheus/prometheus/prometheus.yml --storage.tsdb.path=/app/prometheus/data

[Install] 
WantedBy=multi-user.target 
EOF

cat << EOF > /home/prometheus/prometheus/prometheus.yml 
# my global config 
global: 
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. 
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute. 
  # scrape_timeout is set to the global default (10s). 

# Alertmanager configuration 
alerting: 
  alertmanagers: 
  - static_configs: 
    - targets: 
      # - alertmanager:9093 

# Load rules once and periodically evaluate them according to the global evaluation_interval. 
rule_files: 
  # - "first_rules.yml" 
  # - "second_rules.yml" 

# A scrape configuration containing exactly one endpoint to scrape: 
# Here it's Prometheus itself. 
scrape_configs: 
  # The job name is added as a label job=<job_name> to any timeseries scraped from this config. 
  - job_name: 'prometheus' 

    # metrics_path defaults to '/metrics' 
    # scheme defaults to 'http'. 

    static_configs: 
    - targets: ['localhost:9090'] 
  - job_name: 'node_prometheus' 

    # metrics_path defaults to '/metrics' 
    # scheme defaults to 'http'. 

    static_configs: 
    - targets: ['localhost:9100'] 
  - job_name: 'grafana' 

    # metrics_path defaults to '/metrics' 
    # scheme defaults to 'http'. 

    static_configs: 
# mettre ALB grafana 
    - targets: ['${grafana_dns}'] 

  - job_name: 'sqs_exporter' 
    scrape_interval: 30s 
    scrape_timeout: 30s 
    static_configs: 
    - targets: ['localhost:9434'] 

  - job_name: 'cloudwatch_exporter' 
    scrape_interval: 5m 
    scrape_timeout: 60s 
    static_configs: 
    - targets: ['localhost:9106'] 

  - job_name: '_metrics' 
    metric_relabel_configs: 
    relabel_configs: 
     - source_labels: 
       - __meta_ec2_platform 
       action: keep 
       regex: .*windows.* 
     - action: labelmap 
       regex: __meta_ec2_tag_(.*) 
       replacement: \$1 
    ec2_sd_configs: 
      - region: eu-west-1 
        port: 9543 

  - job_name: 'cadvisor' 
    static_configs: 
    - targets: ['localhost:8080'] 

  - job_name: 'elasticbeanstalk_exporter' 
    static_configs: 
    - targets: ['localhost:9552'] 

EOF



systemctl daemon-reload 
systemctl enable prometheus
systemctl start prometheus

当我检查prometheus是否正在运行时,我得到以下信息:

[ec2-user@ip-10-193-192-49 ~]$  sudo systemctl status prometheus
● prometheus.service - Prometheus Server
   Loaded: loaded (/etc/systemd/system/prometheus.service; enabled; vendor preset: disabled)
   active: failed (Result: start-limit) since Mon 2019-12-02 11:12:33 UTC; 4s ago
     Docs: https://prometheus.io/docs/introduction/overview/
  Process: 22507 ExecStart=/home/prometheus/prometheus/prometheus --config.file=/home/prometheus/prometheus/prometheus.yml --storage.tsdb.path=/app/prometheus/data (code=exited,status=2)
 Main PID: 22507 (code=exited,status=2)

Dec 02 11:12:33 ip-10-193-192-49.service.app systemd[1]: Unit prometheus.service entered failed state.
Dec 02 11:12:33 ip-10-193-192-49.service.app systemd[1]: prometheus.service failed.
Dec 02 11:12:33 ip-10-193-192-49.service.app systemd[1]: prometheus.service holdoff time over,scheduling restart.
Dec 02 11:12:33 ip-10-193-192-49.service.app systemd[1]: start request repeated too quickly for prometheus.service
Dec 02 11:12:33 ip-10-193-192-49.service.app systemd[1]: Failed to start Prometheus Server.
Dec 02 11:12:33 ip-10-193-192-49.service.app systemd[1]: Unit prometheus.service entered failed state.
Dec 02 11:12:33 ip-10-193-192-49.service.app systemd[1]: prometheus.service failed.
[ec2-user@ip-10-193-192-49 ~]$

我安装了Prometheus版本2.14.0。有什么帮助吗?

我评论了文件Restart=on-failure中的行/etc/systemd/system/prometheus.service,然后:

systemctl daemon-reload 
systemctl status prometheus

我明白了:

Dec 02 12:57:52 ip-10-193-192-58.service.app systemd[1]: start request repeated too quickly for prometheus.service
Dec 02 12:57:52 ip-10-193-192-58.service.app systemd[1]: Failed to start Prometheus Server.
Dec 02 12:57:52 ip-10-193-192-58.service.app systemd[1]: Unit prometheus.service entered failed state.
Dec 02 12:57:52 ip-10-193-192-58.service.app systemd[1]: prometheus.service failed.
Dec 02 12:58:03 ip-10-193-192-58.service.app systemd[1]: Started Prometheus Server.
Dec 02 12:58:03 ip-10-193-192-58.service.app systemd[1]: Starting Prometheus Server...
Dec 02 12:58:03 ip-10-193-192-58.service.app prometheus[23391]: level=info ts=2019-12-02T12:58:03.686Z caller=main.go:296 msg="no time or size retention was set so
Dec 02 12:58:03 ip-10-193-192-58.service.app prometheus[23391]: level=info ts=2019-12-02T12:58:03.687Z caller=main.go:332 msg="Starting Prometheus" version="(versio
Dec 02 12:58:03 ip-10-193-192-58.service.app prometheus[23391]: level=info ts=2019-12-02T12:58:03.687Z caller=main.go:333 build_context="(go=go1.13.4,user=root@df2
Dec 02 12:58:03 ip-10-193-192-58.service.app prometheus[23391]: level=info ts=2019-12-02T12:58:03.687Z caller=main.go:334 host_details="(Linux 4.14.77-81.59.amzn2.x
Dec 02 12:58:03 ip-10-193-192-58.service.app prometheus[23391]: level=info ts=2019-12-02T12:58:03.687Z caller=main.go:335 fd_limits="(soft=1024,hard=4096)"
Dec 02 12:58:03 ip-10-193-192-58.service.app lor prometheus[23391]: level=info ts=2019-12-02T12:58:03.687Z caller=main.go:336 vm_limits="(soft=unlimited,hard=unlimited
Dec 02 12:58:03 ip-10-193-192-58.service.app prometheus[23391]: level=error ts=2019-12-02T12:58:03.692Z caller=query_logger.go:85 component=activeQueryTracker msg="
Dec 02 12:58:03 ip-10-193-192-58.service.app systemd[1]: prometheus.service: main process exited,code=exited,status=2/INVALIDARGUMENT
Dec 02 12:58:03 ip-10-193-192-58.service.app systemd[1]: Unit prometheus.service entered failed state.
Dec 02 12:58:03 ip-10-193-192-58.service.app systemd[1]: prometheus.service failed.

尚未找到解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)

编程问答相关问答

是否可以将 Python 程序转换为 C/C++? 我需要实现几个算法,我不确定性能差距是否足以证明我在 C/C++ 中执行它时所经历的所有痛苦(我不擅长)).我想写一个简单的算法,并根据这样一个转换后的解决方案对其进行基准测
我想使用 NTT 进行快速平方(请参阅快速 bignum 平方计算),但即使对于非常大的数字……超过 12000 位.
以下代码: myQueue.enqueue(\'a\'); myQueue.enqueue(\'b\'); cout << myQueue.dequeue() << myQueue.dequeue();
据我所知,写时复制不是在 C++11 中实现符合标准的 std::string 的可行方法,但是当它最近在讨论中出现时,我发现我自己无法直接支持这种说法.
这篇文章的评论部分有一个关于使用 std::vector::reserve 的帖子() vs. std::vector::resize().
我了解 inline 本身是对编译器的建议,它可以自行决定是否内联函数,并且还会生成可链接的目标代码.
我最近遇到了一个问题 可以使用模数除法轻松解决,但输入是浮点数: 给定一个周期函数(例如sin)和一个只能在周期范围内计算它的计算机函数(例如[-π,π]),制作一个可以处理任何输入的函数.
我想了解某个函数在我的 C++ 程序中在 Linux 上执行所需的时间.之后,我想做一个速度比较.我看到了几个时间函数,但最终从 boost 得到了这个.时间:
微信公众号搜索 “ 程序精选 ” ,选择关注!
微信公众号搜 "程序精选"关注