Project

General

Profile

Bug #16271

Monitoring check for Postfix mail queue never reaches warning/critical state

Added by intrigeri about 2 months ago. Updated 21 days ago.

Status:
In Progress
Priority:
Elevated
Assignee:
Category:
Infrastructure
Target version:
Start date:
01/04/2019
Due date:
% Done:

50%

QA Check:
Ready for QA
Feature Branch:
Type of work:
Sysadmin
Blueprint:
Starter:
Affected tool:

Description

There were dozens of messages in the queue today and I don't think we've received any alert.


Related issues

Related to Tails - Bug #12086: Monitor the size of WhisperBack SMTP relay's queue Resolved 12/26/2016
Blocks Tails - Feature #13242: Core work 2017Q4 → 2019Q2: Sysadmin (Maintain our already existing services) Confirmed 06/29/2017

History

#1 Updated by intrigeri about 2 months ago

  • Target version set to Tails_3.12
  • Type of work changed from Security Audit to Sysadmin

#2 Updated by intrigeri 25 days ago

  • Blocks Feature #13242: Core work 2017Q4 → 2019Q2: Sysadmin (Maintain our already existing services) added

#3 Updated by intrigeri 25 days ago

(I'm not sure why I've assigned this to me: arguably this is about keeping things working, i.e. weekly shifts. But whatever, IIRC I've implemented these checks initially so I feel kinda responsible, and I'm curious, so I'll give it a try :)

#4 Updated by intrigeri 25 days ago

  • Related to Bug #12086: Monitor the size of WhisperBack SMTP relay's queue added

#5 Updated by intrigeri 25 days ago

  • Subject changed from Monitoring check for WhisperBack mail queue is broken to Monitoring check for Postfix mail queue never reaches warning/critical state
  • Status changed from Confirmed to In Progress
  • % Done changed from 0 to 10

The check works just fine, as far as collecting data (that we don't store) goes, but "By default, all thresholds are 0 except corrupt_crit". So well, since 2 years Icinga has never told us about things being wrong on the email front.

#6 Updated by intrigeri 25 days ago

To be clear, actually that comment is wrong: all thresholds are empty by default. I had mistakenly believed this comment and concluded that if the threshold is 0, then any value greater than 0 must trigger a notification, i.e. the default settings were good for us. Reading the code leads to a different understanding.

#7 Updated by intrigeri 25 days ago

  • Assignee changed from intrigeri to bertagaz
  • % Done changed from 10 to 50
  • QA Check set to Ready for QA

Fix deployed. Please review:

It would be nice to actually trigger an email delivery problem, somehow, and make sure we actually get notifications. Or take advantage of the fact that some of our Postfix currently have deferred email, set the threshold low enough, and profit :)

#8 Updated by intrigeri 23 days ago

We just received "Subject: PROBLEM - jenkins.lizard - is CRITICAL" so it looks like it's working :)

#9 Updated by anonym 21 days ago

  • Target version changed from Tails_3.12 to Tails_3.13

Also available in: Atom PDF