Project

General

Profile

Feature #16215

Bug #15071: Make our server backup process more usable

Add monitoring to stone

Added by groente 9 months ago. Updated about 2 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Sysadmins
Category:
Infrastructure
Target version:
-
Start date:
12/10/2018
Due date:
% Done:

60%

Feature Branch:
Type of work:
Sysadmin
Blueprint:
Starter:
Affected tool:

Description

Stone should be monitored. Of particular importance are:

- disk space check
- check that no deletes have been issues in append-only mode


Related issues

Related to Tails - Feature #16934: Cleanup monitoring in puppet-tails Confirmed
Blocks Tails - Feature #13284: Core work: Sysadmin (Adapt our infrastructure) Confirmed 06/30/2017
Blocked by Tails - Feature #16214: Add stone to our VPN Resolved 12/10/2018

History

#1 Updated by intrigeri 9 months ago

  • Blocks Feature #13284: Core work: Sysadmin (Adapt our infrastructure) added

#2 Updated by intrigeri 8 months ago

#3 Updated by groente 2 months ago

  • Status changed from Confirmed to Needs Validation
  • Assignee changed from groente to bertagaz
  • % Done changed from 0 to 60

hey berta,

with a fair bit of hack and slash i managed to get the masterless stone integrated into our monitoring. i'm not particularly excited about the code quality, but i thought to leave cleanup for the upcoming sprints in which we'll have to find generic design answers to integrating masterless nodes into our puppet setup.

eitherway, a review would be most welcome. thanks!

#4 Updated by groente about 2 months ago

  • Assignee changed from bertagaz to Sysadmins

#5 Updated by intrigeri about 2 months ago

  • Assignee changed from Sysadmins to intrigeri

I'll take it as part of reviewing #15071!

#6 Updated by intrigeri about 2 months ago

  • Status changed from Needs Validation to In Progress
  • Assignee changed from intrigeri to groente

Hi!

with a fair bit of hack and slash i managed to get the masterless stone integrated into our monitoring. i'm not particularly excited about the code quality, but i thought to leave cleanup for the upcoming sprints in which we'll have to find generic design answers to integrating masterless nodes into our puppet setup.

Fair enough!

Food for thought, this might work to get rid of the duplication:

  1. Have tails::monitoring::config::common_services declare its resources directly (as opposed to exporting them). Turn it from a class into a defined resource.
  2. In tails::monitoring::config, instead of including tails::monitoring::config::common_services, export it with \\.
  3. Then we should be able to use tails::monitoring::config::common_services as-is for stone.

Meanwhile, I've added a comment in both files (9d2e64ed) that might help us keep these files in sync'. Let's call it good enough for the moment but maybe capture the problem (and my suggestion above if you'd like) in another ticket so we have it in mind when we redesign stuff?

Finally, am I correct that this part of the ticket description was not implemented:

- check that no deletes have been issues in append-only mode

? IIRC that was required by our security design but I might be mis-remembering so an update/clarification from your side would be much welcome :)

#7 Updated by groente about 2 months ago

#8 Updated by groente about 2 months ago

  • Status changed from In Progress to Needs Validation
  • Assignee changed from groente to Sysadmins

Meanwhile, I've added a comment in both files (9d2e64ed) that might help us keep these files in sync'. Let's call it good enough for the moment but maybe capture the problem (and my suggestion above if you'd like) in another ticket so we have it in mind when we redesign stuff?

I've created #16934 for this.

Finally, am I correct that this part of the ticket description was not implemented:

- check that no deletes have been issues in append-only mode

? IIRC that was required by our security design but I might be mis-remembering so an update/clarification from your side would be much welcome :)

I've been puzzling on an automated way to do this, but these checks would require both access to stone and access to the backup repo's key material.

In theory, lizard has both, but running these checks from lizard would be suboptimal since lizard being compromised is the exact scenario for which we want these checks, so we'd never be able to rely on them.

One option could be to give ecours this kind of access, but i'm pretty uncomfortable with the level of escalation a compromise of ecours would then imply.

So I ended up simply stating in our documentation to always manually check the integrity of the backups before pruning anything. Since pruning is something we'd only have to do once every few years with our current growth, I don't think this will be too much of a burden.

#9 Updated by intrigeri about 2 months ago

  • Status changed from Needs Validation to Resolved

Finally, am I correct that this part of the ticket description was not implemented:

- check that no deletes have been issues in append-only mode

? IIRC that was required by our security design but I might be mis-remembering so an update/clarification from your side would be much welcome :)

I've been puzzling on an automated way to do this, but these checks would require both access to stone and access to the backup repo's key material.

In theory, lizard has both, but running these checks from lizard would be suboptimal since lizard being compromised is the exact scenario for which we want these checks, so we'd never be able to rely on them.

Ah :/ Indeed!

One option could be to give ecours this kind of access, but i'm pretty uncomfortable with the level of escalation a compromise of ecours would then imply.

Gah, no, let's not do this, indeed.

So I ended up simply stating in our documentation to always manually check the integrity of the backups before pruning anything. Since pruning is something we'd only have to do once every few years with our current growth, I don't think this will be too much of a burden.

OK! Closing, then \o/

Also available in: Atom PDF