Project

General

Profile

Bug #16020

acngtool shrink is insufficient to maintain acng cache size

Added by anonym 12 months ago. Updated 5 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
Build system
Target version:
Start date:
10/02/2018
Due date:
% Done:

100%

Feature Branch:
bugfix/16020-fix-cache-shrinking
Type of work:
Code
Blueprint:
Starter:
Affected tool:

Description

In build-tails we call acngtool shrink 10G before each build to prevent the cache from running out of disk space. From what I can tell it doesn't clean APT indices correctly, e.g. in my /var/cache/apt-cacher-ng/time-based.snapshots.deb.tails.boum.org/debian I have snapshots dating back to January 2017. Each such snapshot takes 30-120 MB (especially the old ones with multiarch are large) so it adds up, for me to 8 GBs. :S

Either we need to improve acngtool (for everyone's benefit) or we manually find snapshots older than six months (or whatever) and purge them from acng's cache.


Related issues

Blocks Tails - Feature #16209: Core work: Foundations Team Confirmed

Associated revisions

Revision f68a97af (diff)
Added by CyrilBrulebois 6 months ago

Fix apt-cacher-ng cache shrinking (refs: #16020).

The “acngtool shrink” command never fails (at least with apt-cacher-ng
2-2, in stretch), which might explain why the missing sudo call was
missed for so long: the insufficient rights as an unprivileged user
don't lead to an non-zero exit code, and to the printing of a warning
message.

Revision 3269cb58
Added by anonym 6 months ago

Merge remote-tracking branch 'origin/bugfix/16020-fix-cache-shrinking' into stable

Fix-committed: #16020

History

#1 Updated by bertagaz 12 months ago

I've noticed that too locally, acng is quite sloppy at shrinking to the maximum amount it is given. It always do to a somewhat higher value. I "solved" that by just lowering the number to get closer to my needs.

#2 Updated by anonym 12 months ago

That didn't work for me -- no matter the size argument I gave nothing was removed. It is as if a too high ratio of non-debs (APT indices, TBB tarballs) makes its calculations flip out and nothing is freed.

#3 Updated by anonym 12 months ago

intri suggested that acng's daily cronjob should be able to clean it up, but that it takes a long time: "(it fetches all dists again to identify obsolete packages) so I doubt we can do that at every build".

#4 Updated by segfault 12 months ago

I had the same issue (see #16032)

#5 Updated by anonym 12 months ago

anonym wrote:

intri suggested that acng's daily cronjob should be able to clean it up, but that it takes a long time: "(it fetches all dists again to identify obsolete packages) so I doubt we can do that at every build".

Or maybe not:

(17:25:53) intrigeri: ah ah on lizard, we do
"rm -rf /var/cache/apt-cacher-ng/*.tails.boum.org" 
weekly, so no, the cronjob is not what saves us there.

#6 Updated by intrigeri 6 months ago

#7 Updated by intrigeri 6 months ago

  • Assignee deleted (anonym)

After segfault & anonym, @CyrilBrulebois was affected by this problem yesterday ⇒ added to the FT's radar.

No progress since a while here ⇒ deassigning anonym for now, let's make it clear that this ticket is up for grabs and could be tackled by whoever else has time for it :)

#8 Updated by CyrilBrulebois 6 months ago

  • Assignee set to CyrilBrulebois

Hit this yesterday or the day before, it's next on my list.

#9 Updated by CyrilBrulebois 6 months ago

Let's look at our code calling acngtool shrink (vagrant/provision/assets/build-tails):

if [ "${TAILS_PROXY_TYPE}" = "vmproxy" ]; then
    # The apt-cacher-ng cache disk is 15G, so let's ensure at most 10G
    # of it is used there is 5G before each build, which should be
    # enough for any build, even if we have to download a complete set
    # of new packages for a new Debian release.
    /usr/lib/apt-cacher-ng/acngtool shrink 10G -f || \
        echo "The clean-up of apt-cacher-ng's cache failed: this is" \
             "not fatal and most likely just means that some disk" \
             "space could not be reclaimed -- in order to fix that" \
             "situation you need to manually investigate " \
             "/var/cache/apt-cacher-ng/apt-cacher-ng-log/main_*.html" >&2
fi

It seems pretty straightforward, catching issues and displaying an error message when that happens (while carrying on).

But upstream code (source/acngtool.cc in apt-cacher-ng 2-2) has:

        if(verbose)
                cout << "Found " << totalSize << " bytes of relevant data, reducing to " << wantedSize << endl;
        while(!delQ.empty())
        {
                bool todel = (totalSize > wantedSize);
                totalSize -= delQ.top().size;
                const char *msg = 0;
                if(verbose || dryrun)
                        msg = (todel ? "Delete: " : "Keep: " );
                auto& delpath(delQ.top().path);
                if(msg)
                        cout << msg << delpath << endl << msg << delpath << ".head" << endl;
                if(todel && apply)
                {
                        unlink(delpath.c_str());
                        unlink(mstring(delpath + ".head").c_str());
                }
                delQ.pop();
        }
        return 0;

Notice the utter lack of error handling and the mandatory return 0;? That's why we have been missing this for so long: that command never fails. In verbose mode, it seems there's much work going on, with many “Delete:” and a few “Keep:” entries. But the filesystem is left untouched.

Prepending the command with as_root_do fixes the shrinking…

#10 Updated by CyrilBrulebois 6 months ago

  • Status changed from Confirmed to In Progress

#11 Updated by CyrilBrulebois 6 months ago

  • Status changed from In Progress to Confirmed
  • Assignee deleted (CyrilBrulebois)
  • QA Check set to Ready for QA
  • Feature Branch set to bugfix/16020-fix-cache-shrinking

#12 Updated by intrigeri 6 months ago

  • Assignee set to segfault
  • Target version set to Tails_3.14

Thanks a lot, kibi, for your work here :)

Hi @segfault! Last time I checked you used the apt-cacher-ng maintained by our build system. Could you please review this branch?

#13 Updated by anonym 6 months ago

  • Status changed from Confirmed to Fix committed
  • % Done changed from 0 to 100

#14 Updated by anonym 6 months ago

  • Assignee deleted (segfault)
  • % Done changed from 100 to 0
  • QA Check changed from Ready for QA to Pass

I tested it, and it worked perfectly for me! Woohoo!

I've merged into stabledevel but skipped feature/buster to not force us all to build a new basebox given our bandwidth limitations during the sprint.

#15 Updated by anonym 6 months ago

  • % Done changed from 0 to 100

#16 Updated by intrigeri 5 months ago

  • Target version changed from Tails_3.14 to Tails_3.13.2

#17 Updated by anonym 5 months ago

  • Status changed from Fix committed to Resolved

#18 Updated by anonym 5 months ago

  • Target version changed from Tails_3.13.2 to Tails_3.14

#19 Updated by intrigeri 5 months ago

  • Target version changed from Tails_3.14 to Tails_3.13.2

Also available in: Atom PDF