guix/nix/libstore
Reepca Russelstein 78da695178
daemon: Explicitly unlock output path in the has-become-valid case.
Fixes <https://issues.guix.gnu.org/31785>.

Similar to <https://github.com/NixOS/nix/issues/178>, fixed in
<29cde917fe>.

We can't rely on Goal deletion to release our locks in a timely manner.  In
the case in which multiple guix-daemon processes simultaneously try producing
an output path path1, the one that gets there first (P1) will get the lock,
and the second one (P2) will continue trying to acquire the lock until it is
released.  Once it has acquired the lock, it checks to see whether the path
has already become valid in the meantime, and if so it reports success to
those Goals waiting on its completion and finishes.  Unfortunately, it fails
to release the locks it holds first, so those stay held until that Goal gets
deleted.

Suppose we have the following store path dependency graph:

          path4
      /     |     \
   path1   path2   path3

P2 is now sitting on path1's lock, and will continue to do so until path4 is
completed.  Suppose there is also a P3, and it has been blocked while P1
builds path2.  Now P3 is sitting on path2's lock, and can't acquire path1's
lock to determine that it has been completed.  Likewise P2 is sitting on
path1's lock, and now can't acquire path2's lock to determine that it has been
completed.  Finally, P3 completes path3 while P2 is blocked.

Now:

- P1 knows that path1 and path2 are complete, and holds no locks, but can't
  determine that path3 is complete
- P2 knows that path1 and path3 are complete, and holds locks on path1 and
  path3, but can't determine that path2 is complete
- P3 knows that path2 and path3 are complete, and holds a lock on path2, but
  can't determine that path1 is complete

And none of these locks will be released until path4 is complete.  Thus, we
have a deadlock.

To resolve this, we should explicitly release these locks as soon as they
should be released.

* nix/libstore/build.cc
  (DerivationGoal::tryToBuild, SubstitutionGoal::tryToRun):
  Explicitly release locks in the has-become-valid case.
* tests/store-deadlock.scm: New file.
* Makefile.am (SCM_TESTS): Add it.

Change-Id: Ie510f84828892315fe6776c830db33d0f70bcef8
Signed-off-by: Ludovic Courtès <ludo@gnu.org>
2024-12-30 00:51:57 +01:00
..
.gitignore
build.cc daemon: Explicitly unlock output path in the has-become-valid case. 2024-12-30 00:51:57 +01:00
builtins.cc
builtins.hh
derivations.cc
derivations.hh
gc.cc
globals.cc
globals.hh
local-store.cc
local-store.hh
misc.cc
misc.hh
optimise-store.cc
pathlocks.cc
pathlocks.hh
references.cc
references.hh
sqlite.cc
sqlite.hh
store-api.cc daemon: Improve error message in ‘checkStoreName’. 2024-11-17 23:15:49 +01:00
store-api.hh
worker-protocol.hh