Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lockfile detection failure on Debian #115

Closed
twdragon opened this issue Jan 12, 2021 · 2 comments · Fixed by #117
Closed

Lockfile detection failure on Debian #115

twdragon opened this issue Jan 12, 2021 · 2 comments · Fixed by #117
Labels
kind/bug A bug in existing code (including security flaws)

Comments

@twdragon
Copy link

Using the last snapshot of source code, I compiled the tool on Debian Buster x86. But it is now impossible to migrate the repository because the locking engine reports the lockfile size is not zero. This issue happens also with the LevelDB datastore selected for the repository. So the only working one is BadgerDS but I am not able to use it on such an old machine. Do you know the possible workaround?

@twdragon twdragon added the need/triage Needs initial labeling and prioritization label Jan 12, 2021
@djdv
Copy link

djdv commented Jan 12, 2021

I encountered this problem recently as well, on both Solaris and Windows when trying to migrate from 10-11.
On my Solaris machine I looked through the code and realized it's mistaken, patched out the check, and everything seemed to work fine.

This is the diff I applied, just commenting out the check in the lock dependency.
(the top change is only relevant for my OS, but the second change is relevant to this)
I don't recommend doing this though, since you're circumventing the entire point of the lock mechanism.
The real solution is to find out why the first lock isn't released before the request for the second. Probably something to do with 2 instances of Repo.Open somewhere. Not sure.

diff --git a/ipfs-10-to-11/_vendor/github.com/libp2p/go-reuseport/control_unix.go b/ipfs-10-to-11/_vendor/github.com/libp2p/go-reuseport/control_unix.go
index fa16e01..2c9bd64 100644
--- a/ipfs-10-to-11/_vendor/github.com/libp2p/go-reuseport/control_unix.go
+++ b/ipfs-10-to-11/_vendor/github.com/libp2p/go-reuseport/control_unix.go
@@ -8,15 +8,18 @@ import (
        "github.com/ipfs/fs-repo-migrations/ipfs-10-to-11/_vendor/golang.org/x/sys/unix"
 )

+const SO_REUSEPORT = 0x2004
+const SO_REUSEADDR = 0x0004
+
 func Control(network, address string, c syscall.RawConn) error {
        var err error
        c.Control(func(fd uintptr) {
-               err = unix.SetsockoptInt(int(fd), unix.SOL_SOCKET, unix.SO_REUSEADDR, 1)
+               err = unix.SetsockoptInt(int(fd), unix.SOL_SOCKET, SO_REUSEADDR, 1)
                if err != nil {
                        return
                }

-               err = unix.SetsockoptInt(int(fd), unix.SOL_SOCKET, unix.SO_REUSEPORT, 1)
+               err = unix.SetsockoptInt(int(fd), unix.SOL_SOCKET, SO_REUSEPORT, 1)
                if err != nil {
                        return
                }
diff --git a/ipfs-10-to-11/_vendor/go4.org/lock/lock_unix.go b/ipfs-10-to-11/_vendor/go4.org/lock/lock_unix.go
index 9e932ca..de178e5 100644
--- a/ipfs-10-to-11/_vendor/go4.org/lock/lock_unix.go
+++ b/ipfs-10-to-11/_vendor/go4.org/lock/lock_unix.go
@@ -32,10 +32,14 @@ func init() {
 }

 func lockFcntl(name string) (io.Closer, error) {
+/*
        fi, err := os.Stat(name)
        if err == nil && fi.Size() > 0 {
                return nil, fmt.Errorf("can't Lock file %q: has non-zero size", name)
        }
+*/

        f, err := os.Create(name)
        if err != nil {

On my Windows machine I forewent the upgrade entirely since I just use that repo for testing. (deleted v10 and initalized 11 with no problems)

Edit:
Also worth noting

So the only working one is BadgerDS but I am not able to use it on such an old machine.

I was not able to get the migration from 10-11 to work (without the patch), despite using Badger on my Solaris node. And whatever the default is, on Windows (flatfs?).
So I'm not sure if the database format is related here, so much as there is an issue with the migration tool's locking sequence.

@aschmahmann
Copy link
Contributor

Thanks for the report. I'm pretty sure this is a bug in the 10-11 migration where we end up locking twice since we both take a manual lock and fsrepo.Open.

@gammazero do you mind taking a look at this? I suspect if you compare 10-11 with 6-7 which also uses fsrepo.Open the fix (which is likely just removing the manual lock) will pop out.

@gammazero gammazero added kind/bug A bug in existing code (including security flaws) and removed need/triage Needs initial labeling and prioritization labels Jan 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants