This is git-annex's bug list. Link bugs to done when done.

wishlist: more descriptive commit messages in git-annex branch
Posted Sat Sep 17 09:24:24 2011

--git-dir and --work-tree options
Posted Sat Sep 17 09:24:24 2011

Prevent accidental merges
Posted Sat Sep 17 09:24:24 2011

Cabal dependency monadIO missing
Posted Sat Sep 17 09:24:24 2011

git annex fsck is a no-op in bare repos
Posted Sat Sep 17 09:10:11 2011

making annex-merge try a fast-forward
Posted Sat Sep 17 09:10:11 2011

annexed symlink mtime matching code is disabled on non-linux systems; needs testing
Posted Sat Sep 17 09:10:11 2011

unannex and uninit do not work when git index is broken
Posted Sat Sep 17 09:10:11 2011

unannex command doesn't all files
Posted Sat Sep 17 09:10:11 2011

Unfortunate interaction with Calibre
Posted Sat Sep 17 09:10:11 2011

softlink mtime
Posted Sat Sep 17 09:10:11 2011

minor bug: errors are not verbose enough
Posted Sat Sep 17 09:10:11 2011

git annex unused seems to check for current path
Posted Sat Sep 17 09:10:11 2011

git rename detection on file move
Posted Sat Sep 17 09:10:11 2011

S3 memory leaks
Posted Sat Sep 17 09:10:11 2011

The key is the basename of the symlink target.
Comment by joey Sun May 15 12:47:53 2011
Maybe I will run into issues myself somewhere down the road, but generally speaking, I really really like the fact that files are immutable by default.
Comment by Richard Mon Mar 21 09:15:03 2011
I've been trying to dig around the trace and code, and used google to see if the forkProcess issue was a haskell thing or an OSX thing. It seems that http://hackage.haskell.org/trac/ghc/ticket/4493 someone may have ran into a similar issue, though I am not sure if its related.
Comment by Jimmy Sat Feb 12 17:19:24 2011

To re-inject new content for a file, you really want to get a new key for the file. Otherwise, other repos that have the old file will never get the new content. So:

git rm file
mv ~/newcontent file
git annex add file
Comment by joey Sat May 14 12:28:36 2011

a0826293 fixed the last problem, there is coreutils available in macports, if they are installed you get the gnu equivalents but they are prefixed with a g (e.g. gchmod instead of chmod), I guess not everyone will have these install or prefer these on OSX

Some more tests fail now...

Testing 1:blackbox:3:git-annex unannex:1:with content
### Failure in: 1:blackbox:3:git-annex unannex:1:with content
foo is not a symlink
Testing 1:blackbox:4:git-annex drop:0:no remotes
### Failure in: 1:blackbox:4:git-annex drop:0:no remotes
drop wrongly succeeded with no known copy of file
Testing 1:blackbox:4:git-annex drop:1:with remote
Testing 1:blackbox:4:git-annex drop:2:untrusted remote
Testing 1:blackbox:5:git-annex get
Testing 1:blackbox:6:git-annex move
Testing 1:blackbox:7:git-annex copy
### Failure in: 1:blackbox:7:git-annex copy
move --to of file already there failed
Testing 1:blackbox:8:git-annex unlock/lock
### Error in:   1:blackbox:8:git-annex unlock/lock
forkProcess: resource exhausted (Resource temporarily unavailable)
Testing 1:blackbox:9:git-annex edit/commit:0
### Error in:   1:blackbox:9:git-annex edit/commit:0
forkProcess: resource exhausted (Resource temporarily unavailable)
Testing 1:blackbox:9:git-annex edit/commit:1
### Error in:   1:blackbox:9:git-annex edit/commit:1
forkProcess: resource exhausted (Resource temporarily unavailable)
Testing 1:blackbox:10:git-annex fix
### Error in:   1:blackbox:10:git-annex fix
forkProcess: resource exhausted (Resource temporarily unavailable)
Testing 1:blackbox:11:git-annex trust/untrust/semitrust
### Error in:   1:blackbox:11:git-annex trust/untrust/semitrust
forkProcess: resource exhausted (Resource temporarily unavailable)
Testing 1:blackbox:12:git-annex fsck:0
### Error in:   1:blackbox:12:git-annex fsck:0
forkProcess: resource exhausted (Resource temporarily unavailable)
Testing 1:blackbox:12:git-annex fsck:1
### Error in:   1:blackbox:12:git-annex fsck:1
forkProcess: resource exhausted (Resource temporarily unavailable)
Testing 1:blackbox:12:git-annex fsck:2
### Error in:   1:blackbox:12:git-annex fsck:2
forkProcess: resource exhausted (Resource temporarily unavailable)
Testing 1:blackbox:13:git-annex migrate:0
### Error in:   1:blackbox:13:git-annex migrate:0
forkProcess: resource exhausted (Resource temporarily unavailable)
Testing 1:blackbox:13:git-annex migrate:1
### Error in:   1:blackbox:13:git-annex migrate:1
forkProcess: resource exhausted (Resource temporarily unavailable)
Testing 1:blackbox:14:git-annex unused/dropunused
### Error in:   1:blackbox:14:git-annex unused/dropunused
forkProcess: resource exhausted (Resource temporarily unavailable)
Cases: 30  Tried: 30  Errors: 11  Failures: 3
test: failed
make: *** [test] Error 1

On a side note, I think I found another bug in the testing. I had tested in a virtual machine in archlinux (a very recent updated version) Please see the report here tests fail when there is no global .gitconfig for the user

Comment by Jimmy Wed Feb 9 05:12:52 2011

Ah, great, thanks very much for the quick fix!

Yes, when I mentioned three defunct git processes, there were three processes shown as "git [defunct]", plus the three git processes I listed, plus two "git-annex" processes. Upon cancel/resume, there were no defunct git processes when I checked, but by the time I found the bug report on the forum and commented I'd already successfully upgraded by annex (by repeatedly attaching strace) and couldn't really easily get at either additional 'ps' info or a fuller strace than what I posted (that was just the log from one of the attach/detach cycles), so it's a relief you managed to pinpoint the problem.

Comment by pavel Wed Jul 6 04:14:26 2011
comment on the output of 'git-annex version' (from my last comment): now I get the right version 3.20110707. But I checked in my console that the three commands "git checkout 3.20110707", "make" and "./git-annex version" gave me before 3.20110702, I don't know why...
Comment by Rafaël Thu Jul 7 20:45:30 2011

Or, even better, wouldn't it make sense to have SHA backends always default to --fast and only use non-fast when any snags are hit, use non-fast mode for that file.

Though if we continue here, we should probably move this to its own page.

Comment by Richard Sun May 15 16:50:26 2011

Outside the test suite, git-annex's actual use of cp puts fairly low demands on it. It tries to use cp -a or cp -p if available just to preserve whatever attributes it can preserve, but the worst case if that you have a symlink pointing to a file that doesn't have the original timestamp or whatever. And there's little expectation git preserves that stuff anyway.

I will probably try to make the test suite entirely use git clone rather than cp.

Comment by joey Sun Feb 13 13:54:09 2011

Joey, sorry, I got it wrong. I thought upgrading git didn't help and you adjusted things in git-annex instead.

Anyway, can I get around upgrading on all hosts by reformatting the drive to case-sensitive HFS+? Or will I have to upgrade git (currently version 1.7.2.5) eventually anyway?

Comment by gernot Sun Apr 3 15:46:16 2011

On second thought and after some messing (trying most of the options and combinations of options on OSX for).... I tried replacing cp with gnu cp from coreutils on my OSX install, and all the tests passed. sigh cp -a is preserving some permissions and attributes but not all, its not behaving in the same way as the gnu cp does... the closet thing that I have found on OSX that behaves in the same way as gnu "cp -pr" is to use "ditto".

Just doing a "ditto SOURCE DEST" in the tests passes everything. I'm not sure if its a good idea to use this even though it works. Though this is just the tests, does it affect CopyFile.hs where "cp" is called?

Comment by Jimmy Sun Feb 13 11:55:47 2011

It seems the objects are in the remote after all, but the remote is unaware of this fact. No idea where/why the remote lost that info, but.. Anyway, with the SHA backends, wouldn't it make sense to simply return "OK" and update the annex logs accordingly, no?

Local:

% ls -l foo
lrwxrwxrwx 1 richih richih 312 Apr  3 01:18 foo -> .git/annex/objects/gG/VW/SHA512-s80781--cef3966a19c7435acceb8fbfbff1feebe6decab7c81a0c197f00932cf9ef0eac330784cc3f0d211bd4acf56a6d16daaebe9b598aa4dfd5bfec73f4e6df3f0491/SHA512-s80781--cef3966a19c7435acceb8fbfbff1feebe6decab7c81a0c197f00932cf9ef0eac330784cc3f0d211bd4acf56a6d16daaebe9b598aa4dfd5bfec73f4e6df3f0491
% 

Remote:

% git-annex-shell recvkey <remote> SHA512-s80781--cef3966a19c7435acceb8fbfbff1feebe6decab7c81a0c197f00932cf9ef0eac330784cc3f0d211bd4acf56a6d16daaebe9b598aa4dfd5bfec73f4e6df3f0491
git-annex-shell: key is already present in annex
% strace git-annex-shell recvkey /base/git-annex/fun SHA512-s80781--cef3966a19c7435acceb8fbfbff1feebe6decab7c81a0c197f00932cf9ef0eac330784cc3f0d211bd4acf56a6d16daaebe9b598aa4dfd5bfec73f4e6df3f0491 2>&1 | grep SHA512-s80781--cef3966a19c7435acceb8fbfbff1feebe6decab7c81a0c197f00932cf9ef0eac330784cc3f0d211bd4acf56a6d16daaebe9b598aa4dfd5bfec73f4e6df3f0491
stat64("/base/git-annex/fun/annex/objects/gG/VW/SHA512-s80781--cef3966a19c7435acceb8fbfbff1feebe6decab7c81a0c197f00932cf9ef0eac330784cc3f0d211bd4acf56a6d16daaebe9b598aa4dfd5bfec73f4e6df3f0491/SHA512-s80781--cef3966a19c7435acceb8fbfbff1feebe6decab7c81a0c197f00932cf9ef0eac330784cc3f0d211bd4acf56a6d16daaebe9b598aa4dfd5bfec73f4e6df3f0491", {st_mode=S_IFREG|0444, st_size=80781, ...}) = 0
% ls -l /base/git-annex/fun/annex/objects/gG/VW/SHA512-s80781--cef3966a19c7435acceb8fbfbff1feebe6decab7c81a0c197f00932cf9ef0eac330784cc3f0d211bd4acf56a6d16daaebe9b598aa4dfd5bfec73f4e6df3f0491/SHA512-s80781--cef3966a19c7435acceb8fbfbff1feebe6decab7c81a0c197f00932cf9ef0eac330784cc3f0d211bd4acf56a6d16daaebe9b598aa4dfd5bfec73f4e6df3f0491
-r--r--r-- 1 richih richih 80781 2011-04-01 12:44 /base/git-annex/fun/annex/objects/gG/VW/SHA512-s80781--cef3966a19c7435acceb8fbfbff1feebe6decab7c81a0c197f00932cf9ef0eac330784cc3f0d211bd4acf56a6d16daaebe9b598aa4dfd5bfec73f4e6df3f0491/SHA512-s80781--cef3966a19c7435acceb8fbfbff1feebe6decab7c81a0c197f00932cf9ef0eac330784cc3f0d211bd4acf56a6d16daaebe9b598aa4dfd5bfec73f4e6df3f0491
% 
Comment by Richard Sun May 15 14:53:26 2011

So, there is evidence here of a circumstance caused by the other bug, as I suspected.

I don't think that manual git commit -a caused the problem. I suspect it was a subsequent git add that caused git to follow the wrong case paths and add the files in the wrong place. Ie, when you run "git add .git-annex", it recurses into .git-annex/Gm/, and adds files using that case, that were previously added from .git-annex/GM/.

For completeness, can you verify this repo's core.ignorecase setting?


I hate that you are stuck using loop filesystems to work around this bug. If my guess is correct, you don't need to, as long as you avoid manually running "git add .git-annex". I take this bug seriously. While I'm currently very involved in adding Amazon S3 support to git-annex (which will take days more of solid work), I do plan to make a loop filesystem of my own, probably vfat, so I can try and reproduce this on a case-insensative filesystem. If you could confirm my above hypothesis, that would speed things up for me.

It's possible I will have to tweak the hash directories. Hopefully if so, I will only tweak them for new keys; if I had to do a v3 backend just to fix this stupid thing, I'd be sad -- upgrading all my offline disks from v1 to v2 took me many days.

Comment by joey Mon Mar 28 11:25:18 2011

I forgot to mention that the statfs64 stuff in OSX seems to be deprecated, see http://developer.apple.com/library/mac/#documentation/Darwin/Reference/ManPages/man2/statfs64.2.html

on a slightly different note, is anonymous pushing to the "wiki" over git allowed? I'd prefer to be able to edit stuff inline for updating some of my own comments if I can :P

Comment by Jimmy Wed Mar 23 12:23:56 2011

Try the changes I've pushed to use statfs64 on apple.

There is actually a standardized statvfs that I'd rather use, but after the last time that I tried going with the POSIX option first only to find it was not broadly implemented, I was happy to find some already existing code that worked for some OSs.

(While ikiwiki supports anonymous git push, it's a feature we have not rolled out on Branchable.com yet, and anyway, ikiwiki disallows editing existing comments that way. I would, however, be happy to git pull changes from somewhere.)

Comment by joey Wed Mar 23 12:57:56 2011
Just to make sure: How do I get $key? What I did was look at the path in the object store of the local repo and see if that exact same path & file existed in the remote.
Comment by Richard Sun May 15 05:16:49 2011

That's odd, I have the md5sha1sum package installed and it still fails with pretty much the same error

Testing 1:blackbox:0:git-annex init
Cases: 30  Tried: 7  Errors: 0  Failures: 0chmod: -R: No such file or directory
### Error in:   1:blackbox:0:git-annex init
.t/repo/.git/annex/objects/SHA1:ee80d2cec57a3810db83b80e1b320df3a3721ffa/SHA1:ee80d2cec57a3810db83b80e1b320df3a3721ffa: removeLink: permission denied (Permission denied)
Testing 1:blackbox:1:git-annex add:0
### Error in:   1:blackbox:1:git-annex add:0
foo: openFile: permission denied (Permission denied)

< and so on >

the configure script finds sha1sum, builds and starts to run.

Comment by Jimmy Tue Feb 8 20:45:31 2011
Fixed that, and removed the impossible cast so it can be built with #if 1
Comment by joey Sun Mar 20 18:06:25 2011
And what about emitting a warning, as git does, that some files were not annex-added (when not using --force)?
Comment by Rafaël Sun Jul 3 07:56:45 2011
For example if the file is owned by root, I guess git-annex fails when it tries to remove write permissions (I retested with the last version of today (whose "version" subcommand still outputs 3.20110702)).By the way, it would be nice to have a log file created containing the list of all failures, to avoid having to scan manually all the output of a long git-annex operation.
Comment by Rafaël Thu Jul 7 20:21:31 2011

I have pushed out a preliminary fix. The old mixed-case directories will be left where they are, and still read from by git-annex. New data will be written to new, lower-case directories. I think that once git stops seeing changes being made to mixed-case, colliding directories, the bugs you ran into won't manifest any more.

You will need to find a way to get your git repository out of the state where it complains about uncommitted files (and won't let you commit them). I have not found a reliable way to do that; git reset --hard worked in one case but not in another. May need to clone a fresh git repository.

Let me know how it works out.

Comment by joey Sat Apr 2 13:53:58 2011

What an evil little bug. In retrospect, this probably bit my own test upgrades, but I ran git annex fsck everywhere and so avoided the location log breakage.

I've fixed the bug, which also involved files with other punctuation in their names [&:%] when using the WORM backend.

The only way I have to recover repos that have already been upgraded is to run git annex fsck --fast in each clone of such a repo, which will let it rebuild the location log information. I think that is the best way to recover; ie I can't think of a way to recover that doesn't need to do everything fsck does anyway.

Comment by joey Thu Jul 7 17:04:23 2011

So, it appears that you're using git annex copy --fast. As documented that assumes the location log is correct. So it avoids directly checking if the bare repo contains the file, and tries to upload it, and the bare repo is all like "but I've already got this file!". The only way to improve that behavior might be to let rsync go ahead and retransfer the file, which, with recovery, should require sending little data etc. But I can't say I like the idea much, as the repo already has the content, so unlocking it and letting rsync mess with it is an unnecessary risk. I think it's ok for --force to blow up if its assumptions turn out to be wrong.

If you use git annex copy without --fast in this situation, it will do the right thing.

Comment by joey Sun May 15 15:40:47 2011

Version: 0.20110503

My local non-bare repo is copying to a remote bare repo.

I have been recovering in a non-bare repo.

If there is anything I can send you to help... If I removed said files and went through http://git-annex.branchable.com/bugs/No_easy_way_to_re-inject_a_file_into_an_annex/ -- would that help?

Comment by Richard Sat May 14 15:03:43 2011
Indeed, uninit needed to be improved. I've done so. Also, unannex --fast can be used to make hard links to content left in the annex.
Comment by Joey Mon Jul 4 16:25:38 2011

In the meantime, would it be acceptable to split the pre-commit hook into two discrete parts?

This would allow to (if preferred) defer "git annex fix" until post-commit while still keeping the safety net for unlocked files.

Comment by praet Mon Mar 21 15:58:34 2011

Alternatively, you can just load it up in ghci and see if it reports numbers that make sense:

joey@gnu:~/src/git-annex>make StatFS.hs
hsc2hs StatFS.hsc
perl -i -pe 's/^{-# INCLUDE.*//' StatFS.hs
joey@gnu:~/src/git-annex>ghci StatFS.hs
GHCi, version 6.12.1: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
[1 of 1] Compiling StatFS           ( StatFS.hs, interpreted )
Ok, modules loaded: StatFS.
*StatFS> s <- getFileSystemStats "."
Loading package bytestring-0.9.1.5 ... linking ... done.
*StatFS> s
Just (FileSystemStats {fsStatBlockSize = 4096, fsStatBlockCount = 7427989, fsStatByteCount = 30425042944, fsStatBytesFree = 2528489472, fsStatBytesAvailable = 2219384832, fsStatBytesUsed = 27896553472})
Comment by joey Wed Mar 23 11:13:33 2011

Hi,

(I'm new to git and git annex, so please forgive any mistakes I make...)

My repo is messed up right now. The fact that I copied the repo with rsync -a back and forth from a case insensitive filesystem to a case sensitive one, probably didn't help.

I believe the annexed files in .git/annex/objects/ are still using a mixed case directory hashing scheme. That's the problem I'm having. The symlinks point to the wrong case and are now broken. I don't think the latest versions of git-annex changed that (it only changed the hashing under .git-annex, right?).

Even if I clean up my repo, I think I'm still going to have a problem because I have one repo on an OS X case insensitive filesystem and my other repos on case sensitive Linux filesystems. Potentially the directory name under .git/annex/objects will have a different case. Then the symlink might have a different case than my Linux FS. Does git-annex track changes in git by the contents of the symlink? In which case the case difference would show up as a change even though there is no change?

Is it possible to change the directory hashing scheme under .git/annex/objects to use lowercase names?

Comment by ssqq Thu Jun 2 16:31:55 2011

Seems like you probably have files in git with nearly as long filenames as the key files. Course, you can rename those yourself.

This couldn't be changed directly in WORM without some ugly transition, but it would be possible to implement it as a WORM100 or so. OTOH, if you're going to git annex migrate, you might as well use SHA1.

Comment by joey Fri Apr 8 13:14:25 2011

Hey @fmarier. Well, this bug report is closed because you can already get rid of the symlinks. Just put a bare git repo on your fat filesystem, and use git-annex copy --to/--from there.

Now, that puts all the files that are on the device in .git/annex/objects/xx/yy/blah.mp3 -- how well rockbox would support that I don't know. And if it tries to modify or delete those files, git annex also can't help you manage those changes.

Another recent option is the directory special remote type, which again uses "xx/yy/blah.mp3" and can't track changes made to the files. This could perhaps be extended in the direction you suggest, although trying to fit this into the special remote infrastructure might not be a good fit really.

The most likely way this has to get dealt with is really by using smudge filters, which would eliminate the symlinks and allow copying a non-bare git repo onto vfat.

Comment by joey Mon Apr 4 14:20:45 2011

Yeap, that did the trick. I just tested a few separate OSX 10.6.6 systems and the tests are better behaved now, only 3 failures now.

So the tests behave better (at least we don't get resource fork errors any more)

  • after the commit c319a3 without modifying the system limits (of 266 procs per user)
  • without the commit c319a3 and when I increase the system process limits to as much as OSX allows

On all the systems I tested on, I'm down to 3 failures now.

### Failure in: 1:blackbox:3:git-annex unannex:1:with content
foo is not a symlink
### Failure in: 1:blackbox:4:git-annex drop:0:no remotes
drop wrongly succeeded with no known copy of file
Cases: 30  Tried: 20  Errors: 0  Failures: 2add foo ok
ok
Cases: 30  Tried: 24  Errors: 0  Failures: 2  Only 1 of 2 trustworthy copies of foo exist.
  Back it up with git-annex copy.
  Only 1 of 2 trustworthy copies of sha1foo exist.
  Back it up with git-annex copy.
  Bad file size; moved to /Users/jtang/develop/git-annex/.t/tmprepo/.git/annex/bad/WORM:1297594011:20:foo
  Bad file content; moved to /Users/jtang/develop/git-annex/.t/tmprepo/.git/annex/bad/SHA1:ee80d2cec57a3810db83b80e1b320df3a3721ffa
### Failure in: 1:blackbox:12:git-annex fsck:1
fsck failed to fail with content only available in untrusted (current) repository
Cases: 30  Tried: 26  Errors: 0  Failures: 3  Only 1 of 2 trustworthy copies of foo exist.
  Back it up with git-annex copy.
  The following untrusted locations may also have copies: 
    90d63906-375e-11e0-8867-abb8a6368269  -- test repo
  Only 1 of 2 trustworthy copies of sha1foo exist.
  Back it up with git-annex copy.
  The following untrusted locations may also have copies: 
    90d63906-375e-11e0-8867-abb8a6368269  -- test repo
Cases: 30  Tried: 30  Errors: 0  Failures: 3

It's the same set of failures across all the OSX systems that I have tested on. Now I just need to figure out why there are still these three failures.

Comment by Jimmy Sun Feb 13 06:46:54 2011
Indeed, I've made it even more robust now, handling the case where the file has weird permissions too, and undoing the failed add so the file is always back at the start state. Had to add a dependency on another haskell module to allow this, so it took some time to figure out how to do it..
Comment by joey Thu Jul 7 21:32:30 2011

It exists locally, whereis tells me it exists locally and locally, only.

The object is not in the bare repo.

The file might have gone missing before I upgraded my annex backend version to 2. Could this be a factor?

Comment by Richard Sat May 14 19:13:15 2011

Hm, if path's ok, guess there's no way around git-bisect indeed. Wonder if there's some kind of ccache for haskell...

OS is linux, amd64 on "host1" and i386 on "host2" where git-annex-shell is crashing. I'll try to come up with a commit, thanks for clarifications.

Comment by fraggod [pip.verisignlabs.com.pip.verisignlabs.com] Sun Apr 3 00:45:49 2011
The chmod errors are because your chmod does not understand the -R argument. Only the test suite uses chmod -R. I've fixed it to modify modes manually.
Comment by joey Wed Feb 9 00:10:27 2011

Actually I may have just been stupid and should have read the man page on statfs...

jtang@x00:~/develop/git-annex $ git diff
diff --git a/StatFS.hsc b/StatFS.hsc
index 8b453dc..e10b2dd 100644
--- a/StatFS.hsc
+++ b/StatFS.hsc
@@ -53,7 +53,7 @@ import Foreign.C.String
 import Data.ByteString (useAsCString)
 import Data.ByteString.Char8 (pack)

-#if defined (__FreeBSD__)
+#if defined (__FreeBSD__) || defined (__APPLE__)
 # include 
 # include 
 #else
@@ -84,8 +84,8 @@ data CStatfs
 #ifdef UNKNOWN
 #warning free space checking code not available for this OS
 #else
-#if defined(__FreeBSD__)
-foreign import ccall unsafe "sys/mount.h statfs"
+#if defined(__FreeBSD__) || defined (__APPLE__)
+foreign import ccall unsafe "sys/mount.h statfs64"
 #else
 foreign import ccall unsafe "sys/vfs.h statfs64"
 #endif

yields this...

jtang@x00:~/develop/git-annex $ ghci StatFS.hs                                                                                                                                    
GHCi, version 6.12.3: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Loading package ffi-1.0 ... linking ... done.
[1 of 1] Compiling StatFS           ( StatFS.hs, interpreted )
Ok, modules loaded: StatFS.
*StatFS> s <- getFileSystemStats "."
Loading package bytestring-0.9.1.7 ... linking ... done.
*StatFS> s
Just (FileSystemStats {fsStatBlockSize = 4096, fsStatBlockCount = 244106668, fsStatByteCount = 999860912128, fsStatBytesFree = 423097798656, fsStatBytesAvailable = 422835654656, fsStatBytesUsed = 576763113472})
*StatFS> 

we could just stick another if defined (APPLE) instead of what I previously had and it looks like it will do the right thing on OSX.

Comment by Jimmy Wed Mar 23 12:14:22 2011

Repeated bisect with -j1, just to be sure it's not a random error, and it gave me 828a84ba3341d4b7a84292d8b9002a8095dd2382 again. Guess I'll look through the changes there a bit later and try to revert these until it works.

Not sure if it's repeatable by anyone but me (and hence worth fixing), but here's a bit more of info about the system:

Exherbo linux
Linux sacrilege 2.6.38.2-fg.roam #4 SMP PREEMPT Mon Mar 28 21:08:47 YEKST 2011 i686 GNU/Linux

dev-lang/ghc-7.0.2:7.0.2::installed
dev-haskell/HUnit-1.2.2.3:1.2.2.3::installed
dev-haskell/MissingH-1.1.0.3:1.1.0.3::installed
dev-haskell/QuickCheck-2.4.0.1:2.4.0.1::installed
dev-haskell/array-0.3.0.2:0.3.0.2::installed
dev-haskell/bytestring-0.9.1.7:0.9.1.7::installed
dev-haskell/containers-0.4.0.0:0.4.0.0::installed
dev-haskell/extensible-exceptions-0.1.1.2:0.1.1.2::installed
dev-haskell/filepath-1.2.0.0:1.2.0.0::installed
dev-haskell/hslogger-1.1.3:0::installed
dev-haskell/mtl-2.0.1.0:2.0.1.0::installed
dev-haskell/network-2.3.0.1:2.3.0.1::installed
dev-haskell/old-locale-1.0.0.2:1.0.0.2::installed
dev-haskell/parsec-3.1.0:3.1.0::installed
dev-haskell/pcre-light-0.4:0::installed
dev-haskell/regex-base-0.93.2:0.93.2::installed
dev-haskell/regex-compat-0.93.1:0.93.1::installed
dev-haskell/regex-posix-0.94.4:0.94.4::installed
dev-haskell/syb-0.3:0.3::installed
dev-haskell/transformers-0.2.2.0:0.2.2.0::installed
dev-haskell/utf8-string-0.3.6:0.3.6::installed

(some stuff listed here as ::installed, but contains no files, since these packages detect whether ghc-7.0.2 already comes with the same/newer package version)

Comment by fraggod [pip.verisignlabs.com.pip.verisignlabs.com] Sun Apr 3 02:57:02 2011
Given that the softlinks contain all needed information (if the object exists, locally), an emergency way to get files "out" of git-annex would be nice. I am aware that one can script it, but a canonical way is always better, especially when things go south.
Comment by Richard Sun Apr 3 04:55:18 2011
Sounds like you probably didn't commit after the fsck, or didn't push so the other repository did not know the first had the content again -- but I'm not 100% sure.
Comment by joey Wed May 11 21:01:34 2011
joey@kitenet.net (hope I can make sense of dtruss output)
Comment by joey Wed Feb 9 15:47:30 2011
@gernot step 0 is to upgrade git-annex to current git, on all systems where you use it, in case that wasn't clear.
Comment by joey Sun Apr 3 12:53:51 2011

I meant to say in it wasn't reliable when I was following the instructions for "Comment 12". I did find that just doing a "git annex copy -t externalusb ." then a "git annex drop ." from the root of my cloned and "none trusted" annexed repos to be more reliable, it just means I temporarily need a load of space to get myself out of my earlier mess.

On testing this bug fix, I found a minor behavioural issue with git annex copy -f REMOTE . doesn't work as expected

Comment by Jimmy Sun Apr 3 04:24:17 2011

I also failed to mention, that in the case when i have stray log files after what has happened in comment 2, I get this left over after a commit when git is confused...

jtang@x00:~/sources $ git status
# On branch master
# Your branch is ahead of 'origin/master' by 1 commit.
#
# Changes not staged for commit:
#   (use "git add ..." to update what will be committed)
#   (use "git checkout -- ..." to discard changes in working directory)
#
#   modified:   .git-annex/1G/X3/WORM-s309910751-m1301311322--l_fcompxe_ia32_2011.2.137.tgz.log
#   modified:   .git-annex/3W/Xf/WORM-s805764902-m1301312756--l_cproc_p_11.1.075_intel64.log
#   modified:   .git-annex/9Q/Wz/WORM-s1234430253-m1301311891--l_ccompxe_2011.2.137.log
#   modified:   .git-annex/FQ/4z/WORM-s318168323-m1301310848--l_cprof_p_11.1.075_ia64.log
#   modified:   .git-annex/FV/0P/WORM-s710135470-m1301311835--l_ccompxe_intel64_2011.2.137.log
#   modified:   .git-annex/Jk/zK/WORM-s374617670-m1301312705--l_ipp_7.0.2.137_intel64.log
#   modified:   .git-annex/Jx/qM/WORM-s599386592-m1301310731--l_fcompxe_2011.2.137.tgz.log
#   modified:   .git-annex/KX/w1/WORM-s35976002-m1301312193--l_tbb_3.0.6.174.log
#   modified:   .git-annex/VK/kv/WORM-s584342291-m1301312669--l_cproc_p_11.1.075_ia64.log
#   modified:   .git-annex/Vw/jK/WORM-s15795178-m1301310913--w_flm_p_1.0.011_intel64.zip.log
#   modified:   .git-annex/Zq/7X/WORM-s343075585-m1301312233--l_ipp_7.0.2.137_ia32.log
#   modified:   .git-annex/vW/v1/WORM-s736986678-m1301312794--l_cproc_p_11.1.075_ia32.log
#
no changes added to commit (use "git add" and/or "git commit -a")

Up until now I have just been updating the status of the staged files by hand and commiting it on my mac x00, this probably isn't helping. I'd rather not lose the tracking information.

Comment by Jimmy Mon Mar 28 11:51:11 2011

Currently fsck silently ignores --to/--from. It should at least complain if it is not supported.

Comment by npouillard Sat Jun 25 12:20:44 2011

Thanks to your feedback, I got it going.

Maybe those two should be added to the 'OSX how-to' in the forum

[realizes pcre-light is needed but pcre not installed on my mac]
sudo port install pcre
sudo cabal install pcre-light

[tests are failing, need haskell's quickcheck]
sudo cabal install quickcheck

Comment by Antoine Sun Feb 6 02:02:57 2011

I think I know how I got myself into this mess... I was on my mac workstation and I had just pulled in a change set from another repo on a linux workstation after I had a made a bunch of moves. here's a bit of a log of what happened...

jtang@x00:~/sources $ git pull cports-devel master
Warning: untrusted X11 forwarding setup failed: xauth key data not generated
Warning: No xauth data; using fake authentication data for X11 forwarding.
remote: Counting objects: 4195, done.
remote: Compressing objects: 100% (1135/1135), done.
remote: Total 2582 (delta 866), reused 2576 (delta 860)
Receiving objects: 100% (2582/2582), 229.42 KiB | 111 KiB/s, done.
Resolving deltas: 100% (866/866), completed with 9 local objects.
From cports-devel:/home/people/jtang/sources
 * branch            master     -> FETCH_HEAD
Updating 319df99..ab0a98c
error: Your local changes to the following files would be overwritten by merge:
    .git-annex/09/5X/WORM-s361516678-m1301310614--l_fcompxe_intel64_2011.2.137.tgz.log
    .git-annex/43/2g/WORM-s19509673-m1301310496--l_fcompxe_2011.2.137_redist.tgz.log
    .git-annex/4J/qF/WORM-s18891115-m1301310934--w_flm_p_1.0.011_ia64.zip.log
    .git-annex/87/w1/WORM-s12212473-m1301310909--w_flm_p_1.0.011_ia32.zip.log
    .git-annex/99/Jq/WORM-s194345957-m1301310926--l_mkl_10.3.2.137_ia32.log
    .git-annex/99/kf/WORM-s9784531-m1301311680--l_ccompxe_2011.2.137_redist.log
    .git-annex/FF/f3/WORM-s93033394-m1301311706--l_gen_ipp_7.0.2.137.log
    .git-annex/MF/xZ/WORM-s515140733-m1301310936--l_cprof_p_11.1.075.log
    .git-annex/XW/X8/WORM-s355559731-m1301310797--l_mkl_10.3.2.137.log
    .git-annex/fJ/mZ/WORM-s1372886477-m1301313368--l_cproc_p_11.1.075.log
    .git-annex/j7/Q9/WORM-s44423202-m1301310622--l_cprof_p_11.1.075_redist.log
    .git-annex/k4/K7/WORM-s239539070-m1301310760--l_mkl_10.3.2.137_intel64.log
    .git-annex/kz/01/WORM-s279573314-m1301310783--l_cprof_p_11.1.075_ia32.log
    .git-annex/p6/Kq/WORM-s31199343-m1301311829--l_cproc_p_11.1.075_redist.log
    .git-annex/pz/J5/WORM-s626995277-m1301312301--l_ccompxe_ia32_2011.2.137.log
    .git-annex/v3/kX/WORM-s339693045-m1301310851--l_cprof_p_11.1.075_intel64.log
Please, commit your changes or stash them before you can merge.
error: Your local changes to the following files would be overwritten by merge:
    .git-annex/12/3W/WORM-s3058814-m1276699694--Botan-1.8.9.tgz.log
    .git-annex/1G/qV/WORM-s9122-m1251558854--Array-Compare-2.01.tar.gz.log
    .git-annex/3W/W5/WORM-s231523-m1270740744--DBD-Pg-2.17.1.tar.gz.log
    .git-annex/3x/PX/WORM-s380310-m1293025187--HTSeq-0.4.7.tar.gz.log
    .git-annex/45/gk/WORM-s67337-m1248732018--ExtUtils-Install-1.54.tar.gz.log
    .git-annex/4J/7Q/WORM-s8608-m1224694862--Algorithm-Munkres-0.08.tar.gz.log
    .git-annex/4g/XQ/WORM-s89208-m1278682033--HTML-Parser-3.66.tar.gz.log
    .git-annex/54/jw/WORM-s300163-m1226422051--AcePerl-1.92.tar.gz.log
    .git-annex/63/kj/WORM-s1213460-m1262942058--DBD-SQLite-1.29.tar.gz.log
    .git-annex/6Z/42/WORM-s4074-m943766010--File-Sync-0.09.tar.gz.log
    .git-annex/8F/M5/WORM-s6989-m1263161127--Digest-HMAC-1.02.tar.gz.log
    .git-annex/G2/FK/WORM-s3309-m1163872981--Bundle-BioPerl-2.1.8.tar.gz.log
    .git-annex/Gk/XF/WORM-s23572243-m1279546902--EMBOSS-6.3.1.tar.gz.log
    .git-annex/Jk/X6/WORM-s566429-m1279309002--DBI-1.612.tar.gz.log
    .git-annex/K6/fV/WORM-s1561451-m1240055295--Convert-Binary-C-0.74.tar.gz.log
    .git-annex/KM/4q/WORM-s146959-m1268515086--Graph-0.94.tar.gz.log
    .git-annex/MF/m2/WORM-s425766-m1212514609--Data-Stag-0.11.tar.gz.log
    .git-annex/QJ/P6/WORM-s1045868-m1282215033--9base-6.tar.gz.log
    .git-annex/Qm/WG/WORM-s39078-m1278163547--Digest-SHA1-2.13.tar.gz.log
    .git-annex/Wq/Fj/WORM-s45680640-m1297862101--BclConverter-1.7.1.tar.log
    .git-annex/Wq/Wm/WORM-s263536640-m1295025537--CASAVA_v1.7.0.tar.log
    .git-annex/XW/qm/WORM-s36609-m1276050470--Bio-ASN1-EntrezGene-1.10-withoutworldwriteables.tar.gz.log
    .git-annex/f7/g0/WORM-s40872-m1278273227--ExtUtils-ParseXS-2.2206.tar.gz.log
    .git-annex/j3/JF/WORM-s11753-m1232427595--Clone-0.31.tar.gz.log
    .git-annex/kX/9g/WORM-s84690-m1229117599--GraphViz-2.04.tar.gz.log
    .git-annex/km/z5/WORM-s44634-m1275505134--Authen-SASL-2.15.tar.gz.log
    .git-annex/kw/J3/WORM-s132396-m1278780649--DBD-mysql-4.016.tar.gz.log
    .git-annex/p5/1P/WORM-s53736-m1278673485--Archive-Tar-1.64.tar.gz.log
    .git-annex/wv/zG/WORM-s30584-m1268774021--ExtUtils-CBuilder-0.2703.tar.gz.log
    .git-annex/x5/7v/WORM-s10462526-m1254242591--BioPerl-1.6.1.tar.gz.log
Please, commit your changes or stash them before you can merge.
error: The following untracked working tree files would be overwritten by merge:
    .git-annex/1g/X3/WORM-s309910751-m1301311322--l_fcompxe_ia32_2011.2.137.tgz.log
    .git-annex/3w/Xf/WORM-s805764902-m1301312756--l_cproc_p_11.1.075_intel64.log
    .git-annex/9Q/Wz/WORM-s1234430253-m1301311891--l_ccompxe_2011.2.137.log
    .git-annex/FQ/4z/WORM-s318168323-m1301310848--l_cprof_p_11.1.075_ia64.log
    .git-annex/FV/0P/WORM-s710135470-m1301311835--l_ccompxe_intel64_2011.2.137.log
    .git-annex/Jx/qM/WORM-s599386592-m1301310731--l_fcompxe_2011.2.137.tgz.log
    .git-annex/KX/w1/WORM-s35976002-m1301312193--l_tbb_3.0.6.174.log
    .git-annex/Vw/jK/WORM-s15795178-m1301310913--w_flm_p_1.0.011_intel64.zip.log
    .git-annex/jK/zK/WORM-s374617670-m1301312705--l_ipp_7.0.2.137_intel64.log
    .git-annex/vK/kv/WORM-s584342291-m1301312669--l_cproc_p_11.1.075_ia64.log
    .git-annex/vw/v1/WORM-s736986678-m1301312794--l_cproc_p_11.1.075_ia32.log
    .git-annex/zq/7X/WORM-s343075585-m1301312233--l_ipp_7.0.2.137_ia32.log
Please move or remove them before you can merge.
Aborting
1|jtang@x00:~/sources $ git status
# On branch master
# Your branch is ahead of 'origin/master' by 2 commits.
#
# Changes to be committed:
#   (use "git reset HEAD ..." to unstage)
#
#   modified:   .git-annex/09/5X/WORM-s361516678-m1301310614--l_fcompxe_intel64_2011.2.137.tgz.log
#   modified:   .git-annex/43/2g/WORM-s19509673-m1301310496--l_fcompxe_2011.2.137_redist.tgz.log
#   modified:   .git-annex/4J/qF/WORM-s18891115-m1301310934--w_flm_p_1.0.011_ia64.zip.log
#   modified:   .git-annex/87/w1/WORM-s12212473-m1301310909--w_flm_p_1.0.011_ia32.zip.log
#   modified:   .git-annex/99/Jq/WORM-s194345957-m1301310926--l_mkl_10.3.2.137_ia32.log
#   modified:   .git-annex/99/kf/WORM-s9784531-m1301311680--l_ccompxe_2011.2.137_redist.log
#   modified:   .git-annex/FF/f3/WORM-s93033394-m1301311706--l_gen_ipp_7.0.2.137.log
#   modified:   .git-annex/MF/xZ/WORM-s515140733-m1301310936--l_cprof_p_11.1.075.log
#   modified:   .git-annex/XW/X8/WORM-s355559731-m1301310797--l_mkl_10.3.2.137.log
#   modified:   .git-annex/fJ/mZ/WORM-s1372886477-m1301313368--l_cproc_p_11.1.075.log
#   modified:   .git-annex/j7/Q9/WORM-s44423202-m1301310622--l_cprof_p_11.1.075_redist.log
#   modified:   .git-annex/k4/K7/WORM-s239539070-m1301310760--l_mkl_10.3.2.137_intel64.log
#   modified:   .git-annex/kz/01/WORM-s279573314-m1301310783--l_cprof_p_11.1.075_ia32.log
#   modified:   .git-annex/p6/Kq/WORM-s31199343-m1301311829--l_cproc_p_11.1.075_redist.log
#   modified:   .git-annex/pz/J5/WORM-s626995277-m1301312301--l_ccompxe_ia32_2011.2.137.log
#   modified:   .git-annex/v3/kX/WORM-s339693045-m1301310851--l_cprof_p_11.1.075_intel64.log
#
# Changes not staged for commit:
#   (use "git add ..." to update what will be committed)
#   (use "git checkout -- ..." to discard changes in working directory)
#
#   modified:   .git-annex/12/3W/WORM-s3058814-m1276699694--Botan-1.8.9.tgz.log
#   modified:   .git-annex/1G/qV/WORM-s9122-m1251558854--Array-Compare-2.01.tar.gz.log
#   modified:   .git-annex/3W/W5/WORM-s231523-m1270740744--DBD-Pg-2.17.1.tar.gz.log
#   modified:   .git-annex/3x/PX/WORM-s380310-m1293025187--HTSeq-0.4.7.tar.gz.log
#   modified:   .git-annex/45/gk/WORM-s67337-m1248732018--ExtUtils-Install-1.54.tar.gz.log
#   modified:   .git-annex/4J/7Q/WORM-s8608-m1224694862--Algorithm-Munkres-0.08.tar.gz.log
#   modified:   .git-annex/4g/XQ/WORM-s89208-m1278682033--HTML-Parser-3.66.tar.gz.log
#   modified:   .git-annex/54/jw/WORM-s300163-m1226422051--AcePerl-1.92.tar.gz.log
#   modified:   .git-annex/63/kj/WORM-s1213460-m1262942058--DBD-SQLite-1.29.tar.gz.log
#   modified:   .git-annex/6Z/42/WORM-s4074-m943766010--File-Sync-0.09.tar.gz.log
#   modified:   .git-annex/8F/M5/WORM-s6989-m1263161127--Digest-HMAC-1.02.tar.gz.log
#   modified:   .git-annex/G2/FK/WORM-s3309-m1163872981--Bundle-BioPerl-2.1.8.tar.gz.log
#   modified:   .git-annex/Gk/XF/WORM-s23572243-m1279546902--EMBOSS-6.3.1.tar.gz.log
#   modified:   .git-annex/Jk/X6/WORM-s566429-m1279309002--DBI-1.612.tar.gz.log
#   modified:   .git-annex/K6/fV/WORM-s1561451-m1240055295--Convert-Binary-C-0.74.tar.gz.log
#   modified:   .git-annex/KM/4q/WORM-s146959-m1268515086--Graph-0.94.tar.gz.log
#   modified:   .git-annex/MF/m2/WORM-s425766-m1212514609--Data-Stag-0.11.tar.gz.log
#   modified:   .git-annex/QJ/P6/WORM-s1045868-m1282215033--9base-6.tar.gz.log
#   modified:   .git-annex/Qm/WG/WORM-s39078-m1278163547--Digest-SHA1-2.13.tar.gz.log
#   modified:   .git-annex/Wq/Fj/WORM-s45680640-m1297862101--BclConverter-1.7.1.tar.log
#   modified:   .git-annex/Wq/Wm/WORM-s263536640-m1295025537--CASAVA_v1.7.0.tar.log
#   modified:   .git-annex/XW/qm/WORM-s36609-m1276050470--Bio-ASN1-EntrezGene-1.10-withoutworldwriteables.tar.gz.log
#   modified:   .git-annex/Zq/7X/WORM-s343075585-m1301312233--l_ipp_7.0.2.137_ia32.log
#   modified:   .git-annex/f7/g0/WORM-s40872-m1278273227--ExtUtils-ParseXS-2.2206.tar.gz.log
#   modified:   .git-annex/j3/JF/WORM-s11753-m1232427595--Clone-0.31.tar.gz.log
#   modified:   .git-annex/kX/9g/WORM-s84690-m1229117599--GraphViz-2.04.tar.gz.log
#   modified:   .git-annex/km/z5/WORM-s44634-m1275505134--Authen-SASL-2.15.tar.gz.log
#   modified:   .git-annex/kw/J3/WORM-s132396-m1278780649--DBD-mysql-4.016.tar.gz.log
#   modified:   .git-annex/p5/1P/WORM-s53736-m1278673485--Archive-Tar-1.64.tar.gz.log
#   modified:   .git-annex/wv/zG/WORM-s30584-m1268774021--ExtUtils-CBuilder-0.2703.tar.gz.log
#   modified:   .git-annex/x5/7v/WORM-s10462526-m1254242591--BioPerl-1.6.1.tar.gz.log
#
# Untracked files:
#   (use "git add ..." to include in what will be committed)
#
#   .git-annex/1G/X3/
#   .git-annex/3W/Xf/
#   .git-annex/9q/Wz/
#   .git-annex/Fq/4z/
#   .git-annex/Jk/zK/
#   .git-annex/Kx/w1/
#   .git-annex/VK/kv/
#   .git-annex/fv/0P/
#   .git-annex/jX/qM/
#   .git-annex/vW/jK/
#   .git-annex/vW/v1/
jtang@x00:~/sources $ git commit -a -m "snap"
[master 45f254a] snap
 47 files changed, 64 insertions(+), 30 deletions(-)
jtang@x00:~/sources $ git status
# On branch master
# Your branch is ahead of 'origin/master' by 3 commits.
#
# Untracked files:
#   (use "git add ..." to include in what will be committed)
#
#   .git-annex/1G/X3/
#   .git-annex/3W/Xf/
#   .git-annex/9q/Wz/
#   .git-annex/Fq/4z/
#   .git-annex/Jk/zK/
#   .git-annex/Kx/w1/
#   .git-annex/VK/kv/
#   .git-annex/fv/0P/
#   .git-annex/jX/qM/
#   .git-annex/vW/jK/
#   .git-annex/vW/v1/
nothing added to commit but untracked files present (use "git add" to track)
jtang@x00:~/sources $ git pull
Comment by Jimmy Mon Mar 28 11:09:45 2011

If you try to clone a git repo that has a symlink over to a VFAT filesystem, you get (in its place) a regular file that contains the name of the symlink target. So why can't git-annex use that? I could still do git annex get on this file, git annex would still "know" that it's a symlink, and could replace it with a copy of the real file (instead of putting it in .git/annex).

I know if it were that simple, someone would have done it already, so what am I missing? I guess trying to get the file FROM the repository would fail because it wouldn't find the file in .git/annex? Couldn't you store a reverse mapping? You wouldn't be able to move the file around, but you already lose that once you give up symlinks. It would also be a little harder to tell which symlinks were "dangling"; I don't see an easy way to get around that. It would still be better than a bare repo..

Comment by ethan.glasser.camp Wed Jun 8 16:59:38 2011
I don't know what these problems forking could be. Can you strace it?
Comment by joey Wed Feb 9 11:04:50 2011
In my "sources" repo on x00, the current setting is this "ignorecase = true" it was the first repo that I created before I clone it elsewhere and pull my changes back, it is on a HFS+ partition which is case insensitive and it is replicated on a portable hdd with a bare repo on a exfat partition. I wonder if my portable disk has a partially borked repo :P
Comment by Jimmy Mon Mar 28 11:41:56 2011
Yes, I've moved it to OSX page where anyone can update it in this wiki, and added your improvements.
Comment by joey Sun Feb 6 13:39:52 2011
Just tried it out on my mac and it's working again. I guess this issue could be closed for now.
Comment by Jimmy Wed Mar 16 16:32:01 2011
Picking up the automagic encryption idea for annex remotes, this would allow you to host a branchable-esque git-annex hosting service. (Nexenta with ZFS is a cheap and reliable option until btrfs becomes stable in a year or five).
Comment by Richard Wed Mar 30 14:20:56 2011

Finally got around to report the issue to GHC tracker.

Looks quite alike (at least to the haskell-illiterate person like me) to a highest-priority issue that's hanging right at the top of the list. There are other similar reports, but they seem to be either related to PowerPC Macs, closed as invalid or due to needinfo inactivity.

Guess any further discussion belongs there, unless ghc developers will bounce it back. Thanks a lot for your help, Joey, and for sharing a great thing that git-annex is.

Comment by fraggod [pip.verisignlabs.com.pip.verisignlabs.com] Thu Apr 7 09:44:36 2011

S3 doesn't support encryption at all, yet.

It certainly makes sense to use a different portion of the encrypted secret key for HMAC than is uses as the gpg symmetric encryption key.

The two keys used in HMAC would be the secret key and the key/value key for the content being stored.

There is a difficult problem with encrypting filenames in S3 buckets, and that is determining when some data in the bucket is unused for dropunused. I've considered two choices:

  1. gpg encrypt the filenames. This would allow dropunused to recover the original filenames, and is probably more robust encryption. But it would double the number of times gpg is run when moving content in/out, and to check for unused content, gpg would have to be run once for every item in the bucket, which just feels way excessive, even though it would not be prompting for a passphrase. Still, haven't ruled this out.

  2. HMAC or other hash. To determine what data was unused the same hash and secret key would have to be used to hash all filenames currently used, and then that set of hashes could be interested with the set in the bucket. But then git-annex could only say "here are some opaque hashes of content that appears unused by anything in your current git repository, but there's no way, short of downloading it and examining it to tell what it is". (This could be improved by keeping a local mapping between filenames and S3 keys, but maintaining and committing that would bring pain of its own.)

Comment by joey Wed Mar 30 10:32:34 2011

I also ran into problems on a case-insensitive HFS+ file system, it seems. I tried following the instructions in comment 12:

1. Remove everything in .git-annex besides uuid.log and trust.log
2. git annex fsck --fast
3. Commit

However, I still see upper and lower case directories in .git-annex. Did I misunderstand that they should all be lower case now?

Comment by gernot Sun Apr 3 11:41:00 2011

You're missing the sha1sum command, everything else is a followon error from that. Added a hint about this to install, and in the next version configure will check for sha1sum.

Comment by joey Tue Feb 8 19:20:08 2011

Thanks for the reply @joey.

While it would certainly be possible for a bare repo to exist on my iRiver, the problem is that the music player uses the filesystem to organize files into directories like "Artist/Album/Track.ogg". So replacing that with "..../xx/yy/Track.ogg" would make it fairly difficult to browse my music collection and select the album/track I want to listen to :)

So unless I have the files physically organized like the symlinks, then it's probably not going to work very for that particular workflow. Smudge filters are interesting though. In the meantime, I'll look into rsyncing from another box which has the right filesystem layout onto my iRiver directly.

Comment by fmarier Tue Apr 5 06:00:21 2011

I've posted about this on the git mailing list. It's possible that these bugs, which can be shown to affect things other than just git-annex, will be fixed in git.

I will wait a while to see. But am considering making git-annex use all-lowercase hash dirs for the log files. Maybe it could first look for .git-annex/aaaa/bbbb/foo.log, but also look for, read, and merge in any info from .git-annex/Aa/Bb/foo.log. And always write to the new style filenames. This would avoid confusing git with changes to mixed-case files, and avoid another massive transition.

Comment by joey Thu Mar 31 15:28:02 2011
They rely on git-ls-files to get a list of files that are checked into git, in order to tell what to unannex.
Comment by joey Sat Apr 2 21:40:50 2011
By the way, the original bug reporter mentioned deleting .git/annex/journal. This is not recommended, and doing it during an upgrade can result in git-annex losing location tracking information. You should probably run git annex fsck or reset to the old git tree (and git config annex.version 2) and upgrade again.
Comment by joey Tue Jul 5 15:06:48 2011
No, I don't need a copy of your repo now.
Comment by joey Fri Apr 1 12:11:52 2011
OTOH, if encryption makes a bup backend more likely disregard the idea above ;)
Comment by Richard Wed Mar 30 15:02:20 2011
@ethan the reason that wouldn't work is because git would then see a file that was checked in and had its one line symlinkish content replaced with a huge binary blob. And git commit would try to commit that etc. The potential for foot-shooting is too high.
Comment by joey Fri Jun 10 12:41:43 2011
Although, if you really do want to shoot yourself in the foot, or know you have the old content, you can use git-annex setkey.
Comment by joey Sat May 14 12:29:35 2011

What you're describing should be impossible; the error message shown can only occur if the object is present in the annex where git-annex-shell recvkey is run. So something strange is going on.

Try reproducing it by running on the remote system, git-annex-shell recvkey /remote/repo.git $key .. if you can reproduce it, I guess the next thing to do will be to strace the command and see why it's thinking the object is there.

Comment by joey Sat May 14 20:09:34 2011
Thanks.
Comment by Richard Mon May 16 16:01:28 2011

I did not. Thanks :)

This still means that you can't re-inject a new version of a file unless you have the old one if you are using a SHA* backend, but that might be a corner case anyway.

Comment by Richard Sun Apr 3 05:00:17 2011
PS: Just to make this clear, I am using a custom alias for all my copying needs and thus didn't even see that I used --fast. :p
Comment by Richard Sun May 15 17:38:47 2011

I wouldn't say it's completly impossible for a WORM100 to work. It would just have the contract that the pair of mtime+100chars has to be unique for each unique piece of data.

But, I have yet to be convinced there's any point, since SHA1 exists.

Comment by joey Sat Apr 9 16:11:59 2011
Interesting, I had not heard of variable symlinks before. AFAIK linux does not have them.
Comment by joey Tue Mar 15 23:03:19 2011
One possible work around is to just create a loopback file system with a case sensitive filesystem. I think I might do that for anything that I really care about for now.
Comment by Jimmy Mon Mar 28 03:23:41 2011

The dtrace puzzlingly does not have the same errors shown above, but a set of mostly new errors. I don't know what to make of that.

git-annex: git-annex/.t/repo/.git/hooks/pre-commit: fileAccess: permission denied (Operation not permitted)

This seems to be caused by it setting the execute bit on the file. I don't know why that would fail; it's just written the file and renamed it into place so clearly should be able to write to it.

was able to modify annexed file's sha1foo content

This also suggests something breaking with permissions.

Comment by joey Wed Feb 9 17:59:47 2011

Hmm.. is utimensat available at all?

I've committed an update that may convince at least some compilers to expose this newer POSIX stuff. I don't know if it will help, please let me know.

Comment by joey Wed Mar 16 12:07:26 2011
Sometimes, I might want to fill up the disk as much as possible. Thus, a warning is preferable to erroring out too early, imo -- Richard
Comment by Richard Wed Mar 16 11:40:56 2011

You convince me for unannex, but isn't the goal of uninit to revert all annex operations? In the current state, a clean revert is not possible (because of the broken symlinks after uninit). Instead of copying, using hard links is out of question?

For my needs, is the command "git annex unlock ." (from the root of the repo) a correct workaround?

Comment by Rafaël Mon Jul 4 12:57:25 2011
Well, focus on a specific file that exhibits the problem. What does git annex whereis say about it? Is the content actually present in annex/objects/ on the bare repository? Does that contradict whereis?
Comment by joey Sat May 14 15:23:45 2011

Nice work on the bisection. It's obviously a compiler bug. Having two test cases that differ in only as trivial and innocous a commit as 828a84ba3341d4b7a84292d8b9002a8095dd2382 might help a GHC developer track it down.

We should probably forward this as a GHC bug. I hope you can find a different version or build of GHC to build git-annex with.

Comment by joey Sun Apr 3 12:06:34 2011

Ah, that gave me a good clue, my system just got pretty confused with a mixture of quickcheck and testpack installs. Would it be possible to put up a list of versions of the software you are using on your development environment? (at least the minimum tested version)

I guess it shouldn't matter to most users who are going to rely on packagers to sort these dependancy issues, but it's nice to know.

Anyway, the tests build now, and they seem to fail on my (rather messy) install of haskell platform + ghc 6.12 on osx 10.6.6.

< output that passed some tests >
Testing 1:blackbox:0:git-annex init
Testing 1:blackbox:1:git-annex add:0
Testing 1:blackbox:1:git-annex add:1
Cases: 30  Tried: 9  Errors: 0  Failures: 0test: sha1sum: executeFile: does not exist (No such file or directory)
  git-annex: : hGetLine: end of file
### Failure in: 1:blackbox:1:git-annex add:1
add with SHA1 failed
Testing 1:blackbox:2:git-annex setkey/fromkey
Cases: 30  Tried: 10  Errors: 0  Failures: 1(checksum...) test: sha1sum: executeFile: does not exist (No such file or directory)
### Error in:   1:blackbox:2:git-annex setkey/fromkey
: hGetLine: end of file
Testing 1:blackbox:3:git-annex unannex:0:no content
Cases: 30  Tried: 11  Errors: 1  Failures: 1chmod: -R: No such file or directory
chmod: -R: No such file or directory
Testing 1:blackbox:3:git-annex unannex:1:with content
### Failure in: 1:blackbox:3:git-annex unannex:1:with content
foo is not a symlink
Testing 1:blackbox:4:git-annex drop:0:no remotes
Cases: 30  Tried: 13  Errors: 1  Failures: 2chmod: -R: No such file or directory
### Error in:   1:blackbox:4:git-annex drop:0:no remotes
.t/tmprepo/.git/annex/objects/WORM:1297194705:20:foo/WORM:1297194705:20:foo: removeLink: permission denied (Permission denied)
Testing 1:blackbox:4:git-annex drop:1:with remote
Cases: 30  Tried: 14  Errors: 2  Failures: 2chmod: -R: No such file or directory
### Error in:   1:blackbox:4:git-annex drop:1:with remote
.t/tmprepo/.git/annex/objects/WORM:1297194705:20:foo/WORM:1297194705:20:foo: removeLink: permission denied (Permission denied)
Testing 1:blackbox:4:git-annex drop:2:untrusted remote
Cases: 30  Tried: 15  Errors: 3  Failures: 2chmod: -R: No such file or directory
### Error in:   1:blackbox:4:git-annex drop:2:untrusted remote
.t/tmprepo/.git/annex/objects/WORM:1297194705:20:foo/WORM:1297194705:20:foo: removeLink: permission denied (Permission denied)
Testing 1:blackbox:5:git-annex get
Cases: 30  Tried: 16  Errors: 4  Failures: 2chmod: -R: No such file or directory
### Error in:   1:blackbox:5:git-annex get
.t/tmprepo/.git/annex/objects/WORM:1297194705:20:foo/WORM:1297194705:20:foo: removeLink: permission denied (Permission denied)
Testing 1:blackbox:6:git-annex move
Cases: 30  Tried: 17  Errors: 5  Failures: 2chmod: -R: No such file or directory
### Error in:   1:blackbox:6:git-annex move
.t/tmprepo/.git/annex/objects/WORM:1297194705:20:foo/WORM:1297194705:20:foo: removeLink: permission denied (Permission denied)
Testing 1:blackbox:7:git-annex copy
Cases: 30  Tried: 18  Errors: 6  Failures: 2chmod: -R: No such file or directory
### Error in:   1:blackbox:7:git-annex copy
.t/tmprepo/.git/annex/objects/WORM:1297194705:20:foo/WORM:1297194705:20:foo: removeLink: permission denied (Permission denied)
Testing 1:blackbox:8:git-annex unlock/lock
Cases: 30  Tried: 19  Errors: 7  Failures: 2chmod: -R: No such file or directory
### Error in:   1:blackbox:8:git-annex unlock/lock
.t/tmprepo/.git/annex/objects/WORM:1297194705:20:foo/WORM:1297194705:20:foo: removeLink: permission denied (Permission denied)
Testing 1:blackbox:9:git-annex edit/commit:0
Cases: 30  Tried: 20  Errors: 8  Failures: 2chmod: -R: No such file or directory
### Error in:   1:blackbox:9:git-annex edit/commit:0
.t/tmprepo/.git/annex/objects/WORM:1297194705:20:foo/WORM:1297194705:20:foo: removeLink: permission denied (Permission denied)
Testing 1:blackbox:9:git-annex edit/commit:1
Cases: 30  Tried: 21  Errors: 9  Failures: 2chmod: -R: No such file or directory
### Error in:   1:blackbox:9:git-annex edit/commit:1
.t/tmprepo/.git/annex/objects/WORM:1297194705:20:foo/WORM:1297194705:20:foo: removeLink: permission denied (Permission denied)
Testing 1:blackbox:10:git-annex fix
Cases: 30  Tried: 22  Errors: 10  Failures: 2chmod: -R: No such file or directory
### Error in:   1:blackbox:10:git-annex fix
.t/tmprepo/.git/annex/objects/WORM:1297194705:20:foo/WORM:1297194705:20:foo: removeLink: permission denied (Permission denied)
Testing 1:blackbox:11:git-annex trust/untrust/semitrust
Cases: 30  Tried: 23  Errors: 11  Failures: 2chmod: -R: No such file or directory
### Error in:   1:blackbox:11:git-annex trust/untrust/semitrust
.t/tmprepo/.git/annex/objects/WORM:1297194705:20:foo/WORM:1297194705:20:foo: removeLink: permission denied (Permission denied)
Testing 1:blackbox:12:git-annex fsck:0
Cases: 30  Tried: 24  Errors: 12  Failures: 2chmod: -R: No such file or directory
### Error in:   1:blackbox:12:git-annex fsck:0
.t/tmprepo/.git/annex/objects/WORM:1297194705:20:foo/WORM:1297194705:20:foo: removeLink: permission denied (Permission denied)
Testing 1:blackbox:12:git-annex fsck:1
Cases: 30  Tried: 25  Errors: 13  Failures: 2chmod: -R: No such file or directory
### Error in:   1:blackbox:12:git-annex fsck:1
.t/tmprepo/.git/annex/objects/WORM:1297194705:20:foo/WORM:1297194705:20:foo: removeLink: permission denied (Permission denied)
Testing 1:blackbox:12:git-annex fsck:2
Cases: 30  Tried: 26  Errors: 14  Failures: 2chmod: -R: No such file or directory
### Error in:   1:blackbox:12:git-annex fsck:2
.t/tmprepo/.git/annex/objects/WORM:1297194705:20:foo/WORM:1297194705:20:foo: removeLink: permission denied (Permission denied)
Testing 1:blackbox:13:git-annex migrate:0
Cases: 30  Tried: 27  Errors: 15  Failures: 2chmod: -R: No such file or directory
### Error in:   1:blackbox:13:git-annex migrate:0
.t/tmprepo/.git/annex/objects/WORM:1297194705:20:foo/WORM:1297194705:20:foo: removeLink: permission denied (Permission denied)
Testing 1:blackbox:13:git-annex migrate:1
Cases: 30  Tried: 28  Errors: 16  Failures: 2chmod: -R: No such file or directory
### Error in:   1:blackbox:13:git-annex migrate:1
.t/tmprepo/.git/annex/objects/WORM:1297194705:20:foo/WORM:1297194705:20:foo: removeLink: permission denied (Permission denied)
Testing 1:blackbox:14:git-annex unused/dropunused
Cases: 30  Tried: 29  Errors: 17  Failures: 2chmod: -R: No such file or directory
### Error in:   1:blackbox:14:git-annex unused/dropunused
.t/tmprepo/.git/annex/objects/WORM:1297194705:20:foo/WORM:1297194705:20:foo: removeLink: permission denied (Permission denied)
Cases: 30  Tried: 30  Errors: 18  Failures: 2
chmod: -R: No such file or directory
test: .t/repo/.git/annex/objects/WORM:1297194705:20:foo/WORM:1297194705:20:foo: removeLink: permission denied (Permission denied)
make: *** [test] Error 1

I assumed that since the tests built, then running them shouldn't be a problem. It looks like some argument isn't being passed about for the location of the .t directory that gets created. I will check the dependancies on my system again.

Comment by Jimmy Tue Feb 8 15:56:55 2011

if you go for the two-commits version, small intermediate branches (or git-commit-tree) could be used to create a tree like this:

*   commit 106eef2
|\  Merge: 436e46f 9395665
| | 
| |     the main commit
| |   
| * commit 9395665
|/  
|       intermediate move
|  
* commit 436e46f
| 
|     ...

while the first commit (436e46f) has a "/subdir/foo → ../.git-annex/where_foo_is", the intermediate (9395665) has "/subdir/deeper/foo → ../.git-annex/where_foo_is", and the inal commit (106eef2) has "/subdir/deeper/foo → ../../.git-annex/where_foo_is".

--follow uses the intermediate commit to find the history, but the intermediate commit would neither show up in git log --first-parent nor affect git diff HEAD^.. & co. (there could still be confusion over git show, though).

Comment by chrysn Wed Mar 9 19:47:48 2011

I'm not sure how this happened, as far as I can see, and based on my testing, git annex upgrade does stage the location log files. OTOH, I vaguely rememeber needing to stage some of them when I was doing my own upgrades, but that was a while ago, and I don't remember the details.

Your upgrade seems to have gone ok from the file lists you sent, so you can just: git add .git-annex; git commit

Comment by joey Sat Apr 2 22:26:20 2011
I'm was running git 1.7.4.1 at the time when I came across it, I have just upgraded to 1.7.4.2. I've also just moved to using a loopback fs for the stuff i care about. Do you still want a repo that exhibits the problem (excluding the .git/annex data) ??? I'm also not sure if 1.7.4.2 has corrected the problem yet as I haven't done much with my repos since. I suspect just making all the .git-annex hashed directories seems to be lower case might be better in the long run.
Comment by Jimmy Thu Mar 31 17:32:10 2011

It all boils down to the fact that the path to a relative symlink's target is determined relative to the symlink itself.

Now, if we define the symlink's target relative to the git repo's root (eg. using the $GIT_DIR environment variable, which can be a relative or absolute path itself), this unfortunately results in an absolute symlink, which would -for obvious reasons- only be usable locally:

user@host:~$ mkdir -p tmp/{.git/annex,somefolder}
user@host:~$ export GIT_DIR=~/tmp
user@host:~$ touch $GIT_DIR/.git/annex/realfile
user@host:~$ ln -s $GIT_DIR/.git/annex/realfile $GIT_DIR/somefolder/file
user@host:~$ ls -al $GIT_DIR/somefolder/
total 12
drwxr-x--- 2 user group 4096 2011-03-10 16:54 .
drwxr-x--- 4 user group 4096 2011-03-10 16:53 ..
lrwxrwxrwx 1 user group   33 2011-03-10 16:54 file -> /home/user/tmp/.git/annex/realfile
user@host:~$

So, what we need is the ability to record the actual variable name (instead of it's value) in our symlinks.

It is possible, using variable/variant symlinks, yet I'm unsure as to whether or not this is available on Linux systems, and even if it is, it would introduce compatibility issues in multi-OS environments.

Thoughts on this?

Comment by praet Thu Mar 10 12:50:28 2011

Ok, well it looks like it isn't doing anything useful at all.

jtang@x00:~/develop/git-annex $ make StatFS.hs                                                                                                                                    
hsc2hs StatFS.hsc
perl -i -pe 's/^{-# INCLUDE.*//' StatFS.hs
jtang@x00:~/develop/git-annex $ ghci StatFS.hs                                                                                                                                    
GHCi, version 6.12.3: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Loading package ffi-1.0 ... linking ... done.
[1 of 1] Compiling StatFS           ( StatFS.hs, interpreted )
Ok, modules loaded: StatFS.
*StatFS> s <- getFileSystemStats "."
Loading package bytestring-0.9.1.7 ... linking ... done.
*StatFS> s
Just (FileSystemStats {fsStatBlockSize = 0, fsStatBlockCount = 1048576, fsStatByteCount = 0, fsStatBytesFree = 0, fsStatBytesAvailable = 0, fsStatBytesUsed = 0})
*StatFS> s <- getFileSystemStats "/"
*StatFS> s
Just (FileSystemStats {fsStatBlockSize = 0, fsStatBlockCount = 1048576, fsStatByteCount = 0, fsStatBytesFree = 0, fsStatBytesAvailable = 0, fsStatBytesUsed = 0})
*StatFS> 
Comment by Jimmy Wed Mar 23 12:02:34 2011
Also, you can delete .git-annex/?? if you want to, then running git annex fsck --fast in each of your clones would regenerate the data using only the lower-case hash directories.
Comment by joey Sat Apr 2 13:58:24 2011
Ok, thanks for the fix. It seems the fix isn't too reliable with my repos, I get different numbers of "** No known copies of..." in the various cloned repos that I have. After all the "messing" that I have done to my repos I think git-annex has gotten very confused. I will just leave things as they are and let git-annex slowly migrate over to the new format or re-clone from a linux source and see how things go. I will report back on this issue in abit after I use it more to see.
Comment by Jimmy Sun Apr 3 03:43:37 2011
I got dtruss to give me a trace, the output is quite big to post here (~560kb gzip'd), do you mind if I emailed it or posted it somewhere else for you?
Comment by Jimmy Wed Feb 9 15:35:47 2011

Yes, encrypting the symmetric key with users' regular gpg keys is the plan.

I don't think that encryption of content in a git annex remote makes much sense; the filenames obviously cannot be encrypted there. It's more likely that the same encryption would get used for a bup remote, or with the directory remote I threw in today.

Comment by joey Wed Mar 30 14:15:18 2011
Well if it happens again why don't you use ps or strace to see what it's doing.
Comment by joey Mon Jul 4 18:58:46 2011

As my comment from work is stuck in moderation:

I ran this twice:

git pull && git annex add . && git annex copy . --to <remote> --fast --quiet && git commit -a -m "$HOST $(date +%F--%H-%M-%S-%Z)" && git push

but nothing changed

Comment by Richard Sat May 14 05:06:54 2011

'git add .git-annex' didn't do anything. That's when I noticed that this repository is on a case-insensitive HFS+ file system.

So, if I get this right it's not a new bug, but similar to this situation: git-annex directory hashing problems on osx

Assuming that it was the file system's fault, I went ahead and upgraded yet another clone. That one (on an ext3 file system) had neither staged changes nor left-over untracked files. Everything seems to just have fallen right into place. Is that possible or still weird?

Comment by gernot Sun Apr 3 11:35:52 2011

Hmm. Old versions may have forgotten to git add a .git-annex location log file when recovering content with fsck. That could be another reason things are out of sync.

But I'm not clear on which repo is trying to copy files to which.

(NB: If the files were recovered on a bare git repo, fsck cannot update the location log there, which could also explain this.)

Comment by joey Sat May 14 12:13:58 2011

I've seen this kind of piping stall that is unblocked by strace before. It can vary with versions of GHC, so it would be good to know what version built git-annex (and on what OS version). I filed a bug report upstream before at http://bugs.debian.org/624389.

I really need a full strace -f from the top, or at least a complete strace -o log of git-annex from one hang through to another hang. The strace you pastebinned does not seem complete. If I can work out which specific git command is being written to when it hangs I can lift the writing out into a separate thread or process to fix it.

@pavel, you mentioned three defunct git processes, and then showed ps output for 3 git processes. Were there 6 git processes in total? And then when you ran it again you said there were no defunct gits -- where the other 3 git processes running once again?

As best I can make out from the (apparently) running git processes, it seems like the journal files for the upgrade had all been written, and the hang occurred when staging them all into the index in preparation for a commit. I have committed a change that lifts the code that does that write out into a new process, which, if I am guessing right on the limited info I have, will avoid the hang.

However, since I can't reproduce it, even when I put 200 thousand files in the journal and have git-annex process them, I can't be sure.

Comment by joey Tue Jul 5 13:31:22 2011

ok, pulling the latest master and building on OSX now does this...

ghc -O2 -Wall -ignore-package monads-fd --make git-annex
[ 1 of 63] Compiling Touch            ( Touch.hs, Touch.o )

Touch.hsc:24:0:
    The type signature for `touchBoth' lacks an accompanying binding

Touch.hsc:27:26: Not in scope: `touchBoth'
make: *** [git-annex] Error 1

changing the #if 0 to 1 gives this...

ghc -O2 -Wall -ignore-package monads-fd --make git-annex
[ 1 of 63] Compiling Touch            ( Touch.hs, Touch.o )

Touch.hsc:95:43:
    Couldn't match expected type `CLong' against inferred type `CTime'
    In the second argument of `(\ hsc_ptr
                                    -> pokeByteOff hsc_ptr 0)', namely
        `(sec :: CLong)'
    In a stmt of a 'do' expression:
        (\ hsc_ptr -> pokeByteOff hsc_ptr 0) ptr (sec :: CLong)
    In the expression:
        do { (\ hsc_ptr -> pokeByteOff hsc_ptr 0) ptr (sec :: CLong);
             (\ hsc_ptr -> pokeByteOff hsc_ptr 4) ptr (0 :: CLong) }
make: *** [git-annex] Error 1

it seems that commit 6634b6a6b84a924f6f6059b5bea61f449d056eee has broken support for OSX.

Comment by Jimmy Sun Mar 20 16:48:41 2011
And, maybe, a way to start a fsck from remote? At least when the other side is a ssh or git annex shell, this would work.
Comment by Richard Mon Jun 13 12:58:52 2011
There's a simple test -- just configure annex.diskreserve to be say, 10 megabytes less than the total free space on your disk. Then try to git annex get a 11 mb file, and a 9 mb file. :)
Comment by joey Wed Mar 23 11:05:12 2011

Just did some minor digging around and checking, this seems to satisfy the compilers etc... I have yet to confirm that it really is working as expected. Also it might be better to check for a darwin operating system instead of apple I think, though I don't know of any one really using a pure darwin OS. But for now it works (I think)

From fbfe27c2e19906ac02e3673b91bffa920f6dae5d Mon Sep 17 00:00:00 2001
From: Jimmy Tang 
Date: Wed, 23 Mar 2011 08:15:39 +0000
Subject: [PATCH] Define (__APPLE__) in StatFS

At least on OSX 10.6.6 it appears to have the same defintions as
FreeBSD. The build process doesn't complain and the code is enabled,
this needs to be tested and checked more.
---
 StatFS.hsc |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/StatFS.hsc b/StatFS.hsc
index 8b453dc..45fd7e4 100644
--- a/StatFS.hsc
+++ b/StatFS.hsc
@@ -53,7 +53,7 @@ import Foreign.C.String
 import Data.ByteString (useAsCString)
 import Data.ByteString.Char8 (pack)

-#if defined (__FreeBSD__)
+#if defined (__FreeBSD__) || defined(__APPLE__)
 # include 
 # include 
 #else
@@ -84,7 +84,7 @@ data CStatfs
 #ifdef UNKNOWN
 #warning free space checking code not available for this OS
 #else
-#if defined(__FreeBSD__)
+#if defined(__FreeBSD__) || defined(__APPLE__)
 foreign import ccall unsafe "sys/mount.h statfs"
 #else
 foreign import ccall unsafe "sys/vfs.h statfs64"
-- 
1.7.4.1
Comment by Jimmy Wed Mar 23 04:21:30 2011
That is displayed by rsync. It's not unheard of for rsync to resume a transfer and display extremely high speeds.
Comment by joey Sat Apr 2 21:37:29 2011
This is brain-storming only so the idea might be crap, but a branch could keep encrypted filenames while master keeps the real deal. This might fit into the whole scheme just nicely or break future stuff in a dozen places, I am not really sure yet. But at least I can't forget the idea, now.
Comment by Richard Wed Mar 30 14:59:19 2011

Completed git-bisect twice, getting roughly the same results:

828a84ba3341d4b7a84292d8b9002a8095dd2382 is the first bad commit
commit 828a84ba3341d4b7a84292d8b9002a8095dd2382
Author: Joey Hess <joey@kitenet.net>
Date:   Sat Mar 19 14:33:24 2011 -0400

    Add version command to show git-annex version as well as repository version information.

:040000 040000 ed849b7b6e9b177d6887ecebd6a0f146357824f3 1c98699dfd3fc3a3e2ce6b55150c4ef917de96e9 M      Command
:100644 100644 b9c22bdfb403b0bdb1999411ccfd34e934f45f5c adf07e5b3e6260b296c982a01a73116b8a9a023c M      GitAnnex.hs
:100644 100644 76dd156f83f3d757e1c20c80d689d24d0c533e16 d201cc73edb31f833b6d00edcbe4cf3f48eaecb0 M      Upgrade.hs
:100644 100644 5f414e93b84589473af5b093381694090c278e50 d4a58d77a29a6a02daf13cec0df08b5aab74f65e M      Version.hs
:100644 100644 f5c2956488a7afafd20374873d79579fb09b1677 f8cd577e992d38c7ec1438ce5c141eb0eb410243 M      configure.hs
:040000 040000 f9b7295e997c0a5b1dda352f151417564458bd6e a30008475c1889f4fd8d60d4d9c982563380a692 M      debian
:040000 040000 9d87a5d8b9b9fe7b722df303252ffd5760d66f75 08834f61a10d36651b3cdcc38389f45991acdf5e M      doc

contents of final refs/bisect:

bad (828a84ba3341d4b7a84292d8b9002a8095dd2382)
good-33cb114be5135ce02671d8ce80440d40e97ca824
good-942480c47f69e13cf053b8f50c98c2ce4eaa256e
good-ca48255495e1b8ef4bda5f7f019c482d2a59b431

"roughly" because second bisect gave two commits as a result, failing to build one of them (missing .o file on link, guess it's because of -j4 and bad deps in that version's build system):

There are only 'skip'ped commits left to test.
The first bad commit could be any of:
828a84ba3341d4b7a84292d8b9002a8095dd2382
5022a69e45a073046a2b14b6a4e798910c920ee9
We cannot bisect more!

Also noticed that "git-annex-shell ..." command succeeds if ran as root user, while failing from unprivileged one. There are no permission/access errors in "strace -f git-annex-shell ...", so I guess it could be some bug in the GHC indeed.

JIC, logged a whole second bisect operation. Resulting log: http://fraggod.net/static/share/git-annex-bisect.log

Bisect script I've used (git-annex-shell dies with error code 134 - SIGABRT on GHC error):

res=
while true; do
  if [[ -n "$res" ]]; then
    cd /var/tmp/paludis/build/dev-scm-git-annex-scm.bak/work/git-annex-scm
    echo "---=== BISECT ($res) ===---"; git bisect "$res" 2>&1; echo '---=== /BISECT ===---'
    cd
    rm -Rf /var/tmp/paludis/build/dev-scm-git-annex-scm
    cp -a --reflink=auto /var/tmp/paludis/build/dev-scm-git-annex-scm{.bak,}
    chown -R paludisbuild: /var/tmp/paludis/build/dev-scm-git-annex-scm
  fi
  res=
  cave resolve -zx1 git-annex --skip-until-phase configure || res=skip
  if [[ -z "$res" ]]; then
    cd /remote/path
    sudo -u user git-annex-shell 'sendkey' '/remote/path' 'SHA1-s6654080--abd8edec20648ade69351d68ae1c64c8074a6f0b' '--' rsync --server --sender -vpe.Lsf --inplace . ''
    if [[ $? -eq 134 ]]; then res=bad; else res=good; fi
    cd
  fi
done 2>&1 | tee ~/git-annex-bisect.log
Comment by fraggod [pip.verisignlabs.com.pip.verisignlabs.com] Sun Apr 3 02:22:15 2011
Git does not need to be upgraded. Git-annex needs to be upgraded to git rev 616e6f8a840ef4d99632d12a2e7ea15c3cfb1805 or newer, on all machines.
Comment by joey Sun Apr 3 15:53:44 2011

I think the correct steps should be, make a backup first :) then ...

  1. git pull # update your clone, and commit everything so you don't lose anything
  2. git annex fsck --fast # check the repo first, just in case
  3. rm -rf .git-annex/?? # remove the old metadata
  4. git annex fsck --fast # get git annex to regenerate it all
  5. push your changes out to your other repos, you will need to make sure git-annex is updated everywhere if there are remotes in your setup.

I eventually migrated all of my own annex'd repos and I no longer have the old hashed directories but the new ones in the form

.git/annex/aaa/bbb/foo.log

I did lose some tracking information but not data (as far as I can see for now), but that was quickly fixed by pushing and pulling to my bare repo which tracks most of my data.

I also found that it worked a bit more reliably for me on the copies of repos that were located on case sensitive filesystems, but I guess that was expected.

Comment by Jimmy Sun Apr 3 12:02:33 2011

git 1.7.4 does not make things better. With it, if I add first "X/foo" and then "x/bar", it commits "X/bar".

That will certianly cause problems when interoperating with a repo clone on a case-sensative filesystem, since git-annex there will not see the location log that git committed to the wrong case directory.

It's possible there is some interoperability problem when pulling from linux like you did, onto HFS+, too. I am not quite sure. Ah, I did find one.. if I clone the repo with "X/foo" in it to a case-sensative filesystem, and add a "x/foo" there, and pull that commit back to HFS+, git says:

 * branch            master     -> FETCH_HEAD
Updating 8754149..e3d4640
Fast-forward
 x/foo |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)
 create mode 100644 x/foo
joey@gnu:/mnt/r4>ls
X/
joey@gnu:/mnt/r4>git st
# On branch master
# Changes not staged for commit:
#   (use "git add ..." to update what will be committed)
#   (use "git checkout -- ..." to discard changes in working directory

#   modified:   X/foo

Aha -- that lets me reproduce your problem with the same file being staged twice with different capitalizations, too:

joey@gnu:/mnt/r4>echo haaai >| x/foo
joey@gnu:/mnt/r4>git st
# On branch master
# Changes not staged for commit:
#   (use "git add ..." to update what will be committed)
#   (use "git checkout -- ..." to discard changes in working directory)
#
#   modified:   X/bar
#   modified:   X/foo
#   modified:   x/foo
#
joey@gnu:/mnt/r4>git commit -a
fatal: Will not add file alias 'X/Bar' ('x/Bar' already exists in index)

And modified files that git refuses to commit, which entirely explains git-annex has issues with git when staging/commiting logs.

joey@gnu:/mnt/r4>git add X/foo
joey@gnu:/mnt/r4>git commit X/foo
# On branch master
# Changes not staged for commit:
#   (use "git add ..." to update what will be committed)
#   (use "git checkout -- ..." to discard changes in working directory)
#
#   modified:   X/bar
#   modified:   X/foo
#
no changes added to commit (use "git add" and/or "git commit -a")

I think git is frankly, buggy. It seems I will need to work around this by stopping using mixed case hashing for location logs.

Comment by joey Thu Mar 31 15:08:01 2011
Just tried building both of the code paths, and they seem to build and somewhat function on OSX. I have yet to confirm the functionality is working correctly, but so far it's looking good. (I somewhat care less about the utimes/mtimes of my files since I care more about the content :) )
Comment by Jimmy Mon Mar 21 04:52:18 2011

I think I have figured out why

### Failure in: 1:blackbox:3:git-annex unannex:1:with content
foo is not a symlink

It goes back to the this piece of code (in test.hs)

copyrepo :: FilePath -> FilePath -> IO FilePath
copyrepo old new = do
        cleanup new
        ensuretmpdir
        Utility.boolSystem "cp" ["-pr", old, new] @? "cp -pr failed"

It seems that on OSX it does not preserve the symbolic link information, basically cp is not gnu cp on OSX, doing a "cp -a SOURCE DEST" seem's to the right thing on OSX. I tried it out on my archlinux workstation by replacing -pr with just -a and all the tests passed on archlinux.

I'm not sure what the implications would be with changing the test with changing the cp command.

Comment by Jimmy Sun Feb 13 11:12:10 2011

Haven't given these any serious thought (which will become apparent in a moment) but hoping they will give birth to some less retarded ideas:


Bait'n'switch

  • pre-commit: Replace all staged symlinks (when pointing to annexed files) with plaintext files containing the key of their respective annexed content, re-stage, and add their paths (relative to repo root) to .gitignore.
  • post-commit: Replace the plaintext files with (git annex fix'ed) symlinks.

In doing so, the blobs to be committed can remain unaltered, irrespective of their related files' depth in the directory hierarchy.

To prevent git from reporting ALL annexed files as unstaged changes after running post-commit hook, their paths would need to be added to .gitignore.

This wouldn't cause any issues when adding files, very little when modifying files (would need some alterations to "git annex unlock"), BUT would make git totally oblivious to removals...


Manifest-based (re)population

  • Keep a manifest of all annexed files (key + relative path)
  • DON'T track the symlinks (.gitignore)
  • Populate/update the directory structure using a post-commit hook.

... thus circumventing the issue entirely, yet diffstats (et al.) would be rather uninformative.


Wide open to suggestions, criticism, mocking laughter and finger-pointing :)

Comment by praet Sun Mar 20 16:11:27 2011

I doubt that git-annex can be used with QuickCheck 1.2.0. The QuickCheck I've tested it with is 2.1.0.3 actually.

I suspect you have an old version of the TestPack haskell library on your system, that is linked against QuickCheck 1.2.0. Git-annex has been tested with TestPack 2.0.0, which uses QuickCheck 2.x.

In any case, you don't have to run 'make test' to build git-annex, and my comments above should make the main program compile, I expect.

Comment by joey Tue Feb 8 15:00:14 2011

After mulling this over, I think actually encrypting the filenames is preferable.

Did you consider encrypting the symmetric key with an asymmetric one? That's what TrueCrypt etc are using to allow different people access to a shared volume. This has the added benefit that you could, potentially, add new keys for data that new people should have access to while making access to old data impossible. Or keys per subdirectory, or, or, or.

As an aside, could the same mechanism be extended to transparently encrypt data for a remote annex repo? A friend of mine is interested to host his data with me, but he wants to encrypt his data for obvious reasons.

Comment by Richard Wed Mar 30 13:01:40 2011
I've fixed the test suite to not accumulate all those zombie processes. Now only 2 or 3 processes should run max. Am curious to see if that clears up all the problems.
Comment by joey Sun Feb 13 00:52:26 2011
Have you seen recover data from lost+found? The method described there will also work in this scenario.
Comment by joey Sat Apr 2 21:46:16 2011

I'm using git-annex to keep my music in sync between all of my different machines. What I'd love to be able to do is to also keep it in sync with my iRiver player. Unfortunately, the firmware, Rockbox, doesn't support ext3, so I'm stuck with a FAT filesystem.

I can see how the design of git-annex makes it rather difficult to get rid of the symlinks, so how about taking a different approach: something like a "git annex export DEST" which would take a destination (not a git remote) and rsync the content over to there as regular files.

Maybe "git annex sync DEST" or "git annex rsync DEST" would be better names if we want to convey the idea that the destination will be made to look like the source repo, including performing the necessary deletions.

Comment by fmarier Mon Apr 4 03:40:41 2011

I followed this to re-inject files which git annex fsck listed as missing.

For everyone of those files, I get

git-annex-shell: key is already present in annex
rsync: connection unexpectedly closed (0 bytes received so far) [sender]
rsync error: error in rsync protocol data stream (code 12) at io.c(601) [sender=3.0.8]

when trying to copy the files to the remote.

-- Richard

Comment by Richard Wed May 11 20:07:29 2011
Right. You probably don't want git-annex to fill up your entire drive anyway, so if it tries to reseve 10 mb or 1% or whatever (probably configurable) for overhead, that should be good enough.
Comment by joey Tue Mar 15 23:04:50 2011
Yes you seem to have come across the same bug that I had initially reported :P
Comment by Jimmy Sun Apr 3 12:05:39 2011
The latest change looks good, it seems to be returning sensible numbers for me. Just tried it out on a few different mount points and it appears to be working.
Comment by Jimmy Wed Mar 23 13:03:51 2011
Alright, I've added #idefs and the symlink timestamp mirroring feature will be unavailable on OSX until I get a version that works there.
Comment by joey Wed Mar 16 13:46:40 2011

It may be possible that OSX has some low resource limits, for user processes (266 per user I think) doing a

sudo sysctl -w kern.maxproc=2048
sudo sysctl -w kern.maxprocperuid=1024
sudo echo "limit maxfiles 1024 unlimited" >> /etc/launchd.conf
sudo echo "limit maxproc 1024 2048" >> /etc/launchd.conf

seems to change the behaviour of the tests abit...

Testing 1:blackbox:3:git-annex unannex:1:with content                         
### Failure in: 1:blackbox:3:git-annex unannex:1:with content
foo is not a symlink
Testing 1:blackbox:4:git-annex drop:0:no remotes                              
### Failure in: 1:blackbox:4:git-annex drop:0:no remotes
drop wrongly succeeded with no known copy of file
Testing 1:blackbox:4:git-annex drop:1:with remote                             
Testing 1:blackbox:4:git-annex drop:2:untrusted remote                        
Testing 1:blackbox:5:git-annex get                                            
Testing 1:blackbox:6:git-annex move                                           
Testing 1:blackbox:7:git-annex copy                                           
Testing 1:blackbox:8:git-annex unlock/lock                                    
Testing 1:blackbox:9:git-annex edit/commit:0                                  
Cases: 30  Tried: 20  Errors: 0  Failures: 2add foo ok
ok
Testing 1:blackbox:9:git-annex edit/commit:1                                  
Testing 1:blackbox:10:git-annex fix                                           
Testing 1:blackbox:11:git-annex trust/untrust/semitrust                       
Testing 1:blackbox:12:git-annex fsck:0                                        
Cases: 30  Tried: 24  Errors: 0  Failures: 2  Only 1 of 2 trustworthy copies of foo exist.
  Back it up with git-annex copy.
  Only 1 of 2 trustworthy copies of sha1foo exist.
  Back it up with git-annex copy.
  Bad file size; moved to /Users/jtang/develop/git-annex/.t/tmprepo/.git/annex/bad/WORM:1297565141:20:foo
  Bad file content; moved to /Users/jtang/develop/git-annex/.t/tmprepo/.git/annex/bad/SHA1:ee80d2cec57a3810db83b80e1b320df3a3721ffa
Testing 1:blackbox:12:git-annex fsck:1                                        
### Failure in: 1:blackbox:12:git-annex fsck:1
fsck failed to fail with content only available in untrusted (current) repository
Testing 1:blackbox:12:git-annex fsck:2                                        
Cases: 30  Tried: 26  Errors: 0  Failures: 3  Only 1 of 2 trustworthy copies of foo exist.
  Back it up with git-annex copy.
  The following untrusted locations may also have copies: 
    58e831c2-371b-11e0-bc1f-47d738dc52ee  -- test repo
  Only 1 of 2 trustworthy copies of sha1foo exist.
  Back it up with git-annex copy.
  The following untrusted locations may also have copies: 
    58e831c2-371b-11e0-bc1f-47d738dc52ee  -- test repo
Testing 1:blackbox:13:git-annex migrate:0                                     
Cases: 30  Tried: 27  Errors: 0  Failures: 3  git-annex: user error (Error in fork: forkProcess: resource exhausted (Resource temporarily unavailable))
### Failure in: 1:blackbox:13:git-annex migrate:0
migrate annexedfile failed
Testing 1:blackbox:13:git-annex migrate:1                                     
### Error in:   1:blackbox:13:git-annex migrate:1
forkProcess: resource exhausted (Resource temporarily unavailable)
Testing 1:blackbox:14:git-annex unused/dropunused                             
### Error in:   1:blackbox:14:git-annex unused/dropunused
forkProcess: resource exhausted (Resource temporarily unavailable)
Cases: 30  Tried: 30  Errors: 2  Failures: 4
test: failed

the number of failures vary as I change the values of the maxprocs, I think I have narrowed it down to OSX just being stupid with limits thus causing the tests to fail.

Comment by Jimmy Sat Feb 12 22:45:51 2011

When I reproduce this, the file is not gone, it's been moved under .git/annex/objects. There is no way an add can delete a file, since all it does is rename it. It would be good for it to error unwind and move the file back though.

joey@gnu:~/tmp/a>touch 663879656b2e676966687474703a2f2f6d656469612e74756d626c722e636f6d2f74756d626c725f6c656673756557324c703171663879656b2e676966687474703a2f2f6d656469612e74756d626c722e636f6d2f74756d626c725f6c656673756557324c703171663879656b2e676966.gif
joey@gnu:~/tmp/a>git annex add *.gif
add 663879656b2e676966687474703a2f2f6d656469612e74756d626c722e636f6d2f74756d626c725f6c656673756557324c703171663879656b2e676966687474703a2f2f6d656469612e74756d626c722e636f6d2f74756d626c725f6c656673756557324c703171663879656b2e676966.gif failed
git-annex: /home/joey/tmp/a/.git/annex/tmp/8e2_6a4_WORM-s0-m1310069979--663879656b2e676966687474703a2f2f6d656469612e74756d626c722e636f6d2f74756d626c725f6c656673756557324c703171663879656b2e676966687474703a2f2f6d656469612e74756d626c722e636f6d2f74756d626c725f6c656673756557324c703171663879656b2e676966.gif.log: openBinaryFile: invalid argument (File name too long)
joey@gnu:~/tmp/a>touch 663879656b2e676966687474703a2f2f6d656469612e74756d626c722e636f6d2f74756d626c725f6c656673756557324c703171663879656b2e676966687474703a2f2f6d656469612e74756d626c722e636f6d2f74756d626c725f6c656673756557324c703171663879656b2e676966.gif
joey@gnu:~/tmp/a>git annex add *.gif
add 663879656b2e676966687474703a2f2f6d656469612e74756d626c722e636f6d2f74756d626c725f6c656673756557324c703171663879656b2e676966687474703a2f2f6d656469612e74756d626c722e636f6d2f74756d626c725f6c656673756557324c703171663879656b2e676966.gif failed
git-annex: /home/joey/tmp/a/.git/annex/tmp/8e2_6a4_WORM-s0-m1310069979--663879656b2e676966687474703a2f2f6d656469612e74756d626c722e636f6d2f74756d626c725f6c656673756557324c703171663879656b2e676966687474703a2f2f6d656469612e74756d626c722e636f6d2f74756d626c725f6c656673756557324c703171663879656b2e676966.gif.log: openBinaryFile: invalid argument (File name too long)
joey@gnu:~/tmp/a>find .git/annex/objects -type f
.git/annex/objects/Mk/92/WORM-s0-m1310069979--663879656b2e676966687474703a2f2f6d656469612e74756d626c722e636f6d2f74756d626c725f6c656673756557324c703171663879656b2e676966687474703a2f2f6d656469612e74756d626c722e636f6d2f74756d626c725f6c656673756557324c703171663879656b2e676966.gif/WORM-s0-m1310069979--663879656b2e676966687474703a2f2f6d656469612e74756d626c722e636f6d2f74756d626c725f6c656673756557324c703171663879656b2e676966687474703a2f2f6d656469612e74756d626c722e636f6d2f74756d626c725f6c656673756557324c703171663879656b2e676966.gif
Comment by joey Thu Jul 7 16:27:33 2011
What if your files have the same prefix and it happens to be 100 chars long? This can not be solved within WORM, but as Joey pointed out, SHA* exists.
Comment by Richard Fri Apr 8 18:02:41 2011

Alright, I have created a case-insensative HFS+ filesystem here on my linux laptop.

I have not been able to trick git into staging the same file with 2 different capitalizations yet.

It might be helpful if you can send me a copy of a git repository where 'git add -i' shows the same file staged with two capitalizations. Leaving out .git/annex of course. (joey@kitenet.net; a tarball would probably work)

It seems that git add only started properly working on case insensative filesystems quite recently. The commit in question is 5e738ae820ec53c45895b029baa3a1f63e654b1b, "Support case folding for git add when core.ignorecase=true", which was first released in git 1.7.4, January 30, 2011. If you don't yet have that version, that could explain the problem entirely. In about half an hour (dialup!) I will have downloaded an older git and will see if I can reproduce the problem with it.

Comment by joey Thu Mar 31 14:02:42 2011
Just pulled the changes, it still fails to build. utimensat doesn't seem to exist on OSX 10.6.6.
Comment by Jimmy Wed Mar 16 12:49:18 2011

I'm running ghc 6.12.3 with the corresponding haskell-platform package from the HP site which I installed in preference to the macports version of haskell-platform (it's quite old). it seems when you install quickcheck, the version that is installed is of version 2.4.0.1 and not 1.2.0 which git-annex depends on for its tests.

jtang@x00:~ $ cabal install quickcheck --reinstall               
Resolving dependencies...
Configuring QuickCheck-2.4.0.1...
Preprocessing library QuickCheck-2.4.0.1...

..
and so on..
..

it fails with this

[54 of 54] Compiling Main             ( test.hs, test.o )

test.hs:56:3:
    No instance for (QuickCheck-1.2.0.1:Test.QuickCheck.Arbitrary Char)
      arising from a use of `qctest' at test.hs:56:3-64
    Possible fix:
      add an instance declaration for
      (QuickCheck-1.2.0.1:Test.QuickCheck.Arbitrary Char)
    In the expression:
        qctest "prop_idempotent_deencode" Git.prop_idempotent_deencode
    In the first argument of `TestList', namely
        `[qctest "prop_idempotent_deencode" Git.prop_idempotent_deencode,
          qctest "prop_idempotent_fileKey" Locations.prop_idempotent_fileKey,
          qctest
            "prop_idempotent_key_read_show"
            BackendTypes.prop_idempotent_key_read_show,
          qctest
            "prop_idempotent_shellEscape" Utility.prop_idempotent_shellEscape,
          ....]'
    In the second argument of `($)', namely
        `TestList
           [qctest "prop_idempotent_deencode" Git.prop_idempotent_deencode,
            qctest "prop_idempotent_fileKey" Locations.prop_idempotent_fileKey,
            qctest
              "prop_idempotent_key_read_show"
              BackendTypes.prop_idempotent_key_read_show,
            qctest
              "prop_idempotent_shellEscape" Utility.prop_idempotent_shellEscape,
            ....]'

I'd imagine if I could downgrade, it would compile and pass the tests (I hope)

Comment by Jimmy Mon Feb 7 08:43:43 2011
One option would be to use the new sharebox a FUSE filesystem for git-annex, which would hide the immutable file details from Calibre, and proxy any changes it made through to git-annex as a series of git annex unlock; modify; git-annex lock
Comment by joey Thu Mar 31 15:32:25 2011
Pity. Mark as done/upstream (or similar) for house-keeping?
Comment by Richard Sun Apr 3 04:56:48 2011
I've managed to reproduce this and confirmed my fix works.
Comment by joey Tue Jul 5 14:37:21 2011

If you install the monads-fd package (with cabal install for instance), then you can no longer build git-annex:

./configure
  checking cp -a... yes
  checking cp -p... yes
  checking cp --reflink=auto... yes
  checking uuid generator... uuid
  checking xargs -0... yes
  checking rsync... yes
ghc -O2 -Wall --make git-annex

Annex.hs:22:7:
    Ambiguous module name `Control.Monad.State':
      it was found in multiple packages: monads-fd-0.2.0.0 mtl-2.0.1.0
make: *** [git-annex] Error 1
Comment by npouillard Mon Feb 7 10:12:43 2011

I'm leaving this bug open because this feature, however minor is not available on OSX and BSD.

I have added a partial implementation using lutimes(3), which should be available on the BSDs. However, it's ifdefed out due to a casting problem: The TimeSpec uses a CTime, while lutimes uses a CLong. These data types may be internally the same on some or all platforms, so if you want this feature you can try changing the "ifdef 0" in Touch.hsc to 1 and try it, see if "git annex add" mirrors file modification time in created symlinks, and let me know.

Comment by joey Sun Mar 20 14:12:59 2011
mtime+100chars can still get collisions and a lot easier than even SHA1. This introduces more problems that it solves, imo.
Comment by Richard Sat Apr 9 19:45:28 2011

@seqq git-annex always uses the same case when creating and accessing the files pointed to by the symlinks. So it will not matter if it's used on a case-insensative, or case-insensative but preserving system like OSX.

You need to fix up the cases of the files in .git/annex/objects to what it expects. I'm not sure what would be the best way to do that. The method described in recover data from lost+found might work well.

Comment by joey Fri Jun 10 12:46:03 2011

Keep in mind that lots of small files may have significant overhead, so a warning that it's not possible to make sure there's enough space would make sense for certain corner cases. Actually finding out the exact overhead is beyond git-annex' scope and, given transparent compression etc, ability, but a warning, optionally with a "do you want to continue" prompt can't hurt.

-- RichiH

Comment by Richard Tue Mar 15 10:11:27 2011

Yes, makes sense. I am so used to using --fast, I forgot a non-fast mode existed. I still think it would be a good idea to fall back to non-fast mode if --fast runs into an error from the remote, but as that is well without my abilities how about this patch?

From 4855510c7a84eb5d28fdada429580a8a42b7112a Mon Sep 17 00:00:00 2001
From: Richard Hartmann <richih.mailinglist@gmail.com>
Date: Sun, 15 May 2011 22:20:42 +0200
Subject: [PATCH] Make error in RecvKey.hs suggest possible solution

---
 Command/RecvKey.hs |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/Command/RecvKey.hs b/Command/RecvKey.hs
index 126608f..b917a1c 100644
--- a/Command/RecvKey.hs
+++ b/Command/RecvKey.hs
@@ -27,7 +27,7 @@ start :: CommandStartKey
 start key = do
    present <- inAnnex key
    when present $
-       error "key is already present in annex"
+       error "key is already present in annex. If you are running copy, try without '--fast'"

    ok <- getViaTmp key (liftIO . rsyncServerReceive)
    if ok
-- 
1.7.4.4
Comment by Richard Sun May 15 16:25:25 2011

I've also seen this apparent hang during upgrade to v3. A few more details:

The annex in question has just under 18k files (and hence that many log files), which can slow down directory operations when they're all in the same place (like, for example, .git/annex/journal).

git-annex uses virtually no CPU time and disk IO when it's hanging like this; the first time it happened, 'ps' showed three defunct git processes, with two "git-annex" processes and three "git" procs:

  • git --git-dir=/mnt/annex/.git --work-tree=/mnt/annex cat-file --batch
  • git --git-dir=/mnt/annex/.git --work-tree=/mnt/annex hash-object -w --stdin-paths
  • git --git-dir=/mnt/annex/.git --work-tree=/mnt/annex update-index -z --index-info

I Ctrl+C'd that and tried again, but it hung again -- this time without the defunct gits.

An strace of the process and its children at the time of hang can be found at http://pastebin.com/4kNh4zEJ . It showed somewhat weird behaviour: When I attached with strace, it would scroll through a whole bunch of syscalls making up the open-fstat-read-close-write loop on .git/annex/journal files, but then would block on a write (sorry, don't have that in my scrollback any more so can't give more details) until I Ctrl+C'd strace; when attaching again, it would again scroll through the syscalls for a second or so and then hang with no output.

Ultimately I detached/reattached with strace about two dozen times and that caused it (?) to finish the upgrade; not really sure how to explain it, but it seems like too much of a timing coincidence.

Comment by pavel Tue Jul 5 11:54:19 2011
Finally got a chance to try to reproduce this. I followed your recipe exactly in a clean squeeze chroot. monadIO was not installed, but git-annex built ok, using monad-control.
Comment by joey Wed Aug 17 00:56:30 2011

I use Debian Squeeze, I have the Debian package cabal-install 0.8.0-1 installed.

$ git clone git://git-annex.branchable.com/
$ cd git-annex.branchable.com
$ cabal update
$ cabal install cabal-install

This installed: Cabal-1.10.2.0, zlib-0.5.3.1, cabal-install 0.10.2. No version of monad-control or monadIO installed.

$ ~/.cabal/bin/cabal install
Registering QuickCheck-2.4.1.1...
Registering Crypto-4.2.3...
Registering base-unicode-symbols-0.2.2.1...
Registering deepseq-1.1.0.2...
Registering hxt-charproperties-9.1.0...
Registering hxt-regex-xmlschema-9.0.0...
Registering hxt-unicode-9.0.1...
Registering hxt-9.1.2...
Registering stm-2.2.0.1...
Registering hS3-0.5.6...
Registering transformers-0.2.2.0...
Registering monad-control-0.2.0.1...
[1 of 1] Compiling Main             ( Setup.hs, dist/setup/Main.o )
Linking ./dist/setup/setup ...
ghc -O2 -Wall -ignore-package monads-fd -fspec-constr-count=5 --make configure
[1 of 2] Compiling TestConfig       ( TestConfig.hs, TestConfig.o )
[2 of 2] Compiling Main             ( configure.hs, configure.o )
Linking configure ...
./configure
  checking version... 3.20110720
  checking cp -a... yes
  checking cp -p... yes
  checking cp --reflink=auto... yes
  checking uuid generator... uuid
  checking xargs -0... yes
  checking rsync... yes
  checking curl... yes
  checking bup... yes
  checking gpg... yes
  checking sha1... sha1sum
  checking sha256... sha256sum
  checking sha512... sha512sum
  checking sha224... sha224sum
  checking sha384... sha384sum

...

Command/Add.hs:54:3:
    No instance for (Control.Monad.IO.Control.MonadControlIO
                       (Control.Monad.State.Lazy.StateT Annex.AnnexState IO))
      arising from a use of `handle' at Command/Add.hs:54:3-24
    Possible fix:
      add an instance declaration for
      (Control.Monad.IO.Control.MonadControlIO
         (Control.Monad.State.Lazy.StateT Annex.AnnexState IO))
    In the first argument of `($)', namely `handle (undo file key)'
    In a stmt of a 'do' expression:
          handle (undo file key) $ moveAnnex key file
    In the expression:
        do { handle (undo file key) $ moveAnnex key file;
             next $ cleanup file key }
cabal: Error: some packages failed to install:
git-annex-3.20110719 failed during the building phase. The exception was:
ExitFailure 1

After I added a depencency for monadIO to the git-annex.cabal file, it installed correctly.
-- Thomas

Comment by Thomas Mon Aug 8 05:04:20 2011
Comments on this page are closed.