(Hi, this is paulproteus@debian, AKA Asheesh). I've been enjoying using git-annex to archive my data. It's great that, by using git-annex and the SHA1 backend, I get a space-saving kind of deduplication through the symbolic links. I'm looking for the ability to filter files, before they get added to the annex, so that I don't add new files whose content is already in the annex.look That would help me in terms of personal file organization. It seems there is not, so this is a wishlist bug filed so that maybe such a thing might exist. What I would really like to do is: * $ git annex add --no-add-if-already-present . * $ git commit -m "Slurping in some photos I found on my old laptop hard drive" And then I'd do something like: * $ git clean -f to remove the files that didn't get annexed in this run. That way, only one filename would ever point to a particular SHA1. I want this because I have copies of various of mine (photos, in particular) scattered across various hard disks. If this feature existed, I could comfortably toss them all into one git annex that grew, bit by bit, to store all of these files exactly once. (I would be even happier for "git annex add --unlink-duplicates .") (Another way to do this would be to "git annex add" them all, and then use a "git annex remove-duplicates" that could prompt me about which files are duplicates of each other, and then I could pipe that command's output into xargs git rm.) (As I write this, I realize it's possible to parse the destination of the symlink in a way that does this..)