Consider two use cases: 1. Using a v6 repo with locked files on a crippled filesystem not supporting symlinks. For the files to be usable, they need to be unlocked. But, the user may not want to unlock the files everywhere, just on this one crippled system. 2. [[todo/hide_missing_files]] Both of these could be met by making `git-annex sync` maintain an adjusted version of the original branch, eg `adjusted/master`. There would be a filter function. For #1 above it would simply convert all annex symlinks to annex file pointers. For #2 above it would omit files whose content is not currently in the annex. Sometimes, both #1 and #2 would be wanted. [Alternatively, it could stay on the master branch, and only adjust the work tree and index. See WORKTREE notes below for how this choice would play out.] [[!toc]] ## merge When merging changes from a remote, apply the filter to the head of the remote branch, resulting in a commit with its changes. Merge in that commit. Note that it's possible to control the metadata of the commit such that 2 users who have the same adjusted branch checked out, both generate the same commit sha. This would be done by `git annex merge` and `git annex sync`. Since the adjusted/master branch is not present on the remote, if the user does a `git pull`, it won't merge in changes from origin/master. Which is good because the filter needs to be applied first. [WORKTREE: `git pull` would update the work tree, and may lead to conflicts between the adjusted work tree and pulled changes. A post-merge hook would be needed to re-adjust the work tree, and there would be a window where eg, not present files would appear in the work tree.] However, if the user does `git merge origin/master`, they'll get into a state where the filter has not been applied. The post-merge hook could be used to clean up after that. Or, let the user foot-shoot this way; they can always reset back once they notice the mistake. ## annex object add/remove When objects are added/removed from the annex, the associated file has to be looked up, and the filter applied to it. So, dropping a file with the missing file filter would cause it to be removed from the adjusted branch, and receiving a file's content would cause it to appear in the adjusted branch. These changes would need to be committed to the adjusted branch, otherwise `git diff` would show them. [WORKTREE: Simply adjust the work tree (and index) per the filter.] ## commit When committing changes, a commit is made as usual to the adjusted branch. So, the user can `git commit` (or `git annex sync`). This does not touch the original branch yet. Then we need to get from that commit to one with the filters reversed, which should be the same as if the adjusted branch had not been used. This commit gets added onto the original branch. So, the branches would look like this: master adjusted/master A ---filter----> A | | | A' | | | B' B <--rev filter- | | B | ---filter----> | | B'' Note particularly that B does not have A' in its history; the adjusted branch is not evident from outside. So, we need a way to detect commits like A'. Also note that B gets merged back to the adjusted branch, re-applying the filter. This will make other checkouts that are in the same adjusted branch end up with the same B'' commit when they pull B. It might be useful to have a post-commit hook that generates the reverse-filtered commit and updates the original branch. And/or `git-annex sync` could do it. [WORKTREE: A pre-commit hook would be needed to update the staged changes, reversing the filter before the commit is made. All the other complications above are avoided.] ## reverse filtering Reversing filter #1 would mean only converting pointer files to symlinks when the file was originally a symlink. This is problimatic when a file is renamed. Would it be ok, if foo is renamed to bar and bar is committed, for it to be committed as an unlocked file, even if foo was originally locked? Reversing filter #2 would mean not deleting removed files whose content was not present. When the commit includes deletion of files that were removed due to their content not being present, those deletions are not propigated. When the user deletes an unlocked file, the content is still present in annex, so reversing the filter should propigate the file deletion. ## push The new master branch can then be pushed out to remotes. The adjusted/master branch is not pushed to remotes. `git-annex sync` should automatically push master when adjusted/master is checked out. When push.default is "simple" (the new default), running `git push` when in adjusted/master won't push anything. It would with "matching". Pity. (I continue to feel git picked the wrong default here.) Users may find that surprising. Users of `git-annex sync` won't need to worry about it though. [WORKTREE: push works as usual] ## acting on filtered-out files If a file is filtered out due to not existing, there should be a way for `git annex get` to get it. Since the filtered out file is not in the index, that would not normally work. What to do? Maybe instead of making a branch where the file is deleted, it would be better to delete it from the work tree, but keep the branch as-is. Then `git annex get` would see the file, as it's in the index. But, not maintaining an adjusted branch complicates other things. See WORKTREE notes throughout this page. Overall, the WORKTREE approach seems too problimatic. Ah, but we know that when filter #2 is in place, any file that `git annex get` could act on is not in the index. So, it could look at the master branch instead. (Same for `git annex move --from` and `git annex copy --from`) OTOH, if filter #1 is in place and not #2, a file might be renamed in the index, and `git annex get $newname` should work. So, it should look at the index in that case. ## problems Using `git checkout` when in an adjusted branch is problimatic, because a non-adjusted branch would then be checked out. But, we can just say, if you want to get into an adjusted branch, you have to run some command. Or, could make a post-checkout hook. Tags are bit of a problem. If the user tags an ajusted branch, the tag includes the local adjustments. [WORKTREE: not a problem] If the user refers to commit shas (in, eg commit messages), those won't be visible to anyone else. [WORKTREE: not a problem] When a pull modifies a file, its content won't be available, and so it would be hidden temporarily by filter #2. So the file would seem to vanish, and come back later, which could be confusing. Could be fixed as discussed in [[todo/deferred_update_mode]]. Arguably, it's just as confusing for the file to remain visible but have its content temporarily replaced with a annex pointer. ## integration with view branches Entering a view from an adjusted branch should probably carry the filtering over into the creation/updating of the view branch. Could go a step further, and implement view branches as another branch adjusting filter, albeit an extreme one. This might improve view branches. For example, it's not currently possible to update a view branch with changes fetched from a remote, and this could get us there. [WORKTREE: Wouldn't be able to integrate, unless view branches are changed into adjusted view worktrees.] ## filter interface Distilling all of the above, the filter interface needs to be something like this, at its most simple: data Filter = UnlockFilter | HideMissingFilter | UnlockHideMissingFilter getFilter :: Annex Filter setFilter :: Filter -> Annex () data FilterAction = UnchangedFile FilePath | UnlockFile FilePath | HideFile FilePath data FileInfo = FileInfo { originalBranchFile :: FileStatus , isContentPresent :: Bool } data FileStatus = IsAnnexSymlink | IsAnnexPointer deriving (Eq) filterAction :: Filter -> FilePath -> FileInfo -> FilterAction filterAction UnlockFilter f fi | originalBranchFile fi == IsAnnexSymlink = UnlockFile f filterAction HideMissingFilter f fi | not (isContentPresent fi) = HideFile f filterAction UnlockHideMissingFilter f fi | not (isContentPresent fi) = HideFile f | otherwise = filterAction UnlockFilter f fi filterAction _ f _ = UnchangedFile f filteredCommit :: Filter -> Git.Commit -> Git.Commit -- Generate a version of the commit made on the filter branch -- with the filtering of modified files reversed. unfilteredCommit :: Filter -> Git.Commit -> Git.Commit isFilteredCommit :: Git.Commit -> Bool