[pacman-dev] Git's views versus parallel branches (was Re: [PATCH 1/2] Enabled ...)
Sebastian Nowicki
sebnow at gmail.com
Tue Feb 24 01:26:57 EST 2009
On 24/02/2009, at 1:45 PM, Bryan Ischo wrote:
> An analogy: using git is kind of like keeping every directory in
> your home directory in a separate tar file, except for one untarred
> working directory. Whenever you want to cd into a new directory,
> you have to tar up the directory you were just in, and then untar
> the directory you want to cd into.
Although I understand what you're getting at, the analogy is an
exaggeration. The workflow you described requires archiving the
directory, storing the archive somewhere, then removing the directory,
and un-archiving a different archive, then changing to that directory.
This is in contrast with a simple "git checkout branch" (possibly with
a prior commit, or stash, operation). I.e., five operations as opposed
to one or two.
> An example of what I can do with parallel branches: if my branch
> 'new' and 'old' were stored in two separate subdirectories, then I
> could grep through all of the files in both branches with one
> command and see collated results; or could diff only files in which
> a given identifier appeared only in a file on one branch but not the
> corresponding file on another.
Given git's flexibility and UNIX philosophy, I'm sure it would be
possible to create tools which did this. Most of my development thus
far has been either adding features, or modifying small things, so I
haven't had a need for this. In general the changes where not
conflicting between branches, and I did not need to compare my
branches with an upstream master branch. Perhaps this is a workflow
issue (doing too many things at once)?
> Another problem with git: I have to constantly rebuild stuff when I
> stash and unstash because my build directory doesn't stash.
I don't see why anyone would want to frequently change directories and
compile. When I switch to a branch I tend to work on it for a while
before switching to another one.
> I'd rather use a paradigm that thousands of tools already depend on,
> than the special case paradigm that is git.
I don't see how this would improve your workflow in any way
whatsoever, but it is possible to simply keep multiple trees, with a
specific branch checked out. For instance instead of having one git
repository "foobar", you could have a project directory "foobar", and
have two repositories "eggs" and "spam", cloned from the same "master"
repository, but with the "eggs" and "spam" branches checked out,
respectively. You could still do all the other operations with git, if
you add the other repositories as remotes, but this adds unnecessary
maintenance. I see Dan beat me to pointing this out already.
> If anything, git seems messier to me because some files get changed
> in-place as you switch branches in git, and some files are ignored
> and left as they are (those that aren't actually tracked by git),
> and distinguishing between the two types requires git commands and
> lots of mental notes.
As I mentioned in my previous post, the files simply shouldn't be
ignored, they should be added to a temporary commit before switching,
otherwise you get confused. This is the downside of switching branches
in-place. I don't really have a preference between a directory-based
branch structure or an in-place structure. However, git's branches let
you have a single tree (working directory), which seems simpler and
cleaner.
> you can do a grep over multiple branches at once to find identifiers
git grep (not sure if it can actually grep over multiple branches)
> , you can count line numbers for entire branches if you like to see
> how much bigger your code base is in one branch than another
git diff my_branch..other_branch --stat
> you have to run many git commands
Not really. Git is highly scriptable, if you do something often, you
can script the git commands to do it (if they don't already exist),
and just run the script. This is also what you'd call the UNIX
philosophy.
> git seems to require keeping a mental model of branches that you
> can't even "see" because they aren't in your filesystem anywhere.
Indeed, git doesn't track files, it tracks file content. Branches are
just labels for a sequence of changes. I'm sure this gives git many
advantages over other SCMs, but I'm not familiar enough with the
underlying implementation to give any insight.
> Except that not only am I getting confused about the state of my
> branches as I commit partially complete changes to them for the
> purpose of saving state as I switch between branches, the tools that
> I use are getting confused as well.
This seems to be a consistent theme in your workflow - "I switch
branches frequently". This is probably why you're having so much
trouble with git, and I suggest you ask yourself _why_ you switch
branches so frequently. It seems to me like you're treating branches
more like commits - each branch is a single logical change to the
tree. Even though branches are cheap, I still only have a few around,
and rarely switch between them. They are there to separate related
commits, and to provide isolation from other branches. It may be that
you're a "hacker" and simply work on unrelated code arbitrarily. I do
that a lot too, but I somehow don't have the same problems you're
facing.
> Do some google searches. File renaming in version control systems
> is a big deal, and for good reason.
Like this? ;)
http://article.gmane.org/gmane.comp.version-control.git/217
> Yes; it's called refactoring. On a well managed project it doesn't
> happen often, but it does happen. Reorganization of subtrees of
> code is something that source control systems should support well.
> It's the merging after such reorganization that tools that don't
> track renames have problems with. For examples of git failing in
> this, take a look at the simple scripts I sent out to the list
> earlier today.
I don't see how renames are relevant in this situation. In fact,
rename information would probably cause more problems. When you
refactor, you're moving content, not files. Typically you also change
that content significantly. I don't see why git would have any problem
with this - this is actually where git's content tracking shines.
> Git doesn't allow you to rename and change contents of a file at the
> same time:
>
> [snip]
>
> If I 'git commit' it will take the rename of file to file2, but not
> the modification of file2. If I try to add file2 before committing,
> git status now shows:
>
> [snip]
>
> If you check this in, git will not be able to merge changes to file2
> into a branch taken before this change.
I just tried a a more complex example that I was sure would result in
a merge conflict. I basically created "file" and committed. I created
the "move" branch I moved "file" to "file2", and edited it. At this
point, `git status` showed something different to your output:
$ git mv file file2
$ vi file2
$ git add file2
$ git status
# On branch move
# Changes to be committed:
# (use "git reset HEAD <file>..." to unstage)
#
# renamed: file -> file2
#
$ git commit -m "changed file and renamed as file2"
[move]: created 1fe9963: "changed file and renamed as file2"
1 files changed, 6 insertions(+), 8 deletions(-)
rename file => file2 (58%)
I commited the change, and created a branch "change" from master. I
edited the same parts of the file and commited. Now I merged branch
"move" into this branch. Naturally I get a conflict:
$ git merge move
Renaming file => file2
Auto-merging file2
CONFLICT (rename/modify): Merge conflict in file2
Automatic merge failed; fix conflicts and then commit the result.
bash-3.2$ git status
file2: needs merge
# On branch changes
# Changes to be committed:
# (use "git reset HEAD <file>..." to unstage)
#
# deleted: file
#
# Changed but not updated:
# (use "git add <file>..." to update what will be committed)
# (use "git checkout -- <file>..." to discard changes in working
directory)
#
# unmerged: file2
#
Even when the file was moved, and a significant amount of it was
changed (42%), git managed to see that it was moved, and still merged
"file2" from the "move" branch with "file" from the "change" branch.
The conflict would have occurred regardless of the move operation.
Perhaps in a less contrived example the result would be different, but
it would be an edge case where the renamed file's content really
doesn't resemble that of the original file.
> My impression of git is that it feels very much like what its
> history suggests that it is: a tool for managing patches that grew
> into a source control system. For better or worse, git feels
> 'messy' to me, like it wasn't thought out ahead of time but kind of
> organically grew
I'm sure it was planned out quite well. Linus knew what he hated about
other SCMs, he had some good ideas about how to improve those areas,
and he did. Git does everything very well so far, and it's faster than
any SCM I know about.
> Git has dozens of commands, each with dozens of subtle and tricky
> options; that seems like needless complexity to me.
Git was made by a developer for developers. Of course the interface
won't be nice and shiny. The difference between Git's interface and
other SCMs' interfaces is that Git has it's guts exposed. Fortunately
there is nice porcelain now.
I suggest that you discuss these problems with the people at #git.
They seem friendly and they know a _lot_ about git. I'm sure they
could either explain how to use git to accommodate your workflow, or
perhaps expose "flaws" in your workflow. At the very least you will
know if git is for you or not.
Sorry for going so off-topic here. This isn't even about pacman
anymore :P.
More information about the pacman-dev
mailing list