Re: [pacman-dev] Git's views versus parallel branches (was Re: [PATCH 1/2] Enabled ...)
On 24/02/2009, at 1:45 PM, Bryan Ischo wrote:
An analogy: using git is kind of like keeping every directory in your home directory in a separate tar file, except for one untarred working directory. Whenever you want to cd into a new directory, you have to tar up the directory you were just in, and then untar the directory you want to cd into.
Although I understand what you're getting at, the analogy is an exaggeration. The workflow you described requires archiving the directory, storing the archive somewhere, then removing the directory, and un-archiving a different archive, then changing to that directory. This is in contrast with a simple "git checkout branch" (possibly with a prior commit, or stash, operation). I.e., five operations as opposed to one or two.
An example of what I can do with parallel branches: if my branch 'new' and 'old' were stored in two separate subdirectories, then I could grep through all of the files in both branches with one command and see collated results; or could diff only files in which a given identifier appeared only in a file on one branch but not the corresponding file on another.
Given git's flexibility and UNIX philosophy, I'm sure it would be possible to create tools which did this. Most of my development thus far has been either adding features, or modifying small things, so I haven't had a need for this. In general the changes where not conflicting between branches, and I did not need to compare my branches with an upstream master branch. Perhaps this is a workflow issue (doing too many things at once)?
Another problem with git: I have to constantly rebuild stuff when I stash and unstash because my build directory doesn't stash.
I don't see why anyone would want to frequently change directories and compile. When I switch to a branch I tend to work on it for a while before switching to another one.
I'd rather use a paradigm that thousands of tools already depend on, than the special case paradigm that is git.
I don't see how this would improve your workflow in any way whatsoever, but it is possible to simply keep multiple trees, with a specific branch checked out. For instance instead of having one git repository "foobar", you could have a project directory "foobar", and have two repositories "eggs" and "spam", cloned from the same "master" repository, but with the "eggs" and "spam" branches checked out, respectively. You could still do all the other operations with git, if you add the other repositories as remotes, but this adds unnecessary maintenance. I see Dan beat me to pointing this out already.
If anything, git seems messier to me because some files get changed in-place as you switch branches in git, and some files are ignored and left as they are (those that aren't actually tracked by git), and distinguishing between the two types requires git commands and lots of mental notes.
As I mentioned in my previous post, the files simply shouldn't be ignored, they should be added to a temporary commit before switching, otherwise you get confused. This is the downside of switching branches in-place. I don't really have a preference between a directory-based branch structure or an in-place structure. However, git's branches let you have a single tree (working directory), which seems simpler and cleaner.
you can do a grep over multiple branches at once to find identifiers
git grep (not sure if it can actually grep over multiple branches)
, you can count line numbers for entire branches if you like to see how much bigger your code base is in one branch than another
git diff my_branch..other_branch --stat
you have to run many git commands
Not really. Git is highly scriptable, if you do something often, you can script the git commands to do it (if they don't already exist), and just run the script. This is also what you'd call the UNIX philosophy.
git seems to require keeping a mental model of branches that you can't even "see" because they aren't in your filesystem anywhere.
Indeed, git doesn't track files, it tracks file content. Branches are just labels for a sequence of changes. I'm sure this gives git many advantages over other SCMs, but I'm not familiar enough with the underlying implementation to give any insight.
Except that not only am I getting confused about the state of my branches as I commit partially complete changes to them for the purpose of saving state as I switch between branches, the tools that I use are getting confused as well.
This seems to be a consistent theme in your workflow - "I switch branches frequently". This is probably why you're having so much trouble with git, and I suggest you ask yourself _why_ you switch branches so frequently. It seems to me like you're treating branches more like commits - each branch is a single logical change to the tree. Even though branches are cheap, I still only have a few around, and rarely switch between them. They are there to separate related commits, and to provide isolation from other branches. It may be that you're a "hacker" and simply work on unrelated code arbitrarily. I do that a lot too, but I somehow don't have the same problems you're facing.
Do some google searches. File renaming in version control systems is a big deal, and for good reason.
Like this? ;) http://article.gmane.org/gmane.comp.version-control.git/217
Yes; it's called refactoring. On a well managed project it doesn't happen often, but it does happen. Reorganization of subtrees of code is something that source control systems should support well. It's the merging after such reorganization that tools that don't track renames have problems with. For examples of git failing in this, take a look at the simple scripts I sent out to the list earlier today.
I don't see how renames are relevant in this situation. In fact, rename information would probably cause more problems. When you refactor, you're moving content, not files. Typically you also change that content significantly. I don't see why git would have any problem with this - this is actually where git's content tracking shines.
Git doesn't allow you to rename and change contents of a file at the same time:
[snip]
If I 'git commit' it will take the rename of file to file2, but not the modification of file2. If I try to add file2 before committing, git status now shows:
[snip]
If you check this in, git will not be able to merge changes to file2 into a branch taken before this change.
I just tried a a more complex example that I was sure would result in a merge conflict. I basically created "file" and committed. I created the "move" branch I moved "file" to "file2", and edited it. At this point, `git status` showed something different to your output: $ git mv file file2 $ vi file2 $ git add file2 $ git status # On branch move # Changes to be committed: # (use "git reset HEAD <file>..." to unstage) # # renamed: file -> file2 # $ git commit -m "changed file and renamed as file2" [move]: created 1fe9963: "changed file and renamed as file2" 1 files changed, 6 insertions(+), 8 deletions(-) rename file => file2 (58%) I commited the change, and created a branch "change" from master. I edited the same parts of the file and commited. Now I merged branch "move" into this branch. Naturally I get a conflict: $ git merge move Renaming file => file2 Auto-merging file2 CONFLICT (rename/modify): Merge conflict in file2 Automatic merge failed; fix conflicts and then commit the result. bash-3.2$ git status file2: needs merge # On branch changes # Changes to be committed: # (use "git reset HEAD <file>..." to unstage) # # deleted: file # # Changed but not updated: # (use "git add <file>..." to update what will be committed) # (use "git checkout -- <file>..." to discard changes in working directory) # # unmerged: file2 # Even when the file was moved, and a significant amount of it was changed (42%), git managed to see that it was moved, and still merged "file2" from the "move" branch with "file" from the "change" branch. The conflict would have occurred regardless of the move operation. Perhaps in a less contrived example the result would be different, but it would be an edge case where the renamed file's content really doesn't resemble that of the original file.
My impression of git is that it feels very much like what its history suggests that it is: a tool for managing patches that grew into a source control system. For better or worse, git feels 'messy' to me, like it wasn't thought out ahead of time but kind of organically grew
I'm sure it was planned out quite well. Linus knew what he hated about other SCMs, he had some good ideas about how to improve those areas, and he did. Git does everything very well so far, and it's faster than any SCM I know about.
Git has dozens of commands, each with dozens of subtle and tricky options; that seems like needless complexity to me.
Git was made by a developer for developers. Of course the interface won't be nice and shiny. The difference between Git's interface and other SCMs' interfaces is that Git has it's guts exposed. Fortunately there is nice porcelain now. I suggest that you discuss these problems with the people at #git. They seem friendly and they know a _lot_ about git. I'm sure they could either explain how to use git to accommodate your workflow, or perhaps expose "flaws" in your workflow. At the very least you will know if git is for you or not. Sorry for going so off-topic here. This isn't even about pacman anymore :P.
Sebastian Nowicki wrote: [lots of good stuff] Thanks for your excellent points. I won't take this discussion too much further, because I think that it's getting to the 'beating a dead horse' phase, but I do appreciate the time that you and others have taken to let me know how I might use git better. I do agree with you that something is inherently different in the way that I am trying to use git than the way that it should be optimally used. I think that part of the problem is that I am used to commits being final, not something that you revisit and modify and coalesce (--squash) and rebase and what have you, so perhaps some of the methodologies I have become used to using need to be altered in an environment in which the natural way to do many things involves managing changes in these ways. If I need more advice on git, I'll go to the official git project for help ... Thanks, Bryan
participants (2)
-
Bryan Ischo
-
Sebastian Nowicki