Thank you for your excellent comments, Sebastian. I hope I can do them justice with my responses, inlined below ... Sebastian Nowicki wrote:
On 24/02/2009, at 6:10 AM, Bryan Ischo wrote:
It's not the ability to modify a single file in two different places at once. It's the ability to keep changes logically separated by directory, in a persistent manner that doesn't require git commands to put changes away and bring them back, that I care about. I find it infinitely easier to keep track of what I am doing by persistently retaining directory contents than by having a single working view and everything else being stashed away to be retrieved later.
I apologise if I'm missing something, but what's the difference between "cd ../branch" and "git checkout branch"? The state of your "working directory" is changed. You still have only one view of the directory. You can obviously view files in the other directory, but that's also possible with git, albeit somewhat harder if using a file manager. There are many GUI front-ends that allow you to quickly look at other branches and commits. What's the between "ls project-root" and "git branch"? Both list the "branches", both allow you to switch to that branch (cd, git checkout). I really don't see the difference, besides a clean project directory, in git's case.
The difference is subtle. Obviously you can work in both ways, and apparently, many people on this list like using git commands instead of 'normal' filesystem navigation for visiting files in their branches. I can think of a few ways to expand upon my thoughts about the 'git way' versus the 'Perforce way': An analogy: using git is kind of like keeping every directory in your home directory in a separate tar file, except for one untarred working directory. Whenever you want to cd into a new directory, you have to tar up the directory you were just in, and then untar the directory you want to cd into. Only the current working directory (and everything under it) can be untarred, everything else has to be tarred up if you're not in it. Would you like to maintain your home directory this way? It sounds like a major pain to me. Although having to issue tar and untar commands constantly while you are working with files in your home directory doesn't sound all that bad, in practice, it would be so much less convenient than if all of your files were untarred all the time and you could just look through them without having to manage the tar files. I suppose if someone has only ever used a system where they had to constantly tar up and untar directories, they wouldn't think anything of it (and would think that a command like 'cd-stash' which tars up your cwd and untars some other tarred directory and cd's into it would be really cool), but if you've had the 'freedom' of just working with your files without such encumbrances, you'll really hate having to do it. An example of what I can do with parallel branches: if my branch 'new' and 'old' were stored in two separate subdirectories, then I could grep through all of the files in both branches with one command and see collated results; or could diff only files in which a given identifier appeared only in a file on one branch but not the corresponding file on another. Another example is the compiled results of each tree, as I mentioned before, which I can see and compare in place if my branches are in different subtrees, but which requires extra copying around of files and other management if I am using git. Another problem with git: I have to constantly rebuild stuff when I stash and unstash because my build directory doesn't stash. It's interesting that you note that viewing files in other branches with git is harder if you are using a file manager. That's exactly the point I am trying to make: file managers and other tools (scripting languages, diffing tools, text searching and processing tools, etc) all work based on the standard Unix paradigm of "everything is a file". Git works on a paradigm of "everything in your current branch is a file, everything else is accessible only as the output of a complex git command". I'd rather use a paradigm that thousands of tools already depend on, than the special case paradigm that is git. A few more points: I don't think that "many GUI front-ends" being available to help me manage my branches is better than a system that doesn't need any GUI front-ends to make the process palatable. And, I'm not sure why what git does is any 'cleaner' than keeping branches in separate directories. If anything, git seems messier to me because some files get changed in-place as you switch branches in git, and some files are ignored and left as they are (those that aren't actually tracked by git), and distinguishing between the two types requires git commands and lots of mental notes.
Parallel branch directories have an advantage over git's branch views whenever you need to compare the contents of branches.
False. As mentioned earlier there are GUI tools which make this simple. If you don't like GUIs, you can use the command line equivalents (most tools execute git commands anyway). I don't know what these are since I've never had the need to compare two branches beyond `git diff`.
Well, I think that the fact that you have to qualify your statement by saying that it's easy if you use special GUIs, and otherwise doable with command line equivalents, exactly makes my point. There are many more ways to compare branches than just 'git diff'. You can compare the result of building both branches, you can do a grep over multiple branches at once to find identifiers, you can count line numbers for entire branches if you like to see how much bigger your code base is in one branch than another ... these things can all be done with git too, but you have to run many git commands to get the views of the branches that you need when you need them, whereas if they all live in separate subdirectories, there are no commands to run at all to get the files ... they're just there.
Maybe it's because I'm an emacs [...] [and] keeping track of [...] what sequence of [...] commands I need [...] is just more mental effort than I want to undertake.
*cheap shot alert* You use emacs, yet remembering commands is too much of an effort? I know I twisted your words a lot, and I'm not hating on emacs, but you have to admit that that is somewhat hypocritical.
It wasn't meant to be hypocritical. I was trying to allude to the fact that when using vi, you have to remember much more state about what you are doing (what mode am I in? insert mode? delete mode? what file am I working on? what line am I on? etc etc) than with emacs; I think this is one of the fundamental differences between vi and emacs. I could be wrong though, I haven't used vi extensively, just enough to make minor edits to files on the way to getting emacs installed :) But assuming that this is true, then it was just the fact that vi users are used to keeping more state about what they are doing in their head that makes git seem natural. Perhaps I should have said 'ed' instead of 'vi' ... Note that I'm not talking about remembering what commands do what (certainly emacs has tons of commands to remember), I'm talking about keeping track of working state as you are using the tool. git seems to require keeping a mental model of branches that you can't even "see" because they aren't in your filesystem anywhere.
I find it so much easier to just leave a branch subdirectory and when I return to it later, it is guaranteed to be exactly as it was when I left, without any effort on my part. If I am working on 4 or 5 bugs in parallel (which I have certainly done at work, where working on just 1 or 2 bugs at once would be inefficient because of the downtime associated with building each tree) I can't even imagine using git stashes to sanely keep track of everything.
This is exactly what branches are for. The exact same thing can be said for git branches. It's guaranteed to be exactly as it was when you switched to another branch. `git stash` should only be used when something is not ready to be committed, but you _urgently_ need to do something else, like a bug fix on the maintenance branch.
Except that not only am I getting confused about the state of my branches as I commit partially complete changes to them for the purpose of saving state as I switch between branches, the tools that I use are getting confused as well. I may have editors and other tools open for files whose contents suddenly change when I git checkout to a different branch. For many tools, this is not a big deal, but I think it illustrates the subtle problems that such an approach introduces. And since I encapsulate part of the state of "what I'm doing" in the state of the tools that I am using, confused state in those tools can often confuse me as well. With git, I can't switch to another branch unless I either a) commit the changes (which I may not be ready to commit yet), or b) stash the changes. Committing or stashing take extra work on my part. Why should I have to do this work?
- Lack of rename tracking. Yeah, I know, git claims that it can do it after the fact when examining change histories but I've tried various scenarios and it just doesn't work very well, and even when it does, requires stupidly complex options to git commands to enable git to discover renames in the history correctly
I can't think of a situation where the file name is relevant. Even when renaming...
Do some google searches. File renaming in version control systems is a big deal, and for good reason.
The problem comes when someone, in a branch, renames a file, and then tries to merge their changes into another branch in which the file was not renamed.
This would only be a problem if the file was not only renamed, but also _changed_, and significantly at that. In this case git would only see that, say, 60% of the file content was moved. I'm not sure how merging would work, since I have never worked on a branch when a file was moved (and changed) in another.
Yes; it's called refactoring. On a well managed project it doesn't happen often, but it does happen. Reorganization of subtrees of code is something that source control systems should support well. It's the merging after such reorganization that tools that don't track renames have problems with. For examples of git failing in this, take a look at the simple scripts I sent out to the list earlier today.
Unless file renames are tracked, the merge becomes very difficult.
Not at all. If git sees that the file content was _moved_ (not changed), it should be able to figure that out easily. Again, I haven't actually done this, but I don't see why it wouldn't work. I would suggest asking about this on #git (or the git ML). If it is indeed a problem then file a bug. I'm sure Linus would be happy to comment on it ;).
It's not just moving. It's moving and changing that git has a problem with.
Refactoring a subsystem on a 'workbranch' is something that is done sometimes on large projects, and with git, I would expect that to be basically impossible to do sanely. Even if git's 'detect renames while examining history' technique did work, it still makes renames cumbersome, because you can't rename a file and change its contents at the same time or else git has almost no chance of detecting the rename via history. And if you can't change a file and rename it at the same time, then you can't, for example, properly rename a Java class, because the class name and file name have to be the same.
Why not? If you change the file contents and rename it, then obviously you'd also change the class name. Why else would you rename it?
Git doesn't allow you to rename and change contents of a file at the same time: $ git mv file file2 $ echo "changed file" > file2 $ git status # On branch master # Changes to be committed: # (use "git reset HEAD <file>..." to unstage) # # renamed: file -> file2 # # Changed but not updated: # (use "git add <file>..." to update what will be committed) # (use "git checkout -- <file>..." to discard changes in working directory) # # modified: file2 # If I 'git commit' it will take the rename of file to file2, but not the modification of file2. If I try to add file2 before committing, git status now shows: # On branch master # Changes to be committed: # (use "git reset HEAD <file>..." to unstage) # # deleted: file # new file: file2 If you check this in, git will not be able to merge changes to file2 into a branch taken before this change.
It just shouldn't be that hard.
Why not? I can't imagine other SCMs doing this any better. If a file contents changes drastically, it doesn't matter if the name of the file is tracked. The name of the file is irrelevant. A merge conflict would arise even if the file was never renamed.
Perforce does it better. It is certainly possible to: * Rename bunches of files on a branch as part of a code refactoring effort and change parts of those files to match (such as class names, etc) * Make bug fixes to the original files in a different branch * Merge those changes together on either branch in a way that makes sense and doesn't produce conflicts (assuming that the individual changes were not conflicting, which is often the case when one branch is doing minor bugfixes and the other is doing more structural work)
I don't mean to contradict everything you say, it's just that I haven't had the same experience with git as you. Using git has been amazing. It does everything I want, it's sophisticated, it merges code well, and it has some very powerful features (like rebase).
I'm glad you like git so much, alot of people do. I'm not saying I don't like git, I'm just saying that there are a few things that I think suck about git. That's how this discussion got started. But alot of people defend git with great vigor if anything critical is said of it, and I don't understand the fervor. My impression of git is that it feels very much like what its history suggests that it is: a tool for managing patches that grew into a source control system. For better or worse, git feels 'messy' to me, like it wasn't thought out ahead of time but kind of organically grew as people realized that certain basic features could be twisted this way or that way to add the equivalent of standard source control functionality. Git has dozens of commands, each with dozens of subtle and tricky options; that seems like needless complexity to me. That's just my impression, take it for what it's worth, which is not much.
By the way, Mercurial seems faster than Bazaar (though I haven't used either much), and both are written in Python. Mercurial might not be pure python though, I am unsure.
I really like what I read on the bazaar web pages; it feels more coherently designed than git, and much simpler to use. But it worries me that it's had significant performance problems on larger projects. Thanks, Bryan