deborah: The management regrets that it was unable to find a Gnomic Utterance that was suitably irrelevant. (gnomic)
deborah ([personal profile] deborah) wrote in [site community profile] dw_dev2012-04-29 02:40 pm
Entry tags:

Git, Mercurial, github, bitbucket

I want to spin off a new post from the log of last night's IRC developer meeting. The topic of GitHub came up in the meeting, and some concerns with that idea have been raised in the comments of the previous post. [personal profile] vlion's concerns largely address the difference between mercurial and git, whereas [profile] karelia's concerns also address that difference but touch incidentally on the hypothetical benefit of working in the more public environment of Github.

I was talking to [personal profile] allen and he pointed out that there are really two different issues in play here, because we can go to a shared, public, relatively popular, FLOSS-friendly environment without ever leaving mercurial, namely, Bitbucket.

I'd actually say there are three questions:
  1. Are there benefits to git over mercurial, and if so, are those benefits enough to outweigh the cost of switching to a new source control system?
  2. Would we like to move our source control management to a public, shared, FLOSS-friendly environment? If so, why? Do we think it would be more friendly to our current developers, do we think it would make it easier to bring in new developers, some combination of the two, or something else?
  3. If we want to move to a shared environment, do we feel that there is a strong reason that it should be Github? What are those reasons, if so? If we think git is worse than mercurial, but we do think there's a benefit to moving to Github, which reason should prevail?


Actually, we should probably add a fourth question, which is "would any of our needs be better served by using mercurial more in the fashion for which it was intended?"

Keep in mind when I write these questions that I use github for other projects and like it,and I have never used mercurial intensely enough to have strong feelings about it either way. Personally I fell in love with Perforce at an early date and find all other VCS systems to be it pale yet free imitations. But I do think that if we make a switch like this, these are the questions we need to answer.
mark: A photo of Mark kneeling on top of the Taal Volcano in the Philippines. It was a long hike. (Default)

[staff profile] mark 2012-05-06 07:57 pm (UTC)(link)
That link you gave illustrates one of the most annoying things about hg for me: merge, merge, merge! It's one of the real powers of git that it can do rebasing, which makes for a much, much cleaner thing.

So -- let's say someone is doing lots of development in their branch foobar and later they want to bring in changes from master... well, in hg and such, you can merge master in to foobar and that's great, but then you have this commit showing that you merged changes in, and it's kind of messy.

Let's imagine commits like this:

master: A, B, C.

At that point you create your branch and start doing some work:

foobar: A, B, C, D, E.

But meanwhile, people are still committing to master:

master: A, B, C, F.

Now you want to merge that change, F, into your foobar branch. In hg and similar systems, you can do it by creating commit G so it looks like this:

foobar: A, B, C, D, E, G.

But in reality, G doesn't contain any of your code. You end up with this branch showing that you made some changes, then you merged in a bunch of stuff. Diffs can get annoying now.

With git, you can instead do a rebase. In essence, this takes changes that you have made and puts them on top of the other changes. This ends up with your branch looking like this:

foobar: A, B, C, D, E, F.

Now you have a perfect repository with a perfect history with no messy merge commits. It's beautiful, easy to read, and exactly what you want. At the end of the day, nobody cares that you have worked on this for a month and merged things back and forth -- that's messy. What we care about is that you took the repository and you made your commit on top of it.

The interactive rebase even lets you do lots of ugly checkpoint commits that let you say "blah blah, typo" and not worry about it. Then before you submit your pull request, you do a rebase and squash all of your checkpoints into the one final commit with a great commit message.

You then submit this in the pull request and instead of seeing 45 different commits you made over the last month, instead you submit one commit that has the changes for your feature. It's easier to review, easier to go back and read when you look a the repository, and generally just a win for clarity.
pauamma: Cartooney crab wearing hot pink and acid green facemask holding drink with straw (Default)

[personal profile] pauamma 2012-05-06 10:57 pm (UTC)(link)
That's what MQ is for.
mark: A photo of Mark kneeling on top of the Taal Volcano in the Philippines. It was a long hike. (Default)

[staff profile] mark 2012-05-06 11:06 pm (UTC)(link)
You can emulate this behavior with MQ, sure. I don't want to, though. I dislike MQ. hg is okay, but I prefer git for many reasons.

(reposted as my staff account, because I was in the wrong window)
pauamma: Cartooney crab wearing hot pink and acid green facemask holding drink with straw (Default)

[personal profile] pauamma 2012-05-07 12:02 am (UTC)(link)
OK, mq is a standard part of hg AFAIK, so let's turn your argument around: I'm going to declare some parts of git offlimits because I don't like them, then compare whatever's left to the whole of hg (including mg). Does that sound like a fair, objective comparison? If your answer is "no", as it reasonably would be, I submit your comparison isn't fair or objective either.
kareila: (Default)

[personal profile] kareila 2012-05-07 02:48 am (UTC)(link)
MQ is sort of a different beast, actually, because it sits on top of the repository and manages applied patches. From the repository's perspective, it's completely separate from the commit history. IIRC, it started life as sort of an ugly hack and was later folded into the standard distribution when it became popular (well, to the extent that Mercurial is popular). Kind of like how QuickTime started out as a little video platform and eventually got baked into the core Mac OS.

I think what Mark is trying to say here is that he wants us to move toward a workflow where instead of having to generate a patch and upload it to Bugzilla (which MQ is very helpful with), he wants us to be able to commit changes on our own branches and request to have them pulled into the main trunk, no patches needed. There are advantages to be gained from that approach but also tradeoffs to be made.
mark: A photo of Mark kneeling on top of the Taal Volcano in the Philippines. It was a long hike. (Default)

[staff profile] mark 2012-05-08 04:53 pm (UTC)(link)
Thank you! Yes, this is what I was aiming for, and said very badly. :) Sorry, Pau!
fu: Close-up of Fu, bringing a scoop of water to her mouth (Default)

[personal profile] fu 2012-05-08 07:59 am (UTC)(link)
Trying to figure this out. When you squash, is the commit history still in your branch, just not in the master, or does it all go away?
mark: A photo of Mark kneeling on top of the Taal Volcano in the Philippines. It was a long hike. (Default)

[staff profile] mark 2012-05-08 08:29 am (UTC)(link)
It all goes away.

Well, it goes away in that your commit tree shows only the new, updated, squashed, whatever items. But they still exist -- git never really deletes things unless you do some magic, and then, it takes 30 days before it goes away.

If you squash a few commits, what it actually does is merge those changes into a new commit. The previous commits still exist, they just become unlinked and not part of the current "tree". (So they are still there if you need them...)

If you want to see this in action, do this experiment:

* edit some file, change it, then commit it. message doesn't matter.
* now say you don't like that commit. "git rebase -i HEAD~2" and delete the line for the commit you just made. save, exit.
* now "git log" -- the commit you made is gone!
* now "git reflog" and see it still exists!
* try "git checkout IDHERE" with the id from the commit line.
* you're now in detached head state, try "git log" again -> there's the commit you just dropped!
* okay, let's restore this commit we blew away. "git checkout master" and "git log" -> you're back on master!
* now "git merge IDHERE" from that commit you had. now "git log" and you're back to how you were, before you got rid of it!

So -- git never actually truly deletes things. (Except for that 30-day-and-garbage-collect exception.) Even if you squash them, it just creates a new commit with the contents from the ones you squashed. The old commits are still there and you can go back to it if you want, with some effort.

...

Now in the "push" state. The problem is that doing a rebase squash like this means you are "rewriting history" -- which doesn't work if you push it up to a shared repo and someone downloads it. Changing history is something that requires a --force, which messes everybody up.

So, rebase/squash/etc is only for on your local end, and only for committers who want to clean up something before merging it in to the main tree.

Once something has been pushed up to the main dw-free repository, it should be considered immutable. Squashing or rebasing after the fact is a big no-no because it rewrites history and that breaks DVCS in nasty ways.