It is currently 18 Jul 2025, 21:19
   
Text Size

Proper way of working with the Git repo?

Moderators: North, BetaSteward, noxx, jeffwadsworth, JayDi, TheElk801, LevelX, CCGHQ Admins

Proper way of working with the Git repo?

Postby klayhamn » 29 Jul 2015, 08:11

Hey guys,

I'm a bit new to git, and still trying to learn how to work with it.
It seems like whenever I try to merge the latest changes from the main fork into my local machine,
it asks me for a commit message,
and this merge later appears in the commit list (for example, when I do a pull request in GitHub)

I'm trying to figure out what's the correct way of working with this:
I set my "remote origin" to my personal fork on github, and my "remote upstream" to magefree on github

Now, if I understand correctly, I need to "git push" to my personal fork and always "pull" (or "fetch") from the upstream?
But if I worked on my own fork in another computer (#2), and then I want to update my working station (#1) with the latest changes to the (personal) fork I need to do "git pull origin"?

Is there any way to avoid all the merge messages in my commits, or are they harmless?
I noticed that other people committing don't have as many merge messages as me...

Do other people just merge less often?
klayhamn
 
Posts: 17
Joined: 25 Jul 2015, 20:07
Has thanked: 2 times
Been thanked: 2 times

Re: Proper way of working with the Git repo?

Postby Simown » 29 Jul 2015, 10:46

You have two remotes:

origin - which is your repository (a fork of mage)
upstream - which is the mage repostiory

When you push and pull you need to specify which repository it is going to or from:

git pull origin master - will pull from your fork
git pull upstream master - will pull from the original mage repository

If you did a "git clone <your-fork>" then it's likely that git is set up in such a way that:

"git pull" is a shortcut to saying "git pull origin master" on the master branch and;
"git push" is a shortcut to saying "git push origin master" on the master branch

The configuration for this option can be found in the repositories config file (.git/config):

[branch "master"]
remote = origin
merge = refs/heads/master

This says that when you are on branch "master" it will pull and push from origin (likely your fork), and default to origin as options for "push" and "pull".

To change this option if you want - but it's usually unecessary:
> git config branch.master.remote <remote name (e.g. origin)>

Working from another computer should be the same workflow, as long as "origin" is pointing at the same remote repository.

If I understand you correctly, pulling from the upstream or your fork while you have local changes may cause an automatic merge to get your local copy up to date.

You can avoid this by issuing a: "git pull --rebase upstream master"

Be careful with this, however. This effectively reapplies your commits on top of the stuff that is pulled in from the repository. Ensure you don't change the order of commits you have already pushed to any remote repostiory.Might be worth trying it out and/or reading up on it first.

This link might explain it better: http://gitready.com/advanced/2009/02/11 ... ebase.html
~Simown
Simown
 
Posts: 14
Joined: 30 Aug 2014, 19:25
Has thanked: 6 times
Been thanked: 1 time

Re: Proper way of working with the Git repo?

Postby klayhamn » 29 Jul 2015, 12:01

Thanks for your answer

If I understand you correctly - you're saying I can also avoid the merge by making sure to only fetch from the upstream when I have no pending local changes to commit or push?

Because as I understood it --- "fetch" always just brings the changes from the repo but doesn't apply them

To "apply" the changes on my local branch it seems like I always have to merge regardless...

I'm not entirely sure I understand what "rebase" does
you wrote: "Be careful with this, however. This effectively reapplies your commits on top of the stuff that is pulled in from the repository. Ensure you don't change the order of commits you have already pushed to any remote repostiory.Might be worth trying it out and/or reading up on it first."

How can I control the order of the commits?

Can you give an example of how this order can change?

I'll go over the link you sent as well

Thanks
klayhamn
 
Posts: 17
Joined: 25 Jul 2015, 20:07
Has thanked: 2 times
Been thanked: 2 times

Re: Proper way of working with the Git repo?

Postby Simown » 29 Jul 2015, 13:55

klayhamn wrote:If I understand you correctly - you're saying I can also avoid the merge by making sure to only fetch from the upstream when I have no pending local changes to commit or push?
Not necessarily. If you commited changes to your repository and in the meantime someone else commited changes to the mage (upstream) and you pulled them, you could get a merge to fast-foward your branch to match the upstream's commit history with your changes "merged" in.

klayhamn wrote:Because as I understood it --- "fetch" always just brings the changes from the repo but doesn't apply them
You are correct, and I often forget this myself. "pull" is a "fetch" followed by a "merge" without asking you for confirmation. There are two ways you can do this after a fetch: "git merge --no-commit" which is exactly the same as a normal merge but doesn't create a merge commit. Or "git merge --squash" which basically merges all the changes in to your working tree by essentially copying over all the changes without creating a merge commit.

klayhamn wrote:How can I control the order of the commits?
It's hard to explain without a working copy at work but I'll try my best:

Currently your working tree and the remote look like this:

------------------------------------------------------------


http://pastebin.com/xvfTggff - Moved to pastebin to preserve formatting.


------------------------------------------------------------

We keep a linear commit history but look what's happened. Commits E, F and G have been placed before your local commits and the local changes applied effectively on top. This is all good if they are just local commits, but what if you pushed to origin with your pre-rebase working tree.

Commit E, F and G follows commit D in your updated working tree, but if you have pushed before rebasing commit X, Y and Z follow commit D in the remote repository. You now have a conflict between what the git history is between your local copy and the remote.

If you want to push to a repository to save your work remotely, or for any othe reason, you should not ever change the order of the commits on the remote. If rebase would change what the history looks like on the remote, you're going to have to merge (but not always with a merge commit as previously mentioned).

Another link for your enjoyment with some nice diagrams: https://www.atlassian.com/git/tutorials ... f-rebasing

Hope that helps without confusing you too much.
~Simown
Simown
 
Posts: 14
Joined: 30 Aug 2014, 19:25
Has thanked: 6 times
Been thanked: 1 time

Re: Proper way of working with the Git repo?

Postby klayhamn » 29 Jul 2015, 15:43

Wow thanks for trying to explain this :)

I think I'm beginning to understand this,
but the main thing that's confusing me is that there are basically 6 different copies of the code:

1. the remote main repo - master branch
2. remote forks of the repo - master branch
3. non-master remote branches of the main repo
4. non-master remote branches of the forks
5. my local copy of the code - master branch
6. my local copy of the code - non-master branches

So whenever people discuss what you should or shouldn't do regarding the rebase, and mention "the branch" or the "the repo" or "the local copy" etc. -

it's extremely hard to figure out which of the 6 they're talking about exactly

I know that in our specific case there is (at least in most relevant cases) only a single branch (the master), but the examples and discussions of rebase always talk about multiple branches, which make it more confusing to understand what exactly i should take from it regarding our specific case

For example, here
they conclude by saying : "Don’t rebase branches you have shared with another developer."

But in our case there is a main branch that everyone works on, and rarely do people create new branches

So, does that mean his advice is that I shouldn't rebase at all?

or - is he talking about not rebasing on the REMOTE repo?
or - is he talking about not rebasing on the main repo, but to rebase freely on MY FORK?

and so on and so forth...

Perhaps I'm overcomplicating things,
but I'm just trying to get a big picture of what is the "most dangerous" scenario in terms of rebasing - in our specific case


I'm trying to understand the example you gave, but can't understand the exact order of events

1) create & commit X Y Z
2) push X Y Z to my personal fork
3) another developer creates, commits, pushes and merges E F and G to the main repo
4) fetch & rebase from upstream
5) my fork is merged (via a pull request) with the main repo

what is exactly the problematic order of events that creates the difference in histories?

if the order is 1, 2, 3, 4, 5

the my local copy is A B C D X Y Z E F G
and the repo is A B C D E F G X Y Z

but this would happen even if I did a merge instead of a rebase, no?
once I committed X Y Z they are no longer "pending commits" and would not be reverted and "rewritten" on top of E F G... or am I misunderstanding how the rebase works? Does it even undo things that were already pushed to my fork but not yet incorporated into the main repo?

did you refer to the order 1,2,3,5,4

where my local copy would be A B C D E F G X Y Z
and the repo would be A B C D X Y Z E F G
?

I don't see a scenario where the two histories would be identical, unless I'm misunderstanding something fundamental about either merge or rebase
klayhamn
 
Posts: 17
Joined: 25 Jul 2015, 20:07
Has thanked: 2 times
Been thanked: 2 times

Re: Proper way of working with the Git repo?

Postby LoneFox » 29 Jul 2015, 17:03

I'm not a git expert, but this is how I understand it:
Git identifies commits by their SHA-1 hash. A commit includes a timestamp. If the timestamp changes, the hash also changes, and git treats it as completely new commit with no connection to the old one. A merge keeps the timestamp intact, but a rebase resets it to the time when the rebase is done.
After a merge you get A B C D X Y Z E F G.
After a rebase you get A B C D E F G X' Y' Z', where X' Y' Z' have the same content but different timestamps than X Y Z.
Rebasing is safe if and only if the commits X Y Z have not been pushed or pulled into any other repository than the one where the rebase is done.
LoneFox
Programmer
 
Posts: 71
Joined: 08 Mar 2009, 13:43
Has thanked: 0 time
Been thanked: 7 times

Re: Proper way of working with the Git repo?

Postby klayhamn » 29 Jul 2015, 18:07

LoneFox wrote:After a merge you get A B C D X Y Z E F G.
After a rebase you get A B C D E F G X' Y' Z', where X' Y' Z' have the same content but different timestamps than X Y Z.
So this basically means that the merge orders the commits (in the git tree) in a serial manner (i.e. - the order in which happen to be applied to the [local] instance), regardless of their timestamp?
because, I could have written and committed X Y Z locally either before or after E F G were merged into the main repo
the only thing that's certain (based on the example) is that all of these commits would appear after "D".

Or -- is the order of commits in my local history (or my personal fork) based on their timestamps (and if so, what are the timestamps based on? the time I fetched them? the time I merged them? the time they were committed in the other developer's computer? the time they were pushed to the main repo from which I pulled them?)

LoneFox wrote:Rebasing is safe if and only if the commits X Y Z have not been pushed or pulled into any other repository than the one where the rebase is done.
So it should always be safe to rebase vs. my personal fork?
What if I committed and pushed X Y and Z to my personal fork but haven't issued a pull-request to the main repo yet?
What would be the effect of "git pull --rebase upstream" in this case? would it be safe?
what would happen to the commits that have already been pushed (but not yet into the main-repo)?
klayhamn
 
Posts: 17
Joined: 25 Jul 2015, 20:07
Has thanked: 2 times
Been thanked: 2 times

Re: Proper way of working with the Git repo?

Postby Simown » 29 Jul 2015, 18:37

Lone Fox is right, a point I forgot to mention. The commits X', Y' and Z' are the same in contents and commit message but the SHA1 hash will have changed - and this is a different commit to git. The hash comprises of timestamps, commit contents AND the position in the git history. If you move commits around, the hashes will change.

The golden rule is to rebase if and only if you have not pushed the changes to *any* remote repository, be that your fork or the master branch of the main mage repository. Basically, you can switch around your local changes as much as you want. Telling git that commit X, Y and Z all came after commit E, F and G. That doesn't matter to anyone but you. As soon as you push to a remote repository then the history should be fixed.

It is only safe to rebase when you don't change the order of commits in a remote repository. I rebase where I can, and merge when it's necessary.
~Simown
Simown
 
Posts: 14
Joined: 30 Aug 2014, 19:25
Has thanked: 6 times
Been thanked: 1 time

Re: Proper way of working with the Git repo?

Postby klayhamn » 29 Jul 2015, 21:53

Simown wrote:The golden rule is to rebase if and only if you have not pushed the changes to *any* remote repository, be that your fork or the master branch of the main mage repository.
but that's what I still don't understand - what do you mean by "pushed the changes" -
after all, I've pushed SOME changes before -- at SOME point in history (let's assume for the sake of example that the "A" commit was created, committed and pushed by me)

So which changes are we talking about exactly when you say "the changes"?

As I see it - X Y and Z can either be already-pushed, or not-yet-pushed
if they are pushed, what distinguishes them from the "A" commit?

Why can I rebase when I haven't pushed X Y and Z but HAVE pushed A (at some point in the past)?
but on the other hand I can't rebase when I HAVE pushed X Y and Z?
what makes them different than A - in this example?

Sorry if I'm being difficult, I'm just still a bit confused by this :)


Is it because when E F and G exist on top of the main repo, this necessarily means the person who merged them in had to merge them with "A" (among other things), because "A" was pushed there first?

And then, if I push X Y and Z to my personal fork,
and then rebase and receive E F and G to my local copy,
then the next time I commit ---
my personal fork would become A B C D X Y Z E F G
even though the main repo is A B C D E F G ?

did I get this right?
klayhamn
 
Posts: 17
Joined: 25 Jul 2015, 20:07
Has thanked: 2 times
Been thanked: 2 times

Re: Proper way of working with the Git repo?

Postby Simown » 29 Jul 2015, 22:58

klayhamn wrote:but that's what I still don't understand - what do you mean by "pushed the changes" -
after all, I've pushed SOME changes before -- at SOME point in history (let's assume for the sake of example that the "A" commit was created, committed and pushed by me)
Sorry if this wasn't clear. Let's assume for simplicity there are two repositories

A) Your local repository which is usually "master"
B) The remote repository on github which is called "origin/master"

For the rest of this post A) is "local" and B) is "remote". The remote is always a single place on github, each user can have a local copy. All users push and pull from the same place and the same branch. Local tracks changes to the remote, you push changes from local to remote and other people pull/fetch your changes from the remote to their local copy.

Going back to the previous example http://pastebin.com/xvfTggff. In your local repository you have commits A, B, C, D and on the server there is A,B,C,D - both local and remote are at the same point. You develop some features and commit locally only to you local repository commits X, Y and Z. These are the changes I mean. So the remote still only has A, B, C and D but your local repository has A,B,C,D,X,Y,Z.

Now comes the more complicated part. In the meantime someone has made changes on their local copy and pushed to the remote. The remote now contains commits A,B,C,D,E,F,G. Your local git history doesn't contain E,F and G. Now, if you rebase the changes to your local copy will be rewritten ahead of the new commits you pull in from the remote. The changes that will move in the rebase are X, Y and Z -- these are the only commits locally that differ from the remote.

klayhamn wrote:As I see it - X Y and Z can either be already-pushed, or not-yet-pushed
if they are pushed, what distinguishes them from the "A" commit?
X, Y and Z should be not-yet-pushed commits. They differ from commit "A" in that they don't exist as part of the remote repository. Rebasing should only move commits locally that are not on the remote.

klayhamn wrote:Why can I rebase when I haven't pushed X Y and Z but HAVE pushed A (at some point in the past)?
but on the other hand I can't rebase when I HAVE pushed X Y and Z?
what makes them different than A - in this example?
You can rebase when "A" is there solely because the commit exists in both the same place on the local and remote repositories. A rebase with two identical copies will not do anything, no history is rewritten because there are no changes. Commits A-D will stay in the same place in the git history if you issue a rebase.

klayhamn wrote:And then, if I push X Y and Z to my personal fork,
and then rebase and receive E F and G to my local copy,
then the next time I commit ---
my personal fork would become A B C D X Y Z E F G
even though the main repo is A B C D E F G ?
You have to be very careful here. You *don't* want to push to your fork if you are going to rebase.

I'll try and give an example. "mage" is now the remote for upstream/master and "fork" is the remote for origin/master

mage looks like this:

A - B - C - D - E - F - G

your local copy looks like this

A - B - C - D - X - Y - Z

now, if you push to your fork, fork looks like:

A - B - C - D - X - Y - Z

Now you have pushed, the fork must stay in that order:

If then you decide to do a "git pull --rebase upstream/master" (or fetch/rebase)

Your local copy will look like this, which seems fine:

A - B - C - D - E - F - G - X' - Y' - Z'

But wait - your fork says the git commit history looks like:

A - B - C - D - X - Y - Z

Commit X, Y and Z is after commit D, not E, F and G. The git history is stored in the hashes of each commit and you have effectively rewritten part of the history.

If you issue a "git status" you'll get something to the effect of "your local copy is ahead of fork by 3 commits. your fork is behind your local copy by 3 commits" - because you have effectively erased X, Y and Z and rewritten them at the head of your local copy.

The correct way to rebase like this in this situation would be:

1) Commit X,Y,Z locally. Your local copy looks like:

A - B - C - D - (X) - (Y) - (Z) <--- (not pushed)

And the remote looks like:

A - B - C - D

Someone else adds changes, your local copy still looks the same but the remote looks like:

A - B - C - D - E - F - G

2) git pull --rebase origin/master

Now your local copy looks like:

A - B - C - D - E - F - G - (X') - (Y') - (Z') <--- (not pushed)

3) push - now both local and remote will be in the same state and have the same history.:

A - B - C - D - E - F - G - X' - Y' - Z'

Hopefully that goes some way to explaining it, and I didn't manage to confuse myself halfway through either :)
~Simown
Simown
 
Posts: 14
Joined: 30 Aug 2014, 19:25
Has thanked: 6 times
Been thanked: 1 time

Re: Proper way of working with the Git repo?

Postby klayhamn » 29 Jul 2015, 23:19

Wow thanks so much for taking the time to write this detailed explanation!

I understand 95% of what you're saying but there's still 5% I'm unclear about regarding the edge-cases (i.e. I understand what "should be done", but I'm not entirely sure about what would happen in each case where I diverge a bit from this suggested formula)

But I will stop bothering you now :)
my head is already spinning from all this anyway
klayhamn
 
Posts: 17
Joined: 25 Jul 2015, 20:07
Has thanked: 2 times
Been thanked: 2 times

Re: Proper way of working with the Git repo?

Postby Simown » 30 Jul 2015, 08:31

Hopefully one day it will just click. Can't remember how long it took me to fully understand it.

I think a good way is to create some examples and see what rebase is doing, if you make a mistake you can just discard it as long as the remote you are using stays in a sane state.

Anyway, best of luck :)
~Simown
Simown
 
Posts: 14
Joined: 30 Aug 2014, 19:25
Has thanked: 6 times
Been thanked: 1 time


Return to Developers Talk

Who is online

Users browsing this forum: No registered users and 1 guest

Main Menu

User Menu

Our Partners


Who is online

In total there is 1 user online :: 0 registered, 0 hidden and 1 guest (based on users active over the past 10 minutes)
Most users ever online was 7303 on 15 Jul 2025, 20:46

Users browsing this forum: No registered users and 1 guest

Login Form