Community Pick: Many members of our community have endorsed this article.
Editor's Choice: This article has been selected by our editors as an exceptional contribution.

Git 101

gr8gonzoConsultant
CERTIFIED EXPERT
Published:
Updated:
Git can be a complicated version control system for beginners, but it definitely is one of the best ones out there. Since this article assumes that you're starting at square one, it will skip over things that Git -can- do and will focus on the typical setup for small groups of developers / small projects.

Before getting into any commands, it's important to understand how stuff moves around in Git, because most tutorials start diving into things like "upstream" and "origin" without any good explanations of what data is moving where and why.

One easy way to understand Git is to imagine a ZIP file that is stored on a server somewhere. Whenever you need to update the contents of that ZIP file, you copy the file over to your computer, pull out the files, make your changes, put the changes back into the ZIP file, and then you copy that ZIP file back to the server so that others can get your changes. This is very similar to the workflow for Git.

Step 1: Install Git and a New, Blank Repository

You usually start by installing Git on a server somewhere and creating a new, blank repository on that server.

Analogy: This is like creating an empty ZIP file on the server. Everyone can see the ZIP file and copy it, but there's nothing in it yet, so it's kind of useless.

Note for Admins: Setting up Git is a more advanced topic that requires more detail, but if you are an admin trying to decide on where to install Git, my opinion is that Git works better on Linux servers, using SSH to handle authentication and setup, since that's pretty much the default setup on most Linux distributions nowadays. You can make it work on Windows servers, but it's a pain to set up.

Step 2: "Clone" the Repository

Once your new, empty repository is created, you have your storage structure set up and all you need to do is add files to it. You do this by "cloning" the repository with Git.

Analogy: This is like copying that blank ZIP file from the server to your own computer. Now you AND the server both have that ZIP file, but you know in your head that the server is the real "source" for this ZIP file and that's where everyone knows where to find it.

Step 3: Add Files / Make Changes

Once you have cloned the repository, you can add new files (and later make updates to existing files), and then do a commit when you're finished with your changes.

Analogy: A commit is like saving an updated ZIP file on your local computer. Now YOU have the updated ZIP file, but the server still has the old one. You can keep making changes if you want, but at some point, you need to copy your ZIP file back to the server so that others can get your changes.

Step 4: Pushing the Repository

It's a good practice to push changes after each commit (at least for now). When you decide to push, you are usually pushing back to the "origin" which is simply the place from which you originally "cloned" the repository.

Analogy: When you use Git to "push" your changes, you are copying your updated ZIP file back to the server. Since you copied the ZIP file from the server, the server is the "origin".

Step 5: Pulling and More Updates

After you've "cloned" the repository, you don't need to clone it again. The cloning process is just a step to copy the repository onto your local computer. After you have cloned the repository, you can get any future updates from the server by telling Git to do a "pull", which will just update your local copy of the repository with whatever is on the server.

Analogy: The following day, you might want to make another update. Since others might have updated the ZIP file on the server in that time, you need to copy it back to your computer first so that you're working with the latest data. Since you already have the ZIP file, the git "pull" is simply like copying from the server again. Git is a little more intelligent, so it can quickly pull just the changes rather than trying to download everything again.

Step 6: Merging in Latest Changes

Git is also intelligent enough to check for newer changes before it accepts your attempt to push new updates back to the server. If you're out of date, it will tell you and ask you to perform another git pull. This git pull will pull down the latest changes and then intelligently merge them into what you already have. Once that's done, you can continue to push your changes.

Analogy: This is basically like stopping before you copy your latest ZIP file up to the server, and then going to the server and seeing if there is a newer version of that ZIP there. If there is, then you stop, download that new version and try to figure out what has changed so you can include those changes in your own ZIP file. Once you've done that, you copy your ZIP file back to the server so that it has everyone's changes. The difference is that Git makes this process simple, where doing the analogy version would be slow an inefficient.

Other Concepts: Branches

In the analogy, there's a single ZIP file that you're working with. But what happens when you have multiple, related ZIP files? For example, let's say the ZIP file contained spreadsheets containing all the credit card transactions for the year 2012. Now it's year 2013 - do you keep adding on to the same ZIP file? Normally, you would probably just create a new ZIP file in that same folder on the server and call it something like CC2013.zip instead of CC2012.zip. Then, whenever you copy ZIP files back and forth between the server, you copy all of them back and forth.

Branching is simply a way to organize various efforts that are happening on the same repository. For example, you might be developing a product and you want a developer to work on a new feature, but you don't want his work in progress to interfere with the product code. You want the product code to keep running the way it is until the new feature is ready.

In this scenario, the developer could create a branch which is a full, completely-isolated copy of all the product code that exists only for people who have chosen specifically to view that branch. Everyone else would see the regular product code.

The developer could create new code, committing it to the branch, and testing it on his/her own until the code is ready to be released into the new product.

Other Concepts: Merging

Git requires branches for everything. That's just the way it stores things, so when you're working with a new repository and you haven't created any new branches, you're very likely working with the default branch that is called "master". In our "developer / new feature" scenario above, the "master" branch is where the regular product code is, while the developer's branch ("tims_new_feature") is the regular product code plus Tim's new code.

Once Tim is finished with his feature, the new code obviously has to get back into that "master" branch in order for everyone else to be able to see it. This is where merging comes in.

Merging is simply taking the changes from one branch and copying them into another branch. It is a "pulling" type of direction, so in Git, Tim would switch over to the "master" branch first, and then tell Git to merge from the "tims_new_feature" branch. This creates a new commit on the "master" branch on Tim's machine. Tim would then push that new commit, and everyone would now have access to the newest code (once they do a git pull on their own machines).

Other Concepts: Commit IDs

With any version control system, you can gain access to how the code looked at any given point in time, as long as you know which commit you're looking for. So if you want to get the product code in whatever state it was after Joe Smith committed his changes for "Bug Fix A", you can absolutely do that. You simply need to know the commit ID.

In most other version control systems, commit IDs are numbers that just increase by 1 for each commit, so you have commit #1, then commit #2, etc... In Git, the commits are identified by a series of letters and numbers (e.g. "abcd1234ef677890" called a "hash." It's a unique identifier and helps Git to organize things on the backend a little better. Git also offers a log command that will let you see the commits by their messages, so you're not trying to memorize or write down a bunch of commit tags.

Final Thoughts

I personally find it intimidating to go into the actual commands for all of these items right away, and there are a lot of things this article does not cover because there are so many ways in which Git can be configured to work with specific workflows.  That said, I would suggest that beginners start with an excellent (and free) GUI program called SourceTree. It takes care of many of the commands for you until you get the hang of the workflow (at which point, you might feel more comfortable using the commands themselves).

Copyright © 2013 - Jonathan Hilgeman. All Rights Reserved. 
13
4,714 Views
gr8gonzoConsultant
CERTIFIED EXPERT

Comments (4)

CERTIFIED EXPERT

Commented:
B-) in my search for a solution to my problem with GitLab https://www.experts-exchange.com/questions/28424640/Problem-with-Gitlab.html ... I had some hope to find some clues here... seems you might like to give a more detailed view of item 1 (setting the git repository server), eg with using turnkey solutions as GiLab or Gitorious
Terry WoodsWeb Developer, specialising in WordPress
CERTIFIED EXPERT
Most Valuable Expert 2011

Commented:
Thanks for writing this; it was just what I was looking for!
CERTIFIED EXPERT

Commented:
+1 Good intro to create a mental picture what is Git about.

You can make it work on Windows servers, but it's a pain to set up. To clarify that, installation of Git on Windows is extremely easy. What may be more difficult is to make it work using a different than "file" protocol. On the other hand, I have found myself using always the "file" protocol, or to clone from other repositories elsewhere (that one need not to set). I believe, very few beginners will need to set up a Git server. I believe that even a lot of intermediate or advanced programmers will not set a Git server either.

Using Git the simplest way (that is using "file" protocol) is so easy that only unexplainable fear would be the excuse not to use Git even for simplest project where one normally would not care.

If it would be a single purpose, then "copying" the project to a removable media (flash, USB mobile disk) and back is great with Git -- keeping history being a bonus.
Jim HornSQL Server Data Dude
CERTIFIED EXPERT
Most Valuable Expert 2013
Author of the Year 2015

Commented:
Very well written from a 101 perspective.  Voted Yes.

Have a question about something in this article? You can receive help directly from the article author. Sign up for a free trial to get started.