SVN (by Professor Cappello for CS 50, Winter 2007)

What is version control?

Controlling changes to information is version control: A version control system enables collaborative editing and sharing of information.

Why is it useful?

For a team

It allows a set of people to concurrently use and change the information in a set of related files.

For a single person

It allows that person to start making a set of complex changes to his files without having to worry about whether or not he can restore the files to their prior state, if desired.

What kinds of files is version control good for?

  • .java
  • .class
  • html
    • Javadoc
  • libraries, such as JUnit
  • ...

Must I use version control for my project?

Yes. It enables multiple team members and programming pairs to work concurrently on the project. Use of a version control tool, such as Subversion, results in fewer errors and a freer style of project development.

Versioning Models

(Based on the material in the SVN book)

The Problem of File-Sharing

All version control systems address the same problem: how does the system enable users to share information, but prevent them from destructively interfering with each other? It's all too easy for users to accidentally overwrite each other's changes in the repository.

Consider the scenario shown in the problem to avoid (below). Suppose we have two co-workers, Harry and Sally. They each decide to edit the same repository file at the same time. If Harry saves his changes to the repository first, then it's possible that (a few moments later) Sally could accidentally overwrite them with her own new version of the file. While Harry's version of the file won't be lost forever (because the system remembers every change), any changes Harry made won't be present in Sally's newer version of the file, because she never saw Harry's changes to begin with. Harry's work is still effectively lost—or at least missing from the latest version of the file—and probably by accident. This is definitely a situation we want to avoid!

Figure 2.2. The problem to avoid

The problem to avoid

The Lock-Modify-Unlock Solution

Many version control systems use a lock-modify-unlock model to address this problem. In such a system, the repository allows only one person to change a file at a time. Harry must “lock” the file before he can change it. If Sally tries to lock the file, the repository will deny the request. All she can do is read the file, and wait for Harry to finish his changes and release his lock. After Harry unlocks the file, Sally can lock and edit it.

Figure 2.3. The lock-modify-unlock solution

The lock-modify-unlock solution

The lock-modify-unlock model is restrictive, and often becomes a development bottleneck:

  • Locking may cause administrative problems. Harry might lock a file and then forget about it. It forces coordination that may not be necessary. If Harry cannot be contacted, Sally has to get an administrator to release Harry's lock. It thus can cause a lot of unnecessary delay and wasted time.

  • Locking may cause unnecessary serialization. What if Harry is editing the beginning of a text file, and Sally simply wants to edit the end of the same file? These changes don't overlap at all. They could easily edit the file simultaneously without harm, assuming the changes were properly merged together.

  • Locking may create a false sense of security. Pretend that Harry locks and edits file A, while Sally simultaneously locks and edits file B. But suppose that A and B depend on one another, and the changes made to each are semantically incompatible. Suddenly A and B don't work together anymore. The locking system was powerless to prevent the problem—yet it somehow provided a false sense of security. It's easy for Harry and Sally to imagine that by locking files, each is beginning a safe, insulated task, and thus not bother discussing their incompatible changes early on.

The Copy-Modify-Merge Solution

Subversion, CVS, and other version control systems use a copy-modify-merge model. In this model, each user's client contacts the project repository and creates a personal working copy—a local reflection of the repository's files and directories. Developers concurrently modify their private copies. The private copies are merged into a new version. The version control system assists with the merging, but ultimately a human being is responsible for making it happen correctly.

Say that Harry and Sally each create working copies of the same project, copied from the repository. They concurrently make changes to the same file A within their copies. Sally saves her changes to the repository first. When Harry attempts to save his changes later, the repository informs him that his file A is out-of-date. In other words, that file A in the repository has somehow changed since he last copied it. Harry asks his client to merge any new changes from the repository into his working copy of file A. Chances are that Sally's changes don't overlap with his own; so once he has both sets of changes integrated, he saves his working copy back to the repository.

Figure 2.4. The copy-modify-merge solution

The copy-modify-merge solution

Figure 2.5. The copy-modify-merge solution (continued)

The copy-modify-merge solution (continued)

What if Sally's changes overlap Harry's changes? This is called a conflict. When Harry asks his client to merge the latest repository changes into his working copy, his copy of file A is flagged as being in conflict: He can see both sets of conflicting changes, and manually choose between them. The software does not automatically resolve conflicts. This responsibility is delegated to humans. Once Harry has resolved the overlapping changes—perhaps after a discussion with Sally—he can save the merged file back to the repository.

The copy-modify-merge model may sound chaotic, but in practice, it runs extremely smoothly. Developers work concurrently. When they work on the same files, it turns out that most of their concurrent changes don't overlap; conflicts are infrequent. The amount of time it takes to resolve conflicts generally is far less than the time lost by a locking system.

No system can force users to communicate perfectly, and no system can detect semantic conflicts. So there's no point in being lulled into a false promise that a locking system will somehow prevent conflicts; in practice, locking seems to inhibit productivity more than anything else.

What am I expected to know about version control for this course?

  • What version control is
  • Why it is useful
  • The lock-modify-unlock model of version control
  • The copy-modify-merge model of version control
  • Subversion commands
    • checkout
    • update
    • add
    • delete
    • move
    • status
    • revert
    • resolved
    • commit

Subversions basic commands

Creating your work space:

  • svn checkout <url of your project repository>

The typical work cycle looks like this:

  • Update your working copy

    • svn update // repository changes subsequent to this copy (or last update) are merged into this copy.

  • Make changes

    • svn add <file | directory>

    • svn delete <file | directory>

    • svn copy <file | directory>

    • svn move <file | directory>

  • Examine your changes

    • svn status // shows modified, added, removed, or renamed files since last commit

    • svn diff // to see a unified diff output of your changes

    • svn revert

  • Merge others' changes into your working copy

    • svn update

    • svn resolved

  • Commit your changes

    • svn commit  // merge my working copy into the repository, making a new version.

Update Your Working Copy

$ svn update
U foo.c
U bar.c
Updated to revision 2.

In this case, someone else checked in modifications to both foo.c and bar.c since the last time you updated, and Subversion has updated your working copy to include those changes.

When the server sends changes to your working copy, a letter code is displayed next to each item to let you know what actions Subversion performed to bring your working copy up-to-date:

U foo

File foo was Updated (received changes from the server).

A foo

File or directory foo was Added to your working copy.

D foo

File or directory foo was Deleted from your working copy.

R foo

File or directory foo was Replaced in your working copy: foo was deleted, and a new item with the same name was added. Subversion considers them to be distinct objects.

G foo

File foo changed in the repository, and you changed the local copy. Either the changes did not intersect, or the changes were exactly the same as your local modifications. Subversion successfully merGed the repository's changes into the file.

C foo

File foo received Conflicting changes from the server: The changes overlap your changes and need to be resolved by you; we explain how below.

Make Changes to Your Working Copy

svn status
M bar.c # the content in bar.c has local modifications
? foo.o # svn doesn't manage foo.o
! some_dir # svn manages this, but it's either missing or incomplete
D stuff/fish.c # this file is scheduled for deletion
A stuff/loot/bloo.h # this file is scheduled for addition
C stuff/loot/lump.c # this file has conflicts from an update

svn diff

$ svn diff
Index: bar.c
===================================================================
--- bar.c (revision 3)
+++ bar.c (working copy)
@@ -1,7 +1,12 @@
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <unistd.h>
+
+#include <stdio.h>

int main(void) {
- printf("Sixty-four slices of American Cheese...\n");
+ printf("Sixty-five slices of American Cheese...\n");
return 0;
}

Index: README
===================================================================
--- README (revision 3)
+++ README (working copy)
@@ -193,3 +193,4 @@
+Note to self: pick up laundry.

Index: stuff/fish.c
===================================================================
--- stuff/fish.c (revision 1)
+++ stuff/fish.c (working copy)
-Welcome to the file known as 'fish'.
-Information on fish will be here soon.

Index: stuff/things/bloo.h
===================================================================
--- stuff/things/bloo.h (revision 8)
+++ stuff/things/bloo.h (working copy)
+Here is a new file to describe
+things about bloo.

svn revert

$ svn revert README
Reverted 'README'

Subversion reverts the file to its pre-modified state.

svn revert can undo any scheduled operation.

$ svn status foo
? foo

$ svn add foo
A foo

$ svn revert foo
Reverted 'foo'

$ svn status foo
? foo

Resolve Conflicts (Merging Others' Changes)

$ svn update
U INSTALL
G README
C bar.c
Updated to revision 46.

The C stands for conflict. You have to manually choose between your changes to bar.c and those in the repository.

3 things typically occur to assist you in noticing and resolving that conflict:

  • Subversion prints a C during the update.

  • If Subversion considers the file to be mergeable (i.e., capable of line-based modification), it places conflict markers—special strings of text which delimit the “sides” of the conflict—into the file to visibly demonstrate the overlapping areas.

  • For every conflicted file, Subversion places up to 3 unversioned files in your working copy:

    filename.mine

    The working copy of the file before you updated it working copy—without conflict markers.

    filename.r<OLDREV>

    Tthe BASE revision of your working copy.

    filename.r<NEWREV>

    This file corresponds to the HEAD revision of the repository.

    OLDREV is the revision number of the file in your .svn directory and NEWREV is the revision number of the repository HEAD.

For example, Sally changes the file sandwich.txt in the repository. Harry changes the file in his working copy and commits in. Sally updates her working copy before checking in and she gets a conflict:

$ svn update
C sandwich.txt
Updated to revision 2.
$ ls -1
sandwich.txt
sandwich.txt.mine
sandwich.txt.r1
sandwich.txt.r2

Subversion will not allow you to commit the file sandwich.txt until the 3 temporary files are removed.

$ svn commit --message "Add a few more things"
svn: Commit failed (details follow):
svn: Aborting commit: '/home/sally/svn-work/sandwich.txt' remains in conflict

You need to either:

  • Merge the conflicted text “by hand” (by examining and editing the conflict markers within the file).

  • Copy one of the temporary files on top of your working file.

  • Run svn revert <filename> to throw away all of your local changes.

Once you've resolved the conflict, you let Subversion know by running svn resolved. This removes the 3 temporary files. Subversion no longer considers the file to be in conflict.

$ svn resolved sandwich.txt
Resolved conflicted state of 'sandwich.txt'

Merging Conflicts by Hand

You and Sally, both edit the file sandwich.txt at the same time. Sally commits her changes. When you update your working copy, you get a conflict. Edit sandwich.txt to resolve the conflicts.

$ cat sandwich.txt
Top piece of bread
Mayonnaise
Lettuce
Tomato
Provolone
<<<<<<< .mine
Salami
Mortadella
Prosciutto
=======
Sauerkraut
Grilled Chicken
>>>>>>> .r2
Creole Mustard
Bottom piece of bread

The <<, ==, and >> are conflict markers: They are not part of the actual data in conflict. Ensure that those are removed from the file before your next commit. The text between the first 2 sets of markers is composed of the changes you made in the conflicting area:

<<<<<<< .mine
Salami
Mortadella
Prosciutto
=======

The text between the 2nd and 3rd sets of conflict markers is the text from Sally's commit:

=======
Sauerkraut
Grilled Chicken
>>>>>>> .r2

You talk to Sally, and do the right thing (discard Sally's edits):

Top piece of bread
Mayonnaise
Lettuce
Tomato
Provolone
Salami
Mortadella
Prosciutto
Creole Mustard
Bottom piece of bread

Run svn resolved

$ svn resolved sandwich.txt
$ svn commit -m "Go ahead and use my sandwich, discarding Sally's edits."

Copying a File Onto Your Working File

If you get a conflict and decide that you want to throw out your changes, you can merely copy one of the temporary files created by Subversion over the file in your working copy:

$ svn update
C sandwich.txt
Updated to revision 2.
$ ls sandwich.*
sandwich.txt sandwich.txt.mine sandwich.txt.r2 sandwich.txt.r1
$ cp sandwich.txt.r2 sandwich.txt
$ svn resolved sandwich.txt

Using svn revert

If you get a conflict and want to throw out your changes, revert:

$ svn revert sandwich.txt
Reverted 'sandwich.txt'
$ ls sandwich.*
sandwich.txt

When you revert a conflicted file, you don't have to run svn resolved.

svn resolved requires an argument. Only run svn resolved when you're certain that you've fixed the conflict in your file—once the temporary files are removed, Subversion lets you commit the file even if it still contains conflict markers.

Commit Your Changes

The commit operation requires a log message (describing your change). Your log message is attached to the new revision. If your log message is brief, you may supply it on the command line using the --message (or -m) option:

$ svn commit --message "Corrected number of cheese slices."
Sending sandwich.txt
Transmitting file data .
Committed revision 3.

If you are composing your log message in a separate file as you work, tell Subversion to get the message from the file:

$ svn commit --file logmsg 
Sending sandwich.txt
Transmitting file data .
Committed revision 4.

If you fail to specify either the --message or --file switch, then Subversion will automatically launch your favorite editor for composing a log message.

If you get:

$ svn commit --message "Add another rule"
Sending rules.txt
svn: Commit failed (details follow):
svn: Out of date: 'rules.txt' in transaction 'g'

Run svn update, deal with any merges or conflicts that result, and re-commit.

To learn more

Please look at the Subversion part of the course Resources page.