A rough dissection of a recent sync between live and stage

A wanted our staging version of our internal site to get synced with production. Technically, we all did, but it was A's request that really kicked off the process. In the past, doing this meant that someone had to manually copy all of the files from the production side to the development side, after first making a copy so that developers could find and reapply their pending changes. With svn, things were a little more straightforward, but still required a certain amount of work.

B and I went into the production side and ran

$ svn diff

This spit out screens full of changes, and then errored. Someone had deleted a directory without telling svn. The solution was pretty easy

$ cd dir
$ svn revert missing_dir
$ svn rm missing_dir
$ svn commit missing_dir

Appropriate check in comments were left, and away we went to the next missing directory. Finally, svn diff ran all the way through, but it was a huge number of files to commit all at once -- and surely they didn't all relate to the same individual changes, right?

An easy way to see the list of changed files is

$ svn status

You can pipe that through grep to see *only* the modified files:

$ svn status | egrep '^M'

You can replace the M with other statuses. I'll leave that as an exercise for the reader. In any case, ideally, we would have taken a list of modified files and figured out which ones belonged to which sets of changes, but some bulk checkins were done under a blanket "getting up to date" style checkin. This isn't optimum, but works.

$ svn commit

Pow! Now the subversion repository has copies of everything that was in the production side.

$ cd ../stage

Let's pretend that that command put us in the correct directory.

Now, what kind of status is stage in? If only there were a way to tell what had changed. Oh!

$ svn diff

Pages and pages later, we realize that a lot has been changed. We want to update, but we don't want to lose our changes.

If we didn't care, we would just do:

$ svn revert

and subversion would do its best to overwrite all of our changes with up to date copies of what was in its repository. Since we just checked in the production side, that would be a good way to make stage match it fairly exactly (minus generated files). But, we don't want to lose our changes, so instead we just do:

$ svn update

Assuming that nothing tragic happens and there are no other problems like filesystem permissions or lock contention, subversion should go through every file and directory and apply changes to bring the stage side up to date without overwriting anything that had already changed on stage. The merging of changes happens not on a file-by-file basis, but on a line-by-line basis.

Let's explain that again, because it's important. I will use a completely fictional example. A and B are both checked out of the repository at the same time. This is not exactly how our production and stage servers were put into svn, but the operation is very similar, so bear with me.

$ svn co path_to_svn/some_project A
$ svn co path_to_svn/some_project B
$ cat A/monkey.txt
foo
bar
baz
$ emacs A/monkey.txt
$ cat A/monkey.txt
foo
monkey
baz
$ echo pirate >>B/monkey.txt
$ cat B/monkey.txt
foo
bar
baz
pirate
$ svn commit A/monkey.txt
(svn messages omitted)
$ svn update B/monkey.txt
(svn messages omitted)
$ cat B/monkey.txt
foo
monkey
baz
pirate

So, you can see that after the various svn magicks, B/monkey.txt has the changes from A and B.

Let's do one more thing, just for show:

$ svn revert B/monkey.txt
(svn messages omitted)
$ cat B/monkey.txt
foo
monkey
baz

This is important, because sometimes you want to throw away changes that you have made. With our stage site, it was assumed that we started with all files up to date with the live side, but it is possible that some things were *older* than on production, rather than *newer*. As such, those files might have to be reverted to be overwritten with the right stuff.

But let's get back from the theoretical to the reality.

$ svn update
(svn messages omitted)

Now, some of those omitted messages involved conflicts. There *were* conflicts. Ain't no thing: we can fix them! But how do we find them?

$ svn status | egrep '^C'
C some_dir/some_conflicting_file.php

I went through and fixed a bunch of conflicts one at a time. I find it useful to search files for <<<<, since that will get to the start of a conflict block. Conflicts also show up when you do an

$ svn diff

so, you can look for them that way, too. I used my best judgment to get files up to date, but there are many files that differ from the live site.

What is the final product?

Stage now has all the changes made to the production side since I first checked everything in. In most places, stage matches production exactly, but there are many unaccounted-for changes to stage that need to be audited and either checked in or thrown away. Do an

$ svn diff

today! The web sites will thank you for it!

Some notes about Subversion

Here's something I wrote up at work to start getting people up to speed on subversion. It's very basic, but you have to start somewhere.

With many people editing a site, it can be easy to lose track of who is editing what and what version of each file is in any given place. Let's imagine that we have a live site and a testing site which are both checked in to svn (subversion). We'll also imagine three developers, A, B, and C.

The live and test sites start out completely in sync with the repository.
A makes a change to live/index.php
B makes a change to test/monkey.php

Now, A should have made the change in test/index.php, tested it, and then pushed the changes to the live site, but perhaps it was a real emergency, so we can overlook it. More importantly, C has come along and wants to know if test/ is up to date. C can do an "svn status" to check.

test $ svn status
M monkey.php

The "M" stands for "modified." Here's a cheat sheet: http://knaddison.com/sites/knaddison.com/files/svn_codes.png

So, C does an "svn update"

test $ svn update
M monkey.php

However, C hasn't gotten the changes from the live site. This is A's fault. If A edits a file on the live site (surely only in an emergency), then A should check those changes in.

live $ svn commit index.php

An editor should open giving A the chance to explain their changes. A, being a thorough developer, types out a clear explanation.
"Fixed the spelling of our product name in the header."

Oh, I can see why we'd want that changed right away.

Now, A or C can go back to test and do an svn update to get the changes from the live server. No copying or backups are needed.

test $ svn update index.php

It's worth noting here that index.php was specifically updated. You can also do an svn update and get the updates for all of the subdirectories and their files.

What about monkey.php? Did that just get overwritten? No! Subversion will do its best to resolve simple conflicts. monkey.php was not changed on the live server, so there are no changes to merge back into the test server. Let's imagine that more development is going on, and C changes monkey.php on the live site. In this instance, it was not an emergency, so C's manager should yell at C for not following procedures -- but it isn't the end of the world. Lots of places work this way, and while they are not making the best use of svn, they are at an advantage to the same situation without svn. C can at least recover.

live $ svn commit monkey.php

In the editor, C gives the following reason for their edit: "Added blink tag to body text." Wow, C, really? C should be in extra trouble for that. Now let's go to the test site:

test $ svn update monkey.php
OMGWTFBBQLOLCANHASCONFLICTS!

That's not the actual error message. The actual error message is more subtle:

test $ svn update monkey.php
C monkey.php

C is for conflict, it's good enough for me. There are also some new files in test/ now:

monkey.php.mine
monkey.php.r30
monkey.php.r31

So, you've got all you need to figure out what happened. You can also just edit monkey.php and see what the conflict looks like. It will look something like this, except with version numbers interspersed.

<<<<<<<
<blink>We are awesome!</blink>
====
<marquee>We are awesome!</marquee>
>>>>>>> .r4040

Obviously, the blink tag is superior to the marquee tag, so C removes the other line and all of svn's conflict junk, leaving just the finished product. It's a good thing that this was done on test/ and not live/, since the extra angle brackets break html and/or php. With it fixed, C can do this:

test $ svn resolve monkey.php

and then

test $ svn commit monkey.php

Assuming that there were other changes, they should now be checked in.

== Some questions and answers ==

Whew, that's a lot of stuff to digest!
Can't A and B just make C do all the work, getting everything checked in and out in all the right places? Yes and no. They are the only ones who know what they have changed.

Should I make backups before making changes? No, you probably shouldn't, and if you do, they should *not* be in svn-controlled directories.

Can I put my log files right there in the svn controlled directory? You can, but you shouldn't. For one thing, we're svn controlling the web root, and you shouldn't put your log files in the web root. Just don't do it. Beyond that, it is possible to ignore files based on filename patterns, but those patterns require maintenance.

Is svn magic? No. It won't do things for you.

Can we make our live site off limits to users so that they can only edit the test site and then push their changes to the live site? Yes! We sure can, but first we have to make the live site amenable to that style of management. That means that our team has to first manage its own changes in a professional, responsible way. We are all responsible for committing our own changes to svn and generally cleaning up the cruft that has been collecting on our customer-facing web sites for years.

That means finding a new home adhoc and regular log files, ad hoc backups, and temp files. Every change on live and test need to be checked in atomically, and we need to make use of svn's tools for manipulating files. "svn rm" and "svn mv" allow svn to make the same changes to files on the live and test servers and track those changes for us.

WHERE IS THE SVN MAGICK YOU PROMISED?!?!?!?!!?!?!?!!????!?!!!!!!QUESTION MARKS!!!!!! I told you, it's not magic, but in not too long we can make some nice stuff happen automagically for us. :)

 1