Glad you could stop by the Linux Basement site. Linux Basement is an informational Podcast about Linux, open source software and lots of other wonderful technology. If you want to find out more about open source technologies, subscribe and have a listen!

#linuxbasement is up at irc.freenode.net

MP3 Feed
Ogg Vorbis Feed
MP3 Feed (all episodes)
Ogg Feed (all episodes)

Video blog (not the show!)

 

The Mother of All Commands - By Luke Scharf

Special thanks to Luke Scharf for this Tutorial/lesson in Linux.

This bash command is a handy way to do a network backup of
a machine during a rebuild, or for troubleshooting.  It's a great
exercise for those who want to understand the bash, Unix I/O
redirection, and what it can do.

The beauty of this command is that tar, ssh, and bash were not intended
to do a network backup.  However, they were intended to work together --
so you can combine them to do useful work in ways that the authors may
not necessarily have intend*ed.

*
*The Command:*

   laptop# ( cd / ; tar -zcvf - /home ) | ssh user@server.domain.tld 'cat > laptop_backup_`date -I`.tar.gz'

*
What Does It Do?*
It's a quick way to do a network-backup that can be run from just about
any Unix command-prompt.  It preserves the usual filesystem metadata --
the path, permissions, modification times, and ownership of each
individual file and directory in the directory tree.  In other words,
this is the kind of backups that is actually useful for restoring files
in real life.

*Broad Properties:*
One of the important properties is that it doesn't use any disk space on
the local machine ("Laptop") -- so if you have an 80GB HDD with 75GB of
data stored, you can still use tar to do this backup.  Unlike scp and
rsync, Tar will preserve the file permissions and ownership information
internally -- so if you only have user-level privileges on the remote
machine, you can still restore the files without much hassle.*

Dissecting The Command:*
There are a lot of elements to this command...  I encourage you to
experiment with components of the command before trying the command in
it's entirety.  Here is the breakdown:

   * "( cd / ; tar -zcvf - /home )"     # The parentheses group these
     two commands together (the semicolon separates the commands) and
     makes this section into a mini script.  So, tar will run from /
     and the output from cd (which will not create any output in this
     case) and the output from thar will be combined into one stream.
         o "cd /"     # just what you think
         o ";"      # Separetes the two commands
         o tar -zcvf - /home
               + "-z"      # Compress the tar file with gzip -- no need
                 to do this separately.
               + "-c"      # Create a tar file (as opposed to extract)
               + "-v"      # Verbose - write a list of the files backed
                 up to stderr
               + "-f -"     # Write the resulting .tar.gz to a file.
                 In this case, we provide a special flag (the hyphen),
                 which instructs tar to send the .tar.gz to stdout.
         o "/home"     # This section of the tar command is the list of
           files/directories to back up.  In this case, I have only one
           entry in the list which is /home/.  Tar will recursively
           back up all files and subdirectories in /home/.  Another
           possible value here would be "/home/gooduser /home/happyuser
           /home/wonderful", which would back up those three
           directories -- which would ignore "/home/hasntloggedinlastyear".
   * "|"   # This is the middle of the command.  It's a pipe -- the
     standard output of "( cd / ; tar -zcvf - /home )" will be attached
     to the standard input of the next command.  This joins the two
     halves of the command.
   * ssh user@server.domain.tld 'cat > laptop_backup_`date
     -I`.tar.gz'     # This is just an ssh command.  It's not a
     compound command like the left side.  It does have several
     elements, though.
         o "user@server.domain.tld"     # this is username and hostname
           the remote machine.  This is everyday use of ssh.
         o "'cat > laptop_backup_`date -I`.tar.gz'"     # This is the
           command that is run on the remote machine.  This is also a
           regular non-special feature of ssh -- though I expect most
           people do something like "ssh
           luke@smurfserver.smurfvilliage.com ls".  But this is more
           complicated than it looks.
               + "cat"     # Why is the "cat" there?  It seems like you
                 should be able to just write out the file on the other
                 end by using the ">", but this is not the case -- ssh
                 needs some program to connect to the terminal-keyboard
                 (which has already been attached to tar on the
                 Laptop), and ">" is a shell-directive.  So, starting
                 an instance of "cat" provides a stdin that sshd (on
                 Server) can use -- it has the right kind of input.
               + ">"     # Now that we have the "cat" in-place and sshd
                 has a place to send the data, we can redirect it to a
                 file.
               + laptop_backup_`date -I`.tar.gz     # This is the
                 filename of the tar that will be created on the remote
                 machine.  Since I didn't provide a relative or
                 absolute path, it'll dump the file in
                 user@server.domain.tld's home directory.  You can put
                 any valid writable path here, if you like.  But, wait,
                 there's more!  What's the deal with the bakticks
                 around `date -I`?  The "date -I" command look up
                 today's date (in the YYYY-MM-DD format) and echos it
                 to stdout.  The backticks take that stdout and turn it
                 into a command-line paramater which, in this case, is
                 part of the filename!  So, if you run the same backup
                 tomorrow, you won't overwrite today's good backup.
                 But, wait, there's even more!  Since 'cat >
                 laptop_backup_`date -I`.tar.gz' term is enclosed in
                 single quotes, the local shell on Laptop won't process
                 the backticks -- they're passed on to the remote
                 system!  So, if the clocks are way off, the date will
                 be set according to the clock on the good stable
                 remote-server!  If you used double quotes ( "cat >
                 laptop_backup_`date -I`.tar.gz" ), the local shell on
                 Laptop will process the backticks!  Neat!

Done!  If you followed all of that then you do, indeed, understand the
subtleties of bash, ssh, and Unix I/O redirection!

*How Does This Fit Into The Big Picture?*
There are a number of fancy programming-language style tricks that bash
can do.  This command, however,  relies heavily on Unix fundamentals,
and also happens to be very useful for solving real problems --
especially when combined with a network-aware Live CD like Knoppix or
the Ubuntu install CD.

*P.S. Some Related Trivia:*
A useful and very-much-related bash+tar trick for copying files around
the local machine (while preserving the usual metadata) is the following:

   (cd /home ; tar -cf - . ) | (cd /newhome ; tar -xvf - )

This does the same thing as "cp -rpv /home/* /newhome/" or "rsync -avP
/home/ /newhome/"...  But the beauty of Unix is that there are many ways
to do these things -- and that they can be applied and adapted to
whatever you want to do in different and interesting ways.

shell scripting is a very

shell scripting is a very well mentioned topic this week. Hacker public radio had the uclug ep on it and Chess' latest ep was also about shell scripting. Nice to get so much info on one subject all at once. Thanks Luke for this great tutorial.

And how cool would it be to run Folding@Home on that supercomputer. If I give you my F@H id would you mind boosting my score ;-)
------------------------------------------------------------------------------------------------------------------
At Microsoft, failure is not an option; it comes pre-installed with Windows

Protein Folding

We run folding at work on System X:
http://amber.scripps.edu/
:-)
 
Something like Folding@Home that tries to run daemons in the background (or as a screensaver) wouldn't be a natural fit for System X, since System X really a queuing/batch system (Torque/Maui+Gold).  The compute-nodes aren't allowed to communicate directly with the Internet, and groups of nodes are allocated to users based on the results of a periodic healthcheck and what we think their state should be.  Running a background daemon like Folding@Home on the nodes would cause them to fail the healthceck, since they're marked as "Idle" in the queuing system but still have a busy CPU.  The term we use in the healtcheck script is "rogue processes".

Typos!

When I wrote the initial tutorial above, I didn't indent properly.  The "/home" part of the tar command should be indented one more level, to show that it really is part of the tar command (rather than an element of the mini-script).