synctool by Walter de Jong (c) 2003-2011 synctool COMES WITH NO WARRANTY. synctool IS FREE SOFTWARE. synctool is distributed under terms described in the GNU General Public License. See the README for general information. See the INSTALL file for information on how to deploy synctool. This DOCUMENTATION file tries to explain using synctool. Online documentation in HTML pages are available at: http://www.heiho.net/synctool/ In 'synctool terminology', a node is a host, a computer in a group of computers. A group of computers is called a cluster. Once you've installed and setup synctool, you will have a 'masterdir' repository on the master node of your cluster. This masterdir is configured in the synctool.conf file. If you don't know what I'm talking about here, read the INSTALL file carefully. The masterdir is usually set to /var/lib/synctool synctool is usually run on the master node. It will use rsync to mirror the masterdir to all the nodes in the cluster. This has the advantage that you don't need a shared filesystem to run synctool, and it is fast too. Note that it is perfectly possible to run synctool.py "stand alone" on a node, in which case it will check it's local copy of the repository. It will not synchronize with the master repository (it's all server push, NOT client pull). The main power of synctool is the fact that you can define logical groups, and you can add these to a filename as a filename extension. This will result in the file being copied, only if the node belongs to the same group. The groups a node is in, is defined in the synctool.conf file. In the configuration file, the nodename is associated with one or more groups. The nodename itself can also be used as a group to indicate that a file belongs to that node. In the synctool masterdir there are 5 subdirectories, each having its own function: * overlay/ * delete/ * scripts/ * tasks/ * sbin/ The overlay/ tree contains files that have to be copied. When synctool detects a difference between a file on the system and a file in the overlay tree, the file will be copied from the overlay tree onto the system. The delete/ tree contains files that always have to be deleted from the system. Only the filename matters, so it is OK if the files in this tree are only 0 bytes in size. The executables in the scripts/ directory are executables that synctool can run when needed. By means of the on_update directive in the synctool.conf file, a designated script may be executed when a certain file is changed. For example: when /etc/inetd.conf is updated, the script 'hupdaemon.sh inetd' must be run. The executables in de tasks/ directory are run when synctool is invoked with the '-t' or '--tasks' argument. This makes it possible to run scripts on hosts, which is very convenient for doing change management. The sbin/ directory contains the synctool programs. Isn't this an odd place to put 'binaries'? Yes, but these are the 'master copies' of the synctool software. These are also synced to every node so that synctool can run there. Example run: root@masternode:/# synctool -qf node3: /etc/xinetd.d/identd updated (file size mismatch) The file is being updated because there's a mismatch in the file size. Should the file size be the same, synctool will calculate an MD5 checksum to see whether the file was changed or not. By default synctool does a DRY RUN. It will not do anything but show what would happen if this would not be a dry run. Use -f or --fix to apply any changes. Now, I want the xinetd to be automatically reloaded after I change the identd file. There are two ways to do this in synctool. 1. Old, classic way, the way it was done in synctool version <= 3.0 Put in synctool.conf: on_update /etc/xinetd.d/identd /etc/init.d/xinetd reload 2. Modern way, synctool version >= 4.0, much easier: Put in file $masterdir/overlay/etc/xinetd.d/identd.post : /etc/init.d/xinetd reload The .post script will be run when the file changes: root@masternode:/# synctool -qf node3: /etc/xinetd.d/identd updated (file size mismatch) node3: running command $masterdir/overlay/etc/xinetd.d/identd.post It is possible to put a group extension on the .post script, so that you can have one group of nodes perform different actions than others. The example for /etc/xinetd.d is interesting because you can also put a "on_update" trigger or .post script on the directory itself. Whenever a file in the directory gets modified, the trigger will be called. So, we can simplify the situation for /etc/xinetd.d to: on_update /etc/xinetd.d /etc/init.d/xinetd reload or $masterdir/overlay/etc/xinetd.d.post: /etc/init.d/xinetd reload The next example shows that the nodename can be used as a group. In the example, the 'fstab' file is identical throughout the cluster, with the exception of node5 and node7. root@masternode:/# ls -F /var/lib/synctool/overlay/etc ... fstab._all motd.production._batch sudoers._all fstab._node5 nsswitch.conf._all sysconfig/ fstab._node7 nsswitch.conf.old._all sysctl.conf._all ... The group 'all' implictly applies to all nodes. Likewise, there is an implicit group 'none' that applies to no nodes. Group 'none' can be convenient when you to have a copy of a file around, but do not wish to push it to any nodes (yet). The '-v' option gives verbose output. This is another way of displaying the logic that synctool performs: # synctool -v ... node3: checking $masterdir/overlay/etc/tcpd_banner.production._all node3: overridden by $masterdir/overlay/etc/tcpd_banner.production._batch node3: checking $masterdir/overlay/etc/issue.net.production._all node3: checking $masterdir/overlay/etc/syslog.conf._all node3: checking $masterdir/overlay/etc/issue.production._all node3: checking $masterdir/overlay/etc/modules.conf._all node3: checking $masterdir/overlay/etc/hosts.allow.production._interactive node3: skipping $masterdir/overlay/etc/hosts.allow.production._interactive, it is not one of my groups ... The '-q' option of synctool gives less output. This is my favorite option. root@masternode:/# synctool -q node3: /etc/xinetd.d/identd updated (file size mismatch) node3: running command /bin/rm /etc/xinetd.d/*.saved ; /etc/init.d/xinetd reload If '-q' still gives too much output, because you have many nodes in your cluster, it is possible to specify '-a' to condense (aggregate) output. The condensed output groups together output that is the same for many nodes. synctool -qa is one of my favorite commands. synctool does this by piping the output through the synctool-aggr command. You may also use this to condense output from dsh, for example # dsh date | synctool-aggr or just # dsh -a date # dsh-ping -a The '-f' or '--fix' option applies all changes. Always be sure to run synctool at least once as a dry run! (without -f). Mind that synctool does not lock the repository and does not guard against concurrent use by multiple sysadmins at once. In practice, this hardly ever led to any problems. To update only a single file rather than all files, use the option '--single' or '-1'. If you want to check what file synctool is using for a given destination file, use the '-r' or '--ref' option: root@masternode:/# synctool -q -n node1 -r /etc/resolv.conf node1: /etc/resolv.conf._somegroup To inspect differences between the master copy and the client copy of a file, use '--diff' or '-d'. synctool can be run on a subset of nodes, a group, or even on individual nodes using the options '--group' or '-g', '--node' or '-n', '--exclude' or '-x', and '--exclude-group' or '-X'. This also works for dsh and dcp. For example: # synctool -g batch,sched -X rack8 Another example: # dsh -n node1,node2,node3 date or copy a file to these three nodes: # dcp -n node1,node2,node3 -d /tmp patchfile-1.0.tar.gz You may also wish to pull a file from a node into the repository. You can do this from the masternode like this: # synctool -n node1 --upload /path/to/file It may be desirable to give the file a different group extension than the default proposed by synctool: # synctool -n node1 --upload /path/to/file --suffix=somegroup After rebooting a cluster, use dsh-ping to see if the nodes respond to ping yet. You may also do this on a group of nodes: # dsh-ping -g rack4 The '-t' or '--tasks' option runs the de executables that are in tasks/ (if you also supply '-f'..!) These executables can also have classnames as filename extension. They can be shell scripts or any other kind of executables. This option is particularly useful for making system changes that cannot be done easily by replacing a configuration file, like for example installing new software packages. Mind to always include a check to see whether the system change has already been made, or else it will always keep installing the same software when it was already there. Doing system changes through the tasks mechanism is highly recommended, for 2 reasons: * easy to see what changes are being done; all tasks are in tasks/ * whenever a node is down, it can do the updates later to get back in sync By default, '--tasks' is not being run. You have to explicitly specify this argument to run tasks. The '--unix' option produces unix-style output. This shows in standard shell syntax just what synctool is about to do. (Note: synctool does not apply changes by executing shell commands; all operations are programmed in Python. This option is only a way of displaying what synctool does, and may be useful when debugging). root@masternode:/# synctool --unix node3: # updating file /etc/xinetd.d/identd node3: mv /etc/xinetd.d/identd /etc/xinetd.d/identd.saved node3: umask 077 node3: cp /var/lib/synctool/overlay/etc/xinetd.d/identd._all /etc/xinetd.d/identd node3: chown root.root /etc/xinetd.d/identd node3: chmod 0644 /etc/xinetd.d/identd node3: node3: # run command /bin/rm /etc/xinetd.d/*.saved ; /etc/init.d/xinetd reload node3: /bin/rm /etc/xinetd.d/*.saved ; /etc/init.d/xinetd reload The '--skip-rsync' option skips the rsync run (that copies the repository from the master server to the client node). You should only specify this option when your repository resides on a shared filesystem. Sharing the repository between your master server and client nodes has certain security implications, so be mindful of what you are doing in such a setup. If you have a fast shared filesystem between all client nodes, but it is not shared with the master server, you may want to write a wrapper script around synctool that first runs rsync to a single node to update the shared repository, and then run synctool with the --skip-rsync option. When using "--fix" to apply changes, synctool can log the performed actions in a log file. Use the 'logfile' directive in synctool.conf to specify that you want logging: logfile /var/log/synctool.log synctool will write this logfile on each node seperately, and a concatenated log on the master node. By using directives in the synctool.conf file, synctool can be told to ignore certain files, nodes, or groups. These will be excluded, skipped. For example: ignore_dotfiles no ignore_dotdirs yes ignore .svn ignore .gitignore ignore_node node1 node2 ignore_group oldgroup ignore_group test About symbolic links ... On the Linux operating system, symbolic links always have mode 0777 (also shown as lrwxrwxrwx). This is awkward, because this mode seems to imply that owners, group members, and others have write access to the symbolic link -- which is not the case. This results in synctool complaining about the mode of the symbolic link: /path/subdir/symlink should have mode 0755 (symlink), but has 0777 As a workaround, synctool forces the mode of the symbolic link to what you set it to in synctool.conf. The hardcoded default value is 0755. synctool.conf: # Linux symlink_mode 0777 or # sensible mode for most other Unix systems symlink_mode 0755 As synctool requires files in the repository to require an extension, so do symbolic links. Symbolic links in the repository will be 'dead' symlinks but they will point to the correct destination on the target node. Consider the following example, where "file" does not exist as is in the repository: $masterdir/overlay/etc/motd._red -> file $masterdir/overlay/etc/file._red For any file synctool updates, it keeps a backup copy around on the target node with the extension '.saved'. If you don't like this, you can tell synctool to not make any backup copies with backup_copies no It is however highly recommended that you run with 'backup_copies yes'. You can manually specify that you want to remove all backup copies using synctool -e or synctool --erase-saved --fix. The settings in synctool.conf can be overridden locally by including a second config file that is present on the node: synctool.conf: include /etc/synctool_local.conf The local config file may contain the same directives as the master config file, but apply only to the node on which the file resides. The local config file may be managed from the master repository, enabling you to have subtle differences in the synctool configuration for certain nodes. For example, this can be used to change the symlink_mode in heterogeneous clusters. Beware that the local config file is read upon startup and may be synced afterwards, if the file was changed. Mind that including node specific configs increases the complexity of your overall synctool configuration, so in general it is recommended that you stick to using the master synctool.conf only, and not including any local configs at all. The 'include' keyword can also be used to clean up your config a little, for example: include /var/lib/synctool/nodes.conf include /var/lib/synctool/on_update.conf In this example all nodes will have the same config, because the masterdir /var/lib/synctool/ is synced to all nodes. synctool can check whether a new version of synctool itself is available by using the '--check-update' option on the master node. You can check periodically for updates by using --check-update in a crontab entry. To download the latest version, run synctool '--download' on the master node. EOB