Diary and notebook of whatever tech problems are irritating me at the moment.


Find tricks

These are some find examples with moderately complicated regular expressions that I've used for administration tasks. Note that regular expressions used by find, grep, and other programs have some variants with both the old "basic" form and the newer "extended" forms. Find defaults to the extended version based emacs but some of it's tests like -name use "shell patterns" instead (see the sh man page). In the regex man page the "(!)" identifies some of the syntax and behaviour that may not be compatible with other regular expression implementations.

The first example cleans out the Unreal Tournament 2004 cache from the home folders of all users. The purpose of cleaning out the cache is that every time the client connects to a server that is using a map, vehicle, or other add-on that it doesn't already have locally it downloads it to the cache. The cache on a system of an avid on-line gamer will easily exceed many gigabytes and can run their home directory out of space on smaller drives. On some distributions, it is impossible to log in if there is no home space available.

find /home -regex '.*/\.ut2004\(/Cache\|/.*/Cache\)' -exec rm -rf {} \;


. = metacharacter implying any single character

* = any quantity of the previous character (in this case any quantity of any character because of the "." metacharacter)

/ = detect the slash to set a reference to a directory when combined with the next part.

\.ut2004 = the backslash escapes "." so that it is treated as a period and not a metacharacter. Combined with the previous ".*/" it limits the results to hidden /.ut2004 and not /..ut2004, /xut2004, or any other directory.

\(...\|...\) = This sets up a pair of branches with alternation as indicated by the vertical bar. The parenthesis define the range of expressions that make up each branch. The vertical bar is escaped to keep the shell from thinking it's a pipe and the parenthesis are escaped to indicate they are not being searched for.

\(/Cache\|/.*/Cache\) = The combined alteration limits results to .ut2004/Cache and any other items named Cache inside the .ut2004 directory. The parenthesis are important - without them find will return /.ut2004/Cache and anything anywhere with a subdirectory named Cache (like in .mozilla).

The entire regular expression is protected by single quotes so find knows they belong together. You could also limit the results to directories by adding the "-type d" test.

-exec rm -rf {} \; = This tells find that for every item it returns it is to execute rm with the parameters -rf (to delete directory trees) followed by the path with which will dynamically replace the {}. The command line is terminated by an escaped semicolon.

Here are some other find examples I've found useful.

Find any user's Mozilla/Firefox cache:
find /home -regex '.*/\.mozilla/.*/Cache'

Find any user's hidden trash directory:
find /home -regex '.*/home/[^/]*/\.Trash'

When installing updates to applications in Wine you will occasionally encounter duplicate file and directory name problems, sort of a reverse name collision. It can occur because Linux names are case sensitive but Windows is not. This means it is possible to have two files, "readme.txt" and README.TXT" in the same Linux directory but not in a Windows one. If an application update is in the form of a executable or self-extracting archive, Wine will resolve capitalization differences and ensure that a replacement file from an update that has a upper-case name will correctly overwrite a target file with a lower-case name. But if the update is from a zip or other archive and contains names with different case, then you just can't extract them and copy them on top of the installed application's directories as duplicates will result. In order to work around this you either have to install a Windows archive utility like 7-Zip and use it to extract the files and take advantage of Wine's name resolution function, or manually change the names to match. Since Windows applications generally don't care about file or directory name case, another option is to rename everything to lower case. You can do this by combining the find command with the rename command:
find <directory or file name> -depth -execdir rename 'y/A-Z/a-z/' {} \;

By default, find has some optimizations that will speed up searches in large directory trees. One of these is to skip checking subdirectories by assuming that two less exist than the total becasue of the "." and ".." entries. This will cause find to miss directories on file systems that do not have hard links for these like CD-ROM and vfat. To prevent this from occurring, use the -noleaf option.

No comments:

About Me

Omnifarious Implementer = I do just about everything. With my usual occupations this means anything an electrical engineer does not feel like doing including PCB design, electronic troubleshooting and repair, part sourcing, inventory control, enclosure machining, label design, PC support, network administration, plant maintenance, janitorial, etc. Non-occupational includes residential plumbing, heating, electrical, farming, automotive and small engine repair. There is plenty more but you get the idea.