Combine AWK and SED to create symbolic links

The unix tools AWK and SED are very powerful tools to manipulate files and perform text processing tasks. It’s true that the syntax is not always very intuitive but it can help you in performing task in a single line.

The following example illustrate the usage of SED and AWK. This is surely not the best one but gives some overview about the gensub awk command and replacement commands.

Let us suppose first that you need to find recursively a list of files that match a pattern.

find . -name "__init__.py"

These files contains the full path names (with the directory). From these files, you want to create a symbolic link (ln -s command under Linux) and in addition you want to keep to keep track of the directory (replacing the / sign with a _ sign).

find . -name "__init__.py" | awk '{print "ln -s " $1 " " gensub("/", "_", "g", $1)}'

The “gensub” command is used by awk to replace a character by an other. The “g” means replace all occurences.

Yet, because a directory starts with “./”, all files start with a ”._”, which can be removed with  “sed”. Note that the ”.” is a special character, so you need to use ”\.” instead of simply ”.” .

The final command is therefore:

find . -name "__init__.py" | awk '{print "ln -s " $1 " " gensub("/", "_", "g", $1)}' | sed -e 's/\._//g'
Please follow and like us:
This entry was posted in Linux and tagged , . Bookmark the permalink.

One Response to Combine AWK and SED to create symbolic links

  1. BEWARE (as of 2016-04-17)
    Observed: global replacement with sed, mutilating desired output in certain cases.
    Expected: local replacement of first instance of search string only, iff string begins with search string.
    FIX
    Problem code:

    sed 's/\._//g'

    Problem fix:

    sed 's/^\._//g'

    Entire Command With Fix:

    find . -name "__init__.py" | awk '{print "ln -s " $1 " " gensub("/", "_", "g", $1)}' | sed -e 's/^\._//g'

    Explanation: The sed portion as written will replace all occurrences of literal dot-underscore with the empty string ‘._’ with ”, in other words, deleting all occurrences of a literal dot followed by underscore. If you have a file named ‘a._b.’ for example, both the occurrence of in the middle of the file name will disappear (renamed: ‘a._b.’ –> ‘ab.’) — but also, since the directory structure has replaced all slashes with underscores, filenames with a trailing literal dot will lose a nesting level, effectively concatenating the first part of the filename to the next higher-leven directory’s name: (–find–> ‘./User/you/docs/.files/.a._b./deep/directories’ –awk–> ‘._User_you_docs_.files_.a._b._deep_directories’ –sed–> ‘User_you_docs_.files_.abdeep_directories’).

Leave a Reply

Your email address will not be published.