Shell Tools and Scripting

Shell Tools and Scripting #

Lecture source: https://missing.csail.mit.edu/2020/shell-tools/

Bash special variables:

  • $0 - Name of the script

  • $1 to $9 - Arguments to the script. $1 is the first argument and so on.

  • $@ - All the arguments

  • $# - Number of arguments

  • $? - Return code of the previous command

  • $$ - Process identification number (PID) for the current script

  • !! - Entire last command, including arguments. A common pattern is to execute a command only for it to fail due to missing permissions; you can quickly re-execute the command with sudo by doing sudo !!

  • $_ - Last argument from the last command. If you are in an interactive shell, you can also quickly get this value by typing Esc followed by . or Alt+.

    $ mkdir test
    $ cd $_
    

    Create a directory named as test and access to it.

Full list in here: https://tldp.org/LDP/abs/html/special-chars.html

Command substitution: $( CMD ).

Process substitution: <( CMD ) will execute CMD and place the output in a temporary file and substitute the <() with that file’s name.

$ file <(ls)
/dev/fd/11: fifo (named pipe)

# shows the differences between dir1 and dir2
$ diff <(ls dir1) <(ls dir2)

Shell glowing:

  • Wildcards:

    • ? matches one character.
    • * matches one or any amount of characters.
  • Curly braces: {} expand automatically.

    $ touch foo{,1,2}
    $ ls
    foo foo1 foo2
      
    $ touch bar{1..5}
    $ ls | grep bar
    bar1
    bar2
    bar3
    bar4
    bar5
    

The first line of the script - Shebang. It is good practice to write shebang lines using the env command, like this:

#!/usr/bin/env bash
#!/usr/bin/env python

locate can use index to do the search jobs, which is more efficient than find.

$ locate missing-semester
# update locate index database
$ updatedb

Finding code: grep, egrep, rg(ripgrep), ag etc.

Finding command: history, Ctrl+R, fzf etc.

Directory navigation: fasd(z), autojump(j), tree, broot, nnn, ranger etc.

Exercises #

  1. Read man ls and write an ls command that lists files in the following manner

    • Includes all files, including hidden files -a
    • Sizes are listed in human readable format (e.g. 454M instead of 454279954) -lh
    • Files are ordered by recency -t
    • Output is colorized -G

    A sample output would look like this

     -rw-r--r--   1 user group 1.1M Jan 14 09:53 baz
     drwxr-xr-x   5 user group  160 Jan 14 09:53 .
     -rw-r--r--   1 user group  514 Jan 14 06:42 bar
     -rw-r--r--   1 user group 106M Jan 13 12:12 foo
     drwx------+ 47 user group 1.5K Jan 12 18:08 ..
    
    $ ls -alhGt
    
  2. Write bash functions marco and polo that do the following. Whenever you execute marco the current working directory should be saved in some manner, then when you execute polo, no matter what directory you are in, polo should cd you back to the directory where you executed marco. For ease of debugging you can write the code in a file marco.sh and (re)load the definitions to your shell by executing source marco.sh.

    $ vim marco.sh
    
    #!/usr/bin/env bash
       
    export MARCO_DIR=""
       
    function marco() {
        MACRO_DIR="$(pwd)"
    }
       
    function polo() {
        cd "$MACRO_DIR"
    }
    
    $ pwd
    /Users/triplez/Code/Playground/missing-semester/1-shell-tools
    $ source marco.sh
    $ marco
    $ cd ../../..
    $ pwd
    /Users/triplez/Code
    $ polo
    $ pwd
    /Users/triplez/Code/Playground/missing-semester/1-shell-tools
    
  3. Say you have a command that fails rarely. In order to debug it you need to capture its output but it can be time consuming to get a failure run. Write a bash script that runs the following script until it fails and captures its standard output and error streams to files and prints everything at the end. Bonus points if you can also report how many runs it took for the script to fail.

     #!/usr/bin/env bash
       
     n=$(( RANDOM % 100 ))
       
     if [[ n -eq 42 ]]; then
        echo "Something went wrong"
        >&2 echo "The error was using magic numbers"
        exit 1
     fi
       
     echo "Everything went according to plan"
    

    I named the above script as fail-command.sh, and my solution script is named as find-fail-command.sh.

    $ vim find-fail-command.sh
    
    #!/usr/bin/env bash
       
    FAIL_CNT=0
       
    while [ $? -eq 0 ]; do
        FAIL_CNT=$(($FAIL_CNT+1))
        bash fail-command.sh > fail-command-output 2>&1
    done
       
    echo "Get failure after running $FAIL_CNT times"
    

    All output from fail-command.sh has been redirected to fail-command-output file.

    $ bash find-fail-command.sh
    Get failure after running 64 times
       
    $ cat fail-command-output
    Something went wrong
    The error was using magic numbers
    
  4. As we covered in the lecture find’s -exec can be very powerful for performing operations over the files we are searching for. However, what if we want to do something with all the files, like creating a zip file? As you have seen so far commands will take input from both arguments and STDIN. When piping commands, we are connecting STDOUT to STDIN, but some commands like tar take inputs from arguments. To bridge this disconnect there’s the xargs command which will execute a command using STDIN as arguments. For example ls | xargs rm will delete the files in the current directory.

    Your task is to write a command that recursively finds all HTML files in the folder and makes a zip with them. Note that your command should work even if the files have spaces (hint: check -d flag for xargs).

    If you’re on macOS, note that the default BSD find is different from the one included in GNU coreutils. You can use -print0 on find and the -0 flag on xargs. As a macOS user, you should be aware that command-line utilities shipped with macOS may differ from the GNU counterparts; you can install the GNU versions if you like by using brew.

    My recursive-zip-html.sh:

    #!/usr/bin/env bash
       
    if [ -f "html.zip" ]; then
        rm -f html.zip
    fi
       
    gfind **/*.html | gxargs -d '\n' zip html.zip
       
    

    For I am a macOS user, so I’m using gfind and gxargs for GNU find and xargs. If you are using pure Linux, just replace as follows: s/gfind/find/g, s/gxargs/xargs/g.

  5. (Advanced) Write a command or script to recursively find the most recently modified file in a directory. More generally, can you list all files by recency? My list-files-by-recency.sh:

    #!/usr/bin/env bash
       
    gfind * -type f | gxargs -d '\n' ls -lht