Genomic Data Analysis: Basics of the shell

The Unix shell has been around longer than most of its users have been alive. It has survived so long because it’s a power tool that allows people to do complex things with just a few keystrokes. More importantly, it helps them combine existing programs in new ways and automate repetitive tasks so they aren’t typing the same things over and over again. Use of the shell is fundamental to using a wide range of other powerful tools and computing resources (including “high-performance computing” supercomputers). These lessons will start you on a path towards using these resources effectively.

Day 1 in class; Day 2 is homework

If you are new to the Linux shell, this workshop is necessary. We have set the schedule based on the practical schedule in LS Lab 5. Our expectation is that we will finish only part of the workshop in class, but that you should complete the rest of the workshop on your own time. That part is listed in the workshop syllabus as “Day 2”.

Prerequisites

This lesson guides you through the basics of file systems and the shell. If you have stored files on a computer at all and recognize the word “file” and either “directory” or “folder” (two common words for the same thing), you’re ready for this lesson.

If you’re already comfortable manipulating files and directories, searching for files with grep and find, and writing simple loops and scripts, you probably won’t learn much from this lesson.




Schedule

Time Topic Learning Objectives
Before start Setup
Day 1 14:00 Introducing the bash shell
  • Explain how the shell relates to the keyboard, the screen, the operating system, and users’ programs.
  • Explain when and why command-line interfaces should be used instead of graphical interfaces.
  • Understand the context of the problem on which this lesson is based.
14:20 Navigating files and directories
  • Explain the similarities and differences between a file and a directory.
  • Translate an absolute path into a relative path and vice versa.
  • Construct absolute and relative paths that identify specific files and directories.
  • Demonstrate the use of tab completion, and explain its advantages.
14:55 Working With Files and Directories
  • Create a directory hierarchy that matches a given diagram.
  • Create files in that hierarchy using an editor or by copying and renaming existing files.
  • Delete, copy and move specified files and/or directories.
15:45 Finish
Day 2 14:00 Pipes and Filters
  • Redirect a command’s output to a file.
  • Process a file instead of keyboard input using redirection.
  • Construct command pipelines with two or more stages.
  • Explain what usually happens if a program or pipeline isn’t given any input to process.
  • Explain Unix’s ‘small pieces, loosely joined’ philosophy.
14:35 Loops
  • Write a loop that applies one or more commands separately to each file in a set of files.
  • Trace the values taken on by a loop variable during execution of the loop.
  • Explain the difference between a variable’s name and its value.
  • Explain why spaces and some punctuation characters shouldn’t be used in file names.
  • Demonstrate how to see what commands have recently been executed.
  • Re-run recently executed commands without retyping them.
15:25 Shell Scripts
  • Write a shell script that runs a command or series of commands for a fixed set of files.
  • Run a shell script from the command line.
  • Write a shell script that operates on a set of files defined by the user on the command line.
  • Create pipelines that include shell scripts you, and others, have written.
16:10 Finding Things
  • Use grep to select lines from text files that match simple patterns.
  • Use find to find files whose names match simple patterns.
  • Use the output of one command as the command-line argument(s) to another command.
  • Explain what is meant by ‘text’ and ‘binary’ files, and why many common tools don’t handle the latter well.
16:55 Finish

We will let you all go before 20:00, but you can finish this lesson on your own schedule