Bash Tool Exercises#

1 What does >> mean?#

We have seen the use of >, but there is a similar operator >> which works slightly differently. We’ll learn about the differences between these two operators by printing some strings. We can use the echo command to print strings as shown below:

$ echo The echo command prints text
The echo command prints text

Now test the commands below to reveal the difference between the two operators:

$ echo hello > testfile01.txt

and:

$ echo hello >> testfile02.txt

Hint: Try executing each command twice in a row and then examining the output files.

2 Appending data#

Given the following commands, what will be included in the file extracted.txt:

$ head -n 3 dracula.txt > extracted.txt
$ tail -n 2 dracula.txt >> extracted.txt
  1. The first three lines of dracula.txt

  2. The last two lines of dracula.txt

  3. The first three lines and the last two lines of dracula.txt

  4. The second and third lines of dracula.txt

3 Piping commands#

In our current directory, we want to find the 3 files which have the least number of lines.

Which command listed below would work?

  1. wc -l * > sort -n > head -n 3

  2. wc -l * | sort -n | head -n 1-3

  3. wc -l * | head -n 3 | sort -n

  4. wc -l * | sort -n | head -n 3

4 Why does uniq only remove adjacent duplicates?#

The command uniq removes adjacent duplicated lines from its input. Consider a hypothetical file genres.txt containing the following data:

science fiction
fantasy
science fiction
fantasy
science fiction
science fiction

Running the command uniq genres.txt produces:

science fiction
fantasy
science fiction
fantasy
science fiction

Why do you think uniq only removes adjacent duplicated lines? (Hint: think about very large datasets.)

What other command could you combine with it in a pipe to remove all duplicated lines?

5 Pipe reading comprehension#

A file called titles.txt contains a list of book titles and publication years:

Dracula,1897
Frankenstein,1818
Jane Eyre,1847
Moby Dick,1851
Sense and Sensibility,1811
The Adventures of Sherlock Holmes,1892
The Invisible Man,1897
The Time Machine,1895
Wuthering Heights,1847

What text passes through each of the pipes and the final redirect in the pipeline below?

$ cat titles.txt | head -n 5 | tail -n 3 | sort -r > final.txt

Hint: build the pipeline up one command at a time to test your understanding

6 Pipe construction#

For the file titles.txt from the previous exercise, consider the following command:

$ cut -d , -f 2 titles.txt

What does the cut command (and its options) accomplish?

7 Which pipe?#

Consider the same titles.txt from the previous exercises.

The uniq command has a -c option which gives a count of the number of times a line occurs in its input.

If titles.txt was in your working directory, what command would you use to produce a table that shows the total count of each publication year in the file?

  1. sort titles.txt | uniq -c

  2. sort -t, -k2,2 titles.txt | uniq -c

  3. cut -d, -f 2 titles.txt | uniq -c

  4. cut -d, -f 2 titles.txt | sort | uniq -c

  5. cut -d, -f 2 titles.txt | sort | uniq -c | wc -l

8 Doing a dry run#

A loop is a way to do many things at once—or to make many mistakes at once if it does the wrong thing. One way to check what a loop would do is to echo the commands it would run instead of actually running them.

Suppose we want to preview the commands the following loop will execute without actually running those commands (analyze is a hypothetical command):

$ for file in *.txt
> do
>   analyze $file > analyzed-$file
> done

What is the difference between the two loops below, and which one would we want to run?

$ for file in *.txt
> do
>   echo analyze $file > analyzed-$file
> done

or:

$ for file in *.txt
> do
>   echo "analyze $file > analyzed-$file"
> done

9 Variables in loops#

Given the files in data/, what is the output of the following code?

$ for datafile in *.txt
> do
>    ls *.txt
> done

Now, what is the output of the following code?

$ for datafile in *.txt
> do
>	ls $datafile
> done

Why do these two loops give different outputs?

10 Limiting sets of files#

What would be the output of running the following loop in your data/ directory?

$ for filename in d*
> do
>    ls $filename
> done

How would the output differ from using this command instead?

$ for filename in *d*
> do
>    ls $filename
> done

11 Saving to a file in a loop#

Consider running the following loop in the data/ directory:

for book in *.txt
> do
>     echo $book
>     head -n 16 $book > headers.txt
> done

Why would the following loop be preferable?

for book in *.txt
> do
>     head -n 16 $book >> headers.txt
> done

12 Why does history record commands before running them?#

If you run the command:

$ history | tail -n 5 > recent.sh

The last command in the file is the history command itself, i.e., the shell has added history to the command log before actually running it. In fact, the shell always adds commands to the log before running them. Why do you think it does this?