Exercises - Error Handling#
This chapter suggested several edits to collate.py
.
Suppose our script now reads as follows:
"""
Combine multiple word count CSV-files
into a single cumulative count.
"""
import csv
import argparse
from collections import Counter
import logging
import utilities as util
ERRORS = {
'not_csv_suffix' : '{fname}: File must end in .csv',
}
def update_counts(reader, word_counts):
"""Update word counts with data from another reader/file."""
for word, count in csv.reader(reader):
word_counts[word] += int(count)
def main(args):
"""Run the command line program."""
word_counts = Counter()
logging.info('Processing files...')
for fname in args.infiles:
logging.debug(f'Reading in {fname}...')
if fname[-4:] != '.csv':
msg = ERRORS['not_csv_suffix'].format(fname=fname)
raise OSError(msg)
with open(fname, 'r') as reader:
logging.debug('Computing word counts...')
update_counts(reader, word_counts)
util.collection_to_csv(word_counts, num=args.num)
if __name__ == '__main__':
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument('infiles', type=str, nargs='*',
help='Input file names')
parser.add_argument('-n', '--num',
type=int, default=None,
help='Output n most frequent words')
args = parser.parse_args()
main(args)
The following exercises will ask you to make further edits to collate.py
.
1) Set the logging level#
Define a new command-line flag for collate.py
called --verbose
(or -v
)
that changes the logging level from WARNING
(the default)
to DEBUG
(the noisiest level).
Hint: the following command changes the logging level to DEBUG
:
logging.basicConfig(level=logging.DEBUG)
Once finished,
running collate.py
with and without the -v
flag should produce the following output:
$ python bin/collate.py results/dracula.csv
results/moby_dick.csv -n 5
the,22559
and,12306
of,10446
to,9192
a,7629
$ python bin/collate.py results/dracula.csv
results/moby_dick.csv -n 5 -v
INFO:root:Processing files...
DEBUG:root:Reading in results/dracula.csv...
DEBUG:root:Computing word counts...
DEBUG:root:Reading in results/moby_dick.csv...
DEBUG:root:Computing word counts...
the,22559
and,12306
of,10446
to,9192
a,7629
2) Send the logging output to file#
In Exercise 1), logging information is printed to the screen when the verbose flag is activated.
This is problematic if we want to re-direct the output from collate.py
to a CSV file,
because the logging information will appear in the CSV file as well as the words and their counts.
Edit
collate.py
so that the logging information is sent to a log file calledcollate.log
instead. (HINT:logging.basicConfig
has an argument calledfilename
.)Create a new command-line option
-l
or--logfile
so that the user can specify a different name for the log file if they don’t like the default name ofcollate.log
.
3) Handling exceptions#
Modify the script
collate.py
so that it catches any exceptions that are raised when it tries to open files and records them in the log file.When you are finished, the program should collate all the files it can, rather than halting as soon as it encounters a problem.
Modify your first solution to handle nonexistent files and permission problems separately.
4) Testing error handling#
In our suggested solution to the previous exercise, we modified collate.py
to handle
different types of errors associated with reading input files.
If the main
function in collate.py
now reads:
def main(args):
"""Run the command line program."""
log_lev = logging.DEBUG if args.verbose else logging.WARNING
logging.basicConfig(level=log_lev, filename=args.logfile)
word_counts = Counter()
logging.info('Processing files...')
for fname in args.infiles:
try:
logging.debug(f'Reading in {fname}...')
if fname[-4:] != '.csv':
msg = ERRORS['not_csv_suffix'].format(
fname=fname)
raise OSError(msg)
with open(fname, 'r') as reader:
logging.debug('Computing word counts...')
update_counts(reader, word_counts)
except FileNotFoundError:
msg = f'{fname} not processed: File does not exist'
logging.warning(msg)
except PermissionError:
msg = f'{fname} not processed: No read permission'
logging.warning(msg)
except Exception as error:
msg = f'{fname} not processed: {error}'
logging.warning(msg)
util.collection_to_csv(word_counts, num=args.num)
It is difficult to write a simple unit test for the lines of code dedicated to reading input files, because
main
is a long function that requires command-line arguments as input. Editcollate.py
so that the six lines of code responsible for processing an input file appear in their own function that reads as follows (i.e., once you are done,main
should callprocess_file
in place of the existing code):
def process_file(fname, word_counts):
"""Read file and update word counts"""
logging.debug(f'Reading in {fname}...')
if fname[-4:] != '.csv':
msg = ERRORS['not_csv_suffix'].format(
fname=fname)
raise OSError(msg)
with open(fname, 'r') as reader:
logging.debug('Computing word counts...')
update_counts(reader, word_counts)
Add a unit test to
test_zipfs.py
that usespytest.raises
to check that the newcollate.process_file
function raises anOSError
if the input file does not end in.csv
. Runpytest
to check that the new test passes.Add a unit test to
test_zipfs.py
that usespytest.raises
to check that the newcollate.process_file
function raises aFileNotFoundError
if the input file does not exist. Runpytest
to check that the new test passes.Use the
coverage
library (Section test coverage) to check that the relevant commands inprocess_file
(specificallyraise OSError
andopen(fname, 'r')
) were indeed tested.
5) Error catalogs#
In Section on writing usefule error messages we started to define an error catalog called ERRORS
.
Remember PEP8 and codingstyle, explain why we have used capital letters for the name of the catalog.
Python has three ways to format strings: the
%
operator, thestr.format
method, and f-strings (where the “f” stands for “format”). Look up the documentation for each and explain why we have to usestr.format
rather than f-strings for formatting error messages in our catalog/lookup table.There’s a good chance we will eventually want to use the error messages we’ve defined in other scripts besides
collate.py
. To avoid duplication, moveERRORS
to theutilities
module that was first created in Section.
6) Tracebacks#
Run the following code:
try:
1/0
except Exception as e:
help(e.__traceback__)
What kind of object is
e.__traceback__
?What useful information can you get from it?