I needed to make sense of a lengthy log file. The server had been running over the weekend and the log file showed an increasing packet error count. (Don’t worry what a packet error is, it doesn’t really matter as far as this post goes). This post explains how the standard Unix tools helped me dissect the log file, and indeed any other log file.
Heads and Tails
First, I needed the initial and final error counts.
$ head logfile | grep "packet errors" Info: Fri 17:34:05, packet errors: 0 $ tail logfile | grep "packet errors" Info: Mon 08:32:11, packet errors: 11323
So what exactly had gone wrong, and when? Paging through the filtered output would take some time since the number of reported packet error counts ran to five figures.
$ grep "packet errors" logfile | less
$ grep "packet errors" logfile | wc -l 125220
Part of the problem was that whatever logged the message simply printed out the error count every couple of seconds, whether or not this error count had changed. Going back to Friday, reconfiguring the logger and re-running the experiment wasn’t an option, but luckily I had too much information rather than too little: all I had to do was process it correctly.
What looked more useful was filtering just the lines at which the
error count had changed. Filtering out repeated lines using
wouldn’t work, since every line’s timestamp made it unique. Happily,
uniq has an option to ignore a specified number of fields.
$ grep "packet errors" logfile | uniq -f 3
Piping this command to
wc -l indicated that the number of
interesting lines had been thinned down to around three thousand —
the kind of report which could be paged through comfortably enough.
$ grep "packet errors" logfile | uniq -f 3 | wc -l 3432
It’s also possible to home in on other points of
interest. For example, we could investigate the maximum period of
stability by finding when the error count doesn’t change for
longest. This can be done using the
-c option to
precedes each output line with a count of the the number of times that
line was repeated in the input; then sorting using this count field as
a numeric key; then picking the last output value from the
$ grep "packet errors" logfile | \ uniq -f 3 -c | sort -k1 -n | tail -1 277 Info: Sat 23:00:15, packet errors: 4645
This tells us that on Saturday evening the error count remained stable for longest — in fact, an error count of 4645 repeated 277 times in the logfile, a duration of just 9 minutes (given that the packet error count repeated every couple of seconds).
This is just a flavour of what the Unix shell tools can do when piped
together. We’ve combined some simple tools (
grep) into complex and powerful commands
using just a few keystrokes. Imagine trying to design a GUI which
would allow you to analyse a log file as shown above. Now try
designing a GUI which can analyse any log file ever.
I wouldn’t pretend the Unix shell tools are user-friendly. In fact some
are distinctly arcane (
find comes to mind). When I started out, I
found the Unix manual difficult to use, and although
improved things it’s still hard to find out about a command without
knowing its name. And if there’s a decent, hyper-linked online
reference, I’ve yet to find it.
Extending the Toolset
Shell tools aren’t always capable of even the kind of text processing described in this article. For example, if the logfile had more structure to it — or even was written out as XML — then we’d probably be find ourselves struggling. Fortunately a scripting language is very happy to help out in such cases. Perl remains a good choice, if it’s one-liners you like. Python and Ruby are better for longer-lived scripts.
Equally, there’s a point at which you need to convert text into something more visual — a graph or a histogram, perhaps. Text processing can be used to generate a format suitable for importing into a plotting package.
A Unix expert once told me that the difference between a Unix expert and a non-expert wasn’t aptitude or years of accumulated knowledge — rather it came down to a willingness to read carefully through the manual. I’m no expert, but regard this as sound advice.