Tracking Live Web Logfiles with Tail over SSH 10 April, 2007 — Stuart Brown
Using Apache with Linux
There are two distinct types of people when it comes to Linux: Those who are pretty much clueless with the whole OSS thing, and who generally stick with Windows or MacOS - and those who are seemingly born naturals at command lines, regular expressions and other such arcane operations. I'm fortunate to be squarely in the middle of the two extremes (i.e. I use Linux but I'm far from an expert) - and if you're in a similar situation to me, then you may find the following useful.
Tail is a Unix/Linux command that shows the last 10 (or other specified number) of lines from a given file. That's it. Most of the power of the command line in Linux comes from the sheer simplicity of the command available. Couple 'tail' with a dynamic log file, however, and you have a powerful diagnostic tool.
If you have a Linux server of your own, you can get an instant and extremely low-level insight into the workings of your webserver by using tail to view your logs. You'll need a couple of things - first, you'll need SSH/Telnet access to your server. Not all packages have such a luxury - shared hosting in particular will be unlikely to have this feature.
Secondly, you'll need an SSH client with which to connect to your server. In the example that follows I'll be using PuTTY, which is a free and comprehensively featured client. Once you've got your SSH client (The download is just an .EXE for Windows, no installation required), you can connect to your server with your login details (which you should either know, or your hosting service should have provided).
Open PuTTY (or your SSH client of choice - MacOS comes with OpenSSH which should prove sufficient for the task), and specify your host name. In the example above, I've typed 'myserver' - which, conveniently enough, is mapped to my server. Click 'Open' to start the connection process.
You'll get a black command line screen once you're connected, and you'll be prompted for your username. In my case, I'm logging in as 'root'.
Once you've typed in your username, you'll be prompted for your password. Type it as given and press enter - the characters will not appear on screen, obfuscated or otherwise.
Next, we locate our web server log files - if you're running a default installation of apache, type the following to get to the main directory where the logs are stored:
Depending on your configuration, the logfiles may be elsewhere - if you're using Plesk for virtual host management, for instance, then type:
cd /var/www/vhosts/[your domain]/statistics/logs/
Otherwise, you may have to do some hunting to track down your log files. You could try the following:
find / -name access_log
The find command will then scour the filesystem for files called 'access_log'. Navigate to the most likely looking logfile for your web server / virtual host (cd /path/to/directory/).
Once you're in the relevant directory, you can use the 'tail' command to start tracking your logs:
tail -f access_log
The '-f' switch will make tail follow the log file - so as new lines are added, they'll be output on the screen. Essentially, what you've got is a very basic, but very comprehensive and up-to-date version of your web statistics. The lines you see moving up the screen contain the details of every single request made to your webserver, whether human or bot, and all the usual information found in web stats - referring URL, user agent, originating IP etc.
While the potential for analytics is limited, this low-level technique can be useful in certain circumstances - you can track hacking attempts and bandwidth theft as it happens, and get a much better feel for exactly how busy your web server is.
On a final note, apologies to those who are already au fait with the Linux command line - this is pretty basic stuff. But for those with their own web server, SSH access and a fear of getting their hands dirty with the command line, I hope this will be some help.