filter_apache_log.pl
A simple script - only combined log format implemented at this point allows specific fields from an apache log (or any log with extension of the code) to be viewed.
The script filters out some lines (eg blank lines), passes through some lines (eg the filename lines from a multi-file tail) and will abort with any unknown line (so that you know to handle/skip/pass-through those lines)
Also here: https://gist.github.com/bjdean/5726807#file-filter_apache_log-pl
Example usage
Tail out all apache access logs and look at IPs and User-Agents:
/var/log/apache2$ tail -f *access*log | filter_apache_log.pl -ip --usera | head -30 ==> 60iv.aicsa.org.au-access.log <== 66.249.73.16 "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 66.249.73.16 "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 80.57.78.214 "Mozilla/5.0 (compatible; MJ12bot/v1.4.3; http://www.majestic12.co.uk/bot.php?+)" 80.57.78.214 "Mozilla/5.0 (compatible; MJ12bot/v1.4.3; http://www.majestic12.co.uk/bot.php?+)" 91.64.153.168 "Mozilla/5.0 (compatible; MJ12bot/v1.4.3; http://www.majestic12.co.uk/bot.php?+)" 91.64.153.168 "Mozilla/5.0 (compatible; MJ12bot/v1.4.3; http://www.majestic12.co.uk/bot.php?+)" 91.64.153.168 "Mozilla/5.0 (compatible; MJ12bot/v1.4.3; http://www.majestic12.co.uk/bot.php?+)" 91.64.153.168 "Mozilla/5.0 (compatible; MJ12bot/v1.4.3; http://www.majestic12.co.uk/bot.php?+)" 100.43.83.153 "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)" 100.43.83.153 "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)" ==> access.log <== 127.0.0.1 "monit/4.8.1" 127.0.0.1 "monit/4.8.1" 127.0.0.1 "libwww-perl/5.808" 127.0.0.1 "libwww-perl/5.808" 127.0.0.1 "libwww-perl/5.808" 127.0.0.1 "monit/4.8.1" 127.0.0.1 "libwww-perl/5.808" 127.0.0.1 "libwww-perl/5.808" 127.0.0.1 "libwww-perl/5.808" 127.0.0.1 "monit/4.8.1" ==> aicsa.org.au-access.log <== 216.172.141.107 "Mozilla/5.0 (Windows NT 5.1; rv:13.0) Gecko/20100101 Firefox/13.0.1" 183.221.250.141 "Mozilla/5.0 (Linux; U; Android 2.2; fr-fr; Desire_A8181 Build/FRF91) App3leWebKit/53.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1" 184.154.124.146 "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; FunWebProducts; .NET CLR 1.1.4322; PeoplePal 6.2)" 5.39.95.193 "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.112 Safari/535.1" 5.39.95.193 "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.112 Safari/535.1" 5.39.95.193 "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.112 Safari/535.1" 216.152.251.37 "Mozilla/5.0 (Windows NT 6.1; rv:5.0) Gecko/20100101 Firefox/5.02"
Source
Download: filter_apache_log.pl Raw:
#!/usr/bin/perl # filter_apache_log.pl - quick filter of apache logs to show specific fields # Copyright (C) 2008 Bradley Dean <bjdean@bjdean.id.au> # # This program is free software: you can redistribute it and/or modify it under # the terms of the GNU General Public License as published by the Free Software # Foundation, either version 3 of the License, or (at your option) any later # version. # # This program is distributed in the hope that it will be useful, but WITHOUT ANY # WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A # PARTICULAR PURPOSE. See the GNU General Public License for more details. # # You should have received a copy of the GNU General Public License along with # this program. If not, see <http://www.gnu.org/licenses/>. # A simple script - only combined log format implemented at this point # allows specific fields from an apache log (or any log with extension # of the code) to be viewed. # # The script filters out some lines (eg blank lines), passes through some # lines (eg the filename lines from a multi-file tail) and will abort # with any unknown line (so that you know to handle/skip/pass-through those # lines) # # Example usage: Look at IPs and User-Agents # # $ tail -f /var/log/apache/*access*log | filter_apache_log.pl -ip -useragent use strict; use warnings; use Getopt::Long; use IO::Handle; # No buffering output autoflush STDOUT 1; # commmand line arguments my $show_all; my ( $show_ip, $show_ident, $show_user, $show_date, $show_path, $show_response, $show_size, $show_referrer, $show_useragent ); my $result = GetOptions( "ip" => \$show_ip, "ident" => \$show_ident, "user" => \$show_user, "date" => \$show_date, "path" => \$show_path, "response" => \$show_response, "size" => \$show_size, "referrer" => \$show_referrer, "useragent" => \$show_useragent, "all" => \$show_all, ) or die; # Read and filter LINE: while ( my $line = <STDIN> ) { # Special lines - pass through if ( # tail -f file names $line =~ /^==>.*<==$/ ) { print $line; next LINE; } # Special lines - skip if ( # empty lines $line =~ /^\s*$/ ) { next LINE; } # Apache line formats my ($ip, $ident, $user, $date, $path, $response, $size, $referrer, $useragent); if ( my @match = $line =~ / ^\s* ([0-9\.]+)\s # ip (\S+)\s # ident (\S+)\s # user (\[.*?\])\s+ # date (".*?")\s+ # path (\S+)\s+ # response (\S+)\s+ # size (".*?")\s+ # referrer (".*?") # user-agent /x ) { ($ip, $ident, $user, $date, $path, $response, $size, $referrer, $useragent) = @match; my $line = ""; $line .= fmt_val($ip) if ( $show_all || $show_ip ); $line .= fmt_val($ident) if ( $show_all || $show_ident ); $line .= fmt_val($user) if ( $show_all || $show_user ); $line .= fmt_val($date) if ( $show_all || $show_date ); $line .= fmt_val($path) if ( $show_all || $show_path ); $line .= fmt_val($response) if ( $show_all || $show_response ); $line .= fmt_val($size) if ( $show_all || $show_size ); $line .= fmt_val($referrer) if ( $referrer ne '"-"' ) && ( $show_all || $show_referrer ); $line .= fmt_val($useragent) if ( $show_all || $show_useragent ); print "${line}\n" if ( $line =~ /\S/ ); next LINE; } else { die "Unmatched log line: ${line}"; } } sub fmt_val { my ($val) = @_; if ( $val ) { return "${val} " } else { return " "; } }