trdsql Log aggregation
Posted on:
Log aggregation
Apache and nginx Log are also becoming established in the way of outputting in LTSV format.
An example of analyzing such Log with trdsql.
The output side customizes the apache LogFormat setting to the following custom format.
LogFormat "host:%h\tident:%l\tuser:%u\ttime:%t\treq:%r\tstatus:%>s\tsize:%b\treferer:\%{Referer}i\tua:%{User-Agent}i" combined_ltsv
The items host, ident, user, time, req, status, size, referer, ua are output.
The actual Log looks like this.
host:176.99.192.42 ident:- user:- time:[21/Oct/2019:21:33:53 +0900] req:GET /category/software HTTP/1.1 status:200 size:138 referer:- ua:Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)
host:192.54.157.102 ident:- user:- time:[21/Oct/2019:21:33:53 +0900] req:GET /item/electronics/4478 HTTP/1.1 status:200 size:60 referer:/category/sports ua:Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:9.0.1) Gecko/20100101 Firefox/9.0.1
host:88.60.137.115 ident:- user:- time:[21/Oct/2019:21:33:53 +0900] req:POST /search/?c=Games+Electronics HTTP/1.1 status:200 size:98 referer:/item/networking/929 ua:Mozilla/5.0 (iPhone; CPU iPhone OS 5_0_1 like Mac OS X) AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9A405 Safari/7534.48.3
...
analysis
First, try -a of trdsql.
The table name is log.ltsv.
The file type is LTSV.
Data types:
+-------------+------+
| column name | type |
+-------------+------+
| \`host\` | text |
| ident | text |
| \`user\` | text |
| \`time\` | text |
| req | text |
| \`status\` | text |
| \`size\` | text |
| referer | text |
| ua | text |
+-------------+------+
Data samples:
+---------------+-------+----------+------------------------------+--------------------------------+------------+----------+---------+--------------------------------+
| \`host\` | ident | \`user\` | \`time\` | req | \`status\` | \`size\` | referer | ua |
+---------------+-------+----------+------------------------------+--------------------------------+------------+----------+---------+--------------------------------+
| 176.99.192.42 | - | - | [21/Oct/2019:21:33:53 +0900] | GET /category/software | 200 | 138 | - | Mozilla/5.0 (compatible; MSIE |
| | | | | HTTP/1.1 | | | | 9.0; Windows NT 6.1; WOW64; |
| | | | | | | | | Trident/5.0) |
+---------------+-------+----------+------------------------------+--------------------------------+------------+----------+---------+--------------------------------+
Examples:
trdsql "SELECT \`host\`, ident, \`user\`, \`time\`, req, \`status\`, \`size\`, referer, ua FROM log.ltsv"
trdsql "SELECT \`host\`, ident, \`user\`, \`time\`, req, \`status\`, \`size\`, referer, ua FROM log.ltsv WHERE \`host\` = '176.99.192.42'"
trdsql "SELECT \`host\`, count(\`host\`) FROM log.ltsv GROUP BY \`host\`"
trdsql "SELECT \`host\`, ident, \`user\`, \`time\`, req, \`status\`, \`size\`, referer, ua FROM log.ltsv ORDER BY \`host\` LIMIT 10"
Execute the Examples as a hint and execute it using the SQL introduced so far.
Output the top 5
Output the top 5 hosts with the most requests.
trdsql -oat "SELECT \`host\`, count(\`host\`) as count FROM log.ltsv GROUP BY \`host\` ORDER BY count DESC LIMIT 5"
+----------------+-------+
| host | count |
+----------------+-------+
| 36.69.176.222 | 5 |
| 92.132.226.51 | 4 |
| 76.222.144.225 | 4 |
| 28.63.137.225 | 4 |
| 28.57.188.28 | 4 |
+----------------+-------+
Output the top 5 hosts with the most requests
trdsql -oat "SELECT req, count(req) as count FROM log.ltsv GROUP BY req ORDER BY count DESC LIMIT 5"
+--------------------------------+-------+
| req | count |
+--------------------------------+-------+
| GET /category/software | 74 |
| HTTP/1.1 | |
| GET /category/electronics | 73 |
| HTTP/1.1 | |
| GET /category/games HTTP/1.1 | 66 |
| GET /category/books HTTP/1.1 | 44 |
| GET /category/office HTTP/1.1 | 30 |
+--------------------------------+-------+
Output with search condition
Output requests and counts other than status 200
trdsql -oat "SELECT req, status,count(req) as count FROM log.ltsv WHERE status != '200' GROUP BY req, status ORDER BY count DESC"
+-------------------------------+--------+-------+
| req | status | count |
+-------------------------------+--------+-------+
| GET /item/books/3230 HTTP/1.1 | 404 | 1 |
| GET /item/games/4672 HTTP/1.1 | 404 | 1 |
+-------------------------------+--------+-------+