2020-10-23 20:08:35 -04:00
---
2020-12-01 13:09:42 -05:00
stage: Enablement
group: Distribution
2020-11-26 01:09:20 -05:00
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
2020-10-23 20:08:35 -04:00
---
2021-05-27 17:10:59 -04:00
# Parsing GitLab logs with `jq` **(FREE SELF)**
2020-03-30 11:07:51 -04:00
We recommend using log aggregation and search tools like Kibana and Splunk whenever possible,
but if they are not available you can still quickly parse
2020-10-23 05:08:41 -04:00
[GitLab logs ](../logs.md ) in JSON format
2020-10-14 20:08:42 -04:00
(the default in GitLab 12.0 and later) using [`jq` ](https://stedolan.github.io/jq/ ).
2020-03-30 11:07:51 -04:00
2022-04-15 08:09:42 -04:00
NOTE:
2022-05-11 17:08:09 -04:00
Specifically for summarizing error events and basic usage statistics,
2022-04-15 08:09:42 -04:00
the GitLab Support Team provides the specialised
[`fast-stats` tool ](https://gitlab.com/gitlab-com/support/toolbox/fast-stats/#when-to-use-it ).
2020-03-30 11:07:51 -04:00
## What is JQ?
2020-10-15 20:08:56 -04:00
As noted in its [manual ](https://stedolan.github.io/jq/manual/ ), `jq` is a command-line JSON processor. The following examples
2020-03-30 11:07:51 -04:00
include use cases targeted for parsing GitLab log files.
## Parsing Logs
2022-04-15 08:09:42 -04:00
The examples listed below address their respective log files by
their relative Omnibus paths and default filenames.
Find the respective full paths in the [GitLab logs sections ](../logs.md#production_jsonlog ).
2020-03-30 11:07:51 -04:00
### General Commands
#### Pipe colorized `jq` output into `less`
2020-05-19 14:08:11 -04:00
```shell
2020-03-30 11:07:51 -04:00
jq . < FILE > -C | less -R
```
#### Search for a term and pretty-print all matching lines
2020-05-19 14:08:11 -04:00
```shell
2020-03-30 11:07:51 -04:00
grep < TERM > < FILE > | jq .
```
#### Skip invalid lines of JSON
2020-05-19 14:08:11 -04:00
```shell
2020-03-30 11:07:51 -04:00
jq -cR 'fromjson?' file.json | jq < COMMAND >
```
By default `jq` will error out when it encounters a line that is not valid JSON.
This skips over all invalid lines and parses the rest.
2021-03-09 10:08:59 -05:00
#### Print a JSON log's time range
```shell
cat log.json | (head -1; tail -1) | jq '.time'
```
Use `zcat` if the file has been rotated and compressed:
```shell
zcat @400000006026b71d1a7af804 .s | (head -1; tail -1) | jq '.time'
zcat some_json.log.25.gz | (head -1; tail -1) | jq '.time'
```
2021-09-07 17:10:59 -04:00
#### Get activity for correlation ID across multiple JSON logs in chronological order
2021-08-24 05:10:45 -04:00
```shell
grep -hR < correlationID > | jq -c -R 'fromjson?' | jq -C -s 'sort_by(.time)' | less -R
```
2022-04-15 08:09:42 -04:00
### Parsing `gitlab-rails/production_json.log` and `gitlab-rails/api_json.log`
2020-03-30 11:07:51 -04:00
#### Find all requests with a 5XX status code
2020-05-19 14:08:11 -04:00
```shell
2021-08-01 11:08:49 -04:00
jq 'select(.status >= 500)' < FILE >
2020-03-30 11:07:51 -04:00
```
#### Top 10 slowest requests
2020-05-19 14:08:11 -04:00
```shell
2022-04-29 11:08:33 -04:00
jq -s 'sort_by(-.duration_s) | limit(10; .[])' < FILE >
2020-03-30 11:07:51 -04:00
```
#### Find and pretty print all requests related to a project
2020-05-19 14:08:11 -04:00
```shell
2020-03-30 11:07:51 -04:00
grep < PROJECT_NAME > < FILE > | jq .
```
#### Find all requests with a total duration > 5 seconds
2020-05-19 14:08:11 -04:00
```shell
2022-04-29 11:08:33 -04:00
jq 'select(.duration_s > 5000)' < FILE >
2020-03-30 11:07:51 -04:00
```
#### Find all project requests with more than 5 rugged calls
2020-05-19 14:08:11 -04:00
```shell
2020-03-30 11:07:51 -04:00
grep < PROJECT_NAME > < FILE > | jq 'select(.rugged_calls > 5)'
```
#### Find all requests with a Gitaly duration > 10 seconds
2020-05-19 14:08:11 -04:00
```shell
2022-04-29 11:08:33 -04:00
jq 'select(.gitaly_duration_s > 10000)' < FILE >
2020-03-30 11:07:51 -04:00
```
#### Find all requests with a queue duration > 10 seconds
2020-05-19 14:08:11 -04:00
```shell
2022-04-29 11:08:33 -04:00
jq 'select(.queue_duration_s > 10000)' < FILE >
2020-03-30 11:07:51 -04:00
```
#### Top 10 requests by # of Gitaly calls
2020-05-19 14:08:11 -04:00
```shell
2020-03-30 11:07:51 -04:00
jq -s 'map(select(.gitaly_calls != null)) | sort_by(-.gitaly_calls) | limit(10; .[])' < FILE >
```
2022-04-15 08:09:42 -04:00
### Parsing `gitlab-rails/production_json.log`
2020-03-30 11:07:51 -04:00
#### Print the top three controller methods by request volume and their three longest durations
2020-05-19 14:08:11 -04:00
```shell
2022-04-29 11:08:33 -04:00
jq -s -r 'group_by(.controller+.action) | sort_by(-length) | limit(3; .[]) | sort_by(-.duration_s) | "CT: \(length)\tMETHOD: \(.[0].controller)#\(.[0].action)\tDURS: \(.[0].duration_s), \(.[1].duration_s), \(.[2].duration_s)"' production_json.log
2020-03-30 11:07:51 -04:00
```
**Example output**
```plaintext
CT: 2721 METHOD: SessionsController#new DURS: 844.06, 713.81, 704.66
CT: 2435 METHOD: MetricsController#index DURS: 299.29, 284.01, 158.57
CT: 1328 METHOD: Projects::NotesController#index DURS: 403.99, 386.29, 384.39
```
2022-04-15 08:09:42 -04:00
### Parsing `gitlab-rails/api_json.log`
2020-03-30 11:07:51 -04:00
#### Print top three routes with request count and their three longest durations
2020-05-19 14:08:11 -04:00
```shell
2022-04-29 11:08:33 -04:00
jq -s -r 'group_by(.route) | sort_by(-length) | limit(3; .[]) | sort_by(-.duration_s) | "CT: \(length)\tROUTE: \(.[0].route)\tDURS: \(.[0].duration_s), \(.[1].duration_s), \(.[2].duration_s)"' api_json.log
2020-03-30 11:07:51 -04:00
```
**Example output**
```plaintext
CT: 2472 ROUTE: /api/:version/internal/allowed DURS: 56402.65, 38411.43, 19500.41
CT: 297 ROUTE: /api/:version/projects/:id/repository/tags DURS: 731.39, 685.57, 480.86
CT: 190 ROUTE: /api/:version/projects/:id/repository/commits DURS: 1079.02, 979.68, 958.21
```
2022-04-12 14:10:14 -04:00
### Parsing `gitlab-rails/geo.log`
#### Find most common Geo sync errors
If [the `geo:status` Rake task ](../geo/replication/troubleshooting.md#sync-status-rake-task )
repeatedly reports that some items never reach 100%,
the following command helps to focus on the most common errors.
```shell
jq --raw-output 'select(.severity == "ERROR") | [.project_path, .message] | @tsv ' geo.log | sort | uniq -c | sort | tail
```
2020-03-30 11:07:51 -04:00
### Parsing `gitaly/current`
2022-04-15 08:09:42 -04:00
The following examples are useful to [troubleshoot Gitaly ](../gitaly/troubleshooting.md ).
2020-03-30 11:07:51 -04:00
#### Find all Gitaly requests sent from web UI
2020-05-19 14:08:11 -04:00
```shell
2020-03-30 11:07:51 -04:00
jq 'select(."grpc.meta.client_name" == "gitlab-web")' current
```
#### Find all failed Gitaly requests
2020-05-19 14:08:11 -04:00
```shell
2020-03-30 11:07:51 -04:00
jq 'select(."grpc.code" != null and ."grpc.code" != "OK")' current
```
#### Find all requests that took longer than 30 seconds
2020-05-19 14:08:11 -04:00
```shell
2020-03-30 11:07:51 -04:00
jq 'select(."grpc.time_ms" > 30000)' current
```
2020-12-02 04:09:50 -05:00
#### Print top ten projects by request volume and their three longest durations
```shell
jq --raw-output --slurp '
map(
select(
."grpc.request.glProjectPath" != null
and ."grpc.request.glProjectPath" != ""
and ."grpc.time_ms" != null
)
)
| group_by(."grpc.request.glProjectPath")
| sort_by(-length)
| limit(10; .[])
| sort_by(-."grpc.time_ms")
| [
length,
.[0]."grpc.time_ms",
.[1]."grpc.time_ms",
.[2]."grpc.time_ms",
.[0]."grpc.request.glProjectPath"
]
2022-04-15 08:09:42 -04:00
| @sh ' current \
2020-12-02 04:09:50 -05:00
| awk 'BEGIN { printf "%7s %10s %10s %10s\t%s\n", "CT", "MAX DURS", "", "", "PROJECT" }
{ printf "%7u %7u ms, %7u ms, %7u ms\t%s\n", $1, $2, $3, $4, $5 }'
2020-03-30 11:07:51 -04:00
```
**Example output**
```plaintext
2020-12-02 04:09:50 -05:00
CT MAX DURS PROJECT
206 4898 ms, 1101 ms, 1032 ms 'groupD/project4'
109 1420 ms, 962 ms, 875 ms 'groupEF/project56'
663 106 ms, 96 ms, 94 ms 'groupABC/project123'
...
2020-03-30 11:07:51 -04:00
```
2021-02-12 13:08:59 -05:00
#### Find all projects affected by a fatal Git problem
```shell
2022-04-15 08:09:42 -04:00
grep "fatal: " current | \
2021-02-12 13:08:59 -05:00
jq '."grpc.request.glProjectPath"' | \
sort | uniq
```
2021-04-27 14:10:02 -04:00
2022-04-15 08:09:42 -04:00
### Parsing `gitlab-shell/gitlab-shell.log`
2021-04-27 14:10:02 -04:00
For investigating Git calls via SSH, from [GitLab 12.10 ](https://gitlab.com/gitlab-org/gitlab-shell/-/merge_requests/367 ).
Find the top 20 calls by project and user:
```shell
jq --raw-output --slurp '
map(
select(
.username != null and
.gl_project_path !=null
)
)
| group_by(.username+.gl_project_path)
| sort_by(-length)
| limit(20; .[])
| "count: \(length)\tuser: \(.[0].username)\tproject: \(.[0].gl_project_path)" ' \
2022-04-15 08:09:42 -04:00
gitlab-shell.log
2021-04-27 14:10:02 -04:00
```
Find the top 20 calls by project, user, and command:
```shell
jq --raw-output --slurp '
map(
select(
.command != null and
.username != null and
.gl_project_path !=null
)
)
| group_by(.username+.gl_project_path+.command)
| sort_by(-length)
| limit(20; .[])
| "count: \(length)\tcommand: \(.[0].command)\tuser: \(.[0].username)\tproject: \(.[0].gl_project_path)" ' \
2022-04-15 08:09:42 -04:00
gitlab-shell.log
2021-04-27 14:10:02 -04:00
```