Merge branch 'docs/refactor-monitoring' into 'master'
Move monitoring/ to new location ## What does this MR do? Move monitoring/ to new location. Part of https://gitlab.com/gitlab-org/gitlab-ce/issues/3349 ## Moving docs to a new location? See the guidelines: http://docs.gitlab.com/ce/development/doc_styleguide.html#changing-document-location - [ ] Make sure the old link is not removed and has its contents replaced with a link to the new location. - [ ] Make sure internal links pointing to the document in question are not broken. - [ ] Search and replace any links referring to old docs in GitLab Rails app, specifically under the `app/views/` directory. - [ ] If working on CE, submit an MR to EE with the changes as well. See merge request !6518
|
@ -221,7 +221,11 @@
|
||||||
%fieldset
|
%fieldset
|
||||||
%legend Metrics
|
%legend Metrics
|
||||||
%p
|
%p
|
||||||
These settings require a restart to take effect.
|
Setup InfluxDB to measure a wide variety of statistics like the time spent
|
||||||
|
in running SQL queries. These settings require a
|
||||||
|
= link_to 'restart', help_page_path('administration/restart_gitlab')
|
||||||
|
to take effect.
|
||||||
|
= link_to icon('question-circle'), help_page_path('administration/monitoring/performance/introduction')
|
||||||
.form-group
|
.form-group
|
||||||
.col-sm-offset-2.col-sm-10
|
.col-sm-offset-2.col-sm-10
|
||||||
.checkbox
|
.checkbox
|
||||||
|
|
|
@ -47,8 +47,8 @@
|
||||||
- [Migrate GitLab CI to CE/EE](migrate_ci_to_ce/README.md) Follow this guide to migrate your existing GitLab CI data to GitLab CE/EE.
|
- [Migrate GitLab CI to CE/EE](migrate_ci_to_ce/README.md) Follow this guide to migrate your existing GitLab CI data to GitLab CE/EE.
|
||||||
- [Git LFS configuration](workflow/lfs/lfs_administration.md)
|
- [Git LFS configuration](workflow/lfs/lfs_administration.md)
|
||||||
- [Housekeeping](administration/housekeeping.md) Keep your Git repository tidy and fast.
|
- [Housekeeping](administration/housekeeping.md) Keep your Git repository tidy and fast.
|
||||||
- [GitLab Performance Monitoring](monitoring/performance/introduction.md) Configure GitLab and InfluxDB for measuring performance metrics.
|
- [GitLab Performance Monitoring](administration/monitoring/performance/introduction.md) Configure GitLab and InfluxDB for measuring performance metrics.
|
||||||
- [Monitoring uptime](monitoring/health_check.md) Check the server status using the health check endpoint.
|
- [Monitoring uptime](administration/monitoring/health_check.md) Check the server status using the health check endpoint.
|
||||||
- [Debugging Tips](administration/troubleshooting/debug.md) Tips to debug problems when things go wrong
|
- [Debugging Tips](administration/troubleshooting/debug.md) Tips to debug problems when things go wrong
|
||||||
- [Sidekiq Troubleshooting](administration/troubleshooting/sidekiq.md) Debug when Sidekiq appears hung and is not processing jobs.
|
- [Sidekiq Troubleshooting](administration/troubleshooting/sidekiq.md) Debug when Sidekiq appears hung and is not processing jobs.
|
||||||
- [High Availability](administration/high_availability/README.md) Configure multiple servers for scaling or high availability.
|
- [High Availability](administration/high_availability/README.md) Configure multiple servers for scaling or high availability.
|
||||||
|
|
66
doc/administration/monitoring/health_check.md
Normal file
|
@ -0,0 +1,66 @@
|
||||||
|
# Health Check
|
||||||
|
|
||||||
|
> [Introduced][ce-3888] in GitLab 8.8.
|
||||||
|
|
||||||
|
GitLab provides a health check endpoint for uptime monitoring on the `health_check` web
|
||||||
|
endpoint. The health check reports on the overall system status based on the status of
|
||||||
|
the database connection, the state of the database migrations, and the ability to write
|
||||||
|
and access the cache. This endpoint can be provided to uptime monitoring services like
|
||||||
|
[Pingdom][pingdom], [Nagios][nagios-health], and [NewRelic][newrelic-health].
|
||||||
|
|
||||||
|
## Access Token
|
||||||
|
|
||||||
|
An access token needs to be provided while accessing the health check endpoint. The current
|
||||||
|
accepted token can be found on the `admin/health_check` page of your GitLab instance.
|
||||||
|
|
||||||
|
![access token](img/health_check_token.png)
|
||||||
|
|
||||||
|
The access token can be passed as a URL parameter:
|
||||||
|
|
||||||
|
```
|
||||||
|
https://gitlab.example.com/health_check.json?token=ACCESS_TOKEN
|
||||||
|
```
|
||||||
|
|
||||||
|
or as an HTTP header:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl --header "TOKEN: ACCESS_TOKEN" https://gitlab.example.com/health_check.json
|
||||||
|
```
|
||||||
|
|
||||||
|
## Using the Endpoint
|
||||||
|
|
||||||
|
Once you have the access token, health information can be retrieved as plain text, JSON,
|
||||||
|
or XML using the `health_check` endpoint:
|
||||||
|
|
||||||
|
- `https://gitlab.example.com/health_check?token=ACCESS_TOKEN`
|
||||||
|
- `https://gitlab.example.com/health_check.json?token=ACCESS_TOKEN`
|
||||||
|
- `https://gitlab.example.com/health_check.xml?token=ACCESS_TOKEN`
|
||||||
|
|
||||||
|
You can also ask for the status of specific services:
|
||||||
|
|
||||||
|
- `https://gitlab.example.com/health_check/cache.json?token=ACCESS_TOKEN`
|
||||||
|
- `https://gitlab.example.com/health_check/database.json?token=ACCESS_TOKEN`
|
||||||
|
- `https://gitlab.example.com/health_check/migrations.json?token=ACCESS_TOKEN`
|
||||||
|
|
||||||
|
For example, the JSON output of the following health check:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl --header "TOKEN: ACCESS_TOKEN" https://gitlab.example.com/health_check.json
|
||||||
|
```
|
||||||
|
|
||||||
|
would be like:
|
||||||
|
|
||||||
|
```
|
||||||
|
{"healthy":true,"message":"success"}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Status
|
||||||
|
|
||||||
|
On failure, the endpoint will return a `500` HTTP status code. On success, the endpoint
|
||||||
|
will return a valid successful HTTP status code, and a `success` message. Ideally your
|
||||||
|
uptime monitoring should look for the success message.
|
||||||
|
|
||||||
|
[ce-3888]: https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/3888
|
||||||
|
[pingdom]: https://www.pingdom.com
|
||||||
|
[nagios-health]: https://nagios-plugins.org/doc/man/check_http.html
|
||||||
|
[newrelic-health]: https://docs.newrelic.com/docs/alerts/alert-policies/downtime-alerts/availability-monitoring
|
Before Width: | Height: | Size: 6.5 KiB After Width: | Height: | Size: 6.5 KiB |
|
@ -0,0 +1,40 @@
|
||||||
|
# GitLab Configuration
|
||||||
|
|
||||||
|
GitLab Performance Monitoring is disabled by default. To enable it and change any of its
|
||||||
|
settings, navigate to the Admin area in **Settings > Metrics**
|
||||||
|
(`/admin/application_settings`).
|
||||||
|
|
||||||
|
The minimum required settings you need to set are the InfluxDB host and port.
|
||||||
|
Make sure _Enable InfluxDB Metrics_ is checked and hit **Save** to save the
|
||||||
|
changes.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
![GitLab Performance Monitoring Admin Settings](img/metrics_gitlab_configuration_settings.png)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
Finally, a restart of all GitLab processes is required for the changes to take
|
||||||
|
effect:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# For Omnibus installations
|
||||||
|
sudo gitlab-ctl restart
|
||||||
|
|
||||||
|
# For installations from source
|
||||||
|
sudo service gitlab restart
|
||||||
|
```
|
||||||
|
|
||||||
|
## Pending Migrations
|
||||||
|
|
||||||
|
When any migrations are pending, the metrics are disabled until the migrations
|
||||||
|
have been performed.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
Read more on:
|
||||||
|
|
||||||
|
- [Introduction to GitLab Performance Monitoring](introduction.md)
|
||||||
|
- [InfluxDB Configuration](influxdb_configuration.md)
|
||||||
|
- [InfluxDB Schema](influxdb_schema.md)
|
||||||
|
- [Grafana Install/Configuration](grafana_configuration.md)
|
|
@ -0,0 +1,111 @@
|
||||||
|
# Grafana Configuration
|
||||||
|
|
||||||
|
[Grafana](http://grafana.org/) is a tool that allows you to visualize time
|
||||||
|
series metrics through graphs and dashboards. It supports several backend
|
||||||
|
data stores, including InfluxDB. GitLab writes performance data to InfluxDB
|
||||||
|
and Grafana will allow you to query InfluxDB to display useful graphs.
|
||||||
|
|
||||||
|
For the easiest installation and configuration, install Grafana on the same
|
||||||
|
server as InfluxDB. For larger installations, you may want to split out these
|
||||||
|
services.
|
||||||
|
|
||||||
|
## Installation
|
||||||
|
|
||||||
|
Grafana supplies package repositories (Yum/Apt) for easy installation.
|
||||||
|
See [Grafana installation documentation](http://docs.grafana.org/installation/)
|
||||||
|
for detailed steps.
|
||||||
|
|
||||||
|
> **Note**: Before starting Grafana for the first time, set the admin user
|
||||||
|
and password in `/etc/grafana/grafana.ini`. Otherwise, the default password
|
||||||
|
will be `admin`.
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
Login as the admin user. Expand the menu by clicking the Grafana logo in the
|
||||||
|
top left corner. Choose 'Data Sources' from the menu. Then, click 'Add new'
|
||||||
|
in the top bar.
|
||||||
|
|
||||||
|
![Grafana empty data source page](img/grafana_data_source_empty.png)
|
||||||
|
|
||||||
|
Fill in the configuration details for the InfluxDB data source. Save and
|
||||||
|
Test Connection to ensure the configuration is correct.
|
||||||
|
|
||||||
|
- **Name**: InfluxDB
|
||||||
|
- **Default**: Checked
|
||||||
|
- **Type**: InfluxDB 0.9.x (Even if you're using InfluxDB 0.10.x)
|
||||||
|
- **Url**: https://localhost:8086 (Or the remote URL if you've installed InfluxDB
|
||||||
|
on a separate server)
|
||||||
|
- **Access**: proxy
|
||||||
|
- **Database**: gitlab
|
||||||
|
- **User**: admin (Or the username configured when setting up InfluxDB)
|
||||||
|
- **Password**: The password configured when you set up InfluxDB
|
||||||
|
|
||||||
|
![Grafana data source configurations](img/grafana_data_source_configuration.png)
|
||||||
|
|
||||||
|
## Apply retention policies and create continuous queries
|
||||||
|
|
||||||
|
If you intend to import the GitLab provided Grafana dashboards, you will need to
|
||||||
|
set up the right retention policies and continuous queries. The easiest way of
|
||||||
|
doing this is by using the [influxdb-management](https://gitlab.com/gitlab-org/influxdb-management)
|
||||||
|
repository.
|
||||||
|
|
||||||
|
To use this repository you must first clone it:
|
||||||
|
|
||||||
|
```
|
||||||
|
git clone https://gitlab.com/gitlab-org/influxdb-management.git
|
||||||
|
cd influxdb-management
|
||||||
|
```
|
||||||
|
|
||||||
|
Next you must install the required dependencies:
|
||||||
|
|
||||||
|
```
|
||||||
|
gem install bundler
|
||||||
|
bundle install
|
||||||
|
```
|
||||||
|
|
||||||
|
Now you must configure the repository by first copying `.env.example` to `.env`
|
||||||
|
and then editing the `.env` file to contain the correct InfluxDB settings. Once
|
||||||
|
configured you can simply run `bundle exec rake` and the InfluxDB database will
|
||||||
|
be configured for you.
|
||||||
|
|
||||||
|
For more information see the [influxdb-management README](https://gitlab.com/gitlab-org/influxdb-management/blob/master/README.md).
|
||||||
|
|
||||||
|
## Import Dashboards
|
||||||
|
|
||||||
|
You can now import a set of default dashboards that will give you a good
|
||||||
|
start on displaying useful information. GitLab has published a set of default
|
||||||
|
[Grafana dashboards][grafana-dashboards] to get you started. Clone the
|
||||||
|
repository or download a zip/tarball, then follow these steps to import each
|
||||||
|
JSON file.
|
||||||
|
|
||||||
|
Open the dashboard dropdown menu and click 'Import'
|
||||||
|
|
||||||
|
![Grafana dashboard dropdown](img/grafana_dashboard_dropdown.png)
|
||||||
|
|
||||||
|
Click 'Choose file' and browse to the location where you downloaded or cloned
|
||||||
|
the dashboard repository. Pick one of the JSON files to import.
|
||||||
|
|
||||||
|
![Grafana dashboard import](img/grafana_dashboard_import.png)
|
||||||
|
|
||||||
|
Once the dashboard is imported, be sure to click save icon in the top bar. If
|
||||||
|
you do not save the dashboard after importing it will be removed when you
|
||||||
|
navigate away.
|
||||||
|
|
||||||
|
![Grafana save icon](img/grafana_save_icon.png)
|
||||||
|
|
||||||
|
Repeat this process for each dashboard you wish to import.
|
||||||
|
|
||||||
|
Alternatively you can automatically import all the dashboards into your Grafana
|
||||||
|
instance. See the README of the [Grafana dashboards][grafana-dashboards]
|
||||||
|
repository for more information on this process.
|
||||||
|
|
||||||
|
[grafana-dashboards]: https://gitlab.com/gitlab-org/grafana-dashboards
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
Read more on:
|
||||||
|
|
||||||
|
- [Introduction to GitLab Performance Monitoring](introduction.md)
|
||||||
|
- [GitLab Configuration](gitlab_configuration.md)
|
||||||
|
- [InfluxDB Installation/Configuration](influxdb_configuration.md)
|
||||||
|
- [InfluxDB Schema](influxdb_schema.md)
|
After Width: | Height: | Size: 14 KiB |
After Width: | Height: | Size: 18 KiB |
After Width: | Height: | Size: 25 KiB |
After Width: | Height: | Size: 21 KiB |
After Width: | Height: | Size: 8.9 KiB |
After Width: | Height: | Size: 60 KiB |
|
@ -0,0 +1,193 @@
|
||||||
|
# InfluxDB Configuration
|
||||||
|
|
||||||
|
The default settings provided by [InfluxDB] are not sufficient for a high traffic
|
||||||
|
GitLab environment. The settings discussed in this document are based on the
|
||||||
|
settings GitLab uses for GitLab.com, depending on your own needs you may need to
|
||||||
|
further adjust them.
|
||||||
|
|
||||||
|
If you are intending to run InfluxDB on the same server as GitLab, make sure
|
||||||
|
you have plenty of RAM since InfluxDB can use quite a bit depending on traffic.
|
||||||
|
|
||||||
|
Unless you are going with a budget setup, it's advised to run it separately.
|
||||||
|
|
||||||
|
## Requirements
|
||||||
|
|
||||||
|
- InfluxDB 0.9.5 or newer
|
||||||
|
- A fairly modern version of Linux
|
||||||
|
- At least 4GB of RAM
|
||||||
|
- At least 10GB of storage for InfluxDB data
|
||||||
|
|
||||||
|
Note that the RAM and storage requirements can differ greatly depending on the
|
||||||
|
amount of data received/stored. To limit the amount of stored data users can
|
||||||
|
look into [InfluxDB Retention Policies][influxdb-retention].
|
||||||
|
|
||||||
|
## Installation
|
||||||
|
|
||||||
|
Installing InfluxDB is out of the scope of this document. Please refer to the
|
||||||
|
[InfluxDB documentation].
|
||||||
|
|
||||||
|
## InfluxDB Server Settings
|
||||||
|
|
||||||
|
Since InfluxDB has many settings that users may wish to customize themselves
|
||||||
|
(e.g. what port to run InfluxDB on), we'll only cover the essentials.
|
||||||
|
|
||||||
|
The configuration file in question is usually located at
|
||||||
|
`/etc/influxdb/influxdb.conf`. Whenever you make a change in this file,
|
||||||
|
InfluxDB needs to be restarted.
|
||||||
|
|
||||||
|
### Storage Engine
|
||||||
|
|
||||||
|
InfluxDB comes with different storage engines and as of InfluxDB 0.9.5 a new
|
||||||
|
storage engine is available, called [TSM Tree]. All users **must** use the new
|
||||||
|
`tsm1` storage engine as this [will be the default engine][tsm1-commit] in
|
||||||
|
upcoming InfluxDB releases.
|
||||||
|
|
||||||
|
Make sure you have the following in your configuration file:
|
||||||
|
|
||||||
|
```
|
||||||
|
[data]
|
||||||
|
dir = "/var/lib/influxdb/data"
|
||||||
|
engine = "tsm1"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Admin Panel
|
||||||
|
|
||||||
|
Production environments should have the InfluxDB admin panel **disabled**. This
|
||||||
|
feature can be disabled by adding the following to your InfluxDB configuration
|
||||||
|
file:
|
||||||
|
|
||||||
|
```
|
||||||
|
[admin]
|
||||||
|
enabled = false
|
||||||
|
```
|
||||||
|
|
||||||
|
### HTTP
|
||||||
|
|
||||||
|
HTTP is required when using the [InfluxDB CLI] or other tools such as Grafana,
|
||||||
|
thus it should be enabled. When enabling make sure to _also_ enable
|
||||||
|
authentication:
|
||||||
|
|
||||||
|
```
|
||||||
|
[http]
|
||||||
|
enabled = true
|
||||||
|
auth-enabled = true
|
||||||
|
```
|
||||||
|
|
||||||
|
_**Note:** Before you enable authentication, you might want to [create an
|
||||||
|
admin user](#create-a-new-admin-user)._
|
||||||
|
|
||||||
|
### UDP
|
||||||
|
|
||||||
|
GitLab writes data to InfluxDB via UDP and thus this must be enabled. Enabling
|
||||||
|
UDP can be done using the following settings:
|
||||||
|
|
||||||
|
```
|
||||||
|
[[udp]]
|
||||||
|
enabled = true
|
||||||
|
bind-address = ":8089"
|
||||||
|
database = "gitlab"
|
||||||
|
batch-size = 1000
|
||||||
|
batch-pending = 5
|
||||||
|
batch-timeout = "1s"
|
||||||
|
read-buffer = 209715200
|
||||||
|
```
|
||||||
|
|
||||||
|
This does the following:
|
||||||
|
|
||||||
|
1. Enable UDP and bind it to port 8089 for all addresses.
|
||||||
|
2. Store any data received in the "gitlab" database.
|
||||||
|
3. Define a batch of points to be 1000 points in size and allow a maximum of
|
||||||
|
5 batches _or_ flush them automatically after 1 second.
|
||||||
|
4. Define a UDP read buffer size of 200 MB.
|
||||||
|
|
||||||
|
One of the most important settings here is the UDP read buffer size as if this
|
||||||
|
value is set too low, packets will be dropped. You must also make sure the OS
|
||||||
|
buffer size is set to the same value, the default value is almost never enough.
|
||||||
|
|
||||||
|
To set the OS buffer size to 200 MB, on Linux you can run the following command:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
sysctl -w net.core.rmem_max=209715200
|
||||||
|
```
|
||||||
|
|
||||||
|
To make this permanent, add the following to `/etc/sysctl.conf` and restart the
|
||||||
|
server:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
net.core.rmem_max=209715200
|
||||||
|
```
|
||||||
|
|
||||||
|
It is **very important** to make sure the buffer sizes are large enough to
|
||||||
|
handle all data sent to InfluxDB as otherwise you _will_ lose data. The above
|
||||||
|
buffer sizes are based on the traffic for GitLab.com. Depending on the amount of
|
||||||
|
traffic, users may be able to use a smaller buffer size, but we highly recommend
|
||||||
|
using _at least_ 100 MB.
|
||||||
|
|
||||||
|
When enabling UDP, users should take care to not expose the port to the public,
|
||||||
|
as doing so will allow anybody to write data into your InfluxDB database (as
|
||||||
|
[InfluxDB's UDP protocol][udp] doesn't support authentication). We recommend either
|
||||||
|
whitelisting the allowed IP addresses/ranges, or setting up a VLAN and only
|
||||||
|
allowing traffic from members of said VLAN.
|
||||||
|
|
||||||
|
## Create a new admin user
|
||||||
|
|
||||||
|
If you want to [enable authentication](#http), you might want to [create an
|
||||||
|
admin user][influx-admin]:
|
||||||
|
|
||||||
|
```
|
||||||
|
influx -execute "CREATE USER jeff WITH PASSWORD '1234' WITH ALL PRIVILEGES"
|
||||||
|
```
|
||||||
|
|
||||||
|
## Create the `gitlab` database
|
||||||
|
|
||||||
|
Once you get InfluxDB up and running, you need to create a database for GitLab.
|
||||||
|
Make sure you have changed the [storage engine](#storage-engine) to `tsm1`
|
||||||
|
before creating a database.
|
||||||
|
|
||||||
|
_**Note:** If you [created an admin user](#create-a-new-admin-user) and enabled
|
||||||
|
[HTTP authentication](#http), remember to append the username (`-username <username>`)
|
||||||
|
and password (`-password <password>`) you set earlier to the commands below._
|
||||||
|
|
||||||
|
Run the following command to create a database named `gitlab`:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
influx -execute 'CREATE DATABASE gitlab'
|
||||||
|
```
|
||||||
|
|
||||||
|
The name **must** be `gitlab`, do not use any other name.
|
||||||
|
|
||||||
|
Next, make sure that the database was successfully created:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
influx -execute 'SHOW DATABASES'
|
||||||
|
```
|
||||||
|
|
||||||
|
The output should be similar to:
|
||||||
|
|
||||||
|
```
|
||||||
|
name: databases
|
||||||
|
---------------
|
||||||
|
name
|
||||||
|
_internal
|
||||||
|
gitlab
|
||||||
|
```
|
||||||
|
|
||||||
|
That's it! Now your GitLab instance should send data to InfluxDB.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
Read more on:
|
||||||
|
|
||||||
|
- [Introduction to GitLab Performance Monitoring](introduction.md)
|
||||||
|
- [GitLab Configuration](gitlab_configuration.md)
|
||||||
|
- [InfluxDB Schema](influxdb_schema.md)
|
||||||
|
- [Grafana Install/Configuration](grafana_configuration.md)
|
||||||
|
|
||||||
|
[influxdb-retention]: https://docs.influxdata.com/influxdb/v0.9/query_language/database_management/#retention-policy-management
|
||||||
|
[influxdb documentation]: https://docs.influxdata.com/influxdb/v0.9/
|
||||||
|
[influxdb cli]: https://docs.influxdata.com/influxdb/v0.9/tools/shell/
|
||||||
|
[udp]: https://docs.influxdata.com/influxdb/v0.9/write_protocols/udp/
|
||||||
|
[influxdb]: https://influxdata.com/time-series-platform/influxdb/
|
||||||
|
[tsm tree]: https://influxdata.com/blog/new-storage-engine-time-structured-merge-tree/
|
||||||
|
[tsm1-commit]: https://github.com/influxdata/influxdb/commit/15d723dc77651bac83e09e2b1c94be480966cb0d
|
||||||
|
[influx-admin]: https://docs.influxdata.com/influxdb/v0.9/administration/authentication_and_authorization/#create-a-new-admin-user
|
97
doc/administration/monitoring/performance/influxdb_schema.md
Normal file
|
@ -0,0 +1,97 @@
|
||||||
|
# InfluxDB Schema
|
||||||
|
|
||||||
|
The following measurements are currently stored in InfluxDB:
|
||||||
|
|
||||||
|
- `PROCESS_file_descriptors`
|
||||||
|
- `PROCESS_gc_statistics`
|
||||||
|
- `PROCESS_memory_usage`
|
||||||
|
- `PROCESS_method_calls`
|
||||||
|
- `PROCESS_object_counts`
|
||||||
|
- `PROCESS_transactions`
|
||||||
|
- `PROCESS_views`
|
||||||
|
- `events`
|
||||||
|
|
||||||
|
Here, `PROCESS` is replaced with either `rails` or `sidekiq` depending on the
|
||||||
|
process type. In all series, any form of duration is stored in milliseconds.
|
||||||
|
|
||||||
|
## PROCESS_file_descriptors
|
||||||
|
|
||||||
|
This measurement contains the number of open file descriptors over time. The
|
||||||
|
value field `value` contains the number of descriptors.
|
||||||
|
|
||||||
|
## PROCESS_gc_statistics
|
||||||
|
|
||||||
|
This measurement contains Ruby garbage collection statistics such as the amount
|
||||||
|
of minor/major GC runs (relative to the last sampling interval), the time spent
|
||||||
|
in garbage collection cycles, and all fields/values returned by `GC.stat`.
|
||||||
|
|
||||||
|
## PROCESS_memory_usage
|
||||||
|
|
||||||
|
This measurement contains the process' memory usage (in bytes) over time. The
|
||||||
|
value field `value` contains the number of bytes.
|
||||||
|
|
||||||
|
## PROCESS_method_calls
|
||||||
|
|
||||||
|
This measurement contains the methods called during a transaction along with
|
||||||
|
their duration, and a name of the transaction action that invoked the method (if
|
||||||
|
available). The method call duration is stored in the value field `duration`,
|
||||||
|
while the method name is stored in the tag `method`. The tag `action` contains
|
||||||
|
the full name of the transaction action. Both the `method` and `action` fields
|
||||||
|
are in the following format:
|
||||||
|
|
||||||
|
```
|
||||||
|
ClassName#method_name
|
||||||
|
```
|
||||||
|
|
||||||
|
For example, a method called by the `show` method in the `UsersController` class
|
||||||
|
would have `action` set to `UsersController#show`.
|
||||||
|
|
||||||
|
## PROCESS_object_counts
|
||||||
|
|
||||||
|
This measurement is used to store retained Ruby objects (per class) and the
|
||||||
|
amount of retained objects. The number of objects is stored in the `count` value
|
||||||
|
field while the class name is stored in the `type` tag.
|
||||||
|
|
||||||
|
## PROCESS_transactions
|
||||||
|
|
||||||
|
This measurement is used to store basic transaction details such as the time it
|
||||||
|
took to complete a transaction, how much time was spent in SQL queries, etc. The
|
||||||
|
following value fields are available:
|
||||||
|
|
||||||
|
| Value | Description |
|
||||||
|
| ----- | ----------- |
|
||||||
|
| `duration` | The total duration of the transaction |
|
||||||
|
| `allocated_memory` | The amount of bytes allocated while the transaction was running. This value is only reliable when using single-threaded application servers |
|
||||||
|
| `method_duration` | The total time spent in method calls |
|
||||||
|
| `sql_duration` | The total time spent in SQL queries |
|
||||||
|
| `view_duration` | The total time spent in views |
|
||||||
|
|
||||||
|
## PROCESS_views
|
||||||
|
|
||||||
|
This measurement is used to store view rendering timings for a transaction. The
|
||||||
|
following value fields are available:
|
||||||
|
|
||||||
|
| Value | Description |
|
||||||
|
| ----- | ----------- |
|
||||||
|
| `duration` | The rendering time of the view |
|
||||||
|
| `view` | The path of the view, relative to the application's root directory |
|
||||||
|
|
||||||
|
The `action` tag contains the action name of the transaction that rendered the
|
||||||
|
view.
|
||||||
|
|
||||||
|
## events
|
||||||
|
|
||||||
|
This measurement is used to store generic events such as the number of Git
|
||||||
|
pushes, Emails sent, etc. Each point in this measurement has a single value
|
||||||
|
field called `count`. The value of this field is simply set to `1`. Each point
|
||||||
|
also has at least one tag: `event`. This tag's value is set to the event name.
|
||||||
|
Depending on the event type additional tags may be available as well.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
Read more on:
|
||||||
|
|
||||||
|
- [Introduction to GitLab Performance Monitoring](introduction.md)
|
||||||
|
- [GitLab Configuration](gitlab_configuration.md)
|
||||||
|
- [InfluxDB Configuration](influxdb_configuration.md)
|
||||||
|
- [Grafana Install/Configuration](grafana_configuration.md)
|
65
doc/administration/monitoring/performance/introduction.md
Normal file
|
@ -0,0 +1,65 @@
|
||||||
|
# GitLab Performance Monitoring
|
||||||
|
|
||||||
|
GitLab comes with its own application performance measuring system as of GitLab
|
||||||
|
8.4, simply called "GitLab Performance Monitoring". GitLab Performance Monitoring is available in both the
|
||||||
|
Community and Enterprise editions.
|
||||||
|
|
||||||
|
Apart from this introduction, you are advised to read through the following
|
||||||
|
documents in order to understand and properly configure GitLab Performance Monitoring:
|
||||||
|
|
||||||
|
- [GitLab Configuration](gitlab_configuration.md)
|
||||||
|
- [InfluxDB Install/Configuration](influxdb_configuration.md)
|
||||||
|
- [InfluxDB Schema](influxdb_schema.md)
|
||||||
|
- [Grafana Install/Configuration](grafana_configuration.md)
|
||||||
|
|
||||||
|
## Introduction to GitLab Performance Monitoring
|
||||||
|
|
||||||
|
GitLab Performance Monitoring makes it possible to measure a wide variety of statistics
|
||||||
|
including (but not limited to):
|
||||||
|
|
||||||
|
- The time it took to complete a transaction (a web request or Sidekiq job).
|
||||||
|
- The time spent in running SQL queries and rendering HAML views.
|
||||||
|
- The time spent executing (instrumented) Ruby methods.
|
||||||
|
- Ruby object allocations, and retained objects in particular.
|
||||||
|
- System statistics such as the process' memory usage and open file descriptors.
|
||||||
|
- Ruby garbage collection statistics.
|
||||||
|
|
||||||
|
Metrics data is written to [InfluxDB][influxdb] over [UDP][influxdb-udp]. Stored
|
||||||
|
data can be visualized using [Grafana][grafana] or any other application that
|
||||||
|
supports reading data from InfluxDB. Alternatively data can be queried using the
|
||||||
|
InfluxDB CLI.
|
||||||
|
|
||||||
|
## Metric Types
|
||||||
|
|
||||||
|
Two types of metrics are collected:
|
||||||
|
|
||||||
|
1. Transaction specific metrics.
|
||||||
|
1. Sampled metrics, collected at a certain interval in a separate thread.
|
||||||
|
|
||||||
|
### Transaction Metrics
|
||||||
|
|
||||||
|
Transaction metrics are metrics that can be associated with a single
|
||||||
|
transaction. This includes statistics such as the transaction duration, timings
|
||||||
|
of any executed SQL queries, time spent rendering HAML views, etc. These metrics
|
||||||
|
are collected for every Rack request and Sidekiq job processed.
|
||||||
|
|
||||||
|
### Sampled Metrics
|
||||||
|
|
||||||
|
Sampled metrics are metrics that can't be associated with a single transaction.
|
||||||
|
Examples include garbage collection statistics and retained Ruby objects. These
|
||||||
|
metrics are collected at a regular interval. This interval is made up out of two
|
||||||
|
parts:
|
||||||
|
|
||||||
|
1. A user defined interval.
|
||||||
|
1. A randomly generated offset added on top of the interval, the same offset
|
||||||
|
can't be used twice in a row.
|
||||||
|
|
||||||
|
The actual interval can be anywhere between a half of the defined interval and a
|
||||||
|
half above the interval. For example, for a user defined interval of 15 seconds
|
||||||
|
the actual interval can be anywhere between 7.5 and 22.5. The interval is
|
||||||
|
re-generated for every sampling run instead of being generated once and re-used
|
||||||
|
for the duration of the process' lifetime.
|
||||||
|
|
||||||
|
[influxdb]: https://influxdata.com/time-series-platform/influxdb/
|
||||||
|
[influxdb-udp]: https://docs.influxdata.com/influxdb/v0.9/write_protocols/udp/
|
||||||
|
[grafana]: http://grafana.org/
|
|
@ -1,66 +1 @@
|
||||||
# Health Check
|
This document was moved to [administration/monitoring/health_check](../administration/monitoring/health_check.md).
|
||||||
|
|
||||||
> [Introduced][ce-3888] in GitLab 8.8.
|
|
||||||
|
|
||||||
GitLab provides a health check endpoint for uptime monitoring on the `health_check` web
|
|
||||||
endpoint. The health check reports on the overall system status based on the status of
|
|
||||||
the database connection, the state of the database migrations, and the ability to write
|
|
||||||
and access the cache. This endpoint can be provided to uptime monitoring services like
|
|
||||||
[Pingdom][pingdom], [Nagios][nagios-health], and [NewRelic][newrelic-health].
|
|
||||||
|
|
||||||
## Access Token
|
|
||||||
|
|
||||||
An access token needs to be provided while accessing the health check endpoint. The current
|
|
||||||
accepted token can be found on the `admin/health_check` page of your GitLab instance.
|
|
||||||
|
|
||||||
![access token](img/health_check_token.png)
|
|
||||||
|
|
||||||
The access token can be passed as a URL parameter:
|
|
||||||
|
|
||||||
```
|
|
||||||
https://gitlab.example.com/health_check.json?token=ACCESS_TOKEN
|
|
||||||
```
|
|
||||||
|
|
||||||
or as an HTTP header:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
curl --header "TOKEN: ACCESS_TOKEN" https://gitlab.example.com/health_check.json
|
|
||||||
```
|
|
||||||
|
|
||||||
## Using the Endpoint
|
|
||||||
|
|
||||||
Once you have the access token, health information can be retrieved as plain text, JSON,
|
|
||||||
or XML using the `health_check` endpoint:
|
|
||||||
|
|
||||||
- `https://gitlab.example.com/health_check?token=ACCESS_TOKEN`
|
|
||||||
- `https://gitlab.example.com/health_check.json?token=ACCESS_TOKEN`
|
|
||||||
- `https://gitlab.example.com/health_check.xml?token=ACCESS_TOKEN`
|
|
||||||
|
|
||||||
You can also ask for the status of specific services:
|
|
||||||
|
|
||||||
- `https://gitlab.example.com/health_check/cache.json?token=ACCESS_TOKEN`
|
|
||||||
- `https://gitlab.example.com/health_check/database.json?token=ACCESS_TOKEN`
|
|
||||||
- `https://gitlab.example.com/health_check/migrations.json?token=ACCESS_TOKEN`
|
|
||||||
|
|
||||||
For example, the JSON output of the following health check:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
curl --header "TOKEN: ACCESS_TOKEN" https://gitlab.example.com/health_check.json
|
|
||||||
```
|
|
||||||
|
|
||||||
would be like:
|
|
||||||
|
|
||||||
```
|
|
||||||
{"healthy":true,"message":"success"}
|
|
||||||
```
|
|
||||||
|
|
||||||
## Status
|
|
||||||
|
|
||||||
On failure, the endpoint will return a `500` HTTP status code. On success, the endpoint
|
|
||||||
will return a valid successful HTTP status code, and a `success` message. Ideally your
|
|
||||||
uptime monitoring should look for the success message.
|
|
||||||
|
|
||||||
[ce-3888]: https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/3888
|
|
||||||
[pingdom]: https://www.pingdom.com
|
|
||||||
[nagios-health]: https://nagios-plugins.org/doc/man/check_http.html
|
|
||||||
[newrelic-health]: https://docs.newrelic.com/docs/alerts/alert-policies/downtime-alerts/availability-monitoring
|
|
||||||
|
|
|
@ -1,40 +1 @@
|
||||||
# GitLab Configuration
|
This document was moved to [administration/monitoring/performance/gitlab_configuration](../administration/monitoring/performance/gitlab_configuration.md).
|
||||||
|
|
||||||
GitLab Performance Monitoring is disabled by default. To enable it and change any of its
|
|
||||||
settings, navigate to the Admin area in **Settings > Metrics**
|
|
||||||
(`/admin/application_settings`).
|
|
||||||
|
|
||||||
The minimum required settings you need to set are the InfluxDB host and port.
|
|
||||||
Make sure _Enable InfluxDB Metrics_ is checked and hit **Save** to save the
|
|
||||||
changes.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
![GitLab Performance Monitoring Admin Settings](img/metrics_gitlab_configuration_settings.png)
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
Finally, a restart of all GitLab processes is required for the changes to take
|
|
||||||
effect:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# For Omnibus installations
|
|
||||||
sudo gitlab-ctl restart
|
|
||||||
|
|
||||||
# For installations from source
|
|
||||||
sudo service gitlab restart
|
|
||||||
```
|
|
||||||
|
|
||||||
## Pending Migrations
|
|
||||||
|
|
||||||
When any migrations are pending, the metrics are disabled until the migrations
|
|
||||||
have been performed.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
Read more on:
|
|
||||||
|
|
||||||
- [Introduction to GitLab Performance Monitoring](introduction.md)
|
|
||||||
- [InfluxDB Configuration](influxdb_configuration.md)
|
|
||||||
- [InfluxDB Schema](influxdb_schema.md)
|
|
||||||
- [Grafana Install/Configuration](grafana_configuration.md)
|
|
||||||
|
|
|
@ -1,111 +1 @@
|
||||||
# Grafana Configuration
|
This document was moved to [administration/monitoring/performance/grafana_configuration](../administration/monitoring/performance/grafana_configuration.md).
|
||||||
|
|
||||||
[Grafana](http://grafana.org/) is a tool that allows you to visualize time
|
|
||||||
series metrics through graphs and dashboards. It supports several backend
|
|
||||||
data stores, including InfluxDB. GitLab writes performance data to InfluxDB
|
|
||||||
and Grafana will allow you to query InfluxDB to display useful graphs.
|
|
||||||
|
|
||||||
For the easiest installation and configuration, install Grafana on the same
|
|
||||||
server as InfluxDB. For larger installations, you may want to split out these
|
|
||||||
services.
|
|
||||||
|
|
||||||
## Installation
|
|
||||||
|
|
||||||
Grafana supplies package repositories (Yum/Apt) for easy installation.
|
|
||||||
See [Grafana installation documentation](http://docs.grafana.org/installation/)
|
|
||||||
for detailed steps.
|
|
||||||
|
|
||||||
> **Note**: Before starting Grafana for the first time, set the admin user
|
|
||||||
and password in `/etc/grafana/grafana.ini`. Otherwise, the default password
|
|
||||||
will be `admin`.
|
|
||||||
|
|
||||||
## Configuration
|
|
||||||
|
|
||||||
Login as the admin user. Expand the menu by clicking the Grafana logo in the
|
|
||||||
top left corner. Choose 'Data Sources' from the menu. Then, click 'Add new'
|
|
||||||
in the top bar.
|
|
||||||
|
|
||||||
![Grafana empty data source page](img/grafana_data_source_empty.png)
|
|
||||||
|
|
||||||
Fill in the configuration details for the InfluxDB data source. Save and
|
|
||||||
Test Connection to ensure the configuration is correct.
|
|
||||||
|
|
||||||
- **Name**: InfluxDB
|
|
||||||
- **Default**: Checked
|
|
||||||
- **Type**: InfluxDB 0.9.x (Even if you're using InfluxDB 0.10.x)
|
|
||||||
- **Url**: https://localhost:8086 (Or the remote URL if you've installed InfluxDB
|
|
||||||
on a separate server)
|
|
||||||
- **Access**: proxy
|
|
||||||
- **Database**: gitlab
|
|
||||||
- **User**: admin (Or the username configured when setting up InfluxDB)
|
|
||||||
- **Password**: The password configured when you set up InfluxDB
|
|
||||||
|
|
||||||
![Grafana data source configurations](img/grafana_data_source_configuration.png)
|
|
||||||
|
|
||||||
## Apply retention policies and create continuous queries
|
|
||||||
|
|
||||||
If you intend to import the GitLab provided Grafana dashboards, you will need to
|
|
||||||
set up the right retention policies and continuous queries. The easiest way of
|
|
||||||
doing this is by using the [influxdb-management](https://gitlab.com/gitlab-org/influxdb-management)
|
|
||||||
repository.
|
|
||||||
|
|
||||||
To use this repository you must first clone it:
|
|
||||||
|
|
||||||
```
|
|
||||||
git clone https://gitlab.com/gitlab-org/influxdb-management.git
|
|
||||||
cd influxdb-management
|
|
||||||
```
|
|
||||||
|
|
||||||
Next you must install the required dependencies:
|
|
||||||
|
|
||||||
```
|
|
||||||
gem install bundler
|
|
||||||
bundle install
|
|
||||||
```
|
|
||||||
|
|
||||||
Now you must configure the repository by first copying `.env.example` to `.env`
|
|
||||||
and then editing the `.env` file to contain the correct InfluxDB settings. Once
|
|
||||||
configured you can simply run `bundle exec rake` and the InfluxDB database will
|
|
||||||
be configured for you.
|
|
||||||
|
|
||||||
For more information see the [influxdb-management README](https://gitlab.com/gitlab-org/influxdb-management/blob/master/README.md).
|
|
||||||
|
|
||||||
## Import Dashboards
|
|
||||||
|
|
||||||
You can now import a set of default dashboards that will give you a good
|
|
||||||
start on displaying useful information. GitLab has published a set of default
|
|
||||||
[Grafana dashboards][grafana-dashboards] to get you started. Clone the
|
|
||||||
repository or download a zip/tarball, then follow these steps to import each
|
|
||||||
JSON file.
|
|
||||||
|
|
||||||
Open the dashboard dropdown menu and click 'Import'
|
|
||||||
|
|
||||||
![Grafana dashboard dropdown](img/grafana_dashboard_dropdown.png)
|
|
||||||
|
|
||||||
Click 'Choose file' and browse to the location where you downloaded or cloned
|
|
||||||
the dashboard repository. Pick one of the JSON files to import.
|
|
||||||
|
|
||||||
![Grafana dashboard import](img/grafana_dashboard_import.png)
|
|
||||||
|
|
||||||
Once the dashboard is imported, be sure to click save icon in the top bar. If
|
|
||||||
you do not save the dashboard after importing it will be removed when you
|
|
||||||
navigate away.
|
|
||||||
|
|
||||||
![Grafana save icon](img/grafana_save_icon.png)
|
|
||||||
|
|
||||||
Repeat this process for each dashboard you wish to import.
|
|
||||||
|
|
||||||
Alternatively you can automatically import all the dashboards into your Grafana
|
|
||||||
instance. See the README of the [Grafana dashboards][grafana-dashboards]
|
|
||||||
repository for more information on this process.
|
|
||||||
|
|
||||||
[grafana-dashboards]: https://gitlab.com/gitlab-org/grafana-dashboards
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
Read more on:
|
|
||||||
|
|
||||||
- [Introduction to GitLab Performance Monitoring](introduction.md)
|
|
||||||
- [GitLab Configuration](gitlab_configuration.md)
|
|
||||||
- [InfluxDB Installation/Configuration](influxdb_configuration.md)
|
|
||||||
- [InfluxDB Schema](influxdb_schema.md)
|
|
||||||
|
|
|
@ -1,193 +1 @@
|
||||||
# InfluxDB Configuration
|
This document was moved to [administration/monitoring/performance/influxdb_configuration](../administration/monitoring/performance/influxdb_configuration.md).
|
||||||
|
|
||||||
The default settings provided by [InfluxDB] are not sufficient for a high traffic
|
|
||||||
GitLab environment. The settings discussed in this document are based on the
|
|
||||||
settings GitLab uses for GitLab.com, depending on your own needs you may need to
|
|
||||||
further adjust them.
|
|
||||||
|
|
||||||
If you are intending to run InfluxDB on the same server as GitLab, make sure
|
|
||||||
you have plenty of RAM since InfluxDB can use quite a bit depending on traffic.
|
|
||||||
|
|
||||||
Unless you are going with a budget setup, it's advised to run it separately.
|
|
||||||
|
|
||||||
## Requirements
|
|
||||||
|
|
||||||
- InfluxDB 0.9.5 or newer
|
|
||||||
- A fairly modern version of Linux
|
|
||||||
- At least 4GB of RAM
|
|
||||||
- At least 10GB of storage for InfluxDB data
|
|
||||||
|
|
||||||
Note that the RAM and storage requirements can differ greatly depending on the
|
|
||||||
amount of data received/stored. To limit the amount of stored data users can
|
|
||||||
look into [InfluxDB Retention Policies][influxdb-retention].
|
|
||||||
|
|
||||||
## Installation
|
|
||||||
|
|
||||||
Installing InfluxDB is out of the scope of this document. Please refer to the
|
|
||||||
[InfluxDB documentation].
|
|
||||||
|
|
||||||
## InfluxDB Server Settings
|
|
||||||
|
|
||||||
Since InfluxDB has many settings that users may wish to customize themselves
|
|
||||||
(e.g. what port to run InfluxDB on), we'll only cover the essentials.
|
|
||||||
|
|
||||||
The configuration file in question is usually located at
|
|
||||||
`/etc/influxdb/influxdb.conf`. Whenever you make a change in this file,
|
|
||||||
InfluxDB needs to be restarted.
|
|
||||||
|
|
||||||
### Storage Engine
|
|
||||||
|
|
||||||
InfluxDB comes with different storage engines and as of InfluxDB 0.9.5 a new
|
|
||||||
storage engine is available, called [TSM Tree]. All users **must** use the new
|
|
||||||
`tsm1` storage engine as this [will be the default engine][tsm1-commit] in
|
|
||||||
upcoming InfluxDB releases.
|
|
||||||
|
|
||||||
Make sure you have the following in your configuration file:
|
|
||||||
|
|
||||||
```
|
|
||||||
[data]
|
|
||||||
dir = "/var/lib/influxdb/data"
|
|
||||||
engine = "tsm1"
|
|
||||||
```
|
|
||||||
|
|
||||||
### Admin Panel
|
|
||||||
|
|
||||||
Production environments should have the InfluxDB admin panel **disabled**. This
|
|
||||||
feature can be disabled by adding the following to your InfluxDB configuration
|
|
||||||
file:
|
|
||||||
|
|
||||||
```
|
|
||||||
[admin]
|
|
||||||
enabled = false
|
|
||||||
```
|
|
||||||
|
|
||||||
### HTTP
|
|
||||||
|
|
||||||
HTTP is required when using the [InfluxDB CLI] or other tools such as Grafana,
|
|
||||||
thus it should be enabled. When enabling make sure to _also_ enable
|
|
||||||
authentication:
|
|
||||||
|
|
||||||
```
|
|
||||||
[http]
|
|
||||||
enabled = true
|
|
||||||
auth-enabled = true
|
|
||||||
```
|
|
||||||
|
|
||||||
_**Note:** Before you enable authentication, you might want to [create an
|
|
||||||
admin user](#create-a-new-admin-user)._
|
|
||||||
|
|
||||||
### UDP
|
|
||||||
|
|
||||||
GitLab writes data to InfluxDB via UDP and thus this must be enabled. Enabling
|
|
||||||
UDP can be done using the following settings:
|
|
||||||
|
|
||||||
```
|
|
||||||
[[udp]]
|
|
||||||
enabled = true
|
|
||||||
bind-address = ":8089"
|
|
||||||
database = "gitlab"
|
|
||||||
batch-size = 1000
|
|
||||||
batch-pending = 5
|
|
||||||
batch-timeout = "1s"
|
|
||||||
read-buffer = 209715200
|
|
||||||
```
|
|
||||||
|
|
||||||
This does the following:
|
|
||||||
|
|
||||||
1. Enable UDP and bind it to port 8089 for all addresses.
|
|
||||||
2. Store any data received in the "gitlab" database.
|
|
||||||
3. Define a batch of points to be 1000 points in size and allow a maximum of
|
|
||||||
5 batches _or_ flush them automatically after 1 second.
|
|
||||||
4. Define a UDP read buffer size of 200 MB.
|
|
||||||
|
|
||||||
One of the most important settings here is the UDP read buffer size as if this
|
|
||||||
value is set too low, packets will be dropped. You must also make sure the OS
|
|
||||||
buffer size is set to the same value, the default value is almost never enough.
|
|
||||||
|
|
||||||
To set the OS buffer size to 200 MB, on Linux you can run the following command:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
sysctl -w net.core.rmem_max=209715200
|
|
||||||
```
|
|
||||||
|
|
||||||
To make this permanent, add the following to `/etc/sysctl.conf` and restart the
|
|
||||||
server:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
net.core.rmem_max=209715200
|
|
||||||
```
|
|
||||||
|
|
||||||
It is **very important** to make sure the buffer sizes are large enough to
|
|
||||||
handle all data sent to InfluxDB as otherwise you _will_ lose data. The above
|
|
||||||
buffer sizes are based on the traffic for GitLab.com. Depending on the amount of
|
|
||||||
traffic, users may be able to use a smaller buffer size, but we highly recommend
|
|
||||||
using _at least_ 100 MB.
|
|
||||||
|
|
||||||
When enabling UDP, users should take care to not expose the port to the public,
|
|
||||||
as doing so will allow anybody to write data into your InfluxDB database (as
|
|
||||||
[InfluxDB's UDP protocol][udp] doesn't support authentication). We recommend either
|
|
||||||
whitelisting the allowed IP addresses/ranges, or setting up a VLAN and only
|
|
||||||
allowing traffic from members of said VLAN.
|
|
||||||
|
|
||||||
## Create a new admin user
|
|
||||||
|
|
||||||
If you want to [enable authentication](#http), you might want to [create an
|
|
||||||
admin user][influx-admin]:
|
|
||||||
|
|
||||||
```
|
|
||||||
influx -execute "CREATE USER jeff WITH PASSWORD '1234' WITH ALL PRIVILEGES"
|
|
||||||
```
|
|
||||||
|
|
||||||
## Create the `gitlab` database
|
|
||||||
|
|
||||||
Once you get InfluxDB up and running, you need to create a database for GitLab.
|
|
||||||
Make sure you have changed the [storage engine](#storage-engine) to `tsm1`
|
|
||||||
before creating a database.
|
|
||||||
|
|
||||||
_**Note:** If you [created an admin user](#create-a-new-admin-user) and enabled
|
|
||||||
[HTTP authentication](#http), remember to append the username (`-username <username>`)
|
|
||||||
and password (`-password <password>`) you set earlier to the commands below._
|
|
||||||
|
|
||||||
Run the following command to create a database named `gitlab`:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
influx -execute 'CREATE DATABASE gitlab'
|
|
||||||
```
|
|
||||||
|
|
||||||
The name **must** be `gitlab`, do not use any other name.
|
|
||||||
|
|
||||||
Next, make sure that the database was successfully created:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
influx -execute 'SHOW DATABASES'
|
|
||||||
```
|
|
||||||
|
|
||||||
The output should be similar to:
|
|
||||||
|
|
||||||
```
|
|
||||||
name: databases
|
|
||||||
---------------
|
|
||||||
name
|
|
||||||
_internal
|
|
||||||
gitlab
|
|
||||||
```
|
|
||||||
|
|
||||||
That's it! Now your GitLab instance should send data to InfluxDB.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
Read more on:
|
|
||||||
|
|
||||||
- [Introduction to GitLab Performance Monitoring](introduction.md)
|
|
||||||
- [GitLab Configuration](gitlab_configuration.md)
|
|
||||||
- [InfluxDB Schema](influxdb_schema.md)
|
|
||||||
- [Grafana Install/Configuration](grafana_configuration.md)
|
|
||||||
|
|
||||||
[influxdb-retention]: https://docs.influxdata.com/influxdb/v0.9/query_language/database_management/#retention-policy-management
|
|
||||||
[influxdb documentation]: https://docs.influxdata.com/influxdb/v0.9/
|
|
||||||
[influxdb cli]: https://docs.influxdata.com/influxdb/v0.9/tools/shell/
|
|
||||||
[udp]: https://docs.influxdata.com/influxdb/v0.9/write_protocols/udp/
|
|
||||||
[influxdb]: https://influxdata.com/time-series-platform/influxdb/
|
|
||||||
[tsm tree]: https://influxdata.com/blog/new-storage-engine-time-structured-merge-tree/
|
|
||||||
[tsm1-commit]: https://github.com/influxdata/influxdb/commit/15d723dc77651bac83e09e2b1c94be480966cb0d
|
|
||||||
[influx-admin]: https://docs.influxdata.com/influxdb/v0.9/administration/authentication_and_authorization/#create-a-new-admin-user
|
|
||||||
|
|
|
@ -1,97 +1 @@
|
||||||
# InfluxDB Schema
|
This document was moved to [administration/monitoring/performance/influxdb_schema](../administration/monitoring/performance/influxdb_schema.md).
|
||||||
|
|
||||||
The following measurements are currently stored in InfluxDB:
|
|
||||||
|
|
||||||
- `PROCESS_file_descriptors`
|
|
||||||
- `PROCESS_gc_statistics`
|
|
||||||
- `PROCESS_memory_usage`
|
|
||||||
- `PROCESS_method_calls`
|
|
||||||
- `PROCESS_object_counts`
|
|
||||||
- `PROCESS_transactions`
|
|
||||||
- `PROCESS_views`
|
|
||||||
- `events`
|
|
||||||
|
|
||||||
Here, `PROCESS` is replaced with either `rails` or `sidekiq` depending on the
|
|
||||||
process type. In all series, any form of duration is stored in milliseconds.
|
|
||||||
|
|
||||||
## PROCESS_file_descriptors
|
|
||||||
|
|
||||||
This measurement contains the number of open file descriptors over time. The
|
|
||||||
value field `value` contains the number of descriptors.
|
|
||||||
|
|
||||||
## PROCESS_gc_statistics
|
|
||||||
|
|
||||||
This measurement contains Ruby garbage collection statistics such as the amount
|
|
||||||
of minor/major GC runs (relative to the last sampling interval), the time spent
|
|
||||||
in garbage collection cycles, and all fields/values returned by `GC.stat`.
|
|
||||||
|
|
||||||
## PROCESS_memory_usage
|
|
||||||
|
|
||||||
This measurement contains the process' memory usage (in bytes) over time. The
|
|
||||||
value field `value` contains the number of bytes.
|
|
||||||
|
|
||||||
## PROCESS_method_calls
|
|
||||||
|
|
||||||
This measurement contains the methods called during a transaction along with
|
|
||||||
their duration, and a name of the transaction action that invoked the method (if
|
|
||||||
available). The method call duration is stored in the value field `duration`,
|
|
||||||
while the method name is stored in the tag `method`. The tag `action` contains
|
|
||||||
the full name of the transaction action. Both the `method` and `action` fields
|
|
||||||
are in the following format:
|
|
||||||
|
|
||||||
```
|
|
||||||
ClassName#method_name
|
|
||||||
```
|
|
||||||
|
|
||||||
For example, a method called by the `show` method in the `UsersController` class
|
|
||||||
would have `action` set to `UsersController#show`.
|
|
||||||
|
|
||||||
## PROCESS_object_counts
|
|
||||||
|
|
||||||
This measurement is used to store retained Ruby objects (per class) and the
|
|
||||||
amount of retained objects. The number of objects is stored in the `count` value
|
|
||||||
field while the class name is stored in the `type` tag.
|
|
||||||
|
|
||||||
## PROCESS_transactions
|
|
||||||
|
|
||||||
This measurement is used to store basic transaction details such as the time it
|
|
||||||
took to complete a transaction, how much time was spent in SQL queries, etc. The
|
|
||||||
following value fields are available:
|
|
||||||
|
|
||||||
| Value | Description |
|
|
||||||
| ----- | ----------- |
|
|
||||||
| `duration` | The total duration of the transaction |
|
|
||||||
| `allocated_memory` | The amount of bytes allocated while the transaction was running. This value is only reliable when using single-threaded application servers |
|
|
||||||
| `method_duration` | The total time spent in method calls |
|
|
||||||
| `sql_duration` | The total time spent in SQL queries |
|
|
||||||
| `view_duration` | The total time spent in views |
|
|
||||||
|
|
||||||
## PROCESS_views
|
|
||||||
|
|
||||||
This measurement is used to store view rendering timings for a transaction. The
|
|
||||||
following value fields are available:
|
|
||||||
|
|
||||||
| Value | Description |
|
|
||||||
| ----- | ----------- |
|
|
||||||
| `duration` | The rendering time of the view |
|
|
||||||
| `view` | The path of the view, relative to the application's root directory |
|
|
||||||
|
|
||||||
The `action` tag contains the action name of the transaction that rendered the
|
|
||||||
view.
|
|
||||||
|
|
||||||
## events
|
|
||||||
|
|
||||||
This measurement is used to store generic events such as the number of Git
|
|
||||||
pushes, Emails sent, etc. Each point in this measurement has a single value
|
|
||||||
field called `count`. The value of this field is simply set to `1`. Each point
|
|
||||||
also has at least one tag: `event`. This tag's value is set to the event name.
|
|
||||||
Depending on the event type additional tags may be available as well.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
Read more on:
|
|
||||||
|
|
||||||
- [Introduction to GitLab Performance Monitoring](introduction.md)
|
|
||||||
- [GitLab Configuration](gitlab_configuration.md)
|
|
||||||
- [InfluxDB Configuration](influxdb_configuration.md)
|
|
||||||
- [Grafana Install/Configuration](grafana_configuration.md)
|
|
||||||
|
|
|
@ -1,65 +1 @@
|
||||||
# GitLab Performance Monitoring
|
This document was moved to [administration/monitoring/performance/introduction](../administration/monitoring/performance/introduction.md).
|
||||||
|
|
||||||
GitLab comes with its own application performance measuring system as of GitLab
|
|
||||||
8.4, simply called "GitLab Performance Monitoring". GitLab Performance Monitoring is available in both the
|
|
||||||
Community and Enterprise editions.
|
|
||||||
|
|
||||||
Apart from this introduction, you are advised to read through the following
|
|
||||||
documents in order to understand and properly configure GitLab Performance Monitoring:
|
|
||||||
|
|
||||||
- [GitLab Configuration](gitlab_configuration.md)
|
|
||||||
- [InfluxDB Install/Configuration](influxdb_configuration.md)
|
|
||||||
- [InfluxDB Schema](influxdb_schema.md)
|
|
||||||
- [Grafana Install/Configuration](grafana_configuration.md)
|
|
||||||
|
|
||||||
## Introduction to GitLab Performance Monitoring
|
|
||||||
|
|
||||||
GitLab Performance Monitoring makes it possible to measure a wide variety of statistics
|
|
||||||
including (but not limited to):
|
|
||||||
|
|
||||||
- The time it took to complete a transaction (a web request or Sidekiq job).
|
|
||||||
- The time spent in running SQL queries and rendering HAML views.
|
|
||||||
- The time spent executing (instrumented) Ruby methods.
|
|
||||||
- Ruby object allocations, and retained objects in particular.
|
|
||||||
- System statistics such as the process' memory usage and open file descriptors.
|
|
||||||
- Ruby garbage collection statistics.
|
|
||||||
|
|
||||||
Metrics data is written to [InfluxDB][influxdb] over [UDP][influxdb-udp]. Stored
|
|
||||||
data can be visualized using [Grafana][grafana] or any other application that
|
|
||||||
supports reading data from InfluxDB. Alternatively data can be queried using the
|
|
||||||
InfluxDB CLI.
|
|
||||||
|
|
||||||
## Metric Types
|
|
||||||
|
|
||||||
Two types of metrics are collected:
|
|
||||||
|
|
||||||
1. Transaction specific metrics.
|
|
||||||
1. Sampled metrics, collected at a certain interval in a separate thread.
|
|
||||||
|
|
||||||
### Transaction Metrics
|
|
||||||
|
|
||||||
Transaction metrics are metrics that can be associated with a single
|
|
||||||
transaction. This includes statistics such as the transaction duration, timings
|
|
||||||
of any executed SQL queries, time spent rendering HAML views, etc. These metrics
|
|
||||||
are collected for every Rack request and Sidekiq job processed.
|
|
||||||
|
|
||||||
### Sampled Metrics
|
|
||||||
|
|
||||||
Sampled metrics are metrics that can't be associated with a single transaction.
|
|
||||||
Examples include garbage collection statistics and retained Ruby objects. These
|
|
||||||
metrics are collected at a regular interval. This interval is made up out of two
|
|
||||||
parts:
|
|
||||||
|
|
||||||
1. A user defined interval.
|
|
||||||
1. A randomly generated offset added on top of the interval, the same offset
|
|
||||||
can't be used twice in a row.
|
|
||||||
|
|
||||||
The actual interval can be anywhere between a half of the defined interval and a
|
|
||||||
half above the interval. For example, for a user defined interval of 15 seconds
|
|
||||||
the actual interval can be anywhere between 7.5 and 22.5. The interval is
|
|
||||||
re-generated for every sampling run instead of being generated once and re-used
|
|
||||||
for the duration of the process' lifetime.
|
|
||||||
|
|
||||||
[influxdb]: https://influxdata.com/time-series-platform/influxdb/
|
|
||||||
[influxdb-udp]: https://docs.influxdata.com/influxdb/v0.9/write_protocols/udp/
|
|
||||||
[grafana]: http://grafana.org/
|
|
||||||
|
|