GitLab Performance Monitoring is now able to track custom events not
directly related to application performance. These events include the
number of tags pushed, repositories created, builds registered, etc.
The use of these events is to get a better overview of how a GitLab
instance is used and how that may affect performance. For example, a
large number of Git pushes may have a negative impact on the underlying
storage engine.
Events are stored in the "events" measurement and are not prefixed with
"rails_" or "sidekiq_", this makes it easier to query events with the
same name triggered from different parts of the application. All events
being stored in the same measurement also makes it easier to downsample
data.
Currently the following events are tracked:
* Creating repositories
* Removing repositories
* Changing the default branch of a repository
* Pushing a new tag
* Removing an existing tag
* Pushing a commit (along with the branch being pushed to)
* Pushing a new branch
* Removing an existing branch
* Importing a repository (along with the URL we're importing)
* Forking a repository (along with the source/target path)
* CI builds registered (and when no build could be found)
* CI builds being updated
* Rails and Sidekiq exceptions
Fixesgitlab-org/gitlab-ce#13720
This setup is quite a bit different from before. In the previous setup
raw data was kept around for 30 days and downsampled data for 7 days.
This became problematic for GitLab.com as the number of points and
series resulted in InfluxDB running out of memory when starting up
(besides taking up 30 GB of storage).
To work around this the new setup keeps raw data around for _only_ an
hour while keeping downsampled data around for 7 days. In turn all
Grafana dashboards _only_ query the downsampled data instead of also
querying raw data.
Based on rough calculations this setup needs around 2GB of storage for 1
week of data, excluding whatever is needed for storing the raw data
(this highly depends on the amount of traffic).
If users want to use this new setup they have to remove any existing
dashboards provided by GitLab.com and re-import the ones from the
Grafana dashboards repository
(https://gitlab.com/gitlab-org/grafana-dashboards/). Should users wish
to change their default retention policy the easiest way of doing so is
to simply drop the database and re-run the InfluxDB commands added by
this commit. Users who want to keep their default retention policy as-is
can simply create the "downsampled" policy and run the other commands.
The grafana-dashboards repository now contains _all_ GitLab.com
dashboards and thus requires some extra continuous queries to be set up.
The repository now also provided a way to automatically import/export
dashboards.
[ci skip]