From 24a7d86da2d4e2fe296bec264a70248733ea6f1f Mon Sep 17 00:00:00 2001 From: Stan Hu Date: Tue, 13 Nov 2018 14:43:16 -0800 Subject: [PATCH 1/3] Add documentation on how to use structured logging --- doc/development/README.md | 1 + doc/development/logging.md | 144 +++++++++++++++++++++++++++++++++++++ 2 files changed, 145 insertions(+) create mode 100644 doc/development/logging.md diff --git a/doc/development/README.md b/doc/development/README.md index 0080c34c056..d2dd62ecac5 100644 --- a/doc/development/README.md +++ b/doc/development/README.md @@ -30,6 +30,7 @@ description: 'Learn how to contribute to GitLab.' ## Backend guides - [GitLab utilities](utilities.md) +- [Logging](logging.md) - [API styleguide](api_styleguide.md) Use this styleguide if you are contributing to the API. - [GraphQL API styleguide](api_graphql_styleguide.md) Use this diff --git a/doc/development/logging.md b/doc/development/logging.md new file mode 100644 index 00000000000..e13345a33f8 --- /dev/null +++ b/doc/development/logging.md @@ -0,0 +1,144 @@ +# GitLab Developers Guide to Logging + +[GitLab Logs](../administration/logs.md) play a critical role for both +administrators and GitLab team members to diagnose problems in the field. + +## Don't use `Rails.logger` + +Currently `Rails.logger` calls all get saved into `production.log`, which contains +a mix of Rails' logs and other calls developers have inserted in the code base. +For example: + +``` +Started GET "/gitlabhq/yaml_db/tree/master" for 168.111.56.1 at 2015-02-12 19:34:53 +0200 +Processing by Projects::TreeController#show as HTML + Parameters: {"project_id"=>"gitlabhq/yaml_db", "id"=>"master"} + + ... + + Namespaces"."created_at" DESC, "namespaces"."id" DESC LIMIT 1 [["id", 26]] + CACHE (0.0ms) SELECT "members".* FROM "members" WHERE "members"."source_type" = 'Project' AND "members"."type" IN ('ProjectMember') AND "members"."source_id" = $1 AND "members"."source_type" = $2 AND "members"."user_id" = 1 ORDER BY "members"."created_at" DESC, "members"."id" DESC LIMIT 1 [["source_id", 18], ["source_type", "Project"]] + CACHE (0.0ms) SELECT "members".* FROM "members" WHERE "members"."source_type" = 'Project' AND "members". + (1.4ms) SELECT COUNT(*) FROM "merge_requests" WHERE "merge_requests"."target_project_id" = $1 AND ("merge_requests"."state" IN ('opened','reopened')) [["target_project_id", 18]] + Rendered layouts/nav/_project.html.haml (28.0ms) + Rendered layouts/_collapse_button.html.haml (0.2ms) + Rendered layouts/_flash.html.haml (0.1ms) + Rendered layouts/_page.html.haml (32.9ms) +Completed 200 OK in 166ms (Views: 117.4ms | ActiveRecord: 27.2ms) +``` + +These logs suffer from a number of problems: + +1. They often lack timestamps or other contextual information (e.g. project ID, user) +2. They may span multiple lines, which make them hard to find via Elasticsearch. +3. They lack a common structure, which make them hard to parse by log +forwarders, such as Logstash or Fluentd. This also makes them hard to +search. + +Note that currently on GitLab.com, any messages in `production.log` will +NOT get indexed by Elasticsearch due to the sheer volume and noise. They +do end up in Google Stackdriver, but it is still harder to search for +logs there. See the [GitLab.com logging +documentation](https://gitlab.com/gitlab-com/runbooks/blob/master/howto/logging.md) +for more details. + +## Use structured (JSON) logging + +Structured logging solves these problems. Consider the example from an API request: + +```json +{"time":"2018-10-29T12:49:42.123Z","severity":"INFO","duration":709.08,"db":14.59,"view":694.49,"status":200,"method":"GET","path":"/api/v4/projects","params":[{"key":"action","value":"git-upload-pack"},{"key":"changes","value":"_any"},{"key":"key_id","value":"secret"},{"key":"secret_token","value":"[FILTERED]"}],"host":"localhost","ip":"::1","ua":"Ruby","route":"/api/:version/projects","user_id":1,"username":"root","queue_duration":100.31,"gitaly_calls":30} +``` + +In a single line, we've included all the information that a user needs +to understand what happened: the timestamp, HTTP method and path, user +ID, etc. + +### How to use JSON logging + +Suppose you want to log the events that happen in a project +importer. You want to log issues created, merge requests, etc. as the +importer progresses. Here's what to do: + +1. Look at [the list of GitLab Logs](../administration/logs.md) to see +if your log message might belong with one of the existing log files. +1. If there isn't a good place, consider creating a new filename, but +check with a maintainer if it makes sense to do so. A log file should +make it easy for people to search pertinent logs in one place. For +example,`geo.log` contains all logs pertaining to GitLab Geo. +To create a new file: + 1. Choose a filename (e.g. `importer_json.log`). + 1. Create a new subclass of `Gitlab::JsonLogger`: + + ```ruby + module Gitlab + module Import + class Logger < ::Gitlab::JsonLogger + def self.file_name_noext + 'importer_json' + end + end + end + end + ``` + + 1. In your class where you want to log, you might initialize the logger as an instance variable: + + ```ruby + attr_accessor :logger + + def initialize + @logger = Gitlab::Import::Logger.build + end + ``` + + Note that it's useful to memoize this because creating a new logger + each time you log will open a file, adding unnecessary overhead. + +1. Now insert log messages into your code. When adding logs, + make sure to include all the context as key-value pairs: + + ```ruby + # BAD + logger.info("Unable to create project #{project.id}") + ``` + + ```ruby + # GOOD + logger.info("Unable to create project", project_id: project.id) + ``` + +1. Be sure to create a common base structure of your log messages. For example, + all messages might have `current_user_id` and `project_id` to make it easier + to search for activities by user for a given time. + +1. Do NOT mix and match types. Elasticsearch won't be able to index your + logs properly if you [mix integer and string + types](https://www.elastic.co/guide/en/elasticsearch/guide/current/mapping.html#_avoiding_type_gotchas): + + ```ruby + # BAD + logger.info("Import error", error: 1) + logger.info("Import error", error: "I/O failure") + ``` + + ```ruby + # GOOD + logger.info("Import error", error_code: 1, error: "I/O failure") + ``` + +## Additional steps with new log files + +1. Consider log retention settings. By default, Omnibus will rotate any +logs in `/var/log/gitlab/gitlab-rails/*.log` every hour and [keep at +most 30 compressed files](https://docs.gitlab.com/omnibus/settings/logs.html#logrotate). +On GitLab.com, that setting is only 6 compressed files. These settings should suffice +for most users, but you may need to tweak them in [omnibus-gitlab](https://gitlab.com/gitlab-org/omnibus-gitlab). + +1. If you add a new file, submit an issue to the [production +tracker](https://gitlab.com/gitlab-com/gl-infra/production/issues) or +a merge request to the [gitlab_fluentd](https://gitlab.com/gitlab-cookbooks/gitlab_fluentd) +project. See [this example](https://gitlab.com/gitlab-cookbooks/gitlab_fluentd/merge_requests/51/diffs). + +1. Be sure to update the [GitLab CE/EE documentation](../administration/logs.md) and the [GitLab.com +runbooks](https://gitlab.com/gitlab-com/runbooks/blob/master/howto/logging.md). From 63c9472c2888cb925fe454bef6940bd8c4fd07f6 Mon Sep 17 00:00:00 2001 From: Stan Hu Date: Wed, 14 Nov 2018 13:05:19 -0800 Subject: [PATCH 2/3] Exclude qa and docs from Rails 4 builds Closes https://gitlab.com/gitlab-org/gitlab-ce/issues/54059 --- .gitlab-ci.yml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml index 94922aec598..8c622b06ce6 100644 --- a/.gitlab-ci.yml +++ b/.gitlab-ci.yml @@ -79,6 +79,8 @@ stages: allow_failure: false except: variables: + - $CI_COMMIT_REF_NAME =~ /(^docs[\/-].*|.*-docs$)/ + - $CI_COMMIT_REF_NAME =~ /(^qa[\/-].*|.*-qa$)/ - $CI_COMMIT_REF_NAME =~ /norails4/ - $RAILS5_DISABLED variables: From 5e4f3f45beea8449e1398db1502e0e090e3706cf Mon Sep 17 00:00:00 2001 From: Stan Hu Date: Wed, 14 Nov 2018 13:34:41 -0800 Subject: [PATCH 3/3] Add missing space to logging doc --- doc/development/logging.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/development/logging.md b/doc/development/logging.md index e13345a33f8..abd08c420da 100644 --- a/doc/development/logging.md +++ b/doc/development/logging.md @@ -65,7 +65,7 @@ if your log message might belong with one of the existing log files. 1. If there isn't a good place, consider creating a new filename, but check with a maintainer if it makes sense to do so. A log file should make it easy for people to search pertinent logs in one place. For -example,`geo.log` contains all logs pertaining to GitLab Geo. +example, `geo.log` contains all logs pertaining to GitLab Geo. To create a new file: 1. Choose a filename (e.g. `importer_json.log`). 1. Create a new subclass of `Gitlab::JsonLogger`: