This change comes out of a discussion between Ben Kochie and me, around this MR: https://gitlab.com/gitlab-org/gitlab-pages/merge_requests/164 gitlab-elasticsearch-indexer already uses `go mod` without a `vendor/` directory. It has caused some intermittent build failures in the gitlab-ce/ee CI pipelines, but has otherwise been fine. I think that treating our Go dependencies in the same way we do our Ruby or Node.js ones is reasonable, and it has some minor benefits: * Contributors find it easier to submit MRs * MRs are easier to review * Makefiles are easier to write
14 KiB
Go standards and style guidelines
This document describes various guidelines and best practices for GitLab projects using the Go language.
Overview
GitLab is built on top of Ruby on Rails, but we're also using Go for projects where it makes sense. Go is a very powerful language, with many advantages, and is best suited for projects with a lot of IO (disk/network access), HTTP requests, parallel processing, etc. Since we have both Ruby on Rails and Go at GitLab, we should evaluate carefully which of the two is best for the job.
This page aims to define and organize our Go guidelines, based on our various
experiences. Several projects were started with different standards and they
can still have specifics. They will be described in their respective
README.md
or PROCESS.md
files.
Code Review
We follow the common principles of Go Code Review Comments.
Reviewers and maintainers should pay attention to:
defer
functions: ensure the presence when needed, and aftererr
check.- Inject dependencies as parameters.
- Void structs when marshaling to JSON (generates
null
instead of[]
).
Security
Security is our top priority at GitLab. During code reviews, we must take care of possible security breaches in our code:
- XSS when using text/template
- CSRF Protection using Gorilla
- Use a Go version without known vulnerabilities
- Don't leak secret tokens
- SQL injections
Remember to run SAST (ULTIMATE) on your project (or at least the gosec analyzer), and to follow our Security requirements.
Web servers can take advantages of middlewares like Secure.
Finding a reviewer
Many of our projects are too small to have full-time maintainers. That's why we have a shared pool of Go reviewers at GitLab. To find a reviewer, use the Engineering Projects page in the handbook. "GitLab Community Edition (CE)" and "GitLab Community Edition (EE)" both have a "Go" section with its list of reviewers.
To add yourself to this list, add the following to your profile in the team.yml file and ask your manager to review and merge.
projects:
gitlab-ee: reviewer go
gitlab-ce: reviewer go
Code style and format
- Avoid global variables, even in packages. By doing so you will introduce side effects if the package is included multiple times.
- Use
go fmt
before committing (Gofmt is a tool that automatically formats Go source code).
Automatic linting
All Go projects should include these GitLab CI/CD jobs:
go lint:
image: golang:1.11
script:
- go get -u golang.org/x/lint/golint
- golint -set_exit_status $(go list ./... | grep -v "vendor/")
Once recursive includes become available, you will be able to share job templates like this analyzer.
Dependencies
Dependencies should be kept to the minimum. The introduction of a new dependency should be argued in the merge request, as per our Approval Guidelines. Both License Management (ULTIMATE) and Dependency Scanning (ULTIMATE) should be activated on all projects to ensure new dependencies security status and license compatibility.
Modules
Since Go 1.11, a standard dependency system is available behind the name Go Modules. It provides a way to define and lock dependencies for reproducible builds. It should be used whenever possible.
When Go Modules are in use, there should not be a vendor/
directory. Instead,
Go will automatically download dependencies when they are needed to build the
project. This is in line with how dependencies are handled with Bundler in Ruby
projects, and makes merge requests easier to review.
In some cases, such as building a Go project for it to act as a dependency of a
CI run for another project, removing the vendor/
directory means the code must
be downloaded repeatedly, which can lead to intermittent problems due to rate
limiting or network failures. In these circumstances, you should cache the
downloaded code between runs with a .gitlab-ci.yml
snippet like this:
.go-cache:
variables:
GOPATH: $CI_PROJECT_DIR/.go
before_script:
- mkdir -p .go
cache:
paths:
- .go/pkg/mod/
test:
extends: .go-cache
# ...
There was a bug on modules
checksums in Go < v1.11.4, so make
sure to use at least this version to avoid checksum mismatch
errors.
ORM
We don't use object-relational mapping libraries (ORMs) at GitLab (except ActiveRecord in Ruby on Rails). Projects can be structured with services to avoid them. PQ should be enough to interact with PostgreSQL databases.
Migrations
In the rare event of managing a hosted database, it's necessary to use a
migration system like ActiveRecord is providing. A simple library like
Journey, designed to be used in
postgres
containers, can be deployed as long-running pods. New versions will
deploy a new pod, migrating the data automatically.
Testing
Testing frameworks
We should not use any specific library or framework for testing, as the standard library provides already everything to get started. If there is a need for more sophisticated testing tools, the following external dependencies might be worth considering in case we decide to use a specific library or framework:
Subtests
Use subtests whenever possible to improve code readability and test output.
Better output in tests
When comparing expected and actual values in tests, use testify/require.Equal, testify/require.EqualError, testify/require.EqualValues, and others to improve readability when comparing structs, errors, large portions of text, or JSON documents:
type TestData struct {
// ...
}
func FuncUnderTest() TestData {
// ...
}
func Test(t *testing.T) {
t.Run("FuncUnderTest", func(t *testing.T) {
want := TestData{}
got := FuncUnderTest()
require.Equal(t, want, got) // note that expected value comes first, then comes the actual one ("diff" semantics)
})
}
Table-Driven Tests
Using Table-Driven Tests is generally good practice when you have multiple entries of inputs/outputs for the same function. Below are some guidelines one can follow when writing table-driven test. These guidelines are mostly extracted from Go standard library source code. Keep in mind it's OK not to follow these guidelines when it makes sense.
Defining test cases
Each table entry is a complete test case with inputs and expected results, and sometimes with additional information such as a test name to make the test output easily readable.
- Define a slice of anonymous struct inside of the test.
- Define a slice of anonymous struct outside of the test.
- Named structs for code reuse.
- Using
map[string]struct{}
.
Contents of the test case
- Ideally, each test case should have a field with a unique identifier
to use for naming subtests. In the Go standard library, this is commonly the
name string
field. - Use
want
/expect
/actual
when you are specifcing something in the test case that will be used for assertion.
Variable names
- Each table-driven test map/slice of struct can be named
tests
. - When looping through
tests
the anonymous struct can be referred to astt
ortc
. - The description of the test can be referred to as
name
/testName
/tn
.
Benchmarks
Programs handling a lot of IO or complex operations should always include benchmarks, to ensure performance consistency over time.
CLIs
Every Go program is launched from the command line.
cli is a convenient package to create command
line apps. It should be used whether the project is a daemon or a simple cli
tool. Flags can be mapped to environment
variables directly,
which documents and centralizes at the same time all the possible command line
interactions with the program. Don't use os.GetEnv
, it hides variables deep
in the code.
Daemons
Logging
The usage of a logging library is strongly recommended for daemons. Even
though there is a log
package in the standard library, we generally use
Logrus. Its plugin ("hooks") system
makes it a powerful logging library, with the ability to add notifiers and
formatters at the logger level directly.
Structured (JSON) logging
Every binary ideally must have structured (JSON) logging in place as it helps with searching and filtering the logs. At GitLab we use structured logging in JSON format, as all our infrastructure assumes that. When using Logrus you can turn on structured logging simply by using the build in JSON formatter. This follows the same logging type we use in our Ruby applications.
How to use Logrus
There are a few guidelines one should follow when using the Logrus package:
- When printing an error use
WithError. For
example,
logrus.WithError(err).Error("Failed to do something")
. - Since we use structured logging we can log
fields in the context of that code path, such as the URI of the request using
WithField
orWithFields
. For example,logrus.WithField("file", "/app/go).Info("Opening dir")
. If you have to log multiple keys, always useWithFields
instead of callingWithField
more than once.
Tracing and Correlation
LabKit is a place to keep common libraries for Go services. Currently it's vendored into two projects: Workhorse and Gitaly, and it exports two main (but related) pieces of functionality:
gitlab.com/gitlab-org/labkit/correlation
: for propagating and extracting correlation ids between services.gitlab.com/gitlab-org/labkit/tracing
: for instrumenting Go libraries for distributed tracing.
This gives us a thin abstraction over underlying implementations that is
consistent across Workhorse, Gitaly, and, in future, other Go servers. For
example, in the case of gitlab.com/gitlab-org/labkit/tracing
we can switch
from using Opentracing directly to using Zipkin or Gokit's own tracing wrapper
without changes to the application code, while still keeping the same
consistent configuration mechanism (i.e. the GITLAB_TRACING
environment
variable).
Context
Since daemons are long-running applications, they should have mechanisms to manage cancellations, and avoid unnecessary resources consumption (which could lead to DDOS vulnerabilities). Go Context should be used in functions that can block and passed as the first parameter.
Dockerfiles
Every project should have a Dockerfile
at the root of their repository, to
build and run the project. Since Go program are static binaries, they should
not require any external dependency, and shells in the final image are useless.
We encourage Multistage
builds:
- They let the user build the project with the right Go version and dependencies.
- They generate a small, self-contained image, derived from
Scratch
.
Generated docker images should have the program at their Entrypoint
to create
portable commands. That way, anyone can run the image, and without parameters
it will display its help message (if cli
has been used).