Sometimes we invoke basic Unix commands via the shell when there is also a Ruby API for doing it. Use the Ruby API if it exists. <http://www.ruby-doc.org/stdlib-2.0.0/libdoc/fileutils/rdoc/FileUtils.html#module-FileUtils-label-Module+Functions>
When we pass shell commands as a single string to Ruby, Ruby will let `/bin/sh` evaluate the entire string. Essentially, we are asking the shell to evaluate a one-line script. This creates a risk for shell injection attacks. It is better to split the shell command into tokens ourselves. Sometimes we use the scripting capabilities of the shell to change the working directory or set environment variables. All of this can also be achieved securely straight from Ruby
Make the difference between options and arguments clear to the argument parsers of system commands with `--`. This is supported by many but not all Unix commands.
In the example above, the argument parser of `cat` assumes that `-l` is an option. The solution in the example above is to make it clear to `cat` that `-l` is really an argument, not an option. Many Unix command line tools follow the convention of separating options from arguments with `--`.
Capturing the output of shell commands with backticks reads nicely, but you are forced to pass the command as one string to the shell. We explained above that this is unsafe. In the main GitLab codebase, the solution is to use `Gitlab::Popen.popen` instead.
## Properly anchor regular expressions to the start and end of strings
When using regular expressions to validate user input that is passed as an argument to a shell command, make sure to use the `\A` and `\z` anchors that designate the start and end of the string, rather than `^` and `$`, or no anchors at all.
If you don't, an attacker could use this to execute commands with potentially harmful effect.
For example, when a project's `import_url` is validated like below, the user could trick GitLab into cloning from a Git repository on the local filesystem.
# URI.regexp(%w(ssh git http https)) roughly evaluates to /(ssh|git|http|https):(something_that_looks_like_a_url)/
```
Suppose the user submits the following as their import URL:
```
file://git:/tmp/lol
```
Since there are no anchors in the used regular expression, the `git:/tmp/lol` in the value would match, and the validation would pass.
When importing, GitLab would execute the following command, passing the `import_url` as an argument:
```sh
git clone file://git:/tmp/lol
```
Git would simply ignore the `git:` part, interpret the path as `file:///tmp/lol` and import the repository into the new project, in turn potentially giving the attacker access to any repository in the system, whether private or not.