diff --git a/.travis.yml b/.travis.yml index 1d48a5ac7b..3878fab01c 100644 --- a/.travis.yml +++ b/.travis.yml @@ -5,7 +5,6 @@ git: node_js: - "0.12" before_install: - - travis_retry pip install -r test-infra/requirements.txt --user - rvm install 2.0.0 && rvm use 2.0.0 - "export TRAVIS_COMMIT_MSG=\"$(git log --format=%B --no-merges -n 1)\"" - echo "$TRAVIS_COMMIT_MSG" | grep '\[skip validator\]'; export TWBS_DO_VALIDATOR=$?; true diff --git a/test-infra/README.md b/test-infra/README.md deleted file mode 100644 index 5235f861cb..0000000000 --- a/test-infra/README.md +++ /dev/null @@ -1,115 +0,0 @@ -## What does `s3_cache.py` do? - -### In general -`s3_cache.py` maintains a cache, stored in an Amazon S3 (Simple Storage Service) bucket, of a given directory whose contents are considered non-critical and are completely & solely determined by (and should be able to be regenerated from) a single given file. - -The SHA-256 hash of the single file is used as the key for the cache. The directory is stored as a gzipped tarball. - -All the tarballs are stored in S3's Reduced Redundancy Storage (RRS) storage class, since this is cheaper and the data is non-critical. - -`s3_cache.py` itself never deletes cache entries; deletion should either be done manually or using automatic S3 lifecycle rules on the bucket. - -Similar to git, `s3_cache.py` makes the assumption that [SHA-256 will effectively never have a collision](https://stackoverflow.com/questions/4014090/is-it-safe-to-ignore-the-possibility-of-sha-collisions-in-practice). - - -### For Bootstrap specifically -`s3_cache.py` is used to cache the npm packages that our Grunt tasks depend on. - -For npm, the `node_modules` directory is cached based on our `npm-shrinkwrap.json` file. - - -## Why is `s3_cache.py` necessary? -`s3_cache.py` is used to speed up Bootstrap's Travis builds. Installing npm packages used to take up a significant fraction of our total build times. Also, at the time that `s3_cache.py` was written, npm was occasionally unreliable. - -Travis does offer built-in caching on their paid plans, but this do-it-ourselves S3 solution is significantly cheaper since we only need caching and not Travis' other paid features. - - -## Configuration -`s3_cache.py` is configured via `S3Cachefile.json`, which has the following format: -```json -{ - "cache-name-here": { - "key": "path/to/file/to/SHA-256/hash/and/use/that/as/the/cache.key", - "cache": "path/to/directory/to/be/cached", - "generate": "shell-command --to run --to regenerate --the-cache $from scratch" - }, - ... -} -``` - -`s3_cache.py` will SHA-256 hash the contents of the `key` file and try to fetch a tarball from S3 using the hash as the filename. -If it's unable to fetch the tarball (either because it doesn't exist or there was a network error), it will run the `generate` command. If it was able to fetch the tarball, it will extract it to the `cache` directory. -If it had to `generate` the cache, it will later create a tarball of the `cache` directory and try to upload the tarball to S3 using the SHA-256 hash of the `key` file as the tarball's filename. - - -## AWS Setup - -### Overview -1. Create an Amazon Web Services (AWS) account. -2. Create an Identity & Access Management (IAM) user, and note their credentials. -3. Create an S3 bucket. -4. Set permissions on the bucket to grant the user read+write access. -5. Set the user credentials as secure Travis environment variables. - -### In detail -1. Create an AWS account. -2. Login to the [AWS Management Console](https://console.aws.amazon.com). -3. Go to the IAM Management Console. -4. Create a new user (named e.g. `travis-ci`) and generate an access key for them. Note both the Access Key ID and the Secret Access Key. -5. Note the user's ARN (Amazon Resource Name), which can be found in the "Summary" tab of the user browser. This will be of the form: `arn:aws:iam::XXXXXXXXXXXXXX:user/the-username-goes-here` -6. Note the user's access key, which can be found in the "Security Credentials" tab of the user browser. -7. Go to the S3 Management Console. -8. Create a new bucket. For a non-publicly-accessible bucket (like Bootstrap uses), it's recommended that the bucket name be random to increase security. On most *nix machines, you can easily generate a random UUID to use as the bucket name using Python: - - ```bash - python -c "import uuid; print(uuid.uuid4())" - ``` - -9. Determine and note what your bucket's ARN is. The ARN for an S3 bucket is of the form: `arn:aws:s3:::the-bucket-name-goes-here` -10. In the bucket's Properties pane, in the "Permissions" section, click the "Edit bucket policy" button. -11. Input and submit an IAM Policy that grants the user at least read+write rights to the bucket. AWS has a policy generator and some examples to help with crafting the policy. Here's the policy that Bootstrap uses, with the sensitive bits censored: - - ```json - { - "Version": "2012-10-17", - "Id": "PolicyTravisReadWriteNoAdmin", - "Statement": [ - { - "Sid": "StmtXXXXXXXXXXXXXX", - "Effect": "Allow", - "Principal": { - "AWS": "arn:aws:iam::XXXXXXXXXXXXXX:user/travis-ci" - }, - "Action": [ - "s3:AbortMultipartUpload", - "s3:GetObjectVersion", - "s3:ListBucket", - "s3:DeleteObject", - "s3:DeleteObjectVersion", - "s3:GetObject", - "s3:PutObject" - ], - "Resource": [ - "arn:aws:s3:::XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX", - "arn:aws:s3:::XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX/*" - ] - } - ] - } - ``` - -12. If you want deletion from the cache to be done automatically based on age (like Bootstrap does): In the bucket's Properties pane, in the "Lifecycle" section, add a rule to expire/delete files based on creation date. -13. Install the [`travis` RubyGem](https://github.com/travis-ci/travis): `gem install travis` -14. Encrypt the environment variables: - - ```bash - travis encrypt --repo twbs/bootstrap "AWS_ACCESS_KEY_ID=XXX" - travis encrypt --repo twbs/bootstrap "AWS_SECRET_ACCESS_KEY=XXX" - travis encrypt --repo twbs/bootstrap "TWBS_S3_BUCKET=XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX" - ``` - -14. Add the resulting secure environment variables to `.travis.yml`. - - -## Usage -Read `s3_cache.py`'s source code and Bootstrap's `.travis.yml` for how to invoke and make use of `s3_cache.py`. diff --git a/test-infra/S3Cachefile.json b/test-infra/S3Cachefile.json deleted file mode 100644 index 5cda49a447..0000000000 --- a/test-infra/S3Cachefile.json +++ /dev/null @@ -1,7 +0,0 @@ -{ - "npm-modules": { - "key": "./npm-shrinkwrap.json", - "cache": "../node_modules", - "generate": "./uncached-npm-install.sh" - } -} diff --git a/test-infra/requirements.txt b/test-infra/requirements.txt deleted file mode 100644 index fe44343da2..0000000000 --- a/test-infra/requirements.txt +++ /dev/null @@ -1 +0,0 @@ -boto==2.25.0 diff --git a/test-infra/s3_cache.py b/test-infra/s3_cache.py deleted file mode 100755 index eaa37992db..0000000000 --- a/test-infra/s3_cache.py +++ /dev/null @@ -1,184 +0,0 @@ -#!/usr/bin/env python2.7 -# pylint: disable=C0301 -from __future__ import absolute_import, unicode_literals, print_function, division - -from sys import argv -from os import environ, stat, chdir, remove as _delete_file -from os.path import dirname, basename, abspath, realpath, expandvars -from hashlib import sha256 -from subprocess import check_call as run -from json import load, dump as save -from contextlib import contextmanager -from datetime import datetime - -from boto.s3.connection import S3Connection -from boto.s3.key import Key -from boto.exception import S3ResponseError - - -CONFIG_FILE = './S3Cachefile.json' -UPLOAD_TODO_FILE = './S3CacheTodo.json' -BYTES_PER_MB = 1024 * 1024 - - -@contextmanager -def timer(): - start = datetime.utcnow() - yield - end = datetime.utcnow() - elapsed = end - start - print("\tDone. Took", int(elapsed.total_seconds()), "second(s).") - - -@contextmanager -def todo_file(writeback=True): - try: - with open(UPLOAD_TODO_FILE, 'rt') as json_file: - todo = load(json_file) - except (IOError, OSError, ValueError): - todo = {} - - yield todo - - if writeback: - try: - with open(UPLOAD_TODO_FILE, 'wt') as json_file: - save(todo, json_file) - except (OSError, IOError) as save_err: - print("Error saving {}:".format(UPLOAD_TODO_FILE), save_err) - - -def _sha256_of_file(filename): - hasher = sha256() - with open(filename, 'rb') as input_file: - hasher.update(input_file.read()) - file_hash = hasher.hexdigest() - print('sha256({}) = {}'.format(filename, file_hash)) - return file_hash - - -def _delete_file_quietly(filename): - try: - _delete_file(filename) - except (OSError, IOError): - pass - - -def mark_needs_uploading(cache_name): - with todo_file() as todo: - todo[cache_name] = True - - -def mark_uploaded(cache_name): - with todo_file() as todo: - todo.pop(cache_name, None) - - -def need_to_upload(cache_name): - with todo_file(writeback=False) as todo: - return todo.get(cache_name, False) - - -def _tarball_size(directory): - kib = stat(_tarball_filename_for(directory)).st_size // BYTES_PER_MB - return "{} MiB".format(kib) - - -def _tarball_filename_for(directory): - return abspath('./{}.tar.gz'.format(basename(directory))) - - -def _create_tarball(directory): - print("Creating tarball of {}...".format(directory)) - with timer(): - run(['tar', '-czf', _tarball_filename_for(directory), '-C', dirname(directory), basename(directory)]) - - -def _extract_tarball(directory): - print("Extracting tarball of {}...".format(directory)) - with timer(): - run(['tar', '-xzf', _tarball_filename_for(directory), '-C', dirname(directory)]) - - -def download(directory): - mark_uploaded(cache_name) # reset - try: - print("Downloading {} tarball from S3...".format(cache_name)) - with timer(): - key.get_contents_to_filename(_tarball_filename_for(directory)) - except S3ResponseError as err: - mark_needs_uploading(cache_name) - raise SystemExit("Cached {} download failed!".format(cache_name)) - print("Downloaded {}.".format(_tarball_size(directory))) - _extract_tarball(directory) - print("{} successfully installed from cache.".format(cache_name)) - - -def upload(directory): - _create_tarball(directory) - print("Uploading {} tarball to S3... ({})".format(cache_name, _tarball_size(directory))) - with timer(): - key.set_contents_from_filename(_tarball_filename_for(directory)) - print("{} cache successfully updated.".format(cache_name)) - mark_uploaded(cache_name) - - -if __name__ == '__main__': - # Uses environment variables: - # AWS_ACCESS_KEY_ID -- AWS Access Key ID - # AWS_SECRET_ACCESS_KEY -- AWS Secret Access Key - argv.pop(0) - if len(argv) != 2: - raise SystemExit("USAGE: s3_cache.py ") - mode, cache_name = argv - script_dir = dirname(realpath(__file__)) - chdir(script_dir) - try: - with open(CONFIG_FILE, 'rt') as config_file: - config = load(config_file) - except (IOError, OSError, ValueError) as config_err: - print(config_err) - raise SystemExit("Error when trying to load config from JSON file!") - - try: - cache_info = config[cache_name] - key_file = expandvars(cache_info["key"]) - fallback_cmd = cache_info["generate"] - directory = expandvars(cache_info["cache"]) - except (TypeError, KeyError) as load_err: - print(load_err) - raise SystemExit("Config for cache named {!r} is missing or malformed!".format(cache_name)) - - try: - try: - BUCKET_NAME = environ['TWBS_S3_BUCKET'] - except KeyError: - raise SystemExit("TWBS_S3_BUCKET environment variable not set!") - - conn = S3Connection() - bucket = conn.lookup(BUCKET_NAME) - if bucket is None: - raise SystemExit("Could not access bucket!") - - key_file_hash = _sha256_of_file(key_file) - - key = Key(bucket, key_file_hash) - key.storage_class = 'REDUCED_REDUNDANCY' - - if mode == 'download': - download(directory) - elif mode == 'upload': - if need_to_upload(cache_name): - upload(directory) - else: - print("No need to upload anything.") - else: - raise SystemExit("Unrecognized mode {!r}".format(mode)) - except BaseException as exc: - if mode != 'download': - raise - print("Error!:", exc) - print("Unable to download from cache.") - print("Running fallback command to generate cache directory {!r}: {}".format(directory, fallback_cmd)) - with timer(): - run(fallback_cmd, shell=True) diff --git a/test-infra/uncached-npm-install.sh b/test-infra/uncached-npm-install.sh deleted file mode 100755 index a2d41445d8..0000000000 --- a/test-infra/uncached-npm-install.sh +++ /dev/null @@ -1,15 +0,0 @@ -#!/bin/bash -set -e -cd .. # /bootstrap/ -cp test-infra/npm-shrinkwrap.json npm-shrinkwrap.json -# npm is flaky, so try multiple times -MAXTRIES=3 -TRIES=1 -while ! npm install; do - if [ $TRIES -ge $MAXTRIES ]; then - exit 1 - fi - TRIES=$(($TRIES + 1)) - echo "Retrying npm install (Try $TRIES of $MAXTRIES)..." -done -rm npm-shrinkwrap.json