1
0
Fork 0
mirror of https://github.com/rails/rails.git synced 2022-11-09 12:12:34 -05:00
rails--rails/activestorage/lib/active_storage/service.rb

171 lines
5.9 KiB
Ruby
Raw Normal View History

# frozen_string_literal: true
require "active_storage/log_subscriber"
2019-10-11 17:54:57 -04:00
require "active_storage/downloader"
require "action_dispatch"
require "action_dispatch/http/content_disposition"
2017-07-09 11:04:28 -04:00
module ActiveStorage
# Abstract class serving as an interface for concrete services.
#
# The available services are:
#
# * +Disk+, to manage attachments saved directly on the hard drive.
# * +GCS+, to manage attachments through Google Cloud Storage.
# * +S3+, to manage attachments through Amazon S3.
# * +AzureStorage+, to manage attachments through Microsoft Azure Storage.
# * +Mirror+, to be able to use several services to manage attachments.
#
# Inside a Rails application, you can set-up your services through the
# generated <tt>config/storage.yml</tt> file and reference one
# of the aforementioned constant under the +service+ key. For example:
#
# local:
# service: Disk
# root: <%= Rails.root.join("storage") %>
#
# You can checkout the service's constructor to know which keys are required.
#
# Then, in your application's configuration, you can specify the service to
# use like this:
#
# config.active_storage.service = :local
#
# If you are using Active Storage outside of a Ruby on Rails application, you
# can configure the service to use like this:
#
# ActiveStorage::Blob.service = ActiveStorage::Service.configure(
# :Disk,
# root: Pathname("/foo/bar/storage")
# )
class Service
extend ActiveSupport::Autoload
autoload :Configurator
attr_accessor :name
2017-07-06 10:01:11 -04:00
class << self
# Configure an Active Storage service by name from a set of configurations,
# typically loaded from a YAML file. The Active Storage engine uses this
# to set the global Active Storage service when the app boots.
def configure(service_name, configurations)
Configurator.build(service_name, configurations)
end
2017-07-09 11:04:28 -04:00
# Override in subclasses that stitch together multiple services and hence
# need to build additional services using the configurator.
#
# Passes the configurator and all of the service's config as keyword args.
#
# See MirrorService for an example.
def build(configurator:, name:, service: nil, **service_config) #:nodoc:
new(**service_config).tap do |service_instance|
service_instance.name = name
end
end
end
# Upload the +io+ to the +key+ specified. If a +checksum+ is provided, the service will
# ensure a match when the upload has completed or raise an ActiveStorage::IntegrityError.
Prevent content type and disposition bypass in storage service URLs * Force content-type to binary on service urls for relevant content types We have a list of content types that must be forcibly served as binary, but in practice this only means to serve them as attachment always. We should also set the Content-Type to the configured binary type. As a bonus: add text/cache-manifest to the list of content types to be served as binary by default. * Store content-disposition and content-type in GCS Forcing these in the service_url when serving the file works fine for S3 and Azure, since these services include params in the signature. However, GCS specifically excludes response-content-disposition and response-content-type from the signature, which means an attacker can modify these and have files that should be served as text/plain attachments served as inline HTML for example. This makes our attempt to force specific files to be served as binary and as attachment can be easily bypassed. The only way this can be forced in GCS is by storing content-disposition and content-type in the object metadata. * Update GCS object metadata after identifying blob In some cases we create the blob and upload the data before identifying the content-type, which means we can't store that in GCS right when uploading. In these, after creating the attachment, we enqueue a job to identify the blob, and set the content-type. In other cases, files are uploaded to the storage service via direct upload link. We create the blob before the direct upload, which happens independently from the blob creation itself. We then mark the blob as identified, but we have already the content-type we need without having put it in the service. In these two cases, then, we need to update the metadata in the GCS service. * Include content-type and disposition in the verified key for disk service This prevents an attacker from modifying these params in the service signed URL, which is particularly important when we want to force them to have specific values for security reasons. * Allow only a list of specific content types to be served inline This is different from the content types that must be served as binary in the sense that any content type not in this list will be always served as attachment but with its original content type. Only types in this list are allowed to be served either inline or as attachment. Apart from forcing this in the service URL, for GCS we need to store the disposition in the metadata. Fix CVE-2018-16477.
2018-09-06 10:52:52 -04:00
def upload(key, io, checksum: nil, **options)
raise NotImplementedError
end
2017-06-30 13:12:58 -04:00
Prevent content type and disposition bypass in storage service URLs * Force content-type to binary on service urls for relevant content types We have a list of content types that must be forcibly served as binary, but in practice this only means to serve them as attachment always. We should also set the Content-Type to the configured binary type. As a bonus: add text/cache-manifest to the list of content types to be served as binary by default. * Store content-disposition and content-type in GCS Forcing these in the service_url when serving the file works fine for S3 and Azure, since these services include params in the signature. However, GCS specifically excludes response-content-disposition and response-content-type from the signature, which means an attacker can modify these and have files that should be served as text/plain attachments served as inline HTML for example. This makes our attempt to force specific files to be served as binary and as attachment can be easily bypassed. The only way this can be forced in GCS is by storing content-disposition and content-type in the object metadata. * Update GCS object metadata after identifying blob In some cases we create the blob and upload the data before identifying the content-type, which means we can't store that in GCS right when uploading. In these, after creating the attachment, we enqueue a job to identify the blob, and set the content-type. In other cases, files are uploaded to the storage service via direct upload link. We create the blob before the direct upload, which happens independently from the blob creation itself. We then mark the blob as identified, but we have already the content-type we need without having put it in the service. In these two cases, then, we need to update the metadata in the GCS service. * Include content-type and disposition in the verified key for disk service This prevents an attacker from modifying these params in the service signed URL, which is particularly important when we want to force them to have specific values for security reasons. * Allow only a list of specific content types to be served inline This is different from the content types that must be served as binary in the sense that any content type not in this list will be always served as attachment but with its original content type. Only types in this list are allowed to be served either inline or as attachment. Apart from forcing this in the service URL, for GCS we need to store the disposition in the metadata. Fix CVE-2018-16477.
2018-09-06 10:52:52 -04:00
# Update metadata for the file identified by +key+ in the service.
# Override in subclasses only if the service needs to store specific
# metadata that has to be updated upon identification.
def update_metadata(key, **metadata)
end
# Return the content of the file at the +key+.
def download(key)
raise NotImplementedError
end
2017-06-30 13:12:58 -04:00
# Return the partial content in the byte +range+ of the file at the +key+.
def download_chunk(key, range)
raise NotImplementedError
end
def open(*args, **options, &block)
ActiveStorage::Downloader.new(self).open(*args, **options, &block)
2019-03-28 18:47:42 -04:00
end
# Delete the file at the +key+.
def delete(key)
raise NotImplementedError
end
2017-07-09 11:04:28 -04:00
2017-12-02 22:43:28 -05:00
# Delete files at keys starting with the +prefix+.
def delete_prefixed(prefix)
raise NotImplementedError
end
2017-08-29 13:40:03 -04:00
# Return +true+ if a file exists at the +key+.
def exist?(key)
raise NotImplementedError
end
2017-07-09 12:03:13 -04:00
# Returns the URL for the file at the +key+. This returns a permanent URL for public files, and returns a
# short-lived URL for private files. You must provide the +disposition+ (+:inline+ or +:attachment+),
# +filename+, and +content_type+ that you wish the file to be served with on request. In addition, for
# private files, you must also provide the amount of seconds the URL will be valid for, specified in +expires_in+.
def url(key, **options)
instrument :url, key: key do |payload|
generated_url =
if public?
public_url(key, **options)
else
private_url(key, **options)
end
payload[:url] = generated_url
generated_url
end
end
# Returns a signed, temporary URL that a direct upload file can be PUT to on the +key+.
# The URL will be valid for the amount of seconds specified in +expires_in+.
2017-11-02 19:52:54 -04:00
# You must also provide the +content_type+, +content_length+, and +checksum+ of the file
# that will be uploaded. All these attributes will be validated by the service upon upload.
def url_for_direct_upload(key, expires_in:, content_type:, content_length:, checksum:)
raise NotImplementedError
2017-07-09 11:04:28 -04:00
end
# Returns a Hash of headers for +url_for_direct_upload+ requests.
def headers_for_direct_upload(key, filename:, content_type:, content_length:, checksum:)
{}
2017-07-09 11:04:28 -04:00
end
def public?
@public
end
private
def private_url(key, expires_in:, filename:, disposition:, content_type:, **)
raise NotImplementedError
end
def public_url(key, **)
raise NotImplementedError
end
2017-12-02 22:43:28 -05:00
def instrument(operation, payload = {}, &block)
ActiveSupport::Notifications.instrument(
"service_#{operation}.active_storage",
2017-12-02 22:43:28 -05:00
payload.merge(service: service_name), &block)
end
def service_name
# ActiveStorage::Service::DiskService => Disk
self.class.name.split("::").third.remove("Service")
end
2017-09-28 16:43:37 -04:00
def content_disposition_with(type: "inline", filename:)
disposition = (type.to_s.presence_in(%w( attachment inline )) || "inline")
ActionDispatch::Http::ContentDisposition.format(disposition: disposition, filename: filename.sanitized)
2017-09-28 16:43:37 -04:00
end
end
2017-06-30 18:14:22 -04:00
end