gitlab-org--gitlab-foss/doc/administration/raketasks/uploads/sanitize.md
Jan Provaznik a466d97e62 Rake task for removing exif from uploads
Adds a rake task which can be used for removing EXIF
data from existing uploads.
2019-04-02 07:48:35 +00:00

2.5 KiB

Uploads Sanitize tasks

Requirements

You need exiftool installed on your system. If you installed GitLab:

  • Using the Omnibus package, you're all set.

  • From source, make sure exiftool is installed:

    # Debian/Ubuntu
    sudo apt-get install libimage-exiftool-perl
    
    # RHEL/CentOS
    sudo yum install perl-Image-ExifTool
    

Remove EXIF data from existing uploads

Since 11.9 EXIF data are automatically stripped from JPG or TIFF image uploads. Because EXIF data may contain sensitive information (e.g. GPS location), you can remove EXIF data also from existing images which were uploaded before with the following command:

sudo RAILS_ENV=production -u git -H bundle exec rake gitlab:uploads:sanitize:remove_exif

This command by default runs in dry mode and it doesn't remove EXIF data. It can be used for checking if (and how many) images should be sanitized.

The rake task accepts following parameters.

Parameter Type Description
start_id integer Only uploads with equal or greater ID will be processed
stop_id integer Only uploads with equal or smaller ID will be processed
dry_run boolean Do not remove EXIF data, only check if EXIF data are present or not, default: true
sleep_time float Pause for number of seconds after processing each image, default: 0.3 seconds

If you have too many uploads, you can speed up sanitization by setting sleep_time to a lower value or by running multiple rake tasks in parallel, each with a separate range of upload IDs (by setting start_id and stop_id).

To run the command without dry mode and remove EXIF data from all uploads, you can use:

sudo RAILS_ENV=production -u git -H bundle exec rake gitlab:uploads:sanitize:remove_exif[,,false,] 2>&1 | tee exif.log

To run the command without dry mode on uploads with ID between 100 and 5000 and pause for 0.1 second, you can use:

sudo RAILS_ENV=production -u git -H bundle exec rake gitlab:uploads:sanitize:remove_exif[100,5000,false,0.1] 2>&1 | tee exif.log

Because the output of commands will be probably long, the output is written also into exif.log file.

If sanitization fails for an upload, an error message should be in the output of the rake task (typical reasons may be that the file is missing in the storage or it's not a valid image). Please report any issues at gitlab.com and use prefix 'EXIF' in issue title with the error output and (if possible) the image.