gitlab-org--gitlab-foss/app/models/event.rb

410 lines
9.4 KiB
Ruby
Raw Normal View History

# frozen_string_literal: true
class Event < ApplicationRecord
include Sortable
include IgnorableColumn
include FromUnion
default_scope { reorder(nil) }
2013-02-13 06:48:16 -05:00
CREATED = 1
UPDATED = 2
CLOSED = 3
REOPENED = 4
PUSHED = 5
COMMENTED = 6
MERGED = 7
JOINED = 8 # User joined project
LEFT = 9 # User left project
DESTROYED = 10
EXPIRED = 11 # User left project due to expiry
ACTIONS = HashWithIndifferentAccess.new(
created: CREATED,
updated: UPDATED,
closed: CLOSED,
reopened: REOPENED,
pushed: PUSHED,
commented: COMMENTED,
merged: MERGED,
joined: JOINED,
left: LEFT,
destroyed: DESTROYED,
expired: EXPIRED
).freeze
TARGET_TYPES = HashWithIndifferentAccess.new(
issue: Issue,
milestone: Milestone,
merge_request: MergeRequest,
note: Note,
project: Project,
snippet: Snippet,
user: User
).freeze
RESET_PROJECT_ACTIVITY_INTERVAL = 1.hour
REPOSITORY_UPDATED_AT_INTERVAL = 5.minutes
delegate :name, :email, :public_email, :username, to: :author, prefix: true, allow_nil: true
2012-09-27 02:20:36 -04:00
delegate :title, to: :issue, prefix: true, allow_nil: true
delegate :title, to: :merge_request, prefix: true, allow_nil: true
delegate :title, to: :note, prefix: true, allow_nil: true
2012-09-27 02:20:36 -04:00
2012-09-27 16:23:11 -04:00
belongs_to :author, class_name: "User"
2012-02-28 08:09:23 -05:00
belongs_to :project
belongs_to :target, -> {
# If the association for "target" defines an "author" association we want to
# eager-load this so Banzai & friends don't end up performing N+1 queries to
# get the authors of notes, issues, etc. (likewise for "noteable").
incs = %i(author noteable).select do |a|
reflections['events'].active_record.reflect_on_association(a)
end
incs.reduce(self) { |obj, a| obj.includes(a) }
}, polymorphic: true # rubocop:disable Cop/PolymorphicAssociations
Rework how recent push events are retrieved Whenever you push to a branch GitLab will show a button to create a merge request (should one not exist already). The underlying code to display this data was quite inefficient. For example, it involved multiple slow queries just to figure out what the most recent push event was. This commit changes the way this data is retrieved so it's much faster. This is achieved by caching the ID of the last push event on every push, which is then retrieved when loading certain pages. Database queries are only executed if necessary and the cached data is removed automatically once a merge request has been created, or 2 hours after being stored. A trade-off of this approach is that we _only_ track the last event. Previously if you were to push to branch A and B then create a merge request for branch B we'd still show the widget for branch A. As of this commit this is no longer the case, instead we will only show the widget for the branch you pushed to most recently. Once a merge request exists the widget is no longer displayed. Alternative solutions are either too complex and/or too slow, hence the decision was made to settle for this trade-off. Performance Impact ------------------ In the best case scenario (= a user didn't push anything for more than 2 hours) we perform a single Redis GET per page. Should there be cached data we will run a single (and lightweight) SQL query to get the event data from the database. If a merge request already exists we will run an additional DEL to remove the cache key. The difference in response timings can vary a bit per project. On GitLab.com the 99th percentile of time spent in User#recent_push hovers between 100 milliseconds and 1 second, while the mean hovers around 50 milliseconds. With the changes in this MR the expected time spent in User#recent_push is expected to be reduced down to just a few milliseconds. Fixes https://gitlab.com/gitlab-org/gitlab-ce/issues/35990
2017-09-01 06:50:14 -04:00
has_one :push_event_payload
2014-06-17 14:23:43 -04:00
# Callbacks
after_create :reset_project_activity
after_create :set_last_repository_updated_at, if: :push_action?
after_create :track_user_interacted_projects
2014-06-17 14:23:43 -04:00
2012-10-08 20:10:04 -04:00
# Scopes
scope :recent, -> { reorder(id: :desc) }
2013-02-13 06:48:16 -05:00
scope :code_push, -> { where(action: PUSHED) }
scope :in_projects, -> (projects) do
sub_query = projects
.except(:order)
.select(1)
.where('projects.id = events.project_id')
where('EXISTS (?)', sub_query).recent
end
Migrate events into a new format This commit migrates events data in such a way that push events are stored much more efficiently. This is done by creating a shadow table called "events_for_migration", and a table called "push_event_payloads" which is used for storing push data of push events. The background migration in this commit will copy events from the "events" table into the "events_for_migration" table, push events in will also have a row created in "push_event_payloads". This approach allows us to reclaim space in the next release by simply swapping the "events" and "events_for_migration" tables, then dropping the old events (now "events_for_migration") table. The new table structure is also optimised for storage space, and does not include the unused "title" column nor the "data" column (since this data is moved to "push_event_payloads"). == Newly Created Events Newly created events are inserted into both "events" and "events_for_migration", both using the exact same primary key value. The table "push_event_payloads" in turn has a foreign key to the _shadow_ table. This removes the need for recreating and validating the foreign key after swapping the tables. Since the shadow table also has a foreign key to "projects.id" we also don't have to worry about orphaned rows. This approach however does require some additional storage as we're duplicating a portion of the events data for at least 1 release. The exact amount is hard to estimate, but for GitLab.com this is expected to be between 10 and 20 GB at most. The background migration in this commit deliberately does _not_ update the "events" table as doing so would put a lot of pressure on PostgreSQL's auto vacuuming system. == Supporting Both Old And New Events Application code has also been adjusted to support push events using both the old and new data formats. This is done by creating a PushEvent class which extends the regular Event class. Using Rails' Single Table Inheritance system we can ensure the right class is used for the right data, which in this case is based on the value of `events.action`. To support displaying old and new data at the same time the PushEvent class re-defines a few methods of the Event class, falling back to their original implementations for push events in the old format. Once all existing events have been migrated the various push event related methods can be removed from the Event model, and the calls to `super` can be removed from the methods in the PushEvent model. The UI and event atom feed have also been slightly changed to better handle this new setup, fortunately only a few changes were necessary to make this work. == API Changes The API only displays push data of events in the new format. Supporting both formats in the API is a bit more difficult compared to the UI. Since the old push data was not really well documented (apart from one example that used an incorrect "action" nmae) I decided that supporting both was not worth the effort, especially since events will be migrated in a few days _and_ new events are created in the correct format.
2017-07-10 11:43:57 -04:00
scope :with_associations, -> do
# We're using preload for "push_event_payload" as otherwise the association
# is not always available (depending on the query being built).
includes(:author, :project, project: [:project_feature, :import_data, :namespace])
.preload(:target, :push_event_payload)
Migrate events into a new format This commit migrates events data in such a way that push events are stored much more efficiently. This is done by creating a shadow table called "events_for_migration", and a table called "push_event_payloads" which is used for storing push data of push events. The background migration in this commit will copy events from the "events" table into the "events_for_migration" table, push events in will also have a row created in "push_event_payloads". This approach allows us to reclaim space in the next release by simply swapping the "events" and "events_for_migration" tables, then dropping the old events (now "events_for_migration") table. The new table structure is also optimised for storage space, and does not include the unused "title" column nor the "data" column (since this data is moved to "push_event_payloads"). == Newly Created Events Newly created events are inserted into both "events" and "events_for_migration", both using the exact same primary key value. The table "push_event_payloads" in turn has a foreign key to the _shadow_ table. This removes the need for recreating and validating the foreign key after swapping the tables. Since the shadow table also has a foreign key to "projects.id" we also don't have to worry about orphaned rows. This approach however does require some additional storage as we're duplicating a portion of the events data for at least 1 release. The exact amount is hard to estimate, but for GitLab.com this is expected to be between 10 and 20 GB at most. The background migration in this commit deliberately does _not_ update the "events" table as doing so would put a lot of pressure on PostgreSQL's auto vacuuming system. == Supporting Both Old And New Events Application code has also been adjusted to support push events using both the old and new data formats. This is done by creating a PushEvent class which extends the regular Event class. Using Rails' Single Table Inheritance system we can ensure the right class is used for the right data, which in this case is based on the value of `events.action`. To support displaying old and new data at the same time the PushEvent class re-defines a few methods of the Event class, falling back to their original implementations for push events in the old format. Once all existing events have been migrated the various push event related methods can be removed from the Event model, and the calls to `super` can be removed from the methods in the PushEvent model. The UI and event atom feed have also been slightly changed to better handle this new setup, fortunately only a few changes were necessary to make this work. == API Changes The API only displays push data of events in the new format. Supporting both formats in the API is a bit more difficult compared to the UI. Since the old push data was not really well documented (apart from one example that used an incorrect "action" nmae) I decided that supporting both was not worth the effort, especially since events will be migrated in a few days _and_ new events are created in the correct format.
2017-07-10 11:43:57 -04:00
end
scope :for_milestone_id, ->(milestone_id) { where(target_type: "Milestone", target_id: milestone_id) }
# Authors are required as they're used to display who pushed data.
#
# We're just validating the presence of the ID here as foreign key constraints
# should ensure the ID points to a valid user.
validates :author_id, presence: true
Migrate events into a new format This commit migrates events data in such a way that push events are stored much more efficiently. This is done by creating a shadow table called "events_for_migration", and a table called "push_event_payloads" which is used for storing push data of push events. The background migration in this commit will copy events from the "events" table into the "events_for_migration" table, push events in will also have a row created in "push_event_payloads". This approach allows us to reclaim space in the next release by simply swapping the "events" and "events_for_migration" tables, then dropping the old events (now "events_for_migration") table. The new table structure is also optimised for storage space, and does not include the unused "title" column nor the "data" column (since this data is moved to "push_event_payloads"). == Newly Created Events Newly created events are inserted into both "events" and "events_for_migration", both using the exact same primary key value. The table "push_event_payloads" in turn has a foreign key to the _shadow_ table. This removes the need for recreating and validating the foreign key after swapping the tables. Since the shadow table also has a foreign key to "projects.id" we also don't have to worry about orphaned rows. This approach however does require some additional storage as we're duplicating a portion of the events data for at least 1 release. The exact amount is hard to estimate, but for GitLab.com this is expected to be between 10 and 20 GB at most. The background migration in this commit deliberately does _not_ update the "events" table as doing so would put a lot of pressure on PostgreSQL's auto vacuuming system. == Supporting Both Old And New Events Application code has also been adjusted to support push events using both the old and new data formats. This is done by creating a PushEvent class which extends the regular Event class. Using Rails' Single Table Inheritance system we can ensure the right class is used for the right data, which in this case is based on the value of `events.action`. To support displaying old and new data at the same time the PushEvent class re-defines a few methods of the Event class, falling back to their original implementations for push events in the old format. Once all existing events have been migrated the various push event related methods can be removed from the Event model, and the calls to `super` can be removed from the methods in the PushEvent model. The UI and event atom feed have also been slightly changed to better handle this new setup, fortunately only a few changes were necessary to make this work. == API Changes The API only displays push data of events in the new format. Supporting both formats in the API is a bit more difficult compared to the UI. Since the old push data was not really well documented (apart from one example that used an incorrect "action" nmae) I decided that supporting both was not worth the effort, especially since events will be migrated in a few days _and_ new events are created in the correct format.
2017-07-10 11:43:57 -04:00
self.inheritance_column = 'action'
2012-10-08 20:10:04 -04:00
class << self
def model_name
ActiveModel::Name.new(self, nil, 'event')
end
Migrate events into a new format This commit migrates events data in such a way that push events are stored much more efficiently. This is done by creating a shadow table called "events_for_migration", and a table called "push_event_payloads" which is used for storing push data of push events. The background migration in this commit will copy events from the "events" table into the "events_for_migration" table, push events in will also have a row created in "push_event_payloads". This approach allows us to reclaim space in the next release by simply swapping the "events" and "events_for_migration" tables, then dropping the old events (now "events_for_migration") table. The new table structure is also optimised for storage space, and does not include the unused "title" column nor the "data" column (since this data is moved to "push_event_payloads"). == Newly Created Events Newly created events are inserted into both "events" and "events_for_migration", both using the exact same primary key value. The table "push_event_payloads" in turn has a foreign key to the _shadow_ table. This removes the need for recreating and validating the foreign key after swapping the tables. Since the shadow table also has a foreign key to "projects.id" we also don't have to worry about orphaned rows. This approach however does require some additional storage as we're duplicating a portion of the events data for at least 1 release. The exact amount is hard to estimate, but for GitLab.com this is expected to be between 10 and 20 GB at most. The background migration in this commit deliberately does _not_ update the "events" table as doing so would put a lot of pressure on PostgreSQL's auto vacuuming system. == Supporting Both Old And New Events Application code has also been adjusted to support push events using both the old and new data formats. This is done by creating a PushEvent class which extends the regular Event class. Using Rails' Single Table Inheritance system we can ensure the right class is used for the right data, which in this case is based on the value of `events.action`. To support displaying old and new data at the same time the PushEvent class re-defines a few methods of the Event class, falling back to their original implementations for push events in the old format. Once all existing events have been migrated the various push event related methods can be removed from the Event model, and the calls to `super` can be removed from the methods in the PushEvent model. The UI and event atom feed have also been slightly changed to better handle this new setup, fortunately only a few changes were necessary to make this work. == API Changes The API only displays push data of events in the new format. Supporting both formats in the API is a bit more difficult compared to the UI. Since the old push data was not really well documented (apart from one example that used an incorrect "action" nmae) I decided that supporting both was not worth the effort, especially since events will be migrated in a few days _and_ new events are created in the correct format.
2017-07-10 11:43:57 -04:00
def find_sti_class(action)
if action.to_i == PUSHED
PushEvent
else
Event
end
end
# Update Gitlab::ContributionsCalendar#activity_dates if this changes
def contributions
where("action = ? OR (target_type IN (?) AND action IN (?)) OR (target_type = ? AND action = ?)",
Event::PUSHED,
2017-02-22 12:46:57 -05:00
%w(MergeRequest Issue), [Event::CREATED, Event::CLOSED, Event::MERGED],
"Note", Event::COMMENTED)
end
Faster way of obtaining latest event update time Instead of using MAX(events.updated_at) we can simply sort the events in descending order by the "id" column and grab the first row. In other words, instead of this: SELECT max(events.updated_at) AS max_id FROM events LEFT OUTER JOIN projects ON projects.id = events.project_id LEFT OUTER JOIN namespaces ON namespaces.id = projects.namespace_id WHERE events.author_id IS NOT NULL AND events.project_id IN (13083); we can use this: SELECT events.updated_at AS max_id FROM events LEFT OUTER JOIN projects ON projects.id = events.project_id LEFT OUTER JOIN namespaces ON namespaces.id = projects.namespace_id WHERE events.author_id IS NOT NULL AND events.project_id IN (13083) ORDER BY events.id DESC LIMIT 1; This has the benefit that on PostgreSQL a backwards index scan can be used, which due to the "LIMIT 1" will at most process only a single row. This in turn greatly speeds up the process of grabbing the latest update time. This can be confirmed by looking at the query plans. The first query produces the following plan: Aggregate (cost=43779.84..43779.85 rows=1 width=12) (actual time=2142.462..2142.462 rows=1 loops=1) -> Index Scan using index_events_on_project_id on events (cost=0.43..43704.69 rows=30060 width=12) (actual time=0.033..2138.086 rows=32769 loops=1) Index Cond: (project_id = 13083) Filter: (author_id IS NOT NULL) Planning time: 1.248 ms Execution time: 2142.548 ms The second query in turn produces the following plan: Limit (cost=0.43..41.65 rows=1 width=16) (actual time=1.394..1.394 rows=1 loops=1) -> Index Scan Backward using events_pkey on events (cost=0.43..1238907.96 rows=30060 width=16) (actual time=1.394..1.394 rows=1 loops=1) Filter: ((author_id IS NOT NULL) AND (project_id = 13083)) Rows Removed by Filter: 2104 Planning time: 0.166 ms Execution time: 1.408 ms According to the above plans the 2nd query is around 1500 times faster. However, re-running the first query produces timings of around 80 ms, making the 2nd query "only" around 55 times faster.
2015-11-11 11:14:47 -05:00
def limit_recent(limit = 20, offset = nil)
recent.limit(limit).offset(offset)
end
def actions
ACTIONS.keys
end
def target_types
TARGET_TYPES.keys
end
end
2018-09-21 11:16:13 -04:00
# rubocop:disable Metrics/CyclomaticComplexity
# rubocop:disable Metrics/PerceivedComplexity
def visible_to_user?(user = nil)
if push_action? || commit_note?
Ability.allowed?(user, :download_code, project)
elsif membership_changed?
Ability.allowed?(user, :read_project, project)
elsif created_project_action?
Ability.allowed?(user, :read_project, project)
elsif issue? || issue_note?
2016-08-08 14:55:13 -04:00
Ability.allowed?(user, :read_issue, note? ? note_target : target)
2016-10-04 08:52:08 -04:00
elsif merge_request? || merge_request_note?
Ability.allowed?(user, :read_merge_request, note? ? note_target : target)
2018-09-21 11:16:13 -04:00
elsif personal_snippet_note?
Ability.allowed?(user, :read_personal_snippet, note_target)
elsif project_snippet_note?
Ability.allowed?(user, :read_project_snippet, note_target)
elsif milestone?
Ability.allowed?(user, :read_milestone, project)
else
false # No other event types are visible
end
2012-03-01 13:47:57 -05:00
end
2018-09-21 11:16:13 -04:00
# rubocop:enable Metrics/PerceivedComplexity
# rubocop:enable Metrics/CyclomaticComplexity
2012-03-01 13:47:57 -05:00
def project_name
if project
project.full_name
else
2012-09-30 09:04:43 -04:00
"(deleted project)"
end
end
def target_title
target.try(:title)
end
def created_action?
action == CREATED
end
def push_action?
false
end
def merged_action?
action == MERGED
end
def closed_action?
action == CLOSED
end
def reopened_action?
action == REOPENED
end
def joined_action?
action == JOINED
end
def left_action?
action == LEFT
end
def expired_action?
action == EXPIRED
end
def destroyed_action?
action == DESTROYED
end
def commented_action?
action == COMMENTED
end
def membership_changed?
joined_action? || left_action? || expired_action?
end
def created_project_action?
created_action? && !target && target_type.nil?
end
def created_target?
created_action? && target
end
def milestone?
target_type == "Milestone"
end
def note?
target.is_a?(Note)
end
def issue?
target_type == "Issue"
end
def merge_request?
target_type == "MergeRequest"
2012-03-05 17:29:40 -05:00
end
def milestone
target if milestone?
2012-09-09 16:18:28 -04:00
end
def issue
target if issue?
end
def merge_request
target if merge_request?
end
def note
target if note?
end
def action_name
if push_action?
push_action_name
elsif closed_action?
"closed"
elsif merged_action?
"accepted"
elsif joined_action?
2012-09-09 16:18:28 -04:00
'joined'
elsif left_action?
2012-09-09 17:27:47 -04:00
'left'
elsif expired_action?
'removed due to membership expiration from'
elsif destroyed_action?
'destroyed'
elsif commented_action?
"commented on"
elsif created_project_action?
created_project_action_name
else
"opened"
end
end
2013-01-02 16:35:11 -05:00
def target_iid
target.respond_to?(:iid) ? target.iid : target_id
end
def commit_note?
note? && target && target.for_commit?
2013-01-02 16:35:11 -05:00
end
def issue_note?
2016-05-08 14:46:43 -04:00
note? && target && target.for_issue?
end
2016-10-04 08:52:08 -04:00
def merge_request_note?
note? && target && target.for_merge_request?
end
def project_snippet_note?
note? && target && target.for_snippet?
2013-03-25 07:58:09 -04:00
end
2018-09-21 11:16:13 -04:00
def personal_snippet_note?
note? && target && target.for_personal_snippet?
end
2013-01-02 16:35:11 -05:00
def note_target
target.noteable
end
def note_target_id
if commit_note?
2013-01-02 16:35:11 -05:00
target.commit_id
else
target.noteable_id.to_s
end
end
def note_target_reference
return unless note_target
# Commit#to_reference returns the full SHA, but we want the short one here
if commit_note?
note_target.short_id
else
note_target.to_reference
end
end
2013-01-02 16:35:11 -05:00
def note_target_type
if target.noteable_type.present?
target.noteable_type.titleize
else
"Wall"
end.downcase
end
2013-08-20 14:31:37 -04:00
def body?
if push_action?
push_with_commits?
2013-08-20 14:31:37 -04:00
elsif note?
true
else
target.respond_to? :title
end
end
2014-06-17 14:23:43 -04:00
def reset_project_activity
return unless project
# Don't bother updating if we know the project was updated recently.
return if recent_update?
# At this point it's possible for multiple threads/processes to try to
# update the project. Only one query should actually perform the update,
# hence we add the extra WHERE clause for last_activity_at.
2017-06-21 09:48:12 -04:00
Project.unscoped.where(id: project_id)
.where('last_activity_at <= ?', RESET_PROJECT_ACTIVITY_INTERVAL.ago)
.update_all(last_activity_at: created_at)
end
def authored_by?(user)
user ? author_id == user.id : false
end
def to_partial_path
# We are intentionally using `Event` rather than `self.class` so that
# subclasses also use the `Event` implementation.
Event._to_partial_path
end
private
def push_action_name
if new_ref?
"pushed new"
elsif rm_ref?
"deleted"
else
"pushed to"
end
end
def created_project_action_name
if project.external_import?
"imported"
else
"created"
end
end
def recent_update?
project.last_activity_at > RESET_PROJECT_ACTIVITY_INTERVAL.ago
end
def set_last_repository_updated_at
2017-06-21 09:48:12 -04:00
Project.unscoped.where(id: project_id)
.where("last_repository_updated_at < ? OR last_repository_updated_at IS NULL", REPOSITORY_UPDATED_AT_INTERVAL.ago)
2017-06-21 09:48:12 -04:00
.update_all(last_repository_updated_at: created_at)
end
def track_user_interacted_projects
# Note the call to .available? is due to earlier migrations
# that would otherwise conflict with the call to .track
# (because the table does not exist yet).
2018-03-03 14:00:05 -05:00
UserInteractedProject.track(self) if UserInteractedProject.available?
end
2012-02-28 08:09:23 -05:00
end