gitlab-org--gitlab-foss/app/models/event.rb

372 lines
7.8 KiB
Ruby
Raw Normal View History

2012-02-28 08:09:23 -05:00
class Event < ActiveRecord::Base
include Sortable
include IgnorableColumn
default_scope { reorder(nil).where.not(author_id: nil) }
2013-02-13 06:48:16 -05:00
CREATED = 1
UPDATED = 2
CLOSED = 3
REOPENED = 4
PUSHED = 5
COMMENTED = 6
MERGED = 7
JOINED = 8 # User joined project
LEFT = 9 # User left project
DESTROYED = 10
EXPIRED = 11 # User left project due to expiry
ACTIONS = HashWithIndifferentAccess.new(
created: CREATED,
updated: UPDATED,
closed: CLOSED,
reopened: REOPENED,
pushed: PUSHED,
commented: COMMENTED,
merged: MERGED,
joined: JOINED,
left: LEFT,
destroyed: DESTROYED,
expired: EXPIRED
).freeze
TARGET_TYPES = HashWithIndifferentAccess.new(
issue: Issue,
milestone: Milestone,
merge_request: MergeRequest,
note: Note,
project: Project,
snippet: Snippet,
user: User
).freeze
RESET_PROJECT_ACTIVITY_INTERVAL = 1.hour
delegate :name, :email, :public_email, :username, to: :author, prefix: true, allow_nil: true
2012-09-27 02:20:36 -04:00
delegate :title, to: :issue, prefix: true, allow_nil: true
delegate :title, to: :merge_request, prefix: true, allow_nil: true
delegate :title, to: :note, prefix: true, allow_nil: true
2012-09-27 02:20:36 -04:00
2012-09-27 16:23:11 -04:00
belongs_to :author, class_name: "User"
2012-02-28 08:09:23 -05:00
belongs_to :project
belongs_to :target, polymorphic: true # rubocop:disable Cop/PolymorphicAssociations
Migrate events into a new format This commit migrates events data in such a way that push events are stored much more efficiently. This is done by creating a shadow table called "events_for_migration", and a table called "push_event_payloads" which is used for storing push data of push events. The background migration in this commit will copy events from the "events" table into the "events_for_migration" table, push events in will also have a row created in "push_event_payloads". This approach allows us to reclaim space in the next release by simply swapping the "events" and "events_for_migration" tables, then dropping the old events (now "events_for_migration") table. The new table structure is also optimised for storage space, and does not include the unused "title" column nor the "data" column (since this data is moved to "push_event_payloads"). == Newly Created Events Newly created events are inserted into both "events" and "events_for_migration", both using the exact same primary key value. The table "push_event_payloads" in turn has a foreign key to the _shadow_ table. This removes the need for recreating and validating the foreign key after swapping the tables. Since the shadow table also has a foreign key to "projects.id" we also don't have to worry about orphaned rows. This approach however does require some additional storage as we're duplicating a portion of the events data for at least 1 release. The exact amount is hard to estimate, but for GitLab.com this is expected to be between 10 and 20 GB at most. The background migration in this commit deliberately does _not_ update the "events" table as doing so would put a lot of pressure on PostgreSQL's auto vacuuming system. == Supporting Both Old And New Events Application code has also been adjusted to support push events using both the old and new data formats. This is done by creating a PushEvent class which extends the regular Event class. Using Rails' Single Table Inheritance system we can ensure the right class is used for the right data, which in this case is based on the value of `events.action`. To support displaying old and new data at the same time the PushEvent class re-defines a few methods of the Event class, falling back to their original implementations for push events in the old format. Once all existing events have been migrated the various push event related methods can be removed from the Event model, and the calls to `super` can be removed from the methods in the PushEvent model. The UI and event atom feed have also been slightly changed to better handle this new setup, fortunately only a few changes were necessary to make this work. == API Changes The API only displays push data of events in the new format. Supporting both formats in the API is a bit more difficult compared to the UI. Since the old push data was not really well documented (apart from one example that used an incorrect "action" nmae) I decided that supporting both was not worth the effort, especially since events will be migrated in a few days _and_ new events are created in the correct format.
2017-07-10 11:43:57 -04:00
has_one :push_event_payload, foreign_key: :event_id
2014-06-17 14:23:43 -04:00
# Callbacks
after_create :reset_project_activity
after_create :set_last_repository_updated_at, if: :push?
2014-06-17 14:23:43 -04:00
2012-10-08 20:10:04 -04:00
# Scopes
scope :recent, -> { reorder(id: :desc) }
2013-02-13 06:48:16 -05:00
scope :code_push, -> { where(action: PUSHED) }
scope :in_projects, -> (projects) do
sub_query = projects
.except(:order)
.select(1)
.where('projects.id = events.project_id')
where('EXISTS (?)', sub_query).recent
end
Migrate events into a new format This commit migrates events data in such a way that push events are stored much more efficiently. This is done by creating a shadow table called "events_for_migration", and a table called "push_event_payloads" which is used for storing push data of push events. The background migration in this commit will copy events from the "events" table into the "events_for_migration" table, push events in will also have a row created in "push_event_payloads". This approach allows us to reclaim space in the next release by simply swapping the "events" and "events_for_migration" tables, then dropping the old events (now "events_for_migration") table. The new table structure is also optimised for storage space, and does not include the unused "title" column nor the "data" column (since this data is moved to "push_event_payloads"). == Newly Created Events Newly created events are inserted into both "events" and "events_for_migration", both using the exact same primary key value. The table "push_event_payloads" in turn has a foreign key to the _shadow_ table. This removes the need for recreating and validating the foreign key after swapping the tables. Since the shadow table also has a foreign key to "projects.id" we also don't have to worry about orphaned rows. This approach however does require some additional storage as we're duplicating a portion of the events data for at least 1 release. The exact amount is hard to estimate, but for GitLab.com this is expected to be between 10 and 20 GB at most. The background migration in this commit deliberately does _not_ update the "events" table as doing so would put a lot of pressure on PostgreSQL's auto vacuuming system. == Supporting Both Old And New Events Application code has also been adjusted to support push events using both the old and new data formats. This is done by creating a PushEvent class which extends the regular Event class. Using Rails' Single Table Inheritance system we can ensure the right class is used for the right data, which in this case is based on the value of `events.action`. To support displaying old and new data at the same time the PushEvent class re-defines a few methods of the Event class, falling back to their original implementations for push events in the old format. Once all existing events have been migrated the various push event related methods can be removed from the Event model, and the calls to `super` can be removed from the methods in the PushEvent model. The UI and event atom feed have also been slightly changed to better handle this new setup, fortunately only a few changes were necessary to make this work. == API Changes The API only displays push data of events in the new format. Supporting both formats in the API is a bit more difficult compared to the UI. Since the old push data was not really well documented (apart from one example that used an incorrect "action" nmae) I decided that supporting both was not worth the effort, especially since events will be migrated in a few days _and_ new events are created in the correct format.
2017-07-10 11:43:57 -04:00
scope :with_associations, -> do
# We're using preload for "push_event_payload" as otherwise the association
# is not always available (depending on the query being built).
includes(:author, :project, project: :namespace)
.preload(:target, :push_event_payload)
end
scope :for_milestone_id, ->(milestone_id) { where(target_type: "Milestone", target_id: milestone_id) }
Migrate events into a new format This commit migrates events data in such a way that push events are stored much more efficiently. This is done by creating a shadow table called "events_for_migration", and a table called "push_event_payloads" which is used for storing push data of push events. The background migration in this commit will copy events from the "events" table into the "events_for_migration" table, push events in will also have a row created in "push_event_payloads". This approach allows us to reclaim space in the next release by simply swapping the "events" and "events_for_migration" tables, then dropping the old events (now "events_for_migration") table. The new table structure is also optimised for storage space, and does not include the unused "title" column nor the "data" column (since this data is moved to "push_event_payloads"). == Newly Created Events Newly created events are inserted into both "events" and "events_for_migration", both using the exact same primary key value. The table "push_event_payloads" in turn has a foreign key to the _shadow_ table. This removes the need for recreating and validating the foreign key after swapping the tables. Since the shadow table also has a foreign key to "projects.id" we also don't have to worry about orphaned rows. This approach however does require some additional storage as we're duplicating a portion of the events data for at least 1 release. The exact amount is hard to estimate, but for GitLab.com this is expected to be between 10 and 20 GB at most. The background migration in this commit deliberately does _not_ update the "events" table as doing so would put a lot of pressure on PostgreSQL's auto vacuuming system. == Supporting Both Old And New Events Application code has also been adjusted to support push events using both the old and new data formats. This is done by creating a PushEvent class which extends the regular Event class. Using Rails' Single Table Inheritance system we can ensure the right class is used for the right data, which in this case is based on the value of `events.action`. To support displaying old and new data at the same time the PushEvent class re-defines a few methods of the Event class, falling back to their original implementations for push events in the old format. Once all existing events have been migrated the various push event related methods can be removed from the Event model, and the calls to `super` can be removed from the methods in the PushEvent model. The UI and event atom feed have also been slightly changed to better handle this new setup, fortunately only a few changes were necessary to make this work. == API Changes The API only displays push data of events in the new format. Supporting both formats in the API is a bit more difficult compared to the UI. Since the old push data was not really well documented (apart from one example that used an incorrect "action" nmae) I decided that supporting both was not worth the effort, especially since events will be migrated in a few days _and_ new events are created in the correct format.
2017-07-10 11:43:57 -04:00
self.inheritance_column = 'action'
# "data" will be removed in 10.0 but it may be possible that JOINs happen that
# include this column, hence we're ignoring it as well.
ignore_column :data
2012-10-08 20:10:04 -04:00
class << self
def model_name
ActiveModel::Name.new(self, nil, 'event')
end
Migrate events into a new format This commit migrates events data in such a way that push events are stored much more efficiently. This is done by creating a shadow table called "events_for_migration", and a table called "push_event_payloads" which is used for storing push data of push events. The background migration in this commit will copy events from the "events" table into the "events_for_migration" table, push events in will also have a row created in "push_event_payloads". This approach allows us to reclaim space in the next release by simply swapping the "events" and "events_for_migration" tables, then dropping the old events (now "events_for_migration") table. The new table structure is also optimised for storage space, and does not include the unused "title" column nor the "data" column (since this data is moved to "push_event_payloads"). == Newly Created Events Newly created events are inserted into both "events" and "events_for_migration", both using the exact same primary key value. The table "push_event_payloads" in turn has a foreign key to the _shadow_ table. This removes the need for recreating and validating the foreign key after swapping the tables. Since the shadow table also has a foreign key to "projects.id" we also don't have to worry about orphaned rows. This approach however does require some additional storage as we're duplicating a portion of the events data for at least 1 release. The exact amount is hard to estimate, but for GitLab.com this is expected to be between 10 and 20 GB at most. The background migration in this commit deliberately does _not_ update the "events" table as doing so would put a lot of pressure on PostgreSQL's auto vacuuming system. == Supporting Both Old And New Events Application code has also been adjusted to support push events using both the old and new data formats. This is done by creating a PushEvent class which extends the regular Event class. Using Rails' Single Table Inheritance system we can ensure the right class is used for the right data, which in this case is based on the value of `events.action`. To support displaying old and new data at the same time the PushEvent class re-defines a few methods of the Event class, falling back to their original implementations for push events in the old format. Once all existing events have been migrated the various push event related methods can be removed from the Event model, and the calls to `super` can be removed from the methods in the PushEvent model. The UI and event atom feed have also been slightly changed to better handle this new setup, fortunately only a few changes were necessary to make this work. == API Changes The API only displays push data of events in the new format. Supporting both formats in the API is a bit more difficult compared to the UI. Since the old push data was not really well documented (apart from one example that used an incorrect "action" nmae) I decided that supporting both was not worth the effort, especially since events will be migrated in a few days _and_ new events are created in the correct format.
2017-07-10 11:43:57 -04:00
def find_sti_class(action)
if action.to_i == PUSHED
PushEvent
else
Event
end
end
def subclass_from_attributes(attrs)
# Without this Rails will keep calling this method on the returned class,
# resulting in an infinite loop.
return unless self == Event
action = attrs.with_indifferent_access[inheritance_column].to_i
PushEvent if action == PUSHED
end
# Update Gitlab::ContributionsCalendar#activity_dates if this changes
def contributions
where("action = ? OR (target_type IN (?) AND action IN (?)) OR (target_type = ? AND action = ?)",
Event::PUSHED,
2017-02-22 12:46:57 -05:00
%w(MergeRequest Issue), [Event::CREATED, Event::CLOSED, Event::MERGED],
"Note", Event::COMMENTED)
end
Faster way of obtaining latest event update time Instead of using MAX(events.updated_at) we can simply sort the events in descending order by the "id" column and grab the first row. In other words, instead of this: SELECT max(events.updated_at) AS max_id FROM events LEFT OUTER JOIN projects ON projects.id = events.project_id LEFT OUTER JOIN namespaces ON namespaces.id = projects.namespace_id WHERE events.author_id IS NOT NULL AND events.project_id IN (13083); we can use this: SELECT events.updated_at AS max_id FROM events LEFT OUTER JOIN projects ON projects.id = events.project_id LEFT OUTER JOIN namespaces ON namespaces.id = projects.namespace_id WHERE events.author_id IS NOT NULL AND events.project_id IN (13083) ORDER BY events.id DESC LIMIT 1; This has the benefit that on PostgreSQL a backwards index scan can be used, which due to the "LIMIT 1" will at most process only a single row. This in turn greatly speeds up the process of grabbing the latest update time. This can be confirmed by looking at the query plans. The first query produces the following plan: Aggregate (cost=43779.84..43779.85 rows=1 width=12) (actual time=2142.462..2142.462 rows=1 loops=1) -> Index Scan using index_events_on_project_id on events (cost=0.43..43704.69 rows=30060 width=12) (actual time=0.033..2138.086 rows=32769 loops=1) Index Cond: (project_id = 13083) Filter: (author_id IS NOT NULL) Planning time: 1.248 ms Execution time: 2142.548 ms The second query in turn produces the following plan: Limit (cost=0.43..41.65 rows=1 width=16) (actual time=1.394..1.394 rows=1 loops=1) -> Index Scan Backward using events_pkey on events (cost=0.43..1238907.96 rows=30060 width=16) (actual time=1.394..1.394 rows=1 loops=1) Filter: ((author_id IS NOT NULL) AND (project_id = 13083)) Rows Removed by Filter: 2104 Planning time: 0.166 ms Execution time: 1.408 ms According to the above plans the 2nd query is around 1500 times faster. However, re-running the first query produces timings of around 80 ms, making the 2nd query "only" around 55 times faster.
2015-11-11 11:14:47 -05:00
def limit_recent(limit = 20, offset = nil)
recent.limit(limit).offset(offset)
end
def actions
ACTIONS.keys
end
def target_types
TARGET_TYPES.keys
end
end
def visible_to_user?(user = nil)
if push? || commit_note?
Ability.allowed?(user, :download_code, project)
elsif membership_changed?
true
elsif created_project?
true
elsif issue? || issue_note?
2016-08-08 14:55:13 -04:00
Ability.allowed?(user, :read_issue, note? ? note_target : target)
2016-10-04 08:52:08 -04:00
elsif merge_request? || merge_request_note?
Ability.allowed?(user, :read_merge_request, note? ? note_target : target)
else
2016-10-04 08:52:08 -04:00
milestone?
end
2012-03-01 13:47:57 -05:00
end
def project_name
if project
2013-06-22 09:00:39 -04:00
project.name_with_namespace
else
2012-09-30 09:04:43 -04:00
"(deleted project)"
end
end
def target_title
target.try(:title)
end
def created?
action == CREATED
end
def push?
false
end
def merged?
action == MERGED
end
def closed?
action == CLOSED
end
def reopened?
action == REOPENED
end
def joined?
action == JOINED
end
def left?
action == LEFT
end
def expired?
action == EXPIRED
end
def destroyed?
action == DESTROYED
end
def commented?
action == COMMENTED
end
def membership_changed?
joined? || left? || expired?
end
def created_project?
created? && !target && target_type.nil?
end
def created_target?
created? && target
end
def milestone?
target_type == "Milestone"
end
def note?
target.is_a?(Note)
end
def issue?
target_type == "Issue"
end
def merge_request?
target_type == "MergeRequest"
2012-03-05 17:29:40 -05:00
end
def milestone
target if milestone?
2012-09-09 16:18:28 -04:00
end
def issue
target if issue?
end
def merge_request
target if merge_request?
end
def note
target if note?
end
def action_name
if push?
if new_ref?
"pushed new"
elsif rm_ref?
"deleted"
else
"pushed to"
end
elsif closed?
"closed"
elsif merged?
"accepted"
2012-09-09 16:18:28 -04:00
elsif joined?
'joined'
2012-09-09 17:27:47 -04:00
elsif left?
'left'
elsif expired?
'removed due to membership expiration from'
elsif destroyed?
'destroyed'
elsif commented?
"commented on"
elsif created_project?
if project.external_import?
"imported"
else
"created"
end
else
"opened"
end
end
2013-01-02 16:35:11 -05:00
def target_iid
target.respond_to?(:iid) ? target.iid : target_id
end
def commit_note?
note? && target && target.for_commit?
2013-01-02 16:35:11 -05:00
end
def issue_note?
2016-05-08 14:46:43 -04:00
note? && target && target.for_issue?
end
2016-10-04 08:52:08 -04:00
def merge_request_note?
note? && target && target.for_merge_request?
end
def project_snippet_note?
note? && target && target.for_snippet?
2013-03-25 07:58:09 -04:00
end
2013-01-02 16:35:11 -05:00
def note_target
target.noteable
end
def note_target_id
if commit_note?
2013-01-02 16:35:11 -05:00
target.commit_id
else
target.noteable_id.to_s
end
end
def note_target_reference
return unless note_target
# Commit#to_reference returns the full SHA, but we want the short one here
if commit_note?
note_target.short_id
else
note_target.to_reference
end
end
2013-01-02 16:35:11 -05:00
def note_target_type
if target.noteable_type.present?
target.noteable_type.titleize
else
"Wall"
end.downcase
end
2013-08-20 14:31:37 -04:00
def body?
if push?
push_with_commits?
2013-08-20 14:31:37 -04:00
elsif note?
true
else
target.respond_to? :title
end
end
2014-06-17 14:23:43 -04:00
def reset_project_activity
return unless project
# Don't bother updating if we know the project was updated recently.
return if recent_update?
# At this point it's possible for multiple threads/processes to try to
# update the project. Only one query should actually perform the update,
# hence we add the extra WHERE clause for last_activity_at.
2017-06-21 09:48:12 -04:00
Project.unscoped.where(id: project_id)
.where('last_activity_at <= ?', RESET_PROJECT_ACTIVITY_INTERVAL.ago)
.update_all(last_activity_at: created_at)
end
def authored_by?(user)
user ? author_id == user.id : false
end
def to_partial_path
# We are intentionally using `Event` rather than `self.class` so that
# subclasses also use the `Event` implementation.
Event._to_partial_path
end
private
def recent_update?
project.last_activity_at > RESET_PROJECT_ACTIVITY_INTERVAL.ago
end
def set_last_repository_updated_at
2017-06-21 09:48:12 -04:00
Project.unscoped.where(id: project_id)
.update_all(last_repository_updated_at: created_at)
end
2012-02-28 08:09:23 -05:00
end