gitlab-org--gitlab-foss/app/models/cycle_analytics.rb

79 lines
2.5 KiB
Ruby
Raw Normal View History

class CycleAnalytics
attr_reader :from
def initialize(project, from:)
@from = from
Improve performance of the cycle analytics page. 1. These changes bring down page load time for 100 issues from more than a minute to about 1.5 seconds. 2. This entire commit is composed of these types of performance enhancements: - Cache relevant data in `IssueMetrics` wherever possible. - Cache relevant data in `MergeRequestMetrics` wherever possible. - Preload metrics 3. Given these improvements, we now only need to make 4 SQL calls: - Load all issues - Load all merge requests - Load all metrics for the issues - Load all metrics for the merge requests 4. A list of all the data points that are now being pre-calculated: a. The first time an issue is mentioned in a commit - In `GitPushService`, find all issues mentioned by the given commit using `ReferenceExtractor`. Set the `first_mentioned_in_commit_at` flag for each of them. - There seems to be a (pre-existing) bug here - files (and therefore commits) created using the Web CI don't have cross-references created, and issues are not closed even when the commit title is "Fixes #xx". b. The first time a merge request is deployed to production When a `Deployment` is created, find all merge requests that were merged in before the deployment, and set the `first_deployed_to_production_at` flag for each of them. c. The start / end time for a merge request pipeline Hook into the `Pipeline` state machine. When the `status` moves to `running`, find the merge requests whose tip commit matches the pipeline, and record the `latest_build_started_at` time for each of them. When the `status` moves to `success`, record the `latest_build_finished_at` time. d. The merge requests that close an issue - This was a big cause of the performance problems we were having with Cycle Analytics. We need to use `ReferenceExtractor` to make this calculation, which is slow when we have to run it on a large number of merge requests. - When a merge request is created, updated, or refreshed, find the issues it closes, and create an instance of `MergeRequestsClosingIssues`, which acts as a join model between merge requests and issues. - If a `MergeRequestsClosingIssues` instance links a merge request and an issue, that issue closes that merge request. 5. The `Queries` module was changed into a class, so we can cache the results of `issues` and `merge_requests_closing_issues` across various cycle analytics stages. 6. The code added in this commit is untested. Tests will be added in the next commit.
2016-09-15 08:59:36 +00:00
@queries = Queries.new(project)
end
def as_json(options = {})
{
issue: issue, plan: plan, code: code, test: test,
review: review, staging: staging, production: production
}
end
def issue
Improve performance of the cycle analytics page. 1. These changes bring down page load time for 100 issues from more than a minute to about 1.5 seconds. 2. This entire commit is composed of these types of performance enhancements: - Cache relevant data in `IssueMetrics` wherever possible. - Cache relevant data in `MergeRequestMetrics` wherever possible. - Preload metrics 3. Given these improvements, we now only need to make 4 SQL calls: - Load all issues - Load all merge requests - Load all metrics for the issues - Load all metrics for the merge requests 4. A list of all the data points that are now being pre-calculated: a. The first time an issue is mentioned in a commit - In `GitPushService`, find all issues mentioned by the given commit using `ReferenceExtractor`. Set the `first_mentioned_in_commit_at` flag for each of them. - There seems to be a (pre-existing) bug here - files (and therefore commits) created using the Web CI don't have cross-references created, and issues are not closed even when the commit title is "Fixes #xx". b. The first time a merge request is deployed to production When a `Deployment` is created, find all merge requests that were merged in before the deployment, and set the `first_deployed_to_production_at` flag for each of them. c. The start / end time for a merge request pipeline Hook into the `Pipeline` state machine. When the `status` moves to `running`, find the merge requests whose tip commit matches the pipeline, and record the `latest_build_started_at` time for each of them. When the `status` moves to `success`, record the `latest_build_finished_at` time. d. The merge requests that close an issue - This was a big cause of the performance problems we were having with Cycle Analytics. We need to use `ReferenceExtractor` to make this calculation, which is slow when we have to run it on a large number of merge requests. - When a merge request is created, updated, or refreshed, find the issues it closes, and create an instance of `MergeRequestsClosingIssues`, which acts as a join model between merge requests and issues. - If a `MergeRequestsClosingIssues` instance links a merge request and an issue, that issue closes that merge request. 5. The `Queries` module was changed into a class, so we can cache the results of `issues` and `merge_requests_closing_issues` across various cycle analytics stages. 6. The code added in this commit is untested. Tests will be added in the next commit.
2016-09-15 08:59:36 +00:00
calculate_metric(@queries.issues(created_after: @from),
-> (data_point) { data_point[:issue].created_at },
Improve performance of the cycle analytics page. 1. These changes bring down page load time for 100 issues from more than a minute to about 1.5 seconds. 2. This entire commit is composed of these types of performance enhancements: - Cache relevant data in `IssueMetrics` wherever possible. - Cache relevant data in `MergeRequestMetrics` wherever possible. - Preload metrics 3. Given these improvements, we now only need to make 4 SQL calls: - Load all issues - Load all merge requests - Load all metrics for the issues - Load all metrics for the merge requests 4. A list of all the data points that are now being pre-calculated: a. The first time an issue is mentioned in a commit - In `GitPushService`, find all issues mentioned by the given commit using `ReferenceExtractor`. Set the `first_mentioned_in_commit_at` flag for each of them. - There seems to be a (pre-existing) bug here - files (and therefore commits) created using the Web CI don't have cross-references created, and issues are not closed even when the commit title is "Fixes #xx". b. The first time a merge request is deployed to production When a `Deployment` is created, find all merge requests that were merged in before the deployment, and set the `first_deployed_to_production_at` flag for each of them. c. The start / end time for a merge request pipeline Hook into the `Pipeline` state machine. When the `status` moves to `running`, find the merge requests whose tip commit matches the pipeline, and record the `latest_build_started_at` time for each of them. When the `status` moves to `success`, record the `latest_build_finished_at` time. d. The merge requests that close an issue - This was a big cause of the performance problems we were having with Cycle Analytics. We need to use `ReferenceExtractor` to make this calculation, which is slow when we have to run it on a large number of merge requests. - When a merge request is created, updated, or refreshed, find the issues it closes, and create an instance of `MergeRequestsClosingIssues`, which acts as a join model between merge requests and issues. - If a `MergeRequestsClosingIssues` instance links a merge request and an issue, that issue closes that merge request. 5. The `Queries` module was changed into a class, so we can cache the results of `issues` and `merge_requests_closing_issues` across various cycle analytics stages. 6. The code added in this commit is untested. Tests will be added in the next commit.
2016-09-15 08:59:36 +00:00
[@queries.issue_first_associated_with_milestone_at, @queries.issue_first_added_to_list_label_at])
end
def plan
Improve performance of the cycle analytics page. 1. These changes bring down page load time for 100 issues from more than a minute to about 1.5 seconds. 2. This entire commit is composed of these types of performance enhancements: - Cache relevant data in `IssueMetrics` wherever possible. - Cache relevant data in `MergeRequestMetrics` wherever possible. - Preload metrics 3. Given these improvements, we now only need to make 4 SQL calls: - Load all issues - Load all merge requests - Load all metrics for the issues - Load all metrics for the merge requests 4. A list of all the data points that are now being pre-calculated: a. The first time an issue is mentioned in a commit - In `GitPushService`, find all issues mentioned by the given commit using `ReferenceExtractor`. Set the `first_mentioned_in_commit_at` flag for each of them. - There seems to be a (pre-existing) bug here - files (and therefore commits) created using the Web CI don't have cross-references created, and issues are not closed even when the commit title is "Fixes #xx". b. The first time a merge request is deployed to production When a `Deployment` is created, find all merge requests that were merged in before the deployment, and set the `first_deployed_to_production_at` flag for each of them. c. The start / end time for a merge request pipeline Hook into the `Pipeline` state machine. When the `status` moves to `running`, find the merge requests whose tip commit matches the pipeline, and record the `latest_build_started_at` time for each of them. When the `status` moves to `success`, record the `latest_build_finished_at` time. d. The merge requests that close an issue - This was a big cause of the performance problems we were having with Cycle Analytics. We need to use `ReferenceExtractor` to make this calculation, which is slow when we have to run it on a large number of merge requests. - When a merge request is created, updated, or refreshed, find the issues it closes, and create an instance of `MergeRequestsClosingIssues`, which acts as a join model between merge requests and issues. - If a `MergeRequestsClosingIssues` instance links a merge request and an issue, that issue closes that merge request. 5. The `Queries` module was changed into a class, so we can cache the results of `issues` and `merge_requests_closing_issues` across various cycle analytics stages. 6. The code added in this commit is untested. Tests will be added in the next commit.
2016-09-15 08:59:36 +00:00
calculate_metric(@queries.issues(created_after: @from),
[@queries.issue_first_associated_with_milestone_at, @queries.issue_first_added_to_list_label_at],
@queries.issue_first_mentioned_in_commit_at)
end
def code
Improve performance of the cycle analytics page. 1. These changes bring down page load time for 100 issues from more than a minute to about 1.5 seconds. 2. This entire commit is composed of these types of performance enhancements: - Cache relevant data in `IssueMetrics` wherever possible. - Cache relevant data in `MergeRequestMetrics` wherever possible. - Preload metrics 3. Given these improvements, we now only need to make 4 SQL calls: - Load all issues - Load all merge requests - Load all metrics for the issues - Load all metrics for the merge requests 4. A list of all the data points that are now being pre-calculated: a. The first time an issue is mentioned in a commit - In `GitPushService`, find all issues mentioned by the given commit using `ReferenceExtractor`. Set the `first_mentioned_in_commit_at` flag for each of them. - There seems to be a (pre-existing) bug here - files (and therefore commits) created using the Web CI don't have cross-references created, and issues are not closed even when the commit title is "Fixes #xx". b. The first time a merge request is deployed to production When a `Deployment` is created, find all merge requests that were merged in before the deployment, and set the `first_deployed_to_production_at` flag for each of them. c. The start / end time for a merge request pipeline Hook into the `Pipeline` state machine. When the `status` moves to `running`, find the merge requests whose tip commit matches the pipeline, and record the `latest_build_started_at` time for each of them. When the `status` moves to `success`, record the `latest_build_finished_at` time. d. The merge requests that close an issue - This was a big cause of the performance problems we were having with Cycle Analytics. We need to use `ReferenceExtractor` to make this calculation, which is slow when we have to run it on a large number of merge requests. - When a merge request is created, updated, or refreshed, find the issues it closes, and create an instance of `MergeRequestsClosingIssues`, which acts as a join model between merge requests and issues. - If a `MergeRequestsClosingIssues` instance links a merge request and an issue, that issue closes that merge request. 5. The `Queries` module was changed into a class, so we can cache the results of `issues` and `merge_requests_closing_issues` across various cycle analytics stages. 6. The code added in this commit is untested. Tests will be added in the next commit.
2016-09-15 08:59:36 +00:00
calculate_metric(@queries.merge_requests_closing_issues(created_after: @from),
@queries.issue_first_mentioned_in_commit_at,
-> (data_point) { data_point[:merge_request].created_at })
end
def test
Improve performance of the cycle analytics page. 1. These changes bring down page load time for 100 issues from more than a minute to about 1.5 seconds. 2. This entire commit is composed of these types of performance enhancements: - Cache relevant data in `IssueMetrics` wherever possible. - Cache relevant data in `MergeRequestMetrics` wherever possible. - Preload metrics 3. Given these improvements, we now only need to make 4 SQL calls: - Load all issues - Load all merge requests - Load all metrics for the issues - Load all metrics for the merge requests 4. A list of all the data points that are now being pre-calculated: a. The first time an issue is mentioned in a commit - In `GitPushService`, find all issues mentioned by the given commit using `ReferenceExtractor`. Set the `first_mentioned_in_commit_at` flag for each of them. - There seems to be a (pre-existing) bug here - files (and therefore commits) created using the Web CI don't have cross-references created, and issues are not closed even when the commit title is "Fixes #xx". b. The first time a merge request is deployed to production When a `Deployment` is created, find all merge requests that were merged in before the deployment, and set the `first_deployed_to_production_at` flag for each of them. c. The start / end time for a merge request pipeline Hook into the `Pipeline` state machine. When the `status` moves to `running`, find the merge requests whose tip commit matches the pipeline, and record the `latest_build_started_at` time for each of them. When the `status` moves to `success`, record the `latest_build_finished_at` time. d. The merge requests that close an issue - This was a big cause of the performance problems we were having with Cycle Analytics. We need to use `ReferenceExtractor` to make this calculation, which is slow when we have to run it on a large number of merge requests. - When a merge request is created, updated, or refreshed, find the issues it closes, and create an instance of `MergeRequestsClosingIssues`, which acts as a join model between merge requests and issues. - If a `MergeRequestsClosingIssues` instance links a merge request and an issue, that issue closes that merge request. 5. The `Queries` module was changed into a class, so we can cache the results of `issues` and `merge_requests_closing_issues` across various cycle analytics stages. 6. The code added in this commit is untested. Tests will be added in the next commit.
2016-09-15 08:59:36 +00:00
calculate_metric(@queries.merge_requests_closing_issues(created_after: @from),
@queries.merge_request_build_started_at,
@queries.merge_request_build_finished_at)
end
def review
Improve performance of the cycle analytics page. 1. These changes bring down page load time for 100 issues from more than a minute to about 1.5 seconds. 2. This entire commit is composed of these types of performance enhancements: - Cache relevant data in `IssueMetrics` wherever possible. - Cache relevant data in `MergeRequestMetrics` wherever possible. - Preload metrics 3. Given these improvements, we now only need to make 4 SQL calls: - Load all issues - Load all merge requests - Load all metrics for the issues - Load all metrics for the merge requests 4. A list of all the data points that are now being pre-calculated: a. The first time an issue is mentioned in a commit - In `GitPushService`, find all issues mentioned by the given commit using `ReferenceExtractor`. Set the `first_mentioned_in_commit_at` flag for each of them. - There seems to be a (pre-existing) bug here - files (and therefore commits) created using the Web CI don't have cross-references created, and issues are not closed even when the commit title is "Fixes #xx". b. The first time a merge request is deployed to production When a `Deployment` is created, find all merge requests that were merged in before the deployment, and set the `first_deployed_to_production_at` flag for each of them. c. The start / end time for a merge request pipeline Hook into the `Pipeline` state machine. When the `status` moves to `running`, find the merge requests whose tip commit matches the pipeline, and record the `latest_build_started_at` time for each of them. When the `status` moves to `success`, record the `latest_build_finished_at` time. d. The merge requests that close an issue - This was a big cause of the performance problems we were having with Cycle Analytics. We need to use `ReferenceExtractor` to make this calculation, which is slow when we have to run it on a large number of merge requests. - When a merge request is created, updated, or refreshed, find the issues it closes, and create an instance of `MergeRequestsClosingIssues`, which acts as a join model between merge requests and issues. - If a `MergeRequestsClosingIssues` instance links a merge request and an issue, that issue closes that merge request. 5. The `Queries` module was changed into a class, so we can cache the results of `issues` and `merge_requests_closing_issues` across various cycle analytics stages. 6. The code added in this commit is untested. Tests will be added in the next commit.
2016-09-15 08:59:36 +00:00
calculate_metric(@queries.merge_requests_closing_issues(created_after: @from),
-> (data_point) { data_point[:merge_request].created_at },
Improve performance of the cycle analytics page. 1. These changes bring down page load time for 100 issues from more than a minute to about 1.5 seconds. 2. This entire commit is composed of these types of performance enhancements: - Cache relevant data in `IssueMetrics` wherever possible. - Cache relevant data in `MergeRequestMetrics` wherever possible. - Preload metrics 3. Given these improvements, we now only need to make 4 SQL calls: - Load all issues - Load all merge requests - Load all metrics for the issues - Load all metrics for the merge requests 4. A list of all the data points that are now being pre-calculated: a. The first time an issue is mentioned in a commit - In `GitPushService`, find all issues mentioned by the given commit using `ReferenceExtractor`. Set the `first_mentioned_in_commit_at` flag for each of them. - There seems to be a (pre-existing) bug here - files (and therefore commits) created using the Web CI don't have cross-references created, and issues are not closed even when the commit title is "Fixes #xx". b. The first time a merge request is deployed to production When a `Deployment` is created, find all merge requests that were merged in before the deployment, and set the `first_deployed_to_production_at` flag for each of them. c. The start / end time for a merge request pipeline Hook into the `Pipeline` state machine. When the `status` moves to `running`, find the merge requests whose tip commit matches the pipeline, and record the `latest_build_started_at` time for each of them. When the `status` moves to `success`, record the `latest_build_finished_at` time. d. The merge requests that close an issue - This was a big cause of the performance problems we were having with Cycle Analytics. We need to use `ReferenceExtractor` to make this calculation, which is slow when we have to run it on a large number of merge requests. - When a merge request is created, updated, or refreshed, find the issues it closes, and create an instance of `MergeRequestsClosingIssues`, which acts as a join model between merge requests and issues. - If a `MergeRequestsClosingIssues` instance links a merge request and an issue, that issue closes that merge request. 5. The `Queries` module was changed into a class, so we can cache the results of `issues` and `merge_requests_closing_issues` across various cycle analytics stages. 6. The code added in this commit is untested. Tests will be added in the next commit.
2016-09-15 08:59:36 +00:00
@queries.merge_request_merged_at)
end
def staging
Improve performance of the cycle analytics page. 1. These changes bring down page load time for 100 issues from more than a minute to about 1.5 seconds. 2. This entire commit is composed of these types of performance enhancements: - Cache relevant data in `IssueMetrics` wherever possible. - Cache relevant data in `MergeRequestMetrics` wherever possible. - Preload metrics 3. Given these improvements, we now only need to make 4 SQL calls: - Load all issues - Load all merge requests - Load all metrics for the issues - Load all metrics for the merge requests 4. A list of all the data points that are now being pre-calculated: a. The first time an issue is mentioned in a commit - In `GitPushService`, find all issues mentioned by the given commit using `ReferenceExtractor`. Set the `first_mentioned_in_commit_at` flag for each of them. - There seems to be a (pre-existing) bug here - files (and therefore commits) created using the Web CI don't have cross-references created, and issues are not closed even when the commit title is "Fixes #xx". b. The first time a merge request is deployed to production When a `Deployment` is created, find all merge requests that were merged in before the deployment, and set the `first_deployed_to_production_at` flag for each of them. c. The start / end time for a merge request pipeline Hook into the `Pipeline` state machine. When the `status` moves to `running`, find the merge requests whose tip commit matches the pipeline, and record the `latest_build_started_at` time for each of them. When the `status` moves to `success`, record the `latest_build_finished_at` time. d. The merge requests that close an issue - This was a big cause of the performance problems we were having with Cycle Analytics. We need to use `ReferenceExtractor` to make this calculation, which is slow when we have to run it on a large number of merge requests. - When a merge request is created, updated, or refreshed, find the issues it closes, and create an instance of `MergeRequestsClosingIssues`, which acts as a join model between merge requests and issues. - If a `MergeRequestsClosingIssues` instance links a merge request and an issue, that issue closes that merge request. 5. The `Queries` module was changed into a class, so we can cache the results of `issues` and `merge_requests_closing_issues` across various cycle analytics stages. 6. The code added in this commit is untested. Tests will be added in the next commit.
2016-09-15 08:59:36 +00:00
calculate_metric(@queries.merge_requests_closing_issues(created_after: @from),
@queries.merge_request_merged_at,
@queries.merge_request_deployed_to_production_at)
end
def production
Improve performance of the cycle analytics page. 1. These changes bring down page load time for 100 issues from more than a minute to about 1.5 seconds. 2. This entire commit is composed of these types of performance enhancements: - Cache relevant data in `IssueMetrics` wherever possible. - Cache relevant data in `MergeRequestMetrics` wherever possible. - Preload metrics 3. Given these improvements, we now only need to make 4 SQL calls: - Load all issues - Load all merge requests - Load all metrics for the issues - Load all metrics for the merge requests 4. A list of all the data points that are now being pre-calculated: a. The first time an issue is mentioned in a commit - In `GitPushService`, find all issues mentioned by the given commit using `ReferenceExtractor`. Set the `first_mentioned_in_commit_at` flag for each of them. - There seems to be a (pre-existing) bug here - files (and therefore commits) created using the Web CI don't have cross-references created, and issues are not closed even when the commit title is "Fixes #xx". b. The first time a merge request is deployed to production When a `Deployment` is created, find all merge requests that were merged in before the deployment, and set the `first_deployed_to_production_at` flag for each of them. c. The start / end time for a merge request pipeline Hook into the `Pipeline` state machine. When the `status` moves to `running`, find the merge requests whose tip commit matches the pipeline, and record the `latest_build_started_at` time for each of them. When the `status` moves to `success`, record the `latest_build_finished_at` time. d. The merge requests that close an issue - This was a big cause of the performance problems we were having with Cycle Analytics. We need to use `ReferenceExtractor` to make this calculation, which is slow when we have to run it on a large number of merge requests. - When a merge request is created, updated, or refreshed, find the issues it closes, and create an instance of `MergeRequestsClosingIssues`, which acts as a join model between merge requests and issues. - If a `MergeRequestsClosingIssues` instance links a merge request and an issue, that issue closes that merge request. 5. The `Queries` module was changed into a class, so we can cache the results of `issues` and `merge_requests_closing_issues` across various cycle analytics stages. 6. The code added in this commit is untested. Tests will be added in the next commit.
2016-09-15 08:59:36 +00:00
calculate_metric(@queries.merge_requests_closing_issues(created_after: @from),
-> (data_point) { data_point[:issue].created_at },
Improve performance of the cycle analytics page. 1. These changes bring down page load time for 100 issues from more than a minute to about 1.5 seconds. 2. This entire commit is composed of these types of performance enhancements: - Cache relevant data in `IssueMetrics` wherever possible. - Cache relevant data in `MergeRequestMetrics` wherever possible. - Preload metrics 3. Given these improvements, we now only need to make 4 SQL calls: - Load all issues - Load all merge requests - Load all metrics for the issues - Load all metrics for the merge requests 4. A list of all the data points that are now being pre-calculated: a. The first time an issue is mentioned in a commit - In `GitPushService`, find all issues mentioned by the given commit using `ReferenceExtractor`. Set the `first_mentioned_in_commit_at` flag for each of them. - There seems to be a (pre-existing) bug here - files (and therefore commits) created using the Web CI don't have cross-references created, and issues are not closed even when the commit title is "Fixes #xx". b. The first time a merge request is deployed to production When a `Deployment` is created, find all merge requests that were merged in before the deployment, and set the `first_deployed_to_production_at` flag for each of them. c. The start / end time for a merge request pipeline Hook into the `Pipeline` state machine. When the `status` moves to `running`, find the merge requests whose tip commit matches the pipeline, and record the `latest_build_started_at` time for each of them. When the `status` moves to `success`, record the `latest_build_finished_at` time. d. The merge requests that close an issue - This was a big cause of the performance problems we were having with Cycle Analytics. We need to use `ReferenceExtractor` to make this calculation, which is slow when we have to run it on a large number of merge requests. - When a merge request is created, updated, or refreshed, find the issues it closes, and create an instance of `MergeRequestsClosingIssues`, which acts as a join model between merge requests and issues. - If a `MergeRequestsClosingIssues` instance links a merge request and an issue, that issue closes that merge request. 5. The `Queries` module was changed into a class, so we can cache the results of `issues` and `merge_requests_closing_issues` across various cycle analytics stages. 6. The code added in this commit is untested. Tests will be added in the next commit.
2016-09-15 08:59:36 +00:00
@queries.merge_request_deployed_to_production_at)
end
private
def calculate_metric(data, start_time_fns, end_time_fns)
times = data.map do |data_point|
start_time = Array.wrap(start_time_fns).map { |fn| fn[data_point] }.compact.first
end_time = Array.wrap(end_time_fns).map { |fn| fn[data_point] }.compact.first
if start_time.present? && end_time.present? && end_time >= start_time
end_time - start_time
end
end
median(times.compact.sort)
end
def median(coll)
return if coll.empty?
size = coll.length
(coll[size / 2] + coll[(size - 1) / 2]) / 2.0
end
end