This optimises how GroupProjectsFinder builds it collection, producing
simpler and faster queries in the process. It also cleans up the code a
bit to make it easier to understand.
This changes ProjectsFinder#init_collection so it no longer relies on a
UNION. For example, to get starred projects of a user we used to run:
SELECT projects.*
FROM projects
WHERE projects.pending_delete = 'f'
AND (
projects.id IN (
SELECT projects.id
FROM projects
INNER JOIN users_star_projects
ON users_star_projects.project_id = projects.id
INNER JOIN project_authorizations
ON projects.id = project_authorizations.project_id
WHERE projects.pending_delete = 'f'
AND project_authorizations.user_id = 1
AND users_star_projects.user_id = 1
UNION
SELECT projects.id
FROM projects
INNER JOIN users_star_projects
ON users_star_projects.project_id = projects.id
WHERE projects.visibility_level IN (20, 10)
AND users_star_projects.user_id = 1
)
)
ORDER BY projects.id DESC;
With these changes the above query is turned into the following instead:
SELECT projects.*
FROM projects
INNER JOIN users_star_projects
ON users_star_projects.project_id = projects.id
WHERE projects.pending_delete = 'f'
AND (
EXISTS (
SELECT 1
FROM project_authorizations
WHERE project_authorizations.user_id = 1
AND (project_id = projects.id)
)
OR projects.visibility_level IN (20,10)
)
AND users_star_projects.user_id = 1
ORDER BY projects.id DESC;
This query in turn produces a better execution plan and takes less time,
though the difference is only a few milliseconds (this however depends
on the amount of data involved and additional conditions that may be
added).
Instead of applying WHERE on a UNION, apply the WHERE on each of the seperate
SELECT statements, and do UNION on that.
Local tests with about 2_000_000 projects:
- 1_500_000 private projects
- 40_000 internal projects
- 400_000 public projects
For the API endpoint `/api/v4/projects?visibility=private` the slowest query was:
```sql
SELECT "projects".*
FROM "projects"
WHERE ...
```
The original query took 1073.8ms.
The query refactored to UNION of SELECT/WHERE took 2.3ms.
The original query was:
```sql
SELECT "projects".*
FROM "projects"
WHERE "projects"."pending_delete" = $1
AND (projects.id IN
(SELECT "projects"."id"
FROM "projects"
INNER JOIN "project_authorizations" ON "projects"."id" = "project_authorizations"."project_id"
WHERE "projects"."pending_delete" = 'f'
AND "project_authorizations"."user_id" = 23
UNION SELECT "projects"."id"
FROM "projects"
WHERE "projects"."visibility_level" IN (20,
10)))
AND "projects"."visibility_level" = $2
AND "projects"."archived" = $3
ORDER BY "projects"."created_at" DESC
LIMIT 20
OFFSET 0 [["pending_delete", "f"],
["visibility_level", 0],
["archived", "f"]]
```
The refactored query:
```sql
SELECT "projects".*
FROM "projects"
WHERE "projects"."pending_delete" = $1
AND (projects.id IN
(SELECT "projects"."id"
FROM "projects"
INNER JOIN "project_authorizations" ON "projects"."id" = "project_authorizations"."project_id"
WHERE "projects"."pending_delete" = 'f'
AND "project_authorizations"."user_id" = 23
AND "projects"."visibility_level" = 0
AND "projects"."archived" = 'f'
UNION SELECT "projects"."id"
FROM "projects"
WHERE "projects"."visibility_level" IN (20,
10)
AND "projects"."visibility_level" = 0
AND "projects"."archived" = 'f'))
ORDER BY "projects"."created_at" DESC
LIMIT 20
OFFSET 0 [["pending_delete", "f"]]
```
Extended ProjectFinder in order to handle the following options:
- current_user - which user use
- project_ids_relation: int[] - project ids to use
- params:
- trending: boolean
- non_public: boolean
- starred: boolean
- sort: string
- visibility_level: int
- tags: string[]
- personal: boolean
- search: string
- non_archived: boolean
GroupProjectsFinder now inherits from ProjectsFinder.
Changed the code in order to use the new available options.
Bring from EE: Share Project with Group
- [x] Models and migrations
- [x] Logic, UI
- [x] Tests
- [x] Documentation
- [x] Share with group lock
- [x] Api feature
- [x] Api docs
- [x] Api tests
Signed-off-by: Dmitriy Zaporozhets <dmitriy.zaporozhets@gmail.com>
For #12831
cc @DouweM @rspeicher @vsizov
See merge request !3186
GitLab EE adds an extra relation that selects a "project_id" column
instead of an "id" column, making it very hard for this method to be
re-used in EE. Since using User#authorized_groups in
ProjectsFinder#all_groups apparently has no performance impact we can
just use it and keep everything compatible with EE.
These changes were added in GitLab EE commit
d39de0ea91b26b8840195e5674b92c353cc16661. The tests were a bit bugged
(they used a non existing group, thus not testing a crucial part) which
I only noticed when porting CE changes to EE.
This class now uses a UNION (when needed) instead of plucking tens of
thousands of project IDs into memory. The tests have also been
re-written to ensure all different use cases are tested properly
(assuming I didn't forget any cases).
The finder has also been broken up into 3 different finder classes:
* ContributedProjectsFinder: class for getting the projects a user
contributed to.
* PersonalProjectsFinder: class for getting the personal projects of a
user.
* ProjectsFinder: class for getting generic projects visible to a given
user.
Previously a lot of the logic of these finders was handled directly in
the users controller.