Instead of applying WHERE on a UNION, apply the WHERE on each of the seperate
SELECT statements, and do UNION on that.
Local tests with about 2_000_000 projects:
- 1_500_000 private projects
- 40_000 internal projects
- 400_000 public projects
For the API endpoint `/api/v4/projects?visibility=private` the slowest query was:
```sql
SELECT "projects".*
FROM "projects"
WHERE ...
```
The original query took 1073.8ms.
The query refactored to UNION of SELECT/WHERE took 2.3ms.
The original query was:
```sql
SELECT "projects".*
FROM "projects"
WHERE "projects"."pending_delete" = $1
AND (projects.id IN
(SELECT "projects"."id"
FROM "projects"
INNER JOIN "project_authorizations" ON "projects"."id" = "project_authorizations"."project_id"
WHERE "projects"."pending_delete" = 'f'
AND "project_authorizations"."user_id" = 23
UNION SELECT "projects"."id"
FROM "projects"
WHERE "projects"."visibility_level" IN (20,
10)))
AND "projects"."visibility_level" = $2
AND "projects"."archived" = $3
ORDER BY "projects"."created_at" DESC
LIMIT 20
OFFSET 0 [["pending_delete", "f"],
["visibility_level", 0],
["archived", "f"]]
```
The refactored query:
```sql
SELECT "projects".*
FROM "projects"
WHERE "projects"."pending_delete" = $1
AND (projects.id IN
(SELECT "projects"."id"
FROM "projects"
INNER JOIN "project_authorizations" ON "projects"."id" = "project_authorizations"."project_id"
WHERE "projects"."pending_delete" = 'f'
AND "project_authorizations"."user_id" = 23
AND "projects"."visibility_level" = 0
AND "projects"."archived" = 'f'
UNION SELECT "projects"."id"
FROM "projects"
WHERE "projects"."visibility_level" IN (20,
10)
AND "projects"."visibility_level" = 0
AND "projects"."archived" = 'f'))
ORDER BY "projects"."created_at" DESC
LIMIT 20
OFFSET 0 [["pending_delete", "f"]]
```
Extended ProjectFinder in order to handle the following options:
- current_user - which user use
- project_ids_relation: int[] - project ids to use
- params:
- trending: boolean
- non_public: boolean
- starred: boolean
- sort: string
- visibility_level: int
- tags: string[]
- personal: boolean
- search: string
- non_archived: boolean
GroupProjectsFinder now inherits from ProjectsFinder.
Changed the code in order to use the new available options.
Bring from EE: Share Project with Group
- [x] Models and migrations
- [x] Logic, UI
- [x] Tests
- [x] Documentation
- [x] Share with group lock
- [x] Api feature
- [x] Api docs
- [x] Api tests
Signed-off-by: Dmitriy Zaporozhets <dmitriy.zaporozhets@gmail.com>
For #12831
cc @DouweM @rspeicher @vsizov
See merge request !3186
GitLab EE adds an extra relation that selects a "project_id" column
instead of an "id" column, making it very hard for this method to be
re-used in EE. Since using User#authorized_groups in
ProjectsFinder#all_groups apparently has no performance impact we can
just use it and keep everything compatible with EE.
These changes were added in GitLab EE commit
d39de0ea91b26b8840195e5674b92c353cc16661. The tests were a bit bugged
(they used a non existing group, thus not testing a crucial part) which
I only noticed when porting CE changes to EE.
This class now uses a UNION (when needed) instead of plucking tens of
thousands of project IDs into memory. The tests have also been
re-written to ensure all different use cases are tested properly
(assuming I didn't forget any cases).
The finder has also been broken up into 3 different finder classes:
* ContributedProjectsFinder: class for getting the projects a user
contributed to.
* PersonalProjectsFinder: class for getting the personal projects of a
user.
* ProjectsFinder: class for getting generic projects visible to a given
user.
Previously a lot of the logic of these finders was handled directly in
the users controller.