Commit Graph

1 Commits

Author SHA1 Message Date
Toon Claes 01c6323d23 UNION of SELECT/WHERE is faster than WHERE on UNION
Instead of applying WHERE on a UNION, apply the WHERE on each of the seperate
SELECT statements, and do UNION on that.

Local tests with about 2_000_000 projects:
 - 1_500_000 private projects
 -    40_000 internal projects
 -   400_000 public projects

For the API endpoint `/api/v4/projects?visibility=private` the slowest query was:

```sql
SELECT "projects".*
FROM "projects"
WHERE ...
```

The original query took 1073.8ms.
The query refactored to UNION of SELECT/WHERE took 2.3ms.

The original query was:

```sql
SELECT "projects".*
FROM "projects"
WHERE "projects"."pending_delete" = $1
  AND (projects.id IN
         (SELECT "projects"."id"
          FROM "projects"
          INNER JOIN "project_authorizations" ON "projects"."id" = "project_authorizations"."project_id"
          WHERE "projects"."pending_delete" = 'f'
            AND "project_authorizations"."user_id" = 23
          UNION SELECT "projects"."id"
          FROM "projects"
          WHERE "projects"."visibility_level" IN (20,
                                                  10)))
  AND "projects"."visibility_level" = $2
  AND "projects"."archived" = $3
ORDER BY "projects"."created_at" DESC
LIMIT 20
OFFSET 0 [["pending_delete", "f"],
       ["visibility_level", 0],
       ["archived", "f"]]
```

The refactored query:
```sql
SELECT "projects".*
FROM "projects"
WHERE "projects"."pending_delete" = $1
  AND (projects.id IN
         (SELECT "projects"."id"
          FROM "projects"
          INNER JOIN "project_authorizations" ON "projects"."id" = "project_authorizations"."project_id"
          WHERE "projects"."pending_delete" = 'f'
            AND "project_authorizations"."user_id" = 23
            AND "projects"."visibility_level" = 0
            AND "projects"."archived" = 'f'
          UNION SELECT "projects"."id"
          FROM "projects"
          WHERE "projects"."visibility_level" IN (20,
                                                  10)
            AND "projects"."visibility_level" = 0
            AND "projects"."archived" = 'f'))
ORDER BY "projects"."created_at" DESC
LIMIT 20
OFFSET 0 [["pending_delete", "f"]]
```
2017-05-30 22:45:59 +02:00