This refactors the Markdown pipeline so it supports the rendering of
multiple documents that may belong to different projects. An example of
where this happens is when displaying the event feed of a group. In this
case we retrieve events for all projects in the group. Previously we
would group events per project and render these chunks separately, but
this would result in many SQL queries being executed. By extending the
Markdown pipeline to support this out of the box we can drastically
reduce the number of SQL queries.
To achieve this we introduce a new object to the pipeline:
Banzai::RenderContext. This object simply wraps two other objects: an
optional Project instance, and an optional User instance. On its own
this wouldn't be very helpful, but a RenderContext can also be used to
associate HTML documents with specific Project instances. This work is
done in Banzai::ObjectRenderer and allows us to reuse as many queries
(and results) as possible.
To load issue 1, we see that in #38033 that about 835 ms of the
SQL queries were due to loading ProjectFeature. We should be
able to cut this down by eagerly loading this information.
Closes#38033
Example: for issues that are closed, the links will now show '[closed]'
following the issue number. This is done as post-process after the markdown has
been loaded from the cache as the status of the issue may change between
the cache being populated and the content being displayed.
In order to avoid N+1 queries problem when rendering notes ObjectRenderer
populates the cache of referenced issuables for all notes at once,
before the post processing phase.
As a part of this change, the Banzai BaseParser#grouped_objects_for_nodes
method has been refactored to return a Hash utilising the node itself as the
key, since this was a common pattern of usage for this method.
The method Ability.issues_readable_by_user takes a list of users and an
optional user and returns an Array of issues readable by said user. This
method in turn is used by
Banzai::ReferenceParser::IssueParser#nodes_visible_to_user so this
method no longer needs to get all the available abilities just to check
if a user has the "read_issue" ability.
To test this I benchmarked an issue with 222 comments on my development
environment. Using these changes the time spent in nodes_visible_to_user
was reduced from around 120 ms to around 40 ms.
By eager loading these associations we can greatly cut down the number
of SQL queries executed when processing documents with lots of
references, especially in cases where there are references belonging to
the same project.
Since these associations are so specific to the reference parsing
process and the permissions checking process that follows it I opted to
include them directly in IssueParser instead of using something like a
scope. Once we have a need for it we can move this code to a scope or
method.
This splits the Markdown rendering and reference extraction phases into
two distinct code bases. The reference extraction phase no longer relies
on the html-pipeline Gem (and any related code) and allows for
extracting of references from multiple HTML nodes in a single pass. This
means that if you want to extract user references from 200 comments you
no longer need to run 200 times N number of queries, instead only a
handful of queries may be needed.