2018-08-11 03:00:39 -04:00
|
|
|
# frozen_string_literal: true
|
|
|
|
|
Add repository languages for projects
Our friends at GitHub show the programming languages for a long time,
and inspired by that this commit means to create about the same
functionality.
Language detection is done through Linguist, as before, where the
difference is that we cache the result in the database. Also, Gitaly can
incrementaly scan a repository. This is done through a shell out, which
creates overhead of about 3s each run. For now this won't be improved.
Scans are triggered by pushed to the default branch, usually `master`.
However, one exception to this rule the charts page. If we're requesting
this expensive data anyway, we just cache it in the database.
Edge cases where there is no repository, or its empty are caught in the
Repository model. This makes use of Redis caching, which is probably
already loaded.
The added model is called RepositoryLanguage, which will make it harder
if/when GitLab supports multiple repositories per project. However, for
now I think this shouldn't be a concern. Also, Language could be
confused with the i18n languages and felt like the current name was
suiteable too.
Design of the Project#Show page is done with help from @dimitrieh. This
change is not visible to the end user unless detections are done.
2018-06-06 07:10:59 -04:00
|
|
|
module Projects
|
|
|
|
class DetectRepositoryLanguagesService < BaseService
|
2019-03-20 13:23:23 -04:00
|
|
|
attr_reader :programming_languages
|
Add repository languages for projects
Our friends at GitHub show the programming languages for a long time,
and inspired by that this commit means to create about the same
functionality.
Language detection is done through Linguist, as before, where the
difference is that we cache the result in the database. Also, Gitaly can
incrementaly scan a repository. This is done through a shell out, which
creates overhead of about 3s each run. For now this won't be improved.
Scans are triggered by pushed to the default branch, usually `master`.
However, one exception to this rule the charts page. If we're requesting
this expensive data anyway, we just cache it in the database.
Edge cases where there is no repository, or its empty are caught in the
Repository model. This makes use of Redis caching, which is probably
already loaded.
The added model is called RepositoryLanguage, which will make it harder
if/when GitLab supports multiple repositories per project. However, for
now I think this shouldn't be a concern. Also, Language could be
confused with the i18n languages and felt like the current name was
suiteable too.
Design of the Project#Show page is done with help from @dimitrieh. This
change is not visible to the end user unless detections are done.
2018-06-06 07:10:59 -04:00
|
|
|
|
2018-08-27 11:31:01 -04:00
|
|
|
# rubocop: disable CodeReuse/ActiveRecord
|
Add repository languages for projects
Our friends at GitHub show the programming languages for a long time,
and inspired by that this commit means to create about the same
functionality.
Language detection is done through Linguist, as before, where the
difference is that we cache the result in the database. Also, Gitaly can
incrementaly scan a repository. This is done through a shell out, which
creates overhead of about 3s each run. For now this won't be improved.
Scans are triggered by pushed to the default branch, usually `master`.
However, one exception to this rule the charts page. If we're requesting
this expensive data anyway, we just cache it in the database.
Edge cases where there is no repository, or its empty are caught in the
Repository model. This makes use of Redis caching, which is probably
already loaded.
The added model is called RepositoryLanguage, which will make it harder
if/when GitLab supports multiple repositories per project. However, for
now I think this shouldn't be a concern. Also, Language could be
confused with the i18n languages and felt like the current name was
suiteable too.
Design of the Project#Show page is done with help from @dimitrieh. This
change is not visible to the end user unless detections are done.
2018-06-06 07:10:59 -04:00
|
|
|
def execute
|
|
|
|
repository_languages = project.repository_languages
|
|
|
|
detection = Gitlab::LanguageDetection.new(repository, repository_languages)
|
|
|
|
|
|
|
|
matching_programming_languages = ensure_programming_languages(detection)
|
|
|
|
|
|
|
|
RepositoryLanguage.transaction do
|
|
|
|
project.repository_languages.where(programming_language_id: detection.deletions).delete_all
|
|
|
|
|
|
|
|
detection.updates.each do |update|
|
|
|
|
RepositoryLanguage
|
|
|
|
.where(project_id: project.id)
|
|
|
|
.where(programming_language_id: update[:programming_language_id])
|
2018-08-02 08:09:49 -04:00
|
|
|
.update_all(share: update[:share])
|
Add repository languages for projects
Our friends at GitHub show the programming languages for a long time,
and inspired by that this commit means to create about the same
functionality.
Language detection is done through Linguist, as before, where the
difference is that we cache the result in the database. Also, Gitaly can
incrementaly scan a repository. This is done through a shell out, which
creates overhead of about 3s each run. For now this won't be improved.
Scans are triggered by pushed to the default branch, usually `master`.
However, one exception to this rule the charts page. If we're requesting
this expensive data anyway, we just cache it in the database.
Edge cases where there is no repository, or its empty are caught in the
Repository model. This makes use of Redis caching, which is probably
already loaded.
The added model is called RepositoryLanguage, which will make it harder
if/when GitLab supports multiple repositories per project. However, for
now I think this shouldn't be a concern. Also, Language could be
confused with the i18n languages and felt like the current name was
suiteable too.
Design of the Project#Show page is done with help from @dimitrieh. This
change is not visible to the end user unless detections are done.
2018-06-06 07:10:59 -04:00
|
|
|
end
|
|
|
|
|
|
|
|
Gitlab::Database.bulk_insert(
|
|
|
|
RepositoryLanguage.table_name,
|
|
|
|
detection.insertions(matching_programming_languages)
|
|
|
|
)
|
2019-03-20 13:23:23 -04:00
|
|
|
|
|
|
|
set_detected_repository_languages
|
Add repository languages for projects
Our friends at GitHub show the programming languages for a long time,
and inspired by that this commit means to create about the same
functionality.
Language detection is done through Linguist, as before, where the
difference is that we cache the result in the database. Also, Gitaly can
incrementaly scan a repository. This is done through a shell out, which
creates overhead of about 3s each run. For now this won't be improved.
Scans are triggered by pushed to the default branch, usually `master`.
However, one exception to this rule the charts page. If we're requesting
this expensive data anyway, we just cache it in the database.
Edge cases where there is no repository, or its empty are caught in the
Repository model. This makes use of Redis caching, which is probably
already loaded.
The added model is called RepositoryLanguage, which will make it harder
if/when GitLab supports multiple repositories per project. However, for
now I think this shouldn't be a concern. Also, Language could be
confused with the i18n languages and felt like the current name was
suiteable too.
Design of the Project#Show page is done with help from @dimitrieh. This
change is not visible to the end user unless detections are done.
2018-06-06 07:10:59 -04:00
|
|
|
end
|
|
|
|
|
|
|
|
project.repository_languages.reload
|
|
|
|
end
|
2018-08-27 11:31:01 -04:00
|
|
|
# rubocop: enable CodeReuse/ActiveRecord
|
Add repository languages for projects
Our friends at GitHub show the programming languages for a long time,
and inspired by that this commit means to create about the same
functionality.
Language detection is done through Linguist, as before, where the
difference is that we cache the result in the database. Also, Gitaly can
incrementaly scan a repository. This is done through a shell out, which
creates overhead of about 3s each run. For now this won't be improved.
Scans are triggered by pushed to the default branch, usually `master`.
However, one exception to this rule the charts page. If we're requesting
this expensive data anyway, we just cache it in the database.
Edge cases where there is no repository, or its empty are caught in the
Repository model. This makes use of Redis caching, which is probably
already loaded.
The added model is called RepositoryLanguage, which will make it harder
if/when GitLab supports multiple repositories per project. However, for
now I think this shouldn't be a concern. Also, Language could be
confused with the i18n languages and felt like the current name was
suiteable too.
Design of the Project#Show page is done with help from @dimitrieh. This
change is not visible to the end user unless detections are done.
2018-06-06 07:10:59 -04:00
|
|
|
|
|
|
|
private
|
|
|
|
|
2018-08-27 11:31:01 -04:00
|
|
|
# rubocop: disable CodeReuse/ActiveRecord
|
Add repository languages for projects
Our friends at GitHub show the programming languages for a long time,
and inspired by that this commit means to create about the same
functionality.
Language detection is done through Linguist, as before, where the
difference is that we cache the result in the database. Also, Gitaly can
incrementaly scan a repository. This is done through a shell out, which
creates overhead of about 3s each run. For now this won't be improved.
Scans are triggered by pushed to the default branch, usually `master`.
However, one exception to this rule the charts page. If we're requesting
this expensive data anyway, we just cache it in the database.
Edge cases where there is no repository, or its empty are caught in the
Repository model. This makes use of Redis caching, which is probably
already loaded.
The added model is called RepositoryLanguage, which will make it harder
if/when GitLab supports multiple repositories per project. However, for
now I think this shouldn't be a concern. Also, Language could be
confused with the i18n languages and felt like the current name was
suiteable too.
Design of the Project#Show page is done with help from @dimitrieh. This
change is not visible to the end user unless detections are done.
2018-06-06 07:10:59 -04:00
|
|
|
def ensure_programming_languages(detection)
|
|
|
|
existing_languages = ProgrammingLanguage.where(name: detection.languages)
|
|
|
|
return existing_languages if detection.languages.size == existing_languages.size
|
|
|
|
|
|
|
|
missing_languages = detection.languages - existing_languages.map(&:name)
|
|
|
|
created_languages = missing_languages.map do |name|
|
|
|
|
create_language(name, detection.language_color(name))
|
|
|
|
end
|
|
|
|
|
|
|
|
existing_languages + created_languages
|
|
|
|
end
|
2018-08-27 11:31:01 -04:00
|
|
|
# rubocop: enable CodeReuse/ActiveRecord
|
Add repository languages for projects
Our friends at GitHub show the programming languages for a long time,
and inspired by that this commit means to create about the same
functionality.
Language detection is done through Linguist, as before, where the
difference is that we cache the result in the database. Also, Gitaly can
incrementaly scan a repository. This is done through a shell out, which
creates overhead of about 3s each run. For now this won't be improved.
Scans are triggered by pushed to the default branch, usually `master`.
However, one exception to this rule the charts page. If we're requesting
this expensive data anyway, we just cache it in the database.
Edge cases where there is no repository, or its empty are caught in the
Repository model. This makes use of Redis caching, which is probably
already loaded.
The added model is called RepositoryLanguage, which will make it harder
if/when GitLab supports multiple repositories per project. However, for
now I think this shouldn't be a concern. Also, Language could be
confused with the i18n languages and felt like the current name was
suiteable too.
Design of the Project#Show page is done with help from @dimitrieh. This
change is not visible to the end user unless detections are done.
2018-06-06 07:10:59 -04:00
|
|
|
|
2018-08-27 11:31:01 -04:00
|
|
|
# rubocop: disable CodeReuse/ActiveRecord
|
Add repository languages for projects
Our friends at GitHub show the programming languages for a long time,
and inspired by that this commit means to create about the same
functionality.
Language detection is done through Linguist, as before, where the
difference is that we cache the result in the database. Also, Gitaly can
incrementaly scan a repository. This is done through a shell out, which
creates overhead of about 3s each run. For now this won't be improved.
Scans are triggered by pushed to the default branch, usually `master`.
However, one exception to this rule the charts page. If we're requesting
this expensive data anyway, we just cache it in the database.
Edge cases where there is no repository, or its empty are caught in the
Repository model. This makes use of Redis caching, which is probably
already loaded.
The added model is called RepositoryLanguage, which will make it harder
if/when GitLab supports multiple repositories per project. However, for
now I think this shouldn't be a concern. Also, Language could be
confused with the i18n languages and felt like the current name was
suiteable too.
Design of the Project#Show page is done with help from @dimitrieh. This
change is not visible to the end user unless detections are done.
2018-06-06 07:10:59 -04:00
|
|
|
def create_language(name, color)
|
|
|
|
ProgrammingLanguage.transaction do
|
|
|
|
ProgrammingLanguage.where(name: name).first_or_create(color: color)
|
|
|
|
end
|
|
|
|
rescue ActiveRecord::RecordNotUnique
|
|
|
|
retry
|
|
|
|
end
|
2018-08-27 11:31:01 -04:00
|
|
|
# rubocop: enable CodeReuse/ActiveRecord
|
2019-03-20 13:23:23 -04:00
|
|
|
|
|
|
|
def set_detected_repository_languages
|
|
|
|
return if project.detected_repository_languages?
|
|
|
|
|
|
|
|
project.update_column(:detected_repository_languages, true)
|
|
|
|
end
|
Add repository languages for projects
Our friends at GitHub show the programming languages for a long time,
and inspired by that this commit means to create about the same
functionality.
Language detection is done through Linguist, as before, where the
difference is that we cache the result in the database. Also, Gitaly can
incrementaly scan a repository. This is done through a shell out, which
creates overhead of about 3s each run. For now this won't be improved.
Scans are triggered by pushed to the default branch, usually `master`.
However, one exception to this rule the charts page. If we're requesting
this expensive data anyway, we just cache it in the database.
Edge cases where there is no repository, or its empty are caught in the
Repository model. This makes use of Redis caching, which is probably
already loaded.
The added model is called RepositoryLanguage, which will make it harder
if/when GitLab supports multiple repositories per project. However, for
now I think this shouldn't be a concern. Also, Language could be
confused with the i18n languages and felt like the current name was
suiteable too.
Design of the Project#Show page is done with help from @dimitrieh. This
change is not visible to the end user unless detections are done.
2018-06-06 07:10:59 -04:00
|
|
|
end
|
|
|
|
end
|