ruby--ruby/gc.rb

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

307 lines
10 KiB
Ruby
Raw Normal View History

2019-11-08 06:32:01 +00:00
# for gc.c
# The GC module provides an interface to Ruby's mark and
# sweep garbage collection mechanism.
#
# Some of the underlying methods are also available via the ObjectSpace
# module.
#
# You may obtain information about the operation of the GC through
# GC::Profiler.
module GC
# call-seq:
# GC.start -> nil
# ObjectSpace.garbage_collect -> nil
# include GC; garbage_collect -> nil
# GC.start(full_mark: true, immediate_sweep: true) -> nil
# ObjectSpace.garbage_collect(full_mark: true, immediate_sweep: true) -> nil
# include GC; garbage_collect(full_mark: true, immediate_sweep: true) -> nil
#
# Initiates garbage collection, even if manually disabled.
#
# This method is defined with keyword arguments that default to true:
#
# def GC.start(full_mark: true, immediate_sweep: true); end
#
# Use full_mark: false to perform a minor GC.
# Use immediate_sweep: false to defer sweeping (use lazy sweep).
#
# Note: These keyword arguments are implementation and version dependent. They
# are not guaranteed to be future-compatible, and may be ignored if the
# underlying implementation does not support them.
Revert "Combine sweeping and moving" This reverts commit 02b216e5a70235f42f537e895d6f1afd05d8916a. This reverts commit 9b8825b6f94696c9659f93f5da9bf02644625f67. I found that combining sweep and move is not safe. I don't think that we can do compaction concurrently with _anything_ unless there is a read barrier installed. Here is a simple example. A class object is freed, and during it's free step, it tries to remove itself from its parent's subclass list. However, during the sweep step, the parent class was moved and the "currently being freed" class didn't have references updated yet. So we get a segv like this: ``` (lldb) bt * thread #1, name = 'ruby', stop reason = signal SIGSEGV * frame #0: 0x0000560763e344cb ruby`rb_st_lookup at st.c:320:43 frame #1: 0x0000560763e344cb ruby`rb_st_lookup(tab=0x2f7469672f6e6f72, key=3809, value=0x0000560765bf2270) at st.c:1010 frame #2: 0x0000560763e8f16a ruby`rb_search_class_path at variable.c:99:9 frame #3: 0x0000560763e8f141 ruby`rb_search_class_path at variable.c:145 frame #4: 0x0000560763e8f141 ruby`rb_search_class_path(klass=94589785585880) at variable.c:191 frame #5: 0x0000560763ec744e ruby`rb_vm_bugreport at vm_dump.c:996:17 frame #6: 0x0000560763f5b958 ruby`rb_bug_for_fatal_signal at error.c:675:5 frame #7: 0x0000560763e27dad ruby`sigsegv(sig=<unavailable>, info=<unavailable>, ctx=<unavailable>) at signal.c:955:5 frame #8: 0x00007f8b891d33c0 libpthread.so.0`___lldb_unnamed_symbol1$$libpthread.so.0 + 1 frame #9: 0x0000560763efa8bb ruby`rb_class_remove_from_super_subclasses(klass=94589790314280) at class.c:93:56 frame #10: 0x0000560763d10cb7 ruby`gc_sweep_step at gc.c:2674:2 frame #11: 0x0000560763d1187b ruby`gc_sweep at gc.c:4540:2 frame #12: 0x0000560763d101f0 ruby`gc_start at gc.c:6797:6 frame #13: 0x0000560763d15153 ruby`rb_gc_compact at gc.c:7479:12 frame #14: 0x0000560763eb4eb8 ruby`vm_exec_core at vm_insnhelper.c:5183:13 frame #15: 0x0000560763ea9bae ruby`rb_vm_exec at vm.c:1953:22 frame #16: 0x0000560763eac08d ruby`rb_yield at vm.c:1132:9 frame #17: 0x0000560763edb4f2 ruby`rb_ary_collect at array.c:3186:9 frame #18: 0x0000560763e9ee15 ruby`vm_call_cfunc_with_frame at vm_insnhelper.c:2575:12 frame #19: 0x0000560763eb2e66 ruby`vm_exec_core at vm_insnhelper.c:4177:11 frame #20: 0x0000560763ea9bae ruby`rb_vm_exec at vm.c:1953:22 frame #21: 0x0000560763eac08d ruby`rb_yield at vm.c:1132:9 frame #22: 0x0000560763edb4f2 ruby`rb_ary_collect at array.c:3186:9 frame #23: 0x0000560763e9ee15 ruby`vm_call_cfunc_with_frame at vm_insnhelper.c:2575:12 frame #24: 0x0000560763eb2e66 ruby`vm_exec_core at vm_insnhelper.c:4177:11 frame #25: 0x0000560763ea9bae ruby`rb_vm_exec at vm.c:1953:22 frame #26: 0x0000560763ceee01 ruby`rb_ec_exec_node(ec=0x0000560765afa530, n=0x0000560765b088e0) at eval.c:296:2 frame #27: 0x0000560763cf3b7b ruby`ruby_run_node(n=0x0000560765b088e0) at eval.c:354:12 frame #28: 0x0000560763cee4a3 ruby`main(argc=<unavailable>, argv=<unavailable>) at main.c:50:9 frame #29: 0x00007f8b88e560b3 libc.so.6`__libc_start_main + 243 frame #30: 0x0000560763cee4ee ruby`_start + 46 (lldb) f 9 frame #9: 0x0000560763efa8bb ruby`rb_class_remove_from_super_subclasses(klass=94589790314280) at class.c:93:56 90 91 *RCLASS_EXT(klass)->parent_subclasses = entry->next; 92 if (entry->next) { -> 93 RCLASS_EXT(entry->next->klass)->parent_subclasses = RCLASS_EXT(klass)->parent_subclasses; 94 } 95 xfree(entry); 96 } (lldb) command script import -r misc/lldb_cruby.py lldb scripts for ruby has been installed. (lldb) rp entry->next->klass (struct RMoved) $1 = (flags = 30, destination = 94589792806680, next = 94589784369160) (lldb) ```
2020-06-09 20:46:29 +00:00
def self.start full_mark: true, immediate_mark: true, immediate_sweep: true
Primitive.gc_start_internal full_mark, immediate_mark, immediate_sweep, false
2019-11-08 06:32:01 +00:00
end
def garbage_collect full_mark: true, immediate_mark: true, immediate_sweep: true
Primitive.gc_start_internal full_mark, immediate_mark, immediate_sweep, false
end
2019-11-08 06:32:01 +00:00
# call-seq:
# GC.enable -> true or false
#
# Enables garbage collection, returning +true+ if garbage
# collection was previously disabled.
#
# GC.disable #=> false
# GC.enable #=> true
# GC.enable #=> false
#
def self.enable
Primitive.gc_enable
2019-11-08 06:32:01 +00:00
end
# call-seq:
# GC.disable -> true or false
#
# Disables garbage collection, returning +true+ if garbage
# collection was already disabled.
#
# GC.disable #=> false
# GC.disable #=> true
def self.disable
Primitive.gc_disable
2019-11-08 06:32:01 +00:00
end
# call-seq:
# GC.stress -> integer, true or false
#
# Returns current status of GC stress mode.
def self.stress
Primitive.gc_stress_get
2019-11-08 06:32:01 +00:00
end
# call-seq:
# GC.stress = flag -> flag
#
# Updates the GC stress mode.
#
# When stress mode is enabled, the GC is invoked at every GC opportunity:
# all memory and object allocations.
#
# Enabling stress mode will degrade performance, it is only for debugging.
#
# flag can be true, false, or an integer bit-ORed following flags.
# 0x01:: no major GC
# 0x02:: no immediate sweep
# 0x04:: full mark after malloc/calloc/realloc
def self.stress=(flag)
Primitive.gc_stress_set_m flag
2019-11-08 06:32:01 +00:00
end
# call-seq:
# GC.count -> Integer
#
# The number of times GC occurred.
#
# It returns the number of times GC occurred since the process started.
def self.count
Primitive.gc_count
2019-11-08 06:32:01 +00:00
end
# call-seq:
# GC.stat -> Hash
# GC.stat(hash) -> Hash
2019-11-08 06:32:01 +00:00
# GC.stat(:key) -> Numeric
#
# Returns a Hash containing information about the GC.
#
# The contents of the hash are implementation specific and may change in
# the future without notice.
2019-11-08 06:32:01 +00:00
#
# The hash includes information about internal statistics about GC such as:
#
# [count]
# The total number of garbage collections ran since application start
# (count includes both minor and major garbage collections)
# [time]
# The total time spent in garbage collections (in milliseconds)
# [heap_allocated_pages]
# The total number of `:heap_eden_pages` + `:heap_tomb_pages`
# [heap_sorted_length]
# The number of pages that can fit into the buffer that holds references to
# all pages
# [heap_allocatable_pages]
# The total number of pages the application could allocate without additional GC
# [heap_available_slots]
# The total number of slots in all `:heap_allocated_pages`
# [heap_live_slots]
# The total number of slots which contain live objects
# [heap_free_slots]
# The total number of slots which do not contain live objects
# [heap_final_slots]
# The total number of slots with pending finalizers to be run
# [heap_marked_slots]
# The total number of objects marked in the last GC
# [heap_eden_pages]
# The total number of pages which contain at least one live slot
# [heap_tomb_pages]
# The total number of pages which do not contain any live slots
# [total_allocated_pages]
# The cumulative number of pages allocated since application start
# [total_freed_pages]
# The cumulative number of pages freed since application start
# [total_allocated_objects]
# The cumulative number of objects allocated since application start
# [total_freed_objects]
# The cumulative number of objects freed since application start
# [malloc_increase_bytes]
# Amount of memory allocated on the heap for objects. Decreased by any GC
# [malloc_increase_bytes_limit]
# When `:malloc_increase_bytes` crosses this limit, GC is triggered
# [minor_gc_count]
# The total number of minor garbage collections run since process start
# [major_gc_count]
# The total number of major garbage collections run since process start
# [compact_count]
# The total number of compactions run since process start
# [read_barrier_faults]
# The total number of times the read barrier was triggered during
# compaction
# [total_moved_objects]
# The total number of objects compaction has moved
# [remembered_wb_unprotected_objects]
# The total number of objects without write barriers
# [remembered_wb_unprotected_objects_limit]
# When `:remembered_wb_unprotected_objects` crosses this limit,
# major GC is triggered
# [old_objects]
# Number of live, old objects which have survived at least 3 garbage collections
# [old_objects_limit]
# When `:old_objects` crosses this limit, major GC is triggered
# [oldmalloc_increase_bytes]
# Amount of memory allocated on the heap for objects. Decreased by major GC
# [oldmalloc_increase_bytes_limit]
# When `:old_malloc_increase_bytes` crosses this limit, major GC is triggered
#
# If the optional argument, hash, is given,
# it is overwritten and returned.
# This is intended to avoid probe effect.
#
# This method is only expected to work on CRuby.
2019-11-08 06:32:01 +00:00
def self.stat hash_or_key = nil
Primitive.gc_stat hash_or_key
2019-11-08 06:32:01 +00:00
end
# call-seq:
# GC.stat_heap -> Hash
# GC.stat_heap(nil, hash) -> Hash
# GC.stat_heap(heap_name) -> Hash
# GC.stat_heap(heap_name, hash) -> Hash
# GC.stat_heap(heap_name, :key) -> Numeric
#
# Returns information for memory pools in the GC.
#
# If the first optional argument, +heap_name+, is passed in and not +nil+, it
# returns a +Hash+ containing information about the particular memory pool.
# Otherwise, it will return a +Hash+ with memory pool names as keys and
# a +Hash+ containing information about the memory pool as values.
#
# If the second optional argument, +hash_or_key+, is given as +Hash+, it will
# be overwritten and returned. This is intended to avoid the probe effect.
#
# If both optional arguments are passed in and the second optional argument is
# a symbol, it will return a +Numeric+ of the value for the particular memory
# pool.
#
# On CRuby, +heap_name+ is of the type +Integer+ but may be of type +String+
# on other implementations.
#
# The contents of the hash are implementation specific and may change in
# the future without notice.
#
# If the optional argument, hash, is given, it is overwritten and returned.
#
# This method is only expected to work on CRuby.
def self.stat_heap heap_name = nil, hash_or_key = nil
Primitive.gc_stat_heap heap_name, hash_or_key
end
2019-11-08 06:32:01 +00:00
# call-seq:
# GC.latest_gc_info -> {:gc_by=>:newobj}
# GC.latest_gc_info(hash) -> hash
# GC.latest_gc_info(:major_by) -> :malloc
#
# Returns information about the most recent garbage collection.
#
# If the optional argument, hash, is given,
# it is overwritten and returned.
# This is intended to avoid probe effect.
2019-11-08 06:32:01 +00:00
def self.latest_gc_info hash_or_key = nil
Primitive.gc_latest_gc_info hash_or_key
2019-11-08 06:32:01 +00:00
end
if respond_to?(:compact)
# call-seq:
# GC.verify_compaction_references(toward: nil, double_heap: false) -> hash
#
# Verify compaction reference consistency.
#
# This method is implementation specific. During compaction, objects that
# were moved are replaced with T_MOVED objects. No object should have a
# reference to a T_MOVED object after compaction.
#
Add expand_heap option to GC.verify_compaction_references In order to reliably test compaction we need to be able to move objects between size pools. In order for this to happen there must be pages in a size pool into which we can allocate. The existing implementation of `double_heap` only doubled the existing number of pages in the heap, so if a size pool had a low number of pages (or 0) it's not guaranteed that enough space will be created to move objects into that size pool. This commit deprecates the `double_heap` option and replaces it with `expand_heap` instead. expand heap will expand each heap by enough pages to hold a number of slots defined by `GC_HEAP_INIT_SLOTS` or by `heap->total_pags` whichever is larger. If both `double_heap` and `expand_heap` are present, a deprecation warning will be shown for `double_heap` and the `expand_heap` behaviour will take precedence Given that this is an API intended for debugging and testing GC compaction I'm not concerned about the extra memory usage or time taken to create the pages. However, for completeness: Running the following `test.rb` and using `time` on my Macbook Pro shows the following memory usage and time impact: pp "RSS (kb): #{`ps -o rss #{Process.pid}`.lines.last.to_i}" GC.verify_compaction_references(double_heap: true, toward: :empty) pp "RSS (kb): #{`ps -o rss #{Process.pid}`.lines.last.to_i}" ❯ time make run ./miniruby -I./lib -I. -I.ext/common -r./arm64-darwin21-fake ./test.rb "RSS (kb): 24000" <internal:gc>:251: warning: double_heap is deprecated and will be removed "RSS (kb): 25232" ________________________________________________________ Executed in 124.37 millis fish external usr time 82.22 millis 0.09 millis 82.12 millis sys time 28.76 millis 2.61 millis 26.15 millis ❯ time make run ./miniruby -I./lib -I. -I.ext/common -r./arm64-darwin21-fake ./test.rb "RSS (kb): 24000" "RSS (kb): 49040" ________________________________________________________ Executed in 150.13 millis fish external usr time 103.32 millis 0.10 millis 103.22 millis sys time 35.73 millis 2.59 millis 33.14 millis
2022-07-07 20:52:05 +00:00
# This function expands the heap to ensure room to move all objects,
# compacts the heap to make sure everything moves, updates all references,
# then performs a full GC. If any object contains a reference to a T_MOVED
# object, that object should be pushed on the mark stack, and will
# make a SEGV.
Add expand_heap option to GC.verify_compaction_references In order to reliably test compaction we need to be able to move objects between size pools. In order for this to happen there must be pages in a size pool into which we can allocate. The existing implementation of `double_heap` only doubled the existing number of pages in the heap, so if a size pool had a low number of pages (or 0) it's not guaranteed that enough space will be created to move objects into that size pool. This commit deprecates the `double_heap` option and replaces it with `expand_heap` instead. expand heap will expand each heap by enough pages to hold a number of slots defined by `GC_HEAP_INIT_SLOTS` or by `heap->total_pags` whichever is larger. If both `double_heap` and `expand_heap` are present, a deprecation warning will be shown for `double_heap` and the `expand_heap` behaviour will take precedence Given that this is an API intended for debugging and testing GC compaction I'm not concerned about the extra memory usage or time taken to create the pages. However, for completeness: Running the following `test.rb` and using `time` on my Macbook Pro shows the following memory usage and time impact: pp "RSS (kb): #{`ps -o rss #{Process.pid}`.lines.last.to_i}" GC.verify_compaction_references(double_heap: true, toward: :empty) pp "RSS (kb): #{`ps -o rss #{Process.pid}`.lines.last.to_i}" ❯ time make run ./miniruby -I./lib -I. -I.ext/common -r./arm64-darwin21-fake ./test.rb "RSS (kb): 24000" <internal:gc>:251: warning: double_heap is deprecated and will be removed "RSS (kb): 25232" ________________________________________________________ Executed in 124.37 millis fish external usr time 82.22 millis 0.09 millis 82.12 millis sys time 28.76 millis 2.61 millis 26.15 millis ❯ time make run ./miniruby -I./lib -I. -I.ext/common -r./arm64-darwin21-fake ./test.rb "RSS (kb): 24000" "RSS (kb): 49040" ________________________________________________________ Executed in 150.13 millis fish external usr time 103.32 millis 0.10 millis 103.22 millis sys time 35.73 millis 2.59 millis 33.14 millis
2022-07-07 20:52:05 +00:00
def self.verify_compaction_references(toward: nil, double_heap: false, expand_heap: false)
Primitive.gc_verify_compaction_references(double_heap, expand_heap, toward == :empty)
end
end
# call-seq:
# GC.using_rvargc? -> true or false
#
# Returns true if using experimental feature Variable Width Allocation, false
# otherwise.
def self.using_rvargc? # :nodoc:
GC::INTERNAL_CONSTANTS[:SIZE_POOL_COUNT] > 1
end
# call-seq:
# GC.measure_total_time = true/false
#
# Enable to measure GC time.
# You can get the result with <tt>GC.stat(:time)</tt>.
# Note that GC time measurement can cause some performance overhead.
def self.measure_total_time=(flag)
Primitive.cstmt! %{
rb_objspace.flags.measure_gc = RTEST(flag) ? TRUE : FALSE;
return flag;
}
end
# call-seq:
# GC.measure_total_time -> true/false
#
# Return measure_total_time flag (default: +true+).
# Note that measurement can affect the application performance.
def self.measure_total_time
Primitive.cexpr! %{
RBOOL(rb_objspace.flags.measure_gc)
}
end
# call-seq:
# GC.total_time -> int
#
# Return measured GC total time in nano seconds.
def self.total_time
Primitive.cexpr! %{
ULL2NUM(rb_objspace.profile.total_time_ns)
}
end
2019-11-08 06:32:01 +00:00
end
module ObjectSpace
def garbage_collect full_mark: true, immediate_mark: true, immediate_sweep: true
Primitive.gc_start_internal full_mark, immediate_mark, immediate_sweep, false
2019-11-08 06:32:01 +00:00
end
module_function :garbage_collect
end