Ruby: Memory Internals and Optimization
Ruby is a compiled language that provides an abstraction of memory usage which makes programmers completely ignorant about it. Unlike C, where we can dynamically allocate memory using MALLOC / CALLOC which gives clear transparency on memory usage, in Ruby, such transparency is not achieved since everything is managed internally.
Memory Terms
- Stack: Stack Memory is used for static memory allocation. Variables in the stack are stored directly in the memory and its access is very fast. The memory allocation happens during compile time of the program execution.
- Heap: Heap memory is a pile of memory available to programmers to allocate and de-allocate memory spaces. The memory allocation in the heap happens during the run-time of the program execution. Elements of the heap have no dependencies among each other and can always be accessed randomly at any time.
In Ruby code, everything is Heap allocated so all the memory usage and allocation is done inside the Heap.
Memory Management in Ruby
- ObjectSpace: Inside the system Heap, there is another Heap which is called the Ruby Heap / ObjectSpace. The ObjectSpace is where all the Ruby Objects live and points to other memory locations inside the machine’s Heap.
- Ruby Pages: Whenever Ruby allocates an object, it doesn’t call MALLOC every single time since MALLOC is expensive and costs CPU time. It actually allocates a contiguous chunk of memory inside the ObjectSpace to be distributed among Ruby objects. That chunk of memory is called Ruby Pages which consists of linked lists. The size of one Ruby page is 16K.
- RValues: Each node of the linked list inside a Ruby Page is called a slot and is a Ruby object which stores RValue. An RVALUE is a magical C struct that is a union of various low-level C representations of Ruby objects. The size of a Ruby object or one RValue is 40 Bytes and the size of each Ruby page is 16K. Since each ruby page is 16K, therefore each Ruby page consists of 408 RValues.
If the size of a Ruby object is within 40 bytes, then the value of the object will directly be saved inside the RValue. But if the size exceeds the threshold, then an extra memory outside the Ruby heap will be allocated and RValue will save the location of that memory. As an example, if it is a very short string like hello, the character data are embedded directly in the RValue. However, if the string is long and exceeds the threshold, the RValue contains only a raw pointer to where the object data actually lies in memory outside the Ruby Heap.
Garbage Collector (GC)
Ruby Garbage Collector not only cleans up memory but also allocates memory inside Ruby ObjectSpace from Heap when there is a demand for memory.
- Collection: Garbage Collector maintains structure like trees of objects. Collector finds those nodes which are no longer required and cut those nodes from the tree. The main challenge is to speed up finding the nodes that need to be eliminated. Several algorithms have been implemented to speed up the collection process. The collector runs between the execution of the program and stops it for some time. When all the objects in a Ruby page are freed, the page gets destroyed but they are not released to OS anymore.
- Allocation: ObjectSpace contains Ruby objects with RValue size as 40 bytes. To allocate a new object, the first open slot is searched inside the Ruby page. If any open slot is not present, then allocator allocates more new Ruby pages. Creating a long string (1000-character HTTP response) looks like this:
1. RValue is added to the ObjectSpace list. If ObjectSpace is out of any free slot, an extra Ruby Page is allocated inside ObjectSpace using MALLOC(16384).
2. MALLOC(1000) is called and an address to a 1000-byte memory location is received and saved inside RValue of the new free slot.
3. The HTTP response string data will reside in that memory location.
GC::INTERNAL_CONSTANTS
- RVALUE_SIZE: RValue size as 40 bytes
- HEAP_PAGE_OBJ_LIMIT: Objects Limit per Ruby Page as 408 since the size of one Ruby Page is 16K.
GC.stat
- count: Number of times a GC executed
- heap_allocated_pages: Number of Ruby pages inside the Ruby Heap.
- heap_sorted_length: Actual number of Ruby pages inside Ruby Heap. It will always be greater than or equal to the number of actually allocated pages since it also consists of free pages that are not released to OS.
- heap_allocatable_pages: Number of Ruby pages that are completely free. If Ruby needs a new page, these pages will be populated first.
- heap_available_slots: Total number of available free slots for Ruby objects inside Ruby Heap.
- heap_live_slots: Total number of slots that are currently being used by Ruby Objects.
- heap_free_slots: Total number of slots that are not being currently used by Ruby Objects.
- heap_eden_pages: Eden pages are the pages with at least one live slot in it. This attribute gives the total number of pages with at least one live slot in it.
- heap_tomb_pages: Tomb pages are the pages which contain no live object in it. This attribute gives the total number of pages with no live object in it.
Memory Fragmentation
GC collects objects that are no longer required, and remove them from Ruby Pages. Thus we get many empty slots inside those pages which are no longer usable unless all the slots of the page are freed. This leads to Memory Fragmentation consuming unused memory. Moreover, Ruby cannot move objects in Ruby Heap from one page to another. Doing so would potentially break any C language extensions which are holding raw pointers to a Ruby object. In order to reduce fragmentation, GC will put new objects into eden pages first, and then tomb pages after all the eden pages have filled up.
Calculation of Memory Fragmentation
Memory usage = live_slots / (eden_pages * slots_per_page) * 100%
Memory fragmentation = 100% — Memory Usage
Memory Optimisation at Configuration Level
Configuring the value of MALLOC_ARENA_MAX for threaded applications using sidekiq or puma can help in decreasing memory usage and preventing spikes in memory usage. Moreover using JeMalloc as a substitute for the default MALLOC allocator gives a better performance when it comes to memory usage. (Ref: https://www.speedshop.co/2017/12/04/malloc-doubles-ruby-memory.html)
Memory fragmentation can also be curbed on tuning GC variables. But it requires an in-depth knowledge of how GC works.
Memory Optimisation at Code Level
- Using built-in object types: Using built-in types object types are often the best way to conserve memory, at the cost of having a good abstraction.
- Having Ruby’s internals in mind: Having Ruby’s internals in mind when designing classes and systems can reduce memory usage. Here is a list of memory consumption by various built-in objects inside ObjectSpace.
It can be clearly seen how memory usage differs with different data structures and also, how memory increases on adding an element within a data structure. Now suppose, we need a class with 4 key-value pairs data, then choosing struct data structure will be an optimal solution since it will consume only 72 bytes of memory. An object will use 80 bytes while a hash will consume 192 bytes of memory.
- Re-using objects (especially immutable): Sometimes, it makes sense to retain objects to reuse rather than have to recreate them again and again. Ruby has this feature built-in for strings. On calling freeze on a string, the interpreter will know there is no plan in the modification of the string and hence can be reused.
In the first instance, Ruby stores 10 objects for strings during execution. In the second instance where the freeze is used, Ruby stores only one string object with 10 references to that object. That string object is retained by Ruby to be used further.
- Implementing In-place memory algorithms: This helps in not allocating additional objects at all. For example, if there is an array whose values need to be mapped, either Array#map or Array#map! can be used. The difference is that the first creates a new array whereas the second one modifies the array in-place.
- Preventing reference of a constant to an object: Objects cannot be garbage collected when they are referenced by a constant. The same goes for global variables, modules, and classes.
- Avoiding the usage of constant in regular expression: There is no need to use a constant to store a regular expression, as all regular expression literals are frozen by the Ruby interpreter.
- Avoiding loading all the objects at once: All the objects should not be loaded at once since it can lead to the consumption of all the OS heap memory into ObjectSpace thus leading to crash the application.