12/08/2018, 16:04

Khởi động ứng dụng Ruby/Rails lớn nhanh hơn với bootsnap

Bootsnap là một thư viện có thể cắm vào Ruby, với sự hỗ trợ tùy chọn cho ActiveSupport và YAML, nhằm tối ưu hóa và tính toán các tính toán đắt tiền Thời gian khởi động giảm khoảng 50%, từ khoảng 3s đến 6s trên một máy Ví dụ đối với nền tảng Shopify- Khởi động nhanh hơn khoảng 75% giảm từ ...

Bootsnap là một thư viện có thể cắm vào Ruby, với sự hỗ trợ tùy chọn cho ActiveSupport và YAML, nhằm tối ưu hóa và tính toán các tính toán đắt tiền

Thời gian khởi động giảm khoảng 50%, từ khoảng 3s đến 6s trên một máy
Ví dụ đối với nền tảng Shopify- Khởi động nhanh hơn khoảng 75% giảm từ khoảng 25s xuống còn 6,5s

Gem này hoạt động trên MacOS và Linux

Thêm bootsnap vào file Gemfile
```
gem "bootsnap", require: false
```

Nếu bạn sử dụng rails, thêm vào file config/boot.rb dòng sau:
```
require "bootsnap/setup
```

Nếu không sử dụng rails hoặc bạn muốn kiểm soát tốt hơn về mọi thứ. Hãy thêm vào thiết lập ứng dụng của bạn như sau:

require 'bootsnap'
env = ENV['RAILS_ENV'] || "development"
Bootsnap.setup(
  cache_dir:            'tmp/cache',          # Path to your cache
  development_mode:     env == 'development', # Current working environment, e.g. RACK_ENV, RAILS_ENV, etc
  load_path_cache:      true,                 # Optimize the LOAD_PATH with a cache
  autoload_paths_cache: true,                 # Optimize ActiveSupport autoloads with cache
  disable_trace:        true,                 # (Alpha) Set `RubyVM::InstructionSequence.compile_option = { trace_instruction: false }`
  compile_cache_iseq:   true,                 # Compile Ruby code into ISeq cache, breaks coverage reporting.
  compile_cache_yaml:   true                  # Compile YAML into a cache
)

Bootsnap tối ưu hóa các phương pháp để lưu trữ kết quả của tính toán đắt tiền, và có thể được nhóm lại thành hai loại sau:

Path Pre-Scanning : * Kernel#require và Kernel#load được chỉnh sửa để loại bỏ quét $LOAD_PATH * ActiveSupport::Dependencies.{autoloadable_module?,load_missing_constant,depend_on} được ghi đè để loại bỏ việc quét ActiveSupport::Dependencies.autoload_paths
Compilation Caching: * RubyVM::InstructionSequence.load_iseq được thực hiện để cache kết quả của việc biên dịch ruby bytecode * YAML.load_file được chỉnh sửa để cache kết quả tải một đối tượng YAML trong định dạng MessagePack (hoặc Marshal nếu thông báo dử dụng kiểu không được hỗ trợ bởi MessagePack)

Path Pre-Scanning

(Đây là một sự phát triển nhở của bootscale) Khi bootsnap khởi động hoặc sửa đổi đường dẫn, Bootsnap::LoadPathCache sẽ tìm nạp danh sách các mục yêu cầu từ bộ nhớ cache hoặc thực hiện quét toàn bộ và kết quả bộ nhớ cache nếu thấy cần

Sau đó, khi chạy (eg: require 'foo', Ruby sẽ lặp lại mọi mục trên $LOAD_PATH ['x', 'y', ...], tìm kiếm x/foo.rb, y/foo.rb... Bootsnap nhìn vào tất các cached của mỗi LOAD_PATH và thay thế đường dẫn mở rộng đầy đủ của ruby vào. Sơ đồ sau sẽ trình bày cách ghi đè cho các tính năng *_path_cache

Bootsnap phân loại đường dẫn thành 2 loại :stable và volatile. * volatile được quét mỗi lần khởi động ứng dụng và bộ đệm của chúng chỉ có giá trị trong vòng 30s * stable: không hết hạn, Một khi nội dung của chúng được quét, nó sẽ không bao giờ thay đổi

Các thư mục duy nhất được coi là stable là những thứ trong thư mục ruby cài đặt tiền tố RbConfig::CONFIG['prefix'], eg: /usr/local/ruby or ~/.rubies/x.y.z, và những thứ dưới Gem.path (eg: ~/.gem/ruby/x.y.z) hoặc Bundler.bundle_path. Những thứ còn lại được coi là volatitle

Dưới đây là mã ngồn để làm rõ cách thức hoạt động:

require_relative '../explicit_require'

module Bootsnap
  module LoadPathCache
    class Cache
      AGE_THRESHOLD = 30 # seconds

      def initialize(store, path_obj, development_mode: false)
        @development_mode = development_mode
        @store = store
        @mutex = defined?(::Mutex) ? ::Mutex.new : ::Thread::Mutex.new # TODO: Remove once Ruby 2.2 support is dropped.
        @path_obj = path_obj
        @has_relative_paths = nil
        reinitialize
      end

      # Does this directory exist as a child of one of the path items?
      # e.g. given "/a/b/c/d" exists, and the path is ["/a/b"], has_dir?("c/d")
      # is true.
      def has_dir?(dir)
        reinitialize if stale?
        @mutex.synchronize { @dirs[dir] }
      end

      # { 'enumerator' => nil, 'enumerator.so' => nil, ... }
      BUILTIN_FEATURES = $LOADED_FEATURES.reduce({}) do |acc, feat|
        # Builtin features are of the form 'enumerator.so'.
        # All others include paths.
        next acc unless feat.size < 20 && !feat.include?('/')

        base = File.basename(feat, '.*') # enumerator.so -> enumerator
        ext  = File.extname(feat) # .so

        acc[feat] = nil # enumerator.so
        acc[base] = nil # enumerator

        if [DOT_SO, *DL_EXTENSIONS].include?(ext)
          DL_EXTENSIONS.each do |dl_ext|
            acc["#{base}#{dl_ext}"] = nil # enumerator.bundle
          end
        end

        acc
      end.freeze

      # Try to resolve this feature to an absolute path without traversing the
      # loadpath.
      def find(feature)
        reinitialize if (@has_relative_paths && dir_changed?) || stale?
        feature = feature.to_s
        return feature if absolute_path?(feature)
        return File.expand_path(feature) if feature.start_with?('./')
        @mutex.synchronize do
          x = search_index(feature)
          return x if x

          # Ruby has some built-in features that require lies about.
          # For example, 'enumerator' is built in. If you require it, ruby
          # returns false as if it were already loaded; however, there is no
          # file to find on disk. We've pre-built a list of these, and we
          # return false if any of them is loaded.
          raise LoadPathCache::ReturnFalse if BUILTIN_FEATURES.key?(feature)

          # The feature wasn't found on our preliminary search through the index.
          # We resolve this differently depending on what the extension was.
          case File.extname(feature)
          # If the extension was one of the ones we explicitly cache (.rb and the
          # native dynamic extension, e.g. .bundle or .so), we know it was a
          # failure and there's nothing more we can do to find the file.
          when ', *CACHED_EXTENSIONS # no extension, .rb, (.bundle or .so)
            nil
          # Ruby allows specifying native extensions as '.so' even when DLEXT
          # is '.bundle'. This is where we handle that case.
          when DOT_SO
            x = search_index(feature[0..-4] + DLEXT)
            return x if x
            if DLEXT2
              search_index(feature[0..-4] + DLEXT2)
            end
          else
            # other, unknown extension. For example, `.rake`. Since we haven't
            # cached these, we legitimately need to run the load path search.
            raise LoadPathCache::FallbackScan
          end
        end
      end

      if RbConfig::CONFIG['host_os'] =~ /mswin|mingw|cygwin/
        def absolute_path?(path)
          path[1] == ':'
        end
      else
        def absolute_path?(path)
          path.start_with?(SLASH)
        end
      end

      def unshift_paths(sender, *paths)
        return unless sender == @path_obj
        @mutex.synchronize { unshift_paths_locked(*paths) }
      end

      def push_paths(sender, *paths)
        return unless sender == @path_obj
        @mutex.synchronize { push_paths_locked(*paths) }
      end

      def each_requirable
        @mutex.synchronize do
          @index.each do |rel, entry|
            yield "#{entry}/#{rel}"
          end
        end
      end

      def reinitialize(path_obj = @path_obj)
        @mutex.synchronize do
          @path_obj = path_obj
          ChangeObserver.register(self, @path_obj)
          @index = {}
          @dirs = Hash.new(false)
          @generated_at = now
          push_paths_locked(*@path_obj)
        end
      end

      private

      def dir_changed?
        @prev_dir ||= Dir.pwd
        if @prev_dir == Dir.pwd
          false
        else
          @prev_dir = Dir.pwd
          true
        end
      end

      def push_paths_locked(*paths)
        @store.transaction do
          paths.map(&:to_s).each do |path|
            p = Path.new(path)
            @has_relative_paths = true if p.relative?
            next if p.non_directory?
            entries, dirs = p.entries_and_dirs(@store)
            # push -> low precedence -> set only if unset
            dirs.each    { |dir| @dirs[dir]  ||= true }
            entries.each { |rel| @index[rel] ||= p.expanded_path }
          end
        end
      end

      def unshift_paths_locked(*paths)
        @store.transaction do
          paths.map(&:to_s).reverse_each do |path|
            p = Path.new(path)
            next if p.non_directory?
            entries, dirs = p.entries_and_dirs(@store)
            # unshift -> high precedence -> unconditional set
            dirs.each    { |dir| @dirs[dir]  = true }
            entries.each { |rel| @index[rel] = p.expanded_path }
          end
        end
      end

      def stale?
        @development_mode && @generated_at + AGE_THRESHOLD < now
      end

      def now
        Process.clock_gettime(Process::CLOCK_MONOTONIC).to_i
      end

      if DLEXT2
        def search_index(f)
          try_index(f + DOT_RB) || try_index(f + DLEXT) || try_index(f + DLEXT2) || try_index(f)
        end
      else
        def search_index(f)
          try_index(f + DOT_RB) || try_index(f + DLEXT) || try_index(f)
        end
      end

      def try_index(f)
        if p = @index[f]
          p + '/' + f
        end
      end
    end
  end
end

Hoặc bạn có thể xem sơ đồ dứoi để hiểu về cách hoạt động:

Cần lưu ý là LoadErrors có thể rất tốn kém tài nguyên. Nếu require 'something', nhưng file đó lại không nằm trên $LOAD_PATH nó sẽ mất 2*$$OAD_PATH.lenght hệ thống tập tin truy cập để xác định nó. Bootsnap sẽ cache kết quả này, đưa ra LoadError mà không cần chạm vào hệ thống tập tin ở tất cả.

Complilation Caching

(Tài liệu dễ hiểu hơn của khái niệm này có thể được tìm thấy ở yomikomu) Lưu ý: Chúng ta sẽ tống rất nhiều thời gian để tải các tài liệu YAML trongquá trình khởi động ứng dụng và những mesage như MessagePack và Marshal nhanh hơn nhiều so với YAML. Chúng tôi sử dụng cùng một cách biên dịch bộ nhớ cached cho tài liệu YAML, Tương đương với định dạng bytecode của Ruby là một tài liệu của MessagePack (hoặc Marshal với các tài liệu không hỗ trợ bởi MessagePack).

Các kết quả biên dịch này được lưu trữ trong một thư mục bộ nhớ cache, với tên tập tin được tạo ra bằng cách lấy một băm của đường dẫn mở rộng đầy đủ của tập tin đầu vào (FNV1a-64).

Trong khi trước đây, trình tự các syscalls tạo ra để yêu cầu một tập tin sẽ như sau:

open    /c/foo.rb -> m
fstat64 m
close   m
open    /c/foo.rb -> o
fstat64 o
fstat64 o
read    o
read    o
...
close   o

Thì với Bootsnap sẽ nhận được:

open      /c/foo.rb -> n
fstat64   n
close     n
open      /c/foo.rb -> n
fstat64   n
open      (cache) -> m
read      m
read      m
close     m
close     n

Bootsnap viết một tệp tin cache có chứa một tiêu đề 64 byte theo sau là nội dung của bộ nhớ cache. Tiêu đề là một khóa cache bao gồm một số trường:

version: Phiên bản schema
os_version: Một hash của phiên bản kernel hiện tại(MacOS, BSD) hoặc phiên bản Glibc (Linux)
Compile_option: Những thay đổi với RubyVM::InstructionSequence.compile_option
Ruby_version: Phiên bản của ruby.
Size: Kích thước source
mtime: Timestamp lần sửa đổi cuối cùng
data_size: số byte theo tiêu đề, mà chúng ta cần phải đọc nó vào bộ đệm.

Nếu khóa là hợp lệ, kết quả được nạp từ giá trị. Ngược lại, nó được tái tạo và hủy bộ nhớ cache hiện tại

Hãy tưởng tượng ta có cấu trúc sau:

/
├── a
├── b
└── c
    └── foo.rb

Và có $LOAD_PATH:

["/a", "/b", "/c"]

Khi chúng ta gọi require 'foo' mà không dùng Bootsnap. Ruby sẽ sinh ra chuỗi syscalls này:

open    /a/foo.rb -> -1
open    /b/foo.rb -> -1
open    /c/foo.rb -> n
close   n
open    /c/foo.rb -> m
fstat64 m
close   m
open    /c/foo.rb -> o
fstat64 o
fstat64 o
read    o
read    o
...
close   o

Nhưng với bootsnap thì sẽ như sau:

open      /c/foo.rb -> n
fstat64   n
close     n
open      /c/foo.rb -> n
fstat64   n
open      (cache) -> m
read      m
read      m
close     m
close     n

Nếu chúng ta gọi require 'nope' mà không dùng bootsnap:

open    /a/nope.rb -> -1
open    /b/nope.rb -> -1
open    /c/nope.rb -> -1
open    /a/nope.bundle -> -1
open    /b/nope.bundle -> -1
open    /c/nope.bundle -> -1

Còn nếu chạy với Bootsnap thì sẽ không sinh ra bất cứ gì.

Tài Liệu tham khảo: bootsnap

Bình luận về bài viết này

Hoàng Hải Đăng

24 chủ đề

7226 bài viết

Khởi động ứng dụng Ruby/Rails lớn nhanh hơn với bootsnap

Path Pre-Scanning

Complilation Caching

Đăng ký nhận thông báo

HỖ TRỢ HỌC VIÊN

VỀ CODE24H

HỢP TÁC VÀ LIÊN KẾT

KẾT NỐI VỚI CHÚNG TÔI

TẢI ỨNG DỤNG TRÊN ĐIỆN THOẠI