Using gems with Chef
Installing gems with Chef is relatively painless. Most of the time, you can use the gem_package
resource, which behaves very similarly to the native package
resource:
gem_package 'httparty'
You can even specify the gem version to install:
gem_package 'httparty' do
version '0.12.0'
end
You may have also seen the chef_gem
resource. What's the difference?
The
chef_gem
andgem_package
resources are both used to install Ruby gems. For any machine on which the chef-client is installed, there are two instances of Ruby. One is the standard, system-wide instance of Ruby and the other is a dedicated instance that is available only to the chef-client. Use thechef_gem
resource to install gems into the instance of Ruby that is dedicated to the chef-client. Use thegem_package
resource to install all other gems (i.e. install gems system-wide).
In short - if you want Chef to use it, use chef_gem
, otherwise use gem_package
. Most of the time, you will want to use gem_package
, unless the gem you are installing will be used by Chef core, such as gems that are used in libraries or heavy-weight resources, gems used in report handlers, or extensions to Chef itself (like Chef Sugar).
It is not uncommon to require the use of a third-party gem in a Chef library or heavy-weight resource. For example, you may desire the use of Nokogiri to parse XML or HTTParty to easily make web requests. But installing and then using these gems is often a hassel and "chicken-and-egg" problem. In order to use the resource, the gem must be installed on the system. In order for the gem to be installed, a particular recipe must be executed. And things get exponentially more complex with the gem installation requires native extension.
Let's say you have a very simple library that parses some XML:
require 'nokogiri'
module Helper
def read(url)
Nokogiri::HTML(open(url))
end
end
If you try to run the Chef Client, you'll get a nasty error like:
================================================================================
Recipe Compile Error in /var/chef/cache/cookbooks/bacon/libraries/helper.rb
================================================================================
LoadError
---------
cannot load such file -- nokogiri
Cookbook Trace:
---------------
/var/chef/cache/cookbooks/bacon/libraries/helper.rb:1:in `<top (required)>'
Relevant File Content:
----------------------
/var/chef/cache/cookbooks/bacon/libraries/helper.rb:
1>> require 'nokogiri'
2:
3: module Helper
4: def read(url)
5: Nokogiri::HTML(open(url))
6: end
7: end
8:
This is geek speak for "you haven't installed the Nokogiri gem on your system". Since this gem is going to be used inside of Chef (i.e. in a Chef library), we want to use the chef_gem
resource. Logically, you would add something like the following to your default recipe:
chef_gem 'nokogiri'
You would think the default recipe would run, installing the Nokogiri, and then the Chef run would execute successfully. If you execute this Chef recipe, you'll get the same error as above. That's because Chef tries to load libraries earlier than it executes recipes. If you have some experience developing Ruby applications, you probably want to move the require 'nokogiri'
statement into the method body, since that delays its loading:
module Helper
def read(url)
require 'nokogiri'
Nokogiri::HTML(open(url))
end
end
This Chef Client run will execute successfully!
While this is a good "temporary" solution, it has some serious limitations:
- Having
require
s in yourdef
statements is a code smell. - It can easily become repetitive.
- It does not permit class methods or module inclusions/extensions.
Specifically, consider if you wanted to use HTTParty to make HTTP requests. You typically use the HTTParty module as follows:
require 'httparty'
class Requester
include HTTParty
base_uri 'api.example.com'
end
And now we are faced with another "chicken-and-egg" problem. In order to include HTTParty
and set the base_uri
class method, we need to require 'httparty'
. But we can't actually require httparty until after our class is entirely loaded (as demonstrated with the nokogiri gem again). So our earlier approach will no longer work:
class Requester
include HTTParty
base_uri 'api.example.com'
def initialize
require 'httparty'
end
end
It will blow up like this:
================================================================================
Recipe Compile Error in /var/chef/cache/cookbooks/bacon/libraries/helper.rb
================================================================================
NameError
---------
uninitialized constant Requester::HTTParty
Cookbook Trace:
---------------
/var/chef/cache/cookbooks/bacon/libraries/helper.rb:2:in `<class:Requester>'
/var/chef/cache/cookbooks/bacon/libraries/helper.rb:1:in `<top (required)>'
Relevant File Content:
----------------------
/var/chef/cache/cookbooks/bacon/libraries/helper.rb:
1: class Requester
2>> include HTTParty
3: base_uri 'api.example.com'
4:
5: def initialize
6: require 'httparty'
7: end
8: end
9:
Well this sucks. And there are some really complex solutions to work around these and other issues. I'm sure you've seen at least one recipe like this:
# Install build-essential at runtime, so we can install
# Nokogiri before our library is used.
package 'build-essential' do
action :nothing
end.run_action(:install)
# Install Nokogiri
chef_gem 'nokogiri'
Solution #1 - Push down the stack
This solution involves a deeper understanding of the Chef internals and the Ruby programming language, but it does offer a fairly elegant solution that covers "most" use cases. It involves delaying the loading of classes and libraries until the last possible minute. It also involves more manual work on the part of users of the cookbook.
Move
require
statements immediately followingchef_gem
installations in your recipe:chef_gem 'httparty' require 'httparty'
Convert all your classes to perform any setup in the
initialize
method:class Requestor def initialize self.class.send(:include, HTTParty) self.class.send(:base_uri, 'api.example.com') end end
These are Rubyisms - we are modifying the parent's eigenclass at runtime. Normally this is a really bad idea, but it provides a fairly elegant solution in this example.
Update your README/documentation
# In order to use the foo_resource, you must include the "foo" # recipe in your run_list.
Advantages:
- Easy to convert
- Follows existing patterns
Disadvantages:
- Hack
- Not performant
- Relies on human process (i.e. RTFM)
Solution #2 - Bootstrapping
If you know you are going to need a particular gem on a system, you can create a custom knife bootstrap script to install that gem when Chef is installed on the system. For example:
# ... existing bootstrap
gem update --no-rdoc --no-ri
gem install ohai --no-rdoc --no-ri --verbose
gem install chef --no-rdoc --no-ri --verbose <%= bootstrap_version_string %>
# Add this
gem install nokogiri --no-rdoc --no-ri --verbose
# ... existing bootstrap
Advantages:
- Simple
- One-time operation
Disadvantages:
- Not extensible
- Only works on new systems
Solution #3 - Vendoring
To the best of my knowledge, this is an entirely new approach. I do not know of any cookbooks that currently use this pattern. I have yet to load test or fully evaluate this approach, but I think it offers the most elegant solution of all those here.
This approach involves packaging the gem inside the cookbook. In the Ruby world, this process is referred to as "vendoring a gem". Bundler, for example, vendors Thor. I recommend vendoring the gem inside of files/default/vendor
, since it's semantic and is automatically distributed and packaged with the cookbook.
To install a gem, you normally run the command:
gem install GEMNAME
But the install command also accepts an optional list of arguments:
Usage: gem install GEMNAME [GEMNAME ...] [options] -- --build-flags [options]
So, to vendor a gem inside of files/default/vendor
, run the following command from inside your cookbook root:
gem install --install-dir files/default/vendor --no-document GEMNAME
--install-dir
tells Ruby to install the gem inside our cookbook.--no-document
tells Ruby to ignore documentation (since we are just packaging this cookbook, documentation is unnecessary).
This will create a few files and folders inside of your cookbook. The gem you installed, as well as any required dependencies, are now packaged in your cookbook! Feel free to inspect files/default/vendor
to see everything that is installed. Now we just need to expand our $LOAD_PATH
to include this directory at runtime.
At the very top of your library, add the following line:
$:.unshift *Dir[File.expand_path('../../files/default/vendor/gems/**/lib', __FILE__)]
$:
is short for$LOAD_PATH
, which is the array of paths Ruby searches when you require a file.unshift
is an Array method that puts all elements as arguments at the front of the array*
is the splat operator; it has many uses, but in this instance, it converts the array into a list of parameters to theunshift
method.Dir[]
is equivalent toDir.glob
and behaves very similar tols
ordir
and supports wildcard expansion; it will return an array of file paths on disk.File.expand_path
converts a relative path to an absolute path; the optional second argument is the location to expand from; in other words, expand the path relative from this current file on disk.__FILE__
is a Ruby shortcut for the path to the current file on disk.
So if you were to break this down step-by-step:
The function is decomposed from the inside-out:
$:.unshift *Dir[File.expand_path("../../files/default/vendor/gems/**/lib", __FILE__)]
The path is expanded relative to the current file:
$:.unshift *Dir["/var/chef/cache/cookbooks/bacon/files/default/vendor/gems/**/lib"]
Dir
expands the**
on disk:$:.unshift *[ "/var/chef/cache/cookbooks/bacon/files/default/vendor/gems/httparty-0.12.0/lib", "/var/chef/cache/cookbooks/bacon/files/default/vendor/gems/json-1.8.1/lib", "/var/chef/cache/cookbooks/bacon/files/default/vendor/gems/multi_xml-0.5.5/lib" ]
The splat operator converts the array into method parameters:
$:.unshift( "/var/chef/cache/cookbooks/bacon/files/default/vendor/gems/httparty-0.12.0/lib", "/var/chef/cache/cookbooks/bacon/files/default/vendor/gems/json-1.8.1/lib", "/var/chef/cache/cookbooks/bacon/files/default/vendor/gems/multi_xml-0.5.5/lib" )
$:
is a shortcut for$LOAD_PATH
:$LOAD_PATH.unshift(...)
The
$LOAD_PATH
now includes our gem files at the top:$LOAD_PATH #=> [ '/var/chef/cache/cookbooks/bacon/files/default/vendor/gems/httparty-0.12.0/lib', '/var/chef/cache/cookbooks/bacon/files/default/vendor/gems/json-1.8.1/lib', '/var/chef/cache/cookbooks/bacon/files/default/vendor/gems/multi_xml-0.5.5/lib', # Existing $LOAD_PATH ]
We can now require our gem:
require 'httparty'
Advantages:
- Self-contained
- No third-party dependencies
- Versioned
- No compile-time madness
Disadvantages:
- Size (cookbooks are limited in file size)
- Gem activation errors can still occur
About Seth
Seth Vargo is a Distinguished Software Engineer at Google. Previously he worked at HashiCorp, Chef Software, CustomInk, and some Pittsburgh-based startups. He is the author of Learning Chef and is passionate about reducing inequality in technology. When he is not writing, working on open source, teaching, or speaking at conferences, Seth advises non-profits.