The Cleanroom Pattern in Ruby
I recently gave a talk at Philly.rb, the Ruby meetup in Philadelphia, PA entitled "The Cleanroom Pattern - More safely evaluating DSLs with Ruby". You can watch the full Cleanroom DSL video online, but I decided to also write the contents of the talk into a blog post.
Background on DSLs
Most Ruby-based DSLs are created using a simple instance_eval. While slightly less dangerous than eval, instance_eval still opens the system up to dangerous circumstances. Consider the following DSL file, Project, which has a name attribute:
class Project
NULL = Object.new.freeze
def name(val = NULL)
if val.equal?(NULL)
@name
else
@name = sanitize(val)
end
end
private
def sanitize(string)
string.gsub(/\s+/, '-').downcase
end
end
There are a few things to take note of here:
- A new, frozen object is created to represent "NULL". While it is true that Ruby has a native implementation of
nil, having a default value ofnilwould actually prevent the user from setting the value tonil(since that would be assumed to have passed "nothing"). You may have seen this problem when working with some Chef resources. - There is a DSL method called
name, which is essentially overloaded as two methods. When given no parameters, the method simply returns the instance variable@name. When given a value, the value is sanitized and then set on the@nameinstance variable. - There is a private sanitize method that replaces all space-like characters with a dash (because otherwise the world explodes!).
Inside your system, you would likely load this DSL file as such:
path = '/path/to/dsl.rb'
contents = File.read(path)
project = Project.new
project.instance_eval(contents, File.basename(path), 0)
project
So, given a DSL file like:
name "hamlet"
The loading process would result in a #<Project> object with a name of "hamlet":
project.name #=> "hamlet"
Problem #1 - Private Methods
During an instance_eval (or instance_exec), the entire instance is exposed to the user - it is just as if you were writing code directly in project.rb in a text editor. That means public, protected, and private methods are all accessible by the user:
Project.new.instance_eval do
sanitize("String Here")
end
#=> "string-here"
This is not "terrible", since Rubyists are quite familiar with the ability to use send (or __send__) to call these methods anyway. However, it is very unclear to the DSL author what methods are public and private.
Problem 2 - Scope Creep
The sanitize method has a pretty generic name. Since these are Ruby DSLs, it is feasible that a savvy developer may create a "helper" methods to ease the development process. Consider the following DSL file:
#
# Define a +sanitize+ method that uppercases the value for ...
#
# @param [#to_s] string
# the string to parameterize
#
# @return [String]
#
def sanitize(string)
string.to_s.upcase
end
name "Some String"
The resulting output would be:
project.name #=> "SOME STRING"
The DSL author unintentionally changed the behavior of the instance by making a simple helper method. While this is a contrived example, consider larger DSL-based projects like Chef or Omnibus which have hundreds of tiny helper methods - the possibility of collision is much higher.
Thankfully, since this is instance_eval, the change to the sanitize method is scoped to this DSL method (meaning changing the method here does not change it for future evaluations). We only edited the igenclass.
Problem 3 - Bypassing Validation
Consider a user who really wants to have spaces in his/her project name. They could easily bypass the entire system by just setting the instance variable manually:
@name = "My Custom Name"
When this file is evaluated in the context of the Project object:
project.name #=> "My Custom Name"
The user has completely circumvented our sanitize method by just accessing the instance variable directly. Worse, this is an intentional design in Ruby:
In order to set the context, the variable
selfis set toobjwhile the code is executing, giving the code access toobj's instance variables.
Problem 4 - Persisted Changes
The biggest problem with instance_eval is that it gives you access to self, an instance of the Project class in these examples. self has access to its parent, so truly malicious code could permanently change the behavior of future instance_evals A very clever developer could permanently change the behavior of sanitize for all future instance of this class (until project.rb is reloaded from disk):
Project.new.instance_eval do
self.class.class_eval do
def sanitize(val)
val.upcase
end
end
end
Uh oh!
Project.new.sanitize("foo") => "FOO"
Project.new.sanitize("foo") => "FOO"
Project.new.sanitize("foo") => "FOO"
Project.new.sanitize("foo") => "FOO"
This code has permanently changed the behavior of the instance's sanitize method (note how I am creating a new instance). If you are writing a Ruby application that accepts a user-given DSL or dealing with a long-running Ruby proceess, a malicious user could alter the underlying state of the sytem in memory.
Explaining the Cleanroom Pattern
The cleanroom pattern is an idiomatic way evaluate Ruby DSLs in an isolated environment while restricting the methods and level of access a user has. I want to be clear: I did not invent the cleanroom pattern! It can be found in Metaprogamming Ruby books, various blog posts, and popular community projects. I actually learned of the cleanroom pattern from my good friend and fellow Berkshelf-core-team member Jamie Winsor, so thanks!
The general pattern for a cleanroom looks like this:

- The class defines which methods should be exposed on its DSL instances.
- During evaluation a new, anonymous instance, which only has those defined methods is created. This object is created in the top-level
Objectspace to prevent leaking. - The Ruby file is
instance_evaled against this anonymous instance which has very restricted access to the parent instance. - The anonymous instance then proxies data back to the original instance using
public_send.
Thus there are three areas of protection:
- The class defines which values are public within the DSL. Only those methods exist on the anonymous instance, thus preventing namespace collisions.
- The anonymous instance is created fresh, each time. Even if a malicious attacker is able to craft something to permanently modify the class, it would only persist for that anonymous instance, which is cleaned up during the next GC run.
- The anonymous instance proxies back to the "real" instance using
public_send. So, even if an attacker was able to bypass all the existing mitigations, they would only be able to call public methods on the instance.
The code for creating the cleanroom object is a bit complex and meta:
def cleanroom
Class.new(Object) do # <1>
define_method(:initialize) do |instance| # <2>
define_singleton_method(:__instance__) do # <3>
unless caller[0].include?(__FILE__) # <4>
raise Cleanroom::InaccessibleError.new(:__instance__, self)
end
instance # <5>
end
end
exposed.each do |exposed_method| # <6>
define_method(exposed_method) do |*args, &block|
__instance__.public_send(exposed_method, *args, &block)
end
end
end
end
- First we create a new anonymous class inheriting for
Object. Next we dynamically define an
#initializemethod on the class which accepts an instance as the parameter. In normal-Ruby:def initialize(instance) # ... endDuring initialization, a new singleton method is created on the igenclass of the instance. This is basically the same as a regular
defmethod, but it only exists inside the context of this instance. Furthermore, we create it during initialization, thus allowing us to bind to the parent, giving us access to the giveninstanceparameter. Basically we are doing this:def initialize(instance) @instance = instance end def __instance__ @instance endBut since this anonymous class is what gets
instance_evaled, exposing the real instance as an instance variable would allow an attacker to completely bypass the system (remember, instance variables are within scope during aninstance_eval)!Instead, we are creating a dynamic method at runtime that refers to the parameter given to the
#initializemethod. This allows us to "store" the value in a method, but not expose it in an instance variable.Inside the aforementioned method, we add an extra guard that only permits the method to be called from inside
self. This is a major hack, but we inspect thecallerobject and make sure the person who called the__instance__method is the name of the file we are currently running (not a DSL file).If an error was not raised, we return the instance that was given to us in the
#initializemethod.For each exposed method (which I have just called
exposedin the code snippet), we define a method and public send to__instance__.
Using the Cleanroom
Fortunately you do not need to understand all of this to utilize a cleanroom in your projects! I have wrapped all this logic, plus tests and custom RSpec matchers into the cleanroom gem. The gem is already in use in popular projects like Omnibus and Berkshelf, and you can easily use it too!
After you have added the cleanroom gem to your Gemfile and executed the bundle command to install, simply require and include the Cleanroom module in any DSL:
# my_dsl_file.rb
require 'cleanroom'
class MyDSLFile
include Cleanroom
end
Immediately, without writing any code, you have been given access to the following methods:
MyDSLFile.evaluate_file- evaluate a file against an instanceMyDSLFile.evaluate- evaluate raw Ruby (as a String) or a block against an instanceMyDSLFile#evaluate_file- evaluate a file against this instanceMyDSLFile#evaluate- evaluate raw Ruby (as a String) or a block against this instance
For example:
dsl = MyDSLFile.new
dsl.evaluate_file('/path/to/file.rb')
dsl #=> #<MyDSLFile:0xabc123>
For each method you want to be exposed as part of the DSL API (which may be separate from the public API), simply call expose:
require 'cleanroom'
class MyDSLFile
include Cleanroom
def some_dsl_method
# ...
end
expose :some_dsl_method
end
With just that one additional line of code for the expose method, you get all of the features and magic described before. Go ahead and try it out!
- Example cleanroom method
Project#namein Omnibus - Example cleanroom method
Berksfile#extensionin Berkshelf
Final Thoughts
The slides and video from my talk are linked above, but I have included them here as well:
On a final note - there is still much exploration to be done in this area. Perhaps the DSL evaluation should set Ruby's $SAFE level or prevent against system or%x calls... maybe it should not. The cleanroom pattern and gem has been especially useful in my daily work, and I really hope you benefit from it as well!
About Seth
Seth Vargo is a Distinguished Software Engineer at LinkedIn. Previously he worked at Google, HashiCorp, Chef, CustomInk, and some Pittsburgh-based startups. He is the author of Learning Chef and is passionate about reducing inequality in technology. When he is not writing, working on open source, teaching, or speaking at conferences, Seth mentors other engineers, advises non-profits, and invests in startups.