Authorizers, Extractors, and Policy objects

DIY, Rails, Ruby, Tutorial Posted on

Recently I was working on a Rails 4 project, and much to my surprise, my favorite authorization framework is not supported! CanCan had long been my "go-to" framework for its simplicty and readability. I started searching the Internet for alternative gems, but many of them were also "not-Rails-4-ready" or had not had activity in months. As a Ruby developer, I cannot believe I did not think of this on my own! At the end of the post, the team showcases Pundit - a Ruby gem that encapsulates the content of the blog post with some nice documentation and syntactic-sugar. I tried Pundit, and I really liked it, but it was not going to work... and here is why: Rails policy objects are misunderstood. But first, some background.

The project was a Rails 4 + PostgreSQL stack with OAuth authentication. As with most application developers, I wrote the authentication before the authorization component (you have to login before you can assign permissions). The application used only OmniAuth for authentication, with multi-provider support. If you have ever worked with OAuth, you know that the OmniAuth gem is your savior. That being said, each OAuth provider returns slightly different information in slightly different formats. So you often end up with a huge User model or concern with a bunch of methods that perform almost identical operations with minimal variation.

class User < ActiveRecord::Base
  def user_from_facebook(auth)
    User.create! do |user|
      user.username = auth['info']['username']
      user.first_name = auth['info']['first_name']
      user.last_name = auth['info']['last_name']
    end
  end

  def user_from_twitter(auth)
    User.create! do |user|
      user.username = auth['info']['nickname']
      user.first_name, user.last_name = auth['info']['name'].split(' ', 2)
    end
  end

  def user_from_linkedin(auth)
    # ...
  end

  def user_from_github(auth)
    # ...
  end
end

This pattern (Railscast) works really well for a single provider, but becomes a violation of the DRY principle when implementing multi-provider OAuth... I stumbled across Clean OAuth for Rails by David Lesches that uses an Object-Oriented approach and Policy objects to extract information from various OmniAuth auth hashes. If you are familiar with Object-Oriented programming, you effectively create an interface that defines all the common methods, and then each policy object implements that interface, accessing/manipulatin the correct information in the auth hash. So it looks like this (and please read David's full post for a detailed explanation - he does a very good job):

# app/policies/github_policy.rb
class GithubPolicy
  attr_reader :auth

  def initialize(auth)
    @auth = auth
  end

  def first_name
    auth['info']['first_name']
  end
end
# app/policies/twitter_policy.rb
class TwitterPolicy
  attr_reader :auth

  def initialize(auth)
    @auth = auth
  end

  def first_name
    auth['info']['name'].split(' ').first
  end
end

And then in your User model, instantiate an instance of the policy object, and use the declarative interface for a clean, DRY method body.

# app/models/user.rb

def self.from_oauth(auth)
  policy = "#{auth['provider']}_policy".classify.constantize.new(auth)
  create! do |user|
    user.username = policy.username
    user.first_name = policy.first_name
    user.last_name = policy.last_name
  end
end

Awesomesauce. But this actually causes a conflict of interest with Pundit. I added pundit to the Gemfile and started making policy objects as instructed in the README. Wait... I already have policy objects for OmniAuth though. So my app/policies began looking like this:

$ ls -l app/policies

account_policy.rb
github_policy.rb
post_policy.rb
twitter_policy.rb
user_policy.rb

Woah. Wait, I am confused. So twitter_policy.rb and github_policy.rb are for OmniAuth, but account_policy.rb, post_policy.rb, and user_policy.rb are for Pundit? Just to add to the confusion, the app needed an actual ActiveRecord model for storing Twitter information... So what did twitter_policy.rb actually refer to - the model authorization or the OmniAuth policy?

And thus begins the journey of my refactoring, all of which can be captured in this single statement:

Just because James Golick's original post on policy objects called them "policy objects" does not mean they are "policies".

Let me take a step back - I love the idea of policy objects. But please stop calling them policies; 99% of the time, they aren't actually policies - they are policy objects. So I refactored our OmniAuth policies and Pundit policies to use the idea of policy objects, but called them what they actually are:

  • OmniAuth Policy: Extractor
  • Pundit Policy: Authorizer

So first I made a "true" abstract class modeled after the Rails "base class" pattern:

# lib/extractor/base.rb

module Extractor
  class Base
    class << self
      #
      # Finds an extractor object for the given auth hash and returns a new
      # extractor instance for the auth hash.
      #
      # @param [Hash]
      #   the hash returned from omniauth
      #
      # @return [~Extractor::Base]
      #
      def load(auth)
        provider = auth['provider'].classify

        begin
          "#{provider}Extractor".constantize.new(auth)
        rescue NameError
          raise RuntimeError, "#{provider} is not a valid extractor!"
        end
      end
    end

    # @return [OmniAuth::AuthHash]
    attr_reader :auth

    #
    # Create a new instance of this extractor object.
    #
    # @param [OmniAuth::AuthHash] auth
    #
    def initialize(auth)
      @auth = auth
    end

    #
    # The unique signature (like composite key) used by ActiveRecord to
    # identify this extractor.
    #
    # @example
    #   { provider: 'github', uid: 'abcd1234' }
    #
    # @return [Hash]
    #
    def signature
      { provider: provider, uid: uid }
    end

    #
    # The name for this extractor.
    #
    # @return [Symbol]
    #
    def provider
      auth['provider']
    end

    #
    # @abstract The first name of the OAuth user.
    #
    # @return [String, nil]
    #
    def first_name; end

    # ...snip...
  end
end

Notice a few key changes:

  1. The Extractor::Base.load method now handles loading a given auth provider, and gracefully fails when it cannot.
  Extractor::Base.load(auth)
  1. The initialize method is part of the base class; the repetitive code is eliminated.
  2. All methods return nil by default.

And an individual extractor (which lives under app/extractors) looks like this:

# app/extractors/twitter_extractor.rb

class TwitterExtractor < Extractor::Base
  def first_name
    auth['info']['name'].split(' ', 2).first
  end

  # ...snip...
end

Extracting this policy into an Extractor object has significantly improved the readability and reuse of this pattern. This is still a policy object, but it is now much clearer what this object does. Additionally, this new pattern is incredibly easy to test. Each extractor has its own spec in spec/extractors:

# spec/extractors/twitter_extractor_spec.rb

describe GithubExtractor do
  let(:auth) { OmniAuth.config.mock_auth[:github] }
  subject { described_class.new(auth) }

  its(:first_name)  { should eq('John') }
  its(:last_name)   { should eq('Doe') }
  its(:email)       { should eq('johndoe@example.com') }
  its(:username)    { should eq('johndoe') }
  its(:image_url)   { should eq('https://image-url.com') }
  its(:uid)         { should eq('12345') }
  its(:oauth_token) { should eq('oauth_token') }
end

Now that we have moved our OmniAuth policy out of the "policy" namespace, let's apply the same pattern to Pundit. And this is where things went downhill :(. Pundit makes a lot of assumptions about where your code lives, and does not provide an easy way to override those assumptions. And that is totally okay! I think adding such functionality would severly complicate the awesomely-simple codebase and make it less beginner-friendly. So I extracted the ideas behind Pundit into a more custom solution called "Authorizers".

# lib/authorizer/base.rb

module Authorizer
  class Base
    # @return [User]
    attr_reader :user

    # @return [Object]
    attr_reader :record

    #
    # Create a new authorizer for the given user and record.
    #
    # @param [User] user
    # @param [Object] record
    #
    def initialize(user, record)
      @user = user || User.new
      @record = record
    end

    #
    # In development and test, raise an exception if a method is called that
    # isn't defined on the parent class. In other environments, just return
    # false, assuming the action is unauthorized.
    #
    # @raise [RuntimeError]
    #   in development and test, when an undefined method is called
    #
    def method_missing(m, *args, &block)
      if Rails.env.development? || Rails.env.test?
        raise RuntimeError, "#{self.class.name} does not define #{m}!"
      else
        false
      end
    end
  end
end

Here, you will notice two key differences against Pundit:

  1. The initialize method assumes two objects, where the first is the user object, and the second is the record.
  2. Calling an authorization method that hasn't been defined on an authorizer raises a RuntimeError in development/test. In production, it is assumed that the action is unauthorzed.

Why? I am suppressing some code here that fires an alert about the action in a monitoring system and tells a human about it.

And then an individual authorizer looks like:

class PostAuthorizer < Authorizer::Base
  def index?
    true
  end

  def show?
    if user.is?(:admin)
      true
    else
      record.user_id == user.id
    end
  end

  def create?
    record.user_id == user.id
  end
  alias_method :new?, :create?
end

Just like the extractor pattern, every authorizer inherits from the Base authorizer. Guiding off Thunderbolt Lab's Testing Pundit Policies with RSpec, we can test policies easily. Each policy spec lives under spec/authorizers:

# spec/authorizers/post_authorizer.rb

describe PostAuthorizer do
  let(:record) { build(:post) }
  subject { described_class.new(user, record) }

  context 'as an admin' do
    let(:user) { build(:user, roles: 'admin') }

    it { should permit(:index) }
    it { should permit(:show) }
    it { should permit(:create) }
  end

  context 'as a user' do
    let(:user) { build(:user) }

    it { should permit(:index) }
    it { should permit(:show) }
    it { should_not permit(:create) }

    context 'when the record is owned by the user' do
      let(:record) { build(:post, user: user) }

      it { should permit(:create) }
    end
  end

  context 'as a guest' do
    let(:user) { nil }

    it { should permit(:index) }
    it { should_not permit(:show) }
    it { should_not permit(:create) }
    it { should_not permit(:update) }
    it { should_not permit(:destroy) }
  end
end

This is awesome! Why? Because you have decoupled your authorization tests from the application and Rails. I have worked on many Rails applications where authorization is tested with Capybara by "signing in" as different users, hitting URLs, and asserting page content.

Wrap Up

Advantages

  1. Very "Rails-like" - Controllers, Models, and Views all follow the class SomethingThing < Thing::Base pattern. This makes it less foreign and less magical than pure policy objects. Just like Controllers live under app/controllers, Authorizers live under app/authorizers. Extractors live under app/extractors. See also: Princple of least surprise.
  2. Self-contained - All of the code is customized and tailored to the application.
  3. Well-tested - Moving these policies into a more Rails-like pattern makes for easy testing and abstraction.

Disadvantages

  1. Very "Rails-like" - I would not recommend using this pattern outside of Rails. The mere use of constantitze is probably a bad pattern, but it comes with Rails, so I leveraged it.
  2. Re-inventing the wheel - I do not personally think this is a disadvantage, but other people disagree. I think there is legitimate value in catering a solution to a use case.

About Seth

Seth Vargo is a Distinguished Software Engineer at Google. Previously he worked at HashiCorp, Chef Software, CustomInk, and some Pittsburgh-based startups. He is the author of Learning Chef and is passionate about reducing inequality in technology. When he is not writing, working on open source, teaching, or speaking at conferences, Seth advises non-profits.