Categories Performance Monitoring

An Introduction to Chef (including Windows!)

Chef is a systems integration framework. As one of the poster technologies of the DevOps philosophy, Chef allows users to easily spin up VM images either locally or on the cloud with certain configurations. Configuring a server with a certain version of Python or MySQL server means that over the course of infrastructure upgrades, individual servers don’t drift, making the entire system more stable and easier to manage.

The Chef framework is based on Ruby and the cookbooks are (mostly) platform-independent. For example, to install MySQL on Ubuntu, RedHat, and Windows, you can use the same recipe. Recipes describe what should happen, not how it should happen..

Structure of Chef:

The easiest way to understand Chef to start by looking at the structure of a simple cookbook:

+-- attributes
¦   +-- default.rb
¦   +-- epel.rb
¦   +-- remi.rb
+-- Berksfile
+-- Gemfile
+-- metadata.json
+-- metadata.rb
+-- providers
¦   +-- key.rb
¦   +-- repository.rb
+-- README.md
+-- recipes
¦   +-- default.rb
¦   +-- epel.rb
¦   +-- ius.rb
¦   +-- remi.rb
¦   +-- repoforge.rb
¦   +-- yum.rb
+-- resources
¦   +-- key.rb
¦   +-- repository.rb
+-- templates
¦   +-- default
+-- Vagrantfile

In this example, yum is the cookbook name.

  • attributes are specified in the ruby scripts under the attributes directory. The name of the attribute file is insignificant so you could put all inside a single file for a cookbook although that’s not recommended.
  • metadata.json is where you need to specify cookbook dependencies manually
  • recipes is where you place your recipes. For example, to include the default.rb recipe of the yum cookbook, use
    include_recipe 'yum::default'
  • providers is where you’d place the implementation of a light-weight resource
  • resources is where you’d specify the interface of a light-weight resource
  • templates is where you would load your templates from.

Enabling chef-solo on windows

In production, Chef recipes run from .a central server. The hosted enterprise version of Chef free for up to 5 nodes. However, if you wish to have more nodes, you either have to pay for the additional nodes, or have to use the client stand-alone version which is called chef-solo.

Through the use of “knife solo cook” command, Chef copies your recipes to the remote machine and executes them. This is available as by installing the knife-solo gem, adding this functionality. Since knife uses rsync to accomplish this, it is obvious that it is only supported on linux/unix. However, it is possible to make to work on windows by

  • Installing cygwin on the remote machines
  • Installing ssh server and configuring it with proper security
  • Writing wrappers to send the appropriate cook command.


On of the important aspects of Chef is that it will not run the same recipe twice.

For example, if in your run list you specify:

recipe[install_x], recipe[test_x], recipe[upgrade_x], recipe[test_x]

The last step will not be executed and there is no proper way around this. That works against the concept of re-usability. As a result, I was forced to implement it as:

recipe[install_x], recipe[test_x], recipe[upgrade_and_test_x]

where the tests are the same test (in our case python nosetests) which are called from the 2 recipes.

Also, if you include a recipe in various cookbooks, it will only be loaded once.

There is an important difference between “include_recipe” and “chef.add_recipe” command, and that is “add_recipe” loads the whole cookbook, but “include_recipe” does not load the files and attributes of the cookbook.

Compile-Time versus Run-Time

Chef compiles and validates all the recipes before executing them. One of the confusing aspects of Chef is that when you place a plain ruby code in a recipe, it is evaluated at compile time while the recipe resources are evaluated at run time. As an example:

  # do something

is typically not what you want since you want to know whether the file exists at run time. That’s where you can make use of the ruby block:

ruby_block "some task" do
  block do
    if File.exist?("/home/nosetests")
          Chef::Log.info("nosetests exists!")
          Chef::Log.info("nosetests does not exist!")

The big downside is that you cannot cascade like placing a package Chef resource block inside a ruby_block.

Light-Weight Resources

There are situations where you would like to make a generic operation which then you can call any number of times. For example, you need to run iisreset in various recipes, then you would use the IIS cookbook where you would create a light-weight resource for it to reset. Here is how you’d want it to be used:

windows_iis "reset iis" do
  action :reset

In the resources/iis.rb you need to specify the interface {the actions, the inputs and their type}

In the providers/iis.rb you can implement the behaviour. Note that LWR is only for actions. If instead of an action, you would like to get a return value, then you must use Libraries.


This Chef feature allows you to use a ruby script that is accessible from any recipe.

Let’s say you have a ruby script that retrieves the python version:

  module Helpers
    def self.python_version
      v = `python -V 2>&1`
      return v

and you can just call the method like

if Helpers.python_version == "2.7"
  puts "Good!"

As you can see, the ruby scripts under the library folder are automatically loaded in Chef.

Best Practices Guidelines

    • Never hard code paths like \tmp or C:\. Chef has a temporary directory it uses for files, which is configurable at: Chef::Config[:file_cache_path]
    • Make use of Lightweight Resources when you want to implement something like a general-purpose operation.
    • Make use of not_if and only_if guards so make your code more readable:

execute "apt-get-update" do
   command "apt-get update"
   ignore_failure true
   not_if do ::File.exists?('/var/lib/apt/periodic/update-success-stamp') end
  • Chef::Log is pretty useful since there is no other way to debug Chef.
    If you use Chef::Log.info(“version = #{version}”) then you can limit the log level to info as you run Chef:
    chef-solo -c $(DIR)/solo.rb -j $(DIR)/dna.json -l info

    or on windows

    chef-solo.bat -c %DIR%/solo.rb -j %DIR%/dna.json -l info

Note that DIR is the variable pointing to the directory where dna.json and solo.rb are located. You need to specify the full absolute path otherwise Chef would likely fail with a confusing message.

How to specify the dependencies

You need to specify the dependencies in the metadata.rb. Unfortunately there is no automatic validation for the dependencies which could cause a great deal of confusion. For example, if download a new cookbook called dotnetframework which has a default attribute

default['dotnetframework']['version'] = '4.5'

and you would include it in your own recipe as

include_recipe "dotnetframework::default"

If you forget to include the dependency in your metadata.rb, Chef will fail while complaining about null attribute node[‘dotnetframework’][‘version’] which is caused due to the fact that the attribute file is not read at all.

Final Words:

I think Chef will be very powerful in the future but currently it has some major drawbacks which need to be improved:

  • Many documented resources do not quite work as documented – for example the Directory resource fails on windows on certain paths or the git recipe has un-documented attributes.
  • It does not support chef-solo cleanly on windows
  • The errors reported by Chef are misleading in many cases
Kourosh Parsa: