Blocks in Ruby (Things I Wish I Had Understood Sooner) – Fooda
Fooda Hub

Blocks in Ruby (Things I Wish I Had Understood Sooner)

Ruby blocks are tricky, but they’re powerful and worth understanding. If you’re teaching yourself, you’re bound to run into a roadblock. Just talk yourself through it…

Imagine yourself arriving at an uncomfortable realization: You love programming.

Wait, that wasn’t uncomfortable? Try this one: You love programming, and you squandered every chance to study it in school. In fact, you’ve been out of college for almost two years, and you’re only just now figuring out that “software developer” could be your dream job.

This is the position in which I discovered myself, back in late 2014. Luckily, that was around the same time I discovered two other things: my love of self-teaching, and the Ruby language. Things worked out alright in the end, but as a complete novice, there were certain topics I struggled with more than other. With the benefit of hindsight, what lessons do I wish I had learned sooner? If I could guide myself over these hurdles, how would I do it? Maybe, something like this…

> Hey, future self. I’ve been learning Ruby for a few months now, but there are some concepts I haven’t totally grok’d yet.

Hey, past self. You’re struggling with blocks, right?

> I sure am.

Don’t worry, I got your back.

So you’ve been reading up on “Ruby blocks,” traversing your favorite search engines, link aggregators, and programming forums. You’ve seen many sources repeat the same sets of facts about blocks. Blocks are…

  • …similar to methods
  • …a way of binding functions to environments (this is called a “closure” in Computer Science terms; an important concept best described by more academic authors)
  • …available in “proc” and “lambda” flavors

These are all accurate, but you still feel like you’re missing something. This isn’t a knock on the authors you’ve read – they’re doing a great job, just targeting an audience with a different knowledge base or learning genome.

> Sounds about right.

Great. Let’s start by correcting the big misconception that you, my former self, have acquired… “Blocks are Ruby objects” – *this is false.*

> Wait, they’re not? I thought “everything in Ruby is an object”.

Nope. But this is a bit of a gotcha – Ruby does have a class that *implements* block/closure behavior. That class is `Proc`.

> There’s no class named “Block”?

Correct. The thing you’ve been missing is the distinction between “statements” and “expressions” – a pretty fundamental concept in computer science. But also an easy one to overlook, since you-me never studied this stuff as an undergrad. 

This Quora post offers a great rule of thumb (it’s about Python, but applies just as well to Ruby):

“If you can print it, or assign it to a variable, it’s an expression. If you can’t, it’s a statement.” – Quora.com user ‘Ryan Lam’, May 2017

An expression *returns a value*. A statement *conveys an instruction*. Every expression in Ruby is also a statement, but not all statements are expressions.

(Think back to 4th grade, when we bravely defied the dogmatic lie that every square was also rectangle. Be thankful that our math teacher was a very patient man.)

(Thanks, Mr. Brockway.)

Let’s see that in action.

# Ruby v2.3.1

# assigning an expression to a variable

> foo = puts "bar"

bar

=> nil

# attempting to assign a statement to a variable

> baz = next "bar"

SyntaxError: void value expression

puts logs its argument to the console, and then returns that same argument after executing, and the REPL happily assign that return to a variable.

next executes an instruction, specifically advancing to the next item in a list. The docs go into more detail about the exact mechanism, but suffice to say it doesn’t return anything.

Furthermore, puts is a method, while next is a keyword.

> This sounds kinda familiar. Methods are objects, right, but keywords are not?

Exactly.

> Great, but what’s this have to do with blocks?

If you can grasp the difference between objects and keywords, you’ve basically understood the difference between Procs and blocks. Here’s a block, which encloses the behavior of accepting an argument and adding 1 to it: “`{ |n| n + 1 }“`

> Right, I’ve seen these before.

And you’ve also tried grabbing it within a REPL:

my_block = { |n| n + 1 }

SyntaxError: syntax error, unexpected '|', expecting ''

> Right! Why does that not work? I know I’ve seen code similar to that.

You sure have. Code like this:

> my_array = [1,2,3]

=> [1, 2, 3]

> my_array.map { |n| n * 2 }

=> [2, 4, 6]

Here we’re defining a block in-line, and passing it as an argument to the method `map`. We can see the block right there, but if we try to actually touch it, it crumbles to dust and syntax errors… unless we know the magic word.

Watch as I create a new Proc instance:

my_proc = Proc.new { |n| n + 1 }

=> #<Proc:0x007ffda5026c78@(irb):1>

Finally, we’ve captured the the Blockachu in our Proc-è Ball. And we can unleash its thunderous strength:

my_array.map(&my_proc)

=> [2, 4, 6]

Bam. Any questions?

> So many.

Okay, okay. The & operator there allows us to pass the Proc-object-containing-our-block as a “normal” argument, instead of the usually way of defining the block inline with curly {} braces.

So what’s a Proc? It’s an object that responds to #call and returns a value. If you’re coming from JavaScript, they’re sort of like anonymous functions.

> my_proc = proc { |num| num + 1 }
=> #<Proc:0x007ffda6206ab0@(irb):7>
> my_proc.call(3)
=> 4

That’s the other big thing: you can assign procs to variables.

> new_proc = Proc.new { |a, b| a + b }
=> #<Proc:0x007f9d4dad7f68@(irb):1>
> new_proc.call(1,2)
=> 3
> new_proc.(3,4)
=> 7
> other_new_proc = proc { |x, y| x * y }
=> #<Proc:0x007f9d4dadd148@(irb):4>
> other_new_proc.(5,6)
=> 30
> new_lambda = lambda { |foo, bar| [foo, bar] }
=> #<Proc:0x007f9d4e8c9c98@(irb):6 (lambda)>
> new_lambda.('hello', 'world')
=> ['hello', 'world']
> new_lambda = -> (a, *b, **c) { "a: #{a.inspect}, b: #{b.inspect}, c: #{c.inspect}" }
=> #<Proc:0x007f9d4daee9c0@(irb):8 (lambda)>
> new_lambda.(:foo, [:bar, :baz], {quz: :qux})
=> "a: :foo, b: [[:bar, :baz]], c: {:quz=>:qux}"

> This is a lot to absorb at once.

Don’t worry about the lambda stuff too much. Those are still Proc objects, just with a couple extra features. The official docs lay it out pretty well, and it’s nothing that matters to us right now.

  • object.(args) is shorthand for object.call(args). Stick with the more readable #call unless you have a very good reason.
  • Procs have signatures and arities just like methods do. Well, exactly like methods do, since the signature is really that of their call method. The proc is the object that wraps the function, and allows us to pass it around.
  • The constructions above are indistinguishable performance-wise. There are one or two esoteric differences in their behavior that are usually irrelevant.

> But what if…

It’s all in the official docs. Trust me. It’s important that you recognize these tricks now, so that you can defend yourself against them, and perhaps one day… 

> Alright, Professor Dumbledore, get to the point.

Fine. There’s one more syntactical synonym:

> do_proc = proc do |z| [z * 3] end
=> #<Proc:0x007f9d4f07c4e0@(irb):13>
> do_proc.call("hello")
=> ["hellohellohello"]

I got ya, so do ... end wraps a block just like { } does?

Correct. If you haven’t figured it out by now, my younger self, you’re already using procs, just without assigning them to variables. Check it out:

> nums = [1,2,3]
=> [1, 2, 3]
> nums.map { |num| "A" * num }
=> ["A", "AA", "AAA"]

Looks familiar, right? Everybody loves Enumerable#map, except his weird older brother Brad Garret. You might or might not be surprised to learn that the above example is identical to this:

> nums = [1,2,3]
=> [1, 2, 3]
screamify = proc { |num| 'a' * num }
=> #<Proc:0x007f9d4dadf9c0@(irb):18>
> nums.map(&screamify)
=> ["A", "A", "AAA"]

> Aaah! What? Wait, what’s that & doing here? That looks familiar…

Good eye. You may have seen tricks like this floating around:

> digits.map(&:to_s)
=> ["1", "2", "3"]

The & is another operator (is that the right term?) on the argument :to_s. The method map is usually called with a block, and usually with a block that’s constructed in-line with the map invocation and then discarded immediately. The & signifies “the thing right after me should be read as a block”, allowing the programmer to skip past the procs and block parameter assignment. When you pass it a symbol, such as :to_s, Ruby makes the assumption that the symbol identifies the name of a method. It then constructs a block behind the scenes, and passes each element of the collection to that. In other words, we can rewrite map ourselves!

class Array
  def map_junior(&block)
    self.each_with_object([]) do |item, memo|
      memo.push(block.call(item))
    end
  end
end

> digits.map_junior { |d| d.to_f }
=> [1.0, 2.0, 3.0]

> digits.map_junior(&:to_f)
=> [1.0, 2.0, 3.0]

> That’s core? Seems kinda hacky.

Just our version, the real thing is far more robust. But that hack is also equivalent to this:

class Array
  def map_junior
    self.each_with_object([]) do |item, memo|
      memo.push(yield item)
    end
  end
end

That is what the yield keyword does. yield foo means, “grab the block that was passed to this method, and invoke #call on it with the following value(s) as argument(s)”.

> Okay, let’s say I understand all of this on a technical level. What do I do with that knowledge? I already knew how to use #map.

If you want to contribute to open source Ruby libraries, or write your own tools, you’ll find this knowledge valuable.

Let’s take a look at blocks in action. Domain specific languages (or “DSLs”) rely on blocks to enable various syntactic constructs that wouldn’t be possible otherwise. Let’s take a look at one popular DSL: RSpec, the automated testing framework.

> I know that one!

Yeah, me, I know we do.

If you spin up a new rails app, circa February 2018, with RSpec as your testing framework (such as outlined in this guide), you’ll find yourself with a file named spec/spec_helper.rb that contains configuration settings for the gem. I’ve copied an excerpt below.

RSpec.configure do |config|
 config.order = :random
end

> I can see that’s a block. But what’s it doing?

Let’s look at it from another angle:

RSpec.configure { |c| c.order = :random }

# or

order_random = Proc.new(obj) { obj.order = :random }
RSpec.configure(&order_random)

> Right. And let me guess, those are basically equivalent except for some esoteric corner-case I don’t need to worry about right now?

Right on the money, little buddy. What assumptions can we make here? For starters:

  • The RSpec object implements the class-level method .configure
  • .configure accepts a block

Dive into the source code, and we see the rest.

# rspec-core/lib/rspec/core.rb
# https://github.com/rspec/rspec-core/blob/a9e64bcd11ae26d8e223eb6a94dd51665e0c3329/lib/rspec/core.rb#L85
# rearranged and shortened for clarity

module RSpec

 # …

 def self.configure
   yield configuration if block_given?
 end

 def self.configuration
   @configuration ||= RSpec::Core::Configuration.new
 end

 # …

end

> Huh, so RSpec.configure is just a wrapper around yield?

Yup. It’s an interface for setting attributes on an instance of RSpec::Core:Configuration – and thanks to Ruby’s implicit block-passing, it reads almost like plain English.

We don’t need to get into what Configuration does right now, but if you’re interested all the source code is available on Github. In the meantime, let’s recap.

This…

RSpec.configure do |config|
 config.order = :random
end

…is another way of saying…

Rspec.configuration.order = :random

> That’s great, but I’m still missing the motive. Why go through all this trouble just to make things more difficult to read?

If all we wanted to do was assign attributes on a single object, you would have a point. There’s not much use to wrapping a simple update in block-param methods. The real power comes in abstracting away the underlying functions, and providing your end users with a simple, stable API.

> “Abstracting?”

Oh yeah, you’re gonna fall in love with that word soon enough. We’re taking complex details, and turning them into a simple idea. More or less. The point is, just like you don’t want to duplicate code between your classes, you also don’t want to duplicate knowledge between yourself and your end users.

For example, if we were concerned about our users passing invalid options to our Configuration object, we could do something like this:

module ImaginaryGemExample
 def self.configure
   yield configuration if block_given?
 rescue NoMethodError => e
   # some elegant error handling, hopefully
 end

 def self.configuration
   @configuration ||= Configuration.new
 end
end

Valid code says valid, and we get our new error monitoring. Win-win!

Anyway, that pretty much covers it. The main takeaways are…

> Hold up Pops, I got this one.

  1. blocks are a syntactic construct, which Ruby can manifest as Proc objects.
  2. Procs are like methods, except that they aren’t bound to a specific class or instance.
  3. Ruby offers a few different interfaces for creating and calling Procs, but they’re mostly interchangeable, most of the time.

Well done. Sometimes I even…

> …surprise myself? Hey, as long as I’ve got you here, did Game of Thrones ever pass the books? They’re saying Winds of Winter got pushed back to 2016, and-

Hoo boy. That’s all the time we have today.

— This post was written by Chris Graf, who tells people that he is a full-stack Software Engineer at Fooda, a claim science has not yet debunked. He reveres open learning resources and the communities who cultivate them. He once caught a fish with his bare hands.