Sometimes as a software developer you come across a seemingly innocuous piece of code that, when investigated, leads you down a rabbit hole much deeper than you anticipated. This is the story of such a case.
It begins with some clever Ruby code that we want to refactor, and ends with a prototype solution that changes the language itself. Along the way, we unearth a performance issue in TruffleRuby, an implementation of the Ruby language, and with it, an opportunity to work at the compiler level to even off the performance cliff. I’ll share this story with you.
A Clever Way to Fetch Values from a Nested Hash
The story begins with some Ruby code that struck me as a little bit odd. This was production code seemingly designed to extract a value from a nested hash, though it wasn’t immediately clear to me how it worked. I’ve changed names and values, but this is functionally equivalent to the code I found:
Two things specifically stood out to me. Firstly, when extracting the value from the data hash, we’re calling the same method, fetch, twice and chaining the two calls together. Secondly, each time we call fetch, we provide two arguments, though it isn’t immediately clear what the second argument is for. Could there be an opportunity to refactor this code into something more readable?
Before I start thinking about refactoring, I have to make sure I understand what’s actually going on here. Let’s do a quick refresher on fetch.
The Hash#fetch method is used to retrieve a value from a hash by a given key. It behaves similarly to the more commonly used [ ] syntax, which is itself a method and also fetches values from a hash by a given key. Here’s a simple example of both in action.
Like we saw in the production code that sparked our investigation initially, you can chain calls to fetch together like you would using [ ], to fetch nested values to extract a value from nested key-value pairs.
Now, this works nicely assuming that each chained call to fetch returns a hash itself. But what if it doesn’t? Well, fetch will raise a KeyError.
This is where our optional second argument comes in. Fetch accepts an optional second argument that serves as a default value if a given key can’t be found. If you provide this argument, you get it back instead of the key error being raised.
Helpfully, you can also pass a block to make the default value more dynamic.
Let’s loop back around to the original code and look at it again now that we’ve had a quick refresher on fetch.
The Refactoring Opportunity
Now, it makes a little more sense as to what’s going on in the original code we were looking at. Here it is again to remind you:
The first call to fetch is using the optional default argument in an interesting way. If our data hash doesn’t have a response key, instead of raising a KeyError, it returns an empty hash. In this scenario, by the time we’re calling fetch the second time, we’re actually calling it against an empty hash.
Since an empty hash has no key-value pairs, this means when we evaluate the second call to fetch, we always get the default value returned to us. In this case, it’s an instance of IdentityObject.
While a clever workaround, I feel this could look a lot cleaner. What if we reduced a chained fetch into a single call to fetch, like below?
Well, there’s a precedent for this, actually, in the form of the Hash#dig method. Could we refactor the code using dig? Let’s do a quick refresher on this method before we try.
Dig acts similarly to the [ ] and fetch methods. It’s a method on Ruby hashes that allows for the traversing of a hash to access nested values. Like [ ], it returns nil when it encounters a missing key. Here’s an example of how it works.
Now, if we try to refactor our initial code with dig, we can already make it look a lot cleaner and more readable.
Nice. With the refactor complete, I’m thinking, mission accomplished. But...
One thing continues to bother me. dig just doesn’t feel as versatile as fetch does. With fetch you can choose between raising an error when a key isn’t found, returning nil, or returning a default in a more readable and user-friendly way.
Let me show you what I mean with an example.
Fetch is able to handle multiple control flow scenarios handily. With dig, this is more difficult because you’d have to raise a KeyError explicitly to achieve the same behaviour. In fact, you’d also have to add logic to make a determination about whether the key doesn’t exist or has an explicitly set value of nil, something that fetch handles much better.
So, what if Ruby hashes had a method that combined the flexibility of fetch with the ability to traverse nested hashes like dig is able to do? If we could do that, we could potentially refactor our code to the following:
Of course, if we want to add this functionality, we have a few options. The simplest one is to monkey patch the existing implementation of Ruby#Hash and add our new method to it. This lets me test out the logic with minimal setup required.
There’s also another option. We can try to add this new functionality to the implementation of the Ruby language itself. Since I’ve never made a language level change before, and because it seems more fun to go with option two, I decided to see how hard such a change might actually be.
Adding a New Method to Ruby Hashes
Making a language level change seems like a fun challenge, but it’s a bit daunting. Most of the standard implementation of the Ruby language is written using C. Working in C isn’t something I have experience with, and I know enough to know the learning curve would be steep.
So, is there an option that lets us avoid having to dive into writing or changing C code, but still allows us to make a language level change? Maybe there’s a different implementation of Ruby we could use that doesn’t use C?
TruffleRuby is an alternative implementation of the Ruby programming language built for GraalVM. It uses the Truffle language implementation framework and the GraalVM compiler. One of the main aims of the TruffleRuby language implementation is to run idiomatic Ruby code faster. Currently it isn’t widely used in the Ruby community. Most Ruby apps use MRI or other popular alternatives like JRuby or Rubinius.
However, the big advantage is that parts of the language are themselves written in Ruby, making working with TruffleRuby much more accessible for folks who are proficient in the language already.
After getting set up with TruffleRuby locally (you can do the same using the contributor guide), I jumped into trying to make the change.
Implementing Hash#dig_fetch in TruffleRuby
The easiest way to prototype our new behaviour is to add a brand new method on Ruby hashes in TruffleRuby. Let’s start with the very simple happy case, fetching a single value from a given hash. We’ll call our method dig_fetch, at least for our prototype.
Here’s how it works.
Let’s add a little more functionality. We’ll keep in line with fetch and make this method raise a KeyError if the current key isn’t found. For now, we just format the KeyError the same way that the fetch method has done it.
You may have noticed that there’s still a problem here. With this implementation, we won’t be able to handle the scenario where keys are explicitly set to nil, as they raise a KeyError as well. Thankfully, TruffleRuby has a way to deal with this that’s showcased in its implementation of fetch.
Below is how the body of the fetch method starts in TruffleRuby. You see that it uses a module called Primitive, which exposes the methods hash_get_or_undefined and undefined?. For the purposes of this post we won’t need to go into detail about how this module works, just know that these methods will allow us to distinguish between explicit nil values and keys that are missing from the hash. We can use this same strategy in dig_fetch to get around our problem of keys existing but containing nil values.
Now, when we update our dig_fetch method, it looks like this:
And here is our updated dig_fetch in action.
Finally, let’s add the ability to ‘dig’ into the hash. We take inspiration from the existing implementation of dig and write this as a recursive call to our dig_fetch method.
Here’s the behaviour in action:
From here, it’s fairly easy to add the logic for accepting a default. For now, we just use blocks to provide our default values.
And tada, it works!
So far, making this change has gone smoothly. But in the back of my mind, I’ve been thinking that any language level change would have to be justified with performance data. Instead of just making sure our solution works, we should make sure it works well. Does our new method hold up, performance-wise, to the other methods which extract values from a hash?
Benchmarking—A Performance Cliff Is Found
I figure it makes sense to test the performance of all three methods that we’ve been focusing on, namely, dig, fetch, and dig_fetch. To run our benchmarks, I’m using a popular Ruby library called benchmark-ips. As for the tests themselves, let’s keep them really simple.
For each method, let's look at two things
- How many iterations it can complete in x seconds. Let’s say x = 5.
- How the depth of the provided hash might impact the performance. Let’s test hashes with three, six, and nine nested keys.
This example shows how the tests are set up if we were testing all three methods to a depth of three keys.
Ok, let’s get testing.
Running the Benchmark Tests
We start by running the tests against hashes with a depth of three and it looks pretty good. Our new dig_fetch method performs very similarly to the other methods, knocking out about 458.69M iterations every five seconds.
But uh-oh. When we double the depth to six (as seen below) we already see a big problem emerging. Our method's performance degraded severely. Interestingly, dig degraded in a very similar way. We used this method for inspiration in implementing our recursive solution, and it may have unearthed a problem with both methods.
Let’s try running these tests on a hash with a depth of nine. At this depth, things have gotten even worse for our new method and for dig. We are now only seeing about 12.7M iterations every five seconds, whereas fetch is still able to clock about 164M.
When we plot the results on a graph, you see how much more performant fetch is over dig and dig_fetch.
So, what is going on here?
Is Recursion the Problem?
Let’s look at dig, the implementation of which inspired our dig_fetch method, to see if we can find a reason for this performance degradation. Here’s what it looks like, roughly.
The thing that really jumps out is that both dig and dig_fetch are implemented recursively. In fact, we used the implementation of dig to inspire our implementation of dig_fetch so we could achieve the same hash traversing behaviour.
Could recursion be the cause of our issues?
Well, it could be. An optimizing implementation of Ruby such as TruffleRuby attempts to combine recursive calls into a single body of optimized machine code, but there’s a limit to inlining—we can’t inline forever producing infinite code. By contrast, an iterative solution with a loop starts with the code within a single body of optimized machine code in the first place.
It seems we’ve uncovered an opportunity to fix the production implementation of dig in TruffleRuby. Can we do it by reimplementing dig with an iterative approach?
Shipping an Iterative Approach to #dig
Ok, so we know we want to optimize dig to be iterative and then run the benchmark tests again to test out our theory. I’m still fairly new to TruffleRuby at this point, and because this performance issue is impacting production code, it’s time to inform the TruffleRuby team of the issue. Chris Seaton, founder and maintainer of the language implementation is available to ship a fix for dig’s performance degradation problem. But first, we need to fix the problem.
So, let’s look at dig again.
To simplify things, let’s implement the iterative logic in a new package in TruffleRuby we will call Diggable. To be totally transparent, there’s a good reason for this, though one that we’ve glossed over in this post—dig is also available on Arrays and Structs in Ruby. By pulling out the iterative implementation into a shared package, we can easily update Array#dig, and Struct#dig to share the same behaviour later on. For now though, we focus on the Hash implementation.
Inside Diggable, we make a method called dig and add a loop that iterates as many times as the number of keys that were passed to dig initially.
With this change, dig continues to work as expected and the refactor is complete.
#dig Performance Revisited
Now, let’s have a look at performance again. Things look much better for dig with this new approach.
Our solution had a big impact on the performance of dig. Previously, dig could only complete ~2.5M iterations per second against a hash with nine nested keys, but after our changes it has improved to ~16M. You can see these results plotted below.
Now that that’s out of the way, it’s time to apply the same process to our dig_fetch method and see if we get the same results.
Back to Our Implementation
Now that we’ve seen the performance of dig improve we return to our implementation and make some improvements. Let’s add to the same Diggable package we created when updating dig.
The iterative implementation ends up being really similar to what we saw with dig.
After our changes we confirm that dig_fetch works. Now we can return to our benchmark tests and see whether our iterative approach has paid off again.
Performance is looking a lot better! dig_fetch is now performing similarly to dig.
Below you can see the impact of the changes on performance more easily by comparing the iterative and recursive approaches. Our newly implemented iterative approach is much more performant than the existing recursive one, managing to execute ~15.5M times per second for a hash with nine nested keys when it only hit ~2.5M before.
Refactoring the Initial Code
At this point, we’ve come full circle and can finally swap in our proposed change that set us down this path in the first place.
One more reminder of what our original code looked like.
And after swapping in our new method, things look much more readable. Our experimental refactor is complete!
Of course, even though we managed to refactor the code we found using dig_fetch, we cannot actually change the original production code that inspired this post to use it just yet. That’s because the work captured here doesn’t quite get us to the finish line -- we ignored the interoperability of dig and fetch with two other data structures, Arrays and Structs. On top of that, if we actually wanted to add the method to TruffleRuby, we’d also want to make the same change to the standard implementation, MRI, and we would have to convince the Ruby community to adopt the change.
That said, I’m happy with the results of this little investigation. Even though we didn’t add our dig_fetch method to the language for everyone to use, our investigation did result in real changes to TruffleRuby in the form of drastically improving the performance of the existing dig method. A little curiosity took us a long way.
Thanks for reading!
Julie Antunovic is a Development Manager at Shopify. She leads the App Extensions team and has been with Shopify since 2018.
If building systems from the ground up to solve real-world problems interests you, our Engineering blog has stories about other challenges we have encountered. Visit our Engineering career page to find out about our open positions. Join our remote team and work (almost) anywhere. Learn about how we’re hiring to design the future together - a future that is digital by default.