Here’s a video about threads in Ruby – how do you code them? What are the problems with them? I’m planning a few more short videos about Ruby concurrency. I don’t think enough people have a good grasp of the tradeoffs and problems of running more than one chunk of code at once, in Ruby or in general.
I’m also still learning Camtasia and the rest of the video toolchain. So I hope I’ll look back in a few months and think, “wow, this looks really bad now!”
In the mean time, enjoy! Here’s a transcript:
Ruby supports threads, of course. Modern Ruby, anything 1.9 and higher, uses the normal kind of threads for your operating system, or JVM threads if you’re using JRuby. [next slide] Here’s the code to make that happen. The code inside the Thread.new block runs in the thread, while your main program skips the block and runs the join. “Join” means to wait for the thread to finish - in this case, it waits up to 30 seconds since that’s what we told it. With no argument, join will wait forever. You’ll probably want a begin/rescue/end inside your thread to print a message if an uncaught exception happens. By default, Ruby just lets the thread die silently. Or you can set an option to have Ruby kill your whole program, but I prefer to print a message. In “normal” Ruby, Matz’s Ruby, there’s a Global Interpreter Lock. Two threads can’t run Ruby code at the same time in the same process. So most Ruby threads are waiting for something to complete, like an HTTP request or a database query. [next slide] What are Ruby threads good for if there’s a global lock? In Matz’s Ruby, you’ll mostly want to use them for occasional events. RabbitMQ messages, timeouts, and Cron-style timed jobs are all good. Since you can only have one thread doing real work in Ruby at once, threads are best when your process is mostly idle. JRuby and Rubinius users can chuckle here. Their Ruby handles multiple Ruby threads working at the same time without a hiccup. So what problems do you sometimes see with Ruby threads? [next slide] With the Global Interpreter Lock, threads can switch back and forth between CPUs, sometimes very often. If you see Ruby threads performing horribly in production with Matz’s Ruby or using way too much CPU, try locking the thread down to a single CPU. Your operating system has commands for that — Google them. Next, if the main thread dies or finishes, your program ends. That’s a feature, but sometimes it’s surprising. To have your main thread of execution stick around, use join on the other threads to let them finish. And of course an unhandled exception in a child thread kills only the child thread, and it does it completely silently. Use a begin/rescue/end to print errors to console or set the Ruby option to kill the main program when a child thread dies from an unhandled exception.
Python is an experiment in how much freedom programmers need. Too much freedom and nobody can read another’s code; too little and expressiveness is endangered.
Ruby, on the other hand, is an experiment in “give every toddler a chainsaw”-level freedom. And as Gary says in that talk, you get some things like RSpec, ActiveRecord and Cucumber that simply aren’t possible at the lower “reasonable” levels of freedom.
95% of what only Rubyists can do with that freedom is horrible, irredeemable crap. But the other 5% couldn’t have happened in any other way.
If you’ve never built a Ruby gem, go do that first. It’s amazingly easy. And it lets other people use your stuff!
When you do you’ll have to figure out — how much goes into one gem? What should you cut up into new gems?
Here’s how you tell.
What’s the Big Idea?
You want your gem to have one basic idea. Then, cut out everything that isn’t part of that idea. If your gem solves differential equations… Great! Don’t expose methods for memory allocation just because you wrote them.
The people who download your gem are busy, and they have short attention spans.
So you get to tell them one thing. What’s that one thing?
Aren’t Gems Inefficient?
People from other languages make fun of five-line gems. Ruby people usually know better.
A Ruby gem can do amazing things in five lines — single ideas that don’t take more than that can be awesome.
If you’re using 20 kilobytes to package up five lines of text… Dude! It’s 20 kilobytes. You waste more than that in TCP headers checking gmail every five minutes.
Gems are cheap. I promise.
Helping people see your amazing idea is priceless. Don’t shovel in hundreds of lines of crap so it seems “more serious.” Just show off the important part.
How Big is Too Big?
Libraries can have the same problem.
A library that knows too much is giving you orders. RSpec includes its own mocking library… And so it’s hostile to people who like RR (Ruby Double) or Mocha. It forces you to do things its way. Is that why you started programming Ruby?
As much as I love ActiveSupport, it does the same thing. It requires very specific versions. It monkeypatches the core libraries. It’s just not friendly to people who want to do their own thing.
In the end, this is like “one idea” up above… If you force your users to do it your way, they’ll hate your gem. Show them something awesome and they’ll love it.
But I Want Numbers!
I can’t give you a maximum number of lines for your gem. But here are some rough ideas:
ActiveRecord is about 30,000 lines… And it’s a huge gem. If you’re writing more Ruby than this in one go, you’d better have a really good reason. And its largest file (associations.rb) is still only around 1500 lines. That’s a lot of Ruby code.
Pony is a fabulously useful email gem, which mostly wraps another, worse email gem. It’s only 500 lines, and half of that is tests. I do not care that it’s “too small.” I care that it works well and the interface is awesome. If it were only 10 lines, I would still use it happily.
The important thing, for both of them is: the gem has one central idea, and it stays true to it.
You should do the same.
I love the idea of Chef and Puppet. And to me, that idea is “let’s just declare everything we want to be true about a server and it can magically spring into existence.” Better yet, of course, is trying it out locally and then pushing it out, as Chef plus Vagrant promises.
Especially that server is declared as “NGinX should be installed” rather than “download this file, copy it here, unzip and build it…” Chef and Puppet, each in their way, is a possible vision of that future.
Of course, right now they are not ready for casual use.
But Aren’t They Both A Few Years Old?
I started with Chef several years ago. It was a bit of a mess — significant changes like Lightweight Resources and Providers that were supposed to change everything and the rewrites that followed. Berkshelf, and then later its acquisition and significant rewrites.
Basically, a fair bit of churn and chaos. Just what you’d expect of a new idea that’s turning into something you can use.
Puppet, of course, was doing almost exactly the same thing with the Puppet 2 to Puppet 3 transition, modules and whatnot.
To be clear: each and every one of these is basically a good idea. Some day, when Chef integrates them nicely, I think it will be an amazing solution.
Is This Just Theory?
Right now, the problem manifests in ways like the fact that you can’t use the latest version of the build-essential cookbook (one of the very basic ones) with the version of Chef that comes with Vagrant — you have to figure out you need a different Chef version, and use one of several hacks or plugins to make it work. This isn’t really documented… You just have to figure it out.
Plus, many Chef cookbooks don’t yet support the new LWRPs, and many Puppet cookbooks don’t support the Puppet-3-style modules, and all kinds of things are broken because they’re “old” — often only six to twelve months out of date, but that’s already old.
There are still a lot of rough edges.
Chef and Puppet will both be amazing, given time. However, as a non-Devops person… You can expect that they’ll both be changing a lot for awhile, and that they’ll be unfriendly to casual users for awhile yet. Upgrades will remain painful, probably for at least another year or two.
And likely more.
I look forward to the future, when a well-written year-old tutorial doesn’t require three or four updates to work with recent versions of everything. And we’ll get there.
We’re just not there yet.
Isn’t it wonderful when you get to go in and clean things up? The best, for me, is when I’ve learned a lot in a year or two and I get to re-do an old crufty program or a badly-designed page with what I’ve improved at.
What have you gone back and fixed lately?
Of course, Quarto has kind of an intimidating “install me first” list. Here’s what I needed to do:
I got to skip installing Git. If you need to install it, be sure to install XCode as well, which probably needs to happen from the Mac App Store these days.
brew update, because I hadn’t in awhile.
Install Pandoc. The package failed, so I had to use Homebrew to brew install haskell-platform, then cabal update, possibly cabal install cabal-install and finally cabal install pandoc. Haskell-platform can take 15 minutes to compile. If you need to install a new cabal-install (it’ll tell you if you do), that also takes awhile to compile. You can keep going, though – nothing else on this list depends on Pandoc until Quarto itself.
Install pygments. For me, that meant first installing pip (easy_install pip), a Python package manager, then pip install pygments.
Install xmllint (brew install libxml2).
Download and install PrinceXML, the free version. Unpacked, ran ./install.sh.
Install xmlstarlet (brew install xmlstarlet).
Install fontforge (brew install fontforge).
I didn’t need to install Ruby or RubyGems. If you do — well, try to make sure you get Ruby 1.9 or higher. This step could be difficult if you’re not already a Rubyist. We’re working on it :–(
gem install quarto
Looks like I’m not the first to notice that Quarto takes some installing :–)
Also looks like Quarto wouldn’t be too hard to put together a Homebrew recipe for.