Heroku to OpsWorks

The tale of a hosting migration that shaved 40% off response times

Stefan Wrobel

Startup Founder | Engineering Leader with an MBA

tl;dr

We had a positive experience moving Cult Cosmetics, a rails app, from Heroku to OpsWorks, with response times reduced by ~40%. If you're thinking about making this move, DO IT, but realize that it will probably require more time & effort than you expected.

Why ditch Heroku?

I'm sure everyone has their own reasons, but here are my major decision points:

Cost
Limitations of the dyno
30 second request timeout
60 second app boot timeout
Geographic restrictions
Heroku-specific workarounds in my code (and injected into it)
Lack of autoscaling support

As one of the legion of developers who loathes doing sysadmin, I had a torrid love affair with their service, but my frustrations with scaling StackSocial and Cult Cosmetics, the Rap Genius debacle, and the fact that their default response to customers with scaling issues is "your code sucks" have put me firmly in the haters' camp. I still appreciate how easy it is to deploy an MVP for free, but I wouldn't advise anyone to try to scale to significant traffic on their platform.

In search of the Holy Grail

Platform-as-a-Service (PaaS) providers have been springing up like weeds since Heroku first blazed the trail in 2007. I wanted a platform that would allow me to do just enough sysadmin to effectively troubleshoot performance and platform issues, but not force me to manage individual boxes or instances.

So, what is OpsWorks and how does it work? I'll leave that explanation to the amazing team at Artsy, who also made the switch recently.

Learning curve

If you don't know Chef, learning how it works and how to write cookbooks will be the major stumbling block for you in the migration. Don't be naive like I was and assume that you can just grab cookbooks off the shelf and expect them to work the way you want. You will have to get your hands dirty at some point, and it's better to do it sooner than later so you understand what's going on behind the scenes.

You will also need a way to manage your custom cookbooks. I have seen a variety of different ways of handling this, but I like librarian's Cheffile format because it follows the familiar pattern established by bundler with Gemfiles. Feel free to use what we implemented or comment with your own solution.

As a bonus, now that Amazon RDS supports Postgres, you won't have to migrate to MySQL or manage your own Postgres instances to make the jump from Heroku Postgres.

Platform differences

You can run nginx (or apache if you're into that) to serve static assets and handle gzipping. No more relying on the significantly-slower rack middleware for these tasks. EDIT: this is possible on Heroku via a custom buildpack.
You can run any process you want on any instance. You can have your own separate multi-purpose worker instances. For example, we run both whenever (cronjobs) and sidekiq (background jobs) on our worker instance.
There is no native support for setting ENV vars on OpsWorks (bummer). Custom cookbooks do exist to do this, though.
Precompiling assets at deploy-time on OpsWorks instances is a pain (but you probably shouldn't be doing that anyway)

Step by step

Remove all of that Heroku-specific cruft from your codebase. Examples:
- Set: config.serve_static_assets = false
- Remove: config.logger = Logger.new(STDOUT)
- Remove: use Rack::Deflater
Set up your stack and layers. We built out the stack for staging first and then cloned it and modified as necessary for production use when we were ready (sadly you can't copy layers between stacks - yet). It may be tempting to have one stack with separate apps for your various environments but the fact that custom json is set at the stack-level would make things pretty messy.
Make sure when you set up your Rails App Server layer you add ImageMagick to the OS Packages section if you're using any gems that rely on it (like paperclip or carrierwave do for resizing). Tip: make sure to click the blue plus sign next to the "Enter name" textbox or you won't have actually added the package to your layer. The UX is a little weird.
Figure out what to do with your static assets. We are using a hybrid solution involving AssetSync to push our precompiled static assets to S3 (so we don't have to commit them into our codebase) and keeping the yml (Rails 3) or json (Rails 4) manifest file in source control. This requires precompiling/syncing assets locally before any deploys that involve changes to assets. Eventually we'll integrate this with our CI process but haven't gotten around to that.
Figure out what to do with your ENV vars. We are using a fork of the opsworks_custom_env cookbook modified to work with our sidekiq cookbook. This setup relies on figaro to read the application.yml file generated by the cookbook. I was a dotenv guy before, but figaro works just as well. Tip: make sure when writing your custom json, you use your app's "short name" which will be different from the name you gave it if you used any special characters.
Figure out what to do with your logs. We use Papertrail to store/search our logs. Our solution is tailored to Papertrail but should work with any rsyslog-based provider.
Replace your Heroku addons with an alternative (or keep using them if you're hosting in US-East, things should work just fine). Redis and Memcached are both supported by AWS Elasticache and you can even import a Redis dump file when creating your cache cluster.

Afterthoughts

Overall, I'm happy I made the move. On the positive side:

App response times have dropped by ~40% (based on NewRelic reporting)
It's nice to be able to SSH in and troubleshoot performance issues by digging into system load, process cpu & memory usage, etc.
I can use any type of instance I want. We're currently using the new SSD-backed c3.large instances which run ~$110/month (cheaper if reserved) and can happily run 6 unicorns with plenty of memory to spare.
We can host our app in the AWS Oregon region instead of the dreaded US-East that Heroku forces us into (or Europe ... not helpful). We can also geographically loadbalance by launching instances in other regions when the time comes.
We can autoscale based on time & load!

While I'd love to say I'm completely pleased with the move, I knew that wouldn't be the case when I started. In my ideal world, deploying my apps wouldn't require any platform-specific code, or if it did, that code would be portable between platforms. OpsWorks is a somewhat open-source platform, but since the architecture configuration is closed-source, it's likely that moving to another PaaS would be another involved process. OpenStack holds a lot of promise when it comes to making ops-related code portable between platforms, but it's still a work-in-progress.

Disclosure

I should mention that I was biased in my choice by the fact that, as a Science-backed startup, Cult Cosmetics gets a non-trivial amount of free AWS credits that couldn't be applied on Heroku, so the fact that we're now paying $0 in monthly hosting costs (and probably will be for the next 6 months or so) weighed substantially when considering the cost factor. With that said, I looked agonizingly at every option that I could find, including DotCloud, AppFog, OpenShift among others. Bare EC2 was out of the question because of the ops overhead, so OpsWorks won out because of maturity, cost, and the amazing guys at FreeRunningTech, who have been using OpsWorks for quite a while and were kind enough to answer my countless questions along the way.

Resources

Here are the most useful links scattered throughout this post:

Static Asset configuration
Logging configuration
OpsWorks Cheffile (custom cookbook management)
Custom Env cookbook (ENV vars for OpsWorks)
Sidekiq cookbook (for background jobs)
Whenever cookbook (for scheduled tasks)
OpsWorks stock cookbooks
Artsy OpsWorks overview