Stories

Domino is changing the way people do data analysis — stories like these are the reason we make Domino better and better each day.

EC2 is not enough: Donor Bureau's need for a data science platform

Summary

  • Donor Bureau (DB) uses large data sets and machine learning algorithms to help clients maximize return on investment for their fundraising mailings.

  • DB found that it was spending too much valuable analyst time maintaining its EC2-based compute infrastructure (e.g., maintaining configurations).

  • DB “outsourced” these needs to Domino; its analysts now use Domino as a platform to conduct their analytical modeling for clients instead of using EC2 directly.

  • On net, DB estimates that Domino has saved its analysts a full day each week — time that was previously spent maintaining infrastructure can now go to billable client work.

The Challenge

Donor Bureau (DB) helps hundreds of non-profits and political campaigns optimize their direct mail campaigns, maximizing return for each letter sent. DB's data scientists employ a variety of machine learning techniques to develop and train targeting models against a data warehouse of over a billion transactions and tens of millions of donors.

To meet their heavy demand for computational resources, DB used Amazon's EC2 infrastructure — but managing EC2 machines for a production data science workflow wasn't simple or easy. DB had to grapple with EC2 setup and configuration and software installation; and then develop their own tools and techniques for transferring data to their machines, transferring results back, spinning machines up and down, maintaining configurations, and keeping snapshots of their work. In other words, they had to build a data science workflow on top of EC2.

The original infrastructure setup took a month, and in recent months DB found that its analysts were spending more and more time maintaining and improving this infrastructure. This work included dealing with configuration issues on EC2, adjusting for changes that Amazon introduced to their infrastructure, figuring out better workflows around data, and maintaining integration into a version control system.

It was a time-consuming headache, so DB's Head of Technology, Brian Johnson, went in search of a better solution.

“We're no longer maintaining all the infrastructure required for our people to use EC2 — we've outsourced that to Domino. And in total cost of ownership terms we're much better off.”

Brian Johnson, Head of Technology

 

The Solution

In May, DB moved to Domino as its data science platform, rather than continuing to use EC2 plus its custom infrastructure. Data scientists continue to use their favorite IDEs for Python and R. But now they use Domino to kick-off model runs and store and share results. And Domino handles all the behind the scenes issues with EC2 — from spinning up and down machines to billing.

Johnson describes the transition as “painless.”

Now DB is free from maintaining its infrastructure — Domino handles all the plumbing, so DB can focus on its analysis.

Domino increases DB's productivity in two ways. First, it lets DB run its models on arbitrary cloud hardware with one click, completely abstracting them from the details of managing cloud machines. And second, Domino automatically keeps a revisioned history of DB's work, including code, data, and results, so its analysts can always reproduce past work, and never lose results.

The Result

Most importantly, Johnson estimates that he's gotten back about a day a week of analyst time. In other words, they are 20% more productive now that they are free of the burden of infrastructure maintenance.

In total cost of ownership terms, Johnson considers this a no-brainer. His overall bill for infrastructure is about the same, but his team is much more productive.

And DB's analysts get a user-friendly, ever improving data science platform for their use as well.

Creating new possibilities for an avian ecologist

Alex Bond is a post-doctoral research fellow based in Canada. For his latest personal project, Alex is examining millions of bird records for spacial and temporal trends. With only his personal computer at his disposal, Alex uses Domino to run R scripts essential to his research.

Setup was simple. "Once you get the feel for it," he says, "Domino makes it easy to upload code and get results."

Once running, Domino adds horsepower to Alex's research. He can run scripts anywhere, not tethered to university computing resources, and without limiting himself to questions answerable with 8 GB of RAM.

"My project would have been impossible without Domino."

Alex Bond

 

"My project would have been impossible without Domino," Alex says, "A run that crashed my local machine took only four hours with Domino."

From Twitter

@sjGoring @DominoDataLab is awesome. Simplifies cloud computing for data analysis. Had jobs automated and running v.quickly. DM me for more

@ianakehurst

@lc_walsh @DominoDataLab saw the ad for this earlier and signed up. Guess where my next Kaggle entry will be processed?

@shane_a_lynn

Seconded: @PlethodoNick @DominoDataLab is fantastic. If you run lengthy or CPU intensive code for your science you should use it.

@Nicole_Michel

@DominoDataLab Love the way your service makes using cloud computing easier so users can focus on just doing what they want to do

@AnnoMarket

@DominoDataLab is awesome! Not only is it a great service to run time(CPU)-consuming code but their customer service is wonderful.

@PlethodoNick

Amazing support from @DominoDataLab to help me get the Google Analytics API up and running on their platform. This is going to be fun!!

@TFayyaz

#DataAnalysis in the #cloud - R, Matlab, Python. Upload files, play MarioKart, come back the results are ready @DominoDataLab @shane_a_lynn

@lc_walsh

Data analysis with Python, R or Matlab then take a look at @DominoDataLab for running your jobs in the cloud. Great first impressions.

@AstroAdamH

Whoa... impressed! If you do any machine learning or data work, you'll want to check out @dominodatalab http://bit.ly/1iAllf4

@thedavidprice

If you tell your girlfriend about @DominoDataLab she's going to do work instead of watching Game of Thrones with you. FYI.

@tofias

Around the Internet

Accelerate your analysis

Have a question? Email us or give us a call at (415) 425-0095.