Building a Faster ETL Pipeline with Flume, Kafka, and Hive

At WordPress.com we process a lot of events including some some events that are batched and sent asynchronously sometimes days later. But when querying this data we are likely to care more about when the events occurred rather then when it was sent to our servers. Knowing this we store our event data in Hive partitioned by when the events occurred rather then when they are ingested.

Continue reading Building a Faster ETL Pipeline with Flume, Kafka, and Hive

WordPress Performance with HHVM

With Heroku-WP I hoped to lower the bar in getting WordPress up and running on a more modern tech stack. But what are the performance implications of running WordPress on such a modern set of technologies? Surely it’s faster but by how much and is the performance gains worth the trouble?

PHP vs. HHVM

To answer that I’ve conducted a quick and dirty stress test of a sample WordPress site running under PHP and HipHop VM (HHVM) and found that the HHVM version loaded almost twice as fast and was able to serve over twice as many requests.

Continue reading WordPress Performance with HHVM