Uploaded in The Fifth Elephant 2013 - Day 2

Mozilla tames a rabble of HBase, PostgreSQL, RabbitMQ, elasticsearch, and Python to chew through 50 Firefox crash reports each second. Explore the strengths and weaknesses of these tools and their consequent niches in the greater crash-catching system. Greet the complexities that emerge from the combination, and see how Mozilla engineers around them to keep their never-lost-a-crash record pristine.

Receiving and organizing every Firefox crash in the world is a big job. Concurrency, realtime constraints, and a volume of data 110TB strong all contribute to the challenge of giving Firefox engineers what they need to find and squash browser bugs.

Follow a Firefox crash from its genesis in a collapsing browser process through the dizzying array of collection, storage, and reporting systems that make up Socorro, Mozilla's open-source crash collector. Enjoy war stories of weird, interlocking failures, and see how Mozilla nevertheless continues to fulfill their mandate: “Never lose a crash.”