Facebook’s approach to nearly continuous deployment of mobile software
Chuck Rossi, along with 5 other co-authors that include Kent Beck (yes, XP Kent Beck), recently published the research paper Continuous Deployment of Mobile Software at Facebook (Showcase). If you have not read this article, I highly recommend reading the 12 pages in full.
Challenges of continuous deployment of mobile software
The paper provides insight regarding the challenges with continuous deployment of mobile software, including:
- Software updates not being transparent to end users, as we have with services delivered via the cloud
- Software that must be delivered as a complete binary, unlike microservices approaches that isolate small individual services to increase speed of testing and delivery while reducing risks
- Lack of control regarding whether or not mobile users choose to update their devices—something not encountered with cloud solutions where changes can be pushed out and controlled by the enterprise
- Challenges with the number of hardware variants, a challenge that is significantly more complicated than dealing with browser and operating system variants for Web applications
The team then set out to share their research on how close to “continuous” organizations may be able to get with updating and deploying mobile software.
Reinforcement of good practices
I appreciated the reinforcement of good practices mentioned within the article:
- Short-lived local branches with frequent pushes to the Master branch
- Ability to toggle mobile features on and off from the server side with some granularity
- Rotating senior developers onto the Release Engineering Team
- Automating testing as much as possible for reproducibility, including unit tests, static analysis, build tests, integration tests, performance tests, capacity tests and conformance tests
- The value placed on code reviews
The degree of automation and precision in detecting what injected an issue are essential to building up speed to get to near continuous deployment. Bots categorize reliability issues and automatically review which code changes are closest to instructions in the stack trace. Automation includes auto-analysis and consolidation of user issues and automatically adding tasks where there are consistent issues.
Insight into Facebook’s Infrastructure-as-a-Service approach
Another of the fascinating aspects of how Facebook tests mobile devices is the description of their Infrastructure-as-a Service approach that includes racks of mobile devices used to test for regressions against app speed, memory usage and battery efficiency. Nodes connected to the devices use Chef to configure the devices and cameras photograph the results.
Spoiler alert…. You really should read the whole paper.
That said, the following are some of the results that I found fascinating:
- Continuous deployment is not negatively affecting productivity even as the organization size scales
- The number of deployments does not appear to be increasing the number of critical issues
- Shortened lengths of deployment cycles do not appear to be negatively impacting software quality, as indicated by launch blockers or crash rates
- It’s quite possible that when developers are rushing like crazy to push software out on the cut day, they produce lower quality code. We’ve probably all experienced that, but the authors have metrics to back up their hypothesis.
This research paper is one that I’ll hang onto as an example to aspire to, both in terms of data-driven decision making and pushing the limits of what’s possible with mobile development.