Resiliency is the ability to recover from failures and continue to function. It is not about avoiding failures but accepting the fact that failures will happen and responding to them in a way that avoids downtime or data loss. The goal of resiliency is to return the application to a fully functioning state after a failure.
Building resilient systems requires inculcating a strong sense of acceptance of failures to the team since day very first day of a project. The team will need to create solutions to make sure that the system either recovers from failures or that it still functions despite ongoing failures.
Some of these potential failures may involve lost computing or data nodes, inaccessible clusters, network losses, among many other. Sometimes there are ripple effects due to natural latency of the involved dependencies.
In this Tiago explores a design approach that accepts and embraces failures from the outset and leads to much better and more resilient applications. In the end you will learn how to create more reliable and fault-tolerant systems.
Tiago Luchini’s passion is bridging software development and business together via successful digital products. In his 22 years of experience he has personally shipped more than 40 digital products and has ramped up global teams responsible for planning, development and releasing hundreds of digital products.
At Work & Co he leads desktop, web, mobile and back-end development efforts for several clients including Pepsi, Facebook, Google, Twitter, Philz Coffee, Virgin America, Disney and the NBA. He is responsible for Work & Co’s technology strategy and vision.
He is an alum of Columbia University, London Business School, and Unicamp (Brazil) where he received his MBA and MSc. In his past life he was a technologist at Ixonos (Finland), where he oversaw large-scale technical projects and ramped up mobile device R&D centers in Beijing and Chengdu (China) for clients such as Nokia and Samsung.