Over the past 12 months REA Group has been moving towards a structure where individual teams will manage their own infrastructure.
Start ups (or companies that behave like one) should already have devops culture. At REA Group we’re trying to bring a startup feel to individual teams, so engineers at the team level can decide on what new technology they want to try out, test and learn ahead of the rest of the organisation, and ensure the company stays adaptable and ahead of the curve.
Conway’s Law states that as an engineer you inevitably create a system that has the same structure as that of your organisation. We are pushing microservice structure at REA, so having smaller teams makes a lot of sense.
In essence, we want to give teams autonomy, but that does come with a bunch of added responsibility, and making that work comes a with lot of changes and learning.
For instance, if you have a traditional ops-style team that deliver the applications to production, they are probably also dealing with the support pager for those applications. But in a team managed infrastructure environment there will have to be some support done within the development team. That is, you’re going to have to forget about that 9-5 role you thought you were signing up for and respond to support pages in the middle of the night.
With responsibility comes, in my opinion, larger rewards. As a developer, you get a much better feeling of satisfaction when you control the process end to end. You probably also have more developers than ops people in your organisation, so having more people able to fix things outside of business hours is only going to be a plus. You also lessen the amount of stress on your support staff. They no longer need to know about every service in your entire company. They only need to know about the ones they actually support, which will also be the services they work on day to day.
If your teams are the right size you should find yourself in a situation where every person on the team is capable of deploying an application all the way to production within minutes – and they’re not scared to do so.
Developers feel empowered with the trust they get from this kind of environment, and everyone wins.
When teams choose the tools they want and need in their infrastructure, your company absorbs this knowledge (usually meaning a wider range of tools are learnt). Though there may be a little duplicated work, or upskilling of engineers when they switch teams, this is small price to pay against the upsides.
Another huge upside is reducing the “blast radius” of potential problems. If your team deploys something that may cause havoc, it’s unlikely to affect many things outside of your team. Use a team centered approach as buffer for your testing and learning.
Structuring your teams for infrastructure – a work in progress
We’re yet to decide what this team infrastructure should exactly look like at REA Group.
At what level do you split out the more major things like DNS and NAT infrastructure? Do you want every engineer in your team to need to know CIDR terminology, or be able to debug iptables issues on your NAT boxes? What about CI servers?
The further down the funnel you push infrastructure that everyone needs, the more you duplicate the cost of running them as well as the skills required to fix them. That could be fixed with a well crafted PaaS solution. We’ve found sharing a group of highly skilled devops staff between teams also really helps. You slowly but surely enhance the skills of your engineering pool.
You also need to figure out the relationship between your team structure and the organisational structure as a whole. If you’re already agile, you don’t want the switching of apps between different infrastructures hindering your ability to restructure your teams. So you need to strike a balance between agility and specialism that gets the job done.
We’re still learning. Have you given it a go yet? What have you learned?