This post was originally published internally, as an appeal to REA colleagues.
“Getting Shit Done” is the catchphrase on everybody’s lips, and deservedly so! When we deliver new functionality, our users regroup and flock to us, our customers grudgingly respect us, and our shareholders rejoice. When the novel concepts invented by our product managers take shape as they watch, their eyes light up with pride and enthusiasm. Programmers are never happier than when fire and magic fly from their fingertips; products that change people’s lives materialise from thin air, and insurmountable problems melt like butter. Beer flows freely, parmas are devoured and our managers circulate glowing praise within the company.
We have all felt the opposite too; long months gone by without new features, frustrated and bored developers; product managers forced to nervously adjust their collars and disappoint their superiors, with often dense technical reasons they can barely hope to convey. New features pop up like mushrooms on our competitors’ sites, and we wonder: why didn’t we do this years ago?
Yet when I hear this phrase “Get Shit Done”, I grimace; my teeth clench and my back involuntarily stiffens. Why? There is truly nothing I want more, and it is clearly important; many of our most talented teammates live by it.
What is not as widely known is that employee time can be traced all the way back to Post-WW2 in the United States. It was 1948 and multinational manufacturer 3M instigated “15% time“. In 1974 an employee by the name of Art Fry used this time to develop a means of applying an adhesive to the back of a piece of paper and the post-it note was born.
In addition to the Silicon Valley titans, several companies have embraced employee time to foster innovation, all with pretty cool names: BlueSky (Apple), [in]Cubator (LinkedIn), Hackweek (Dropbox), The Garage (Microsoft), ShipIt (Atlassian).
“Microservice” is becoming a buzzword in the industry today. Developers all over are experimenting and learning more and more about it. It’s a cool concept and we’ve started working with it at REA. There were several challenges that we faced with microservices and one of the key ones was tracking and documenting them.
After lots of deliberation, discussions and debates we could summarise our dilemma and needs as:
“Wouldn’t it be cool to have a living, breathing tracking system for all the microservices that exist in our ecosystem which will emit all the key information that we would like to know and have?”
The dreaded CREATE_FAILED message can be all too common a source of frustration when deploying new stacks with CloudFormation. The AWS Console does show you which component in your stack has failed but if you have a heavy reliance upon metadata and userdata components more often than not you’ll only get a wait condition timeout error which gives you no indication at all as to what has actually gone wrong under the covers.
The good news is that there are some tips and tricks out there for troubleshooting CloudFormation stack failures. Some of the tips revolve around CLI switches, some around knowing a bit more about the CF internals and others about knowing where specific scripts live on your typical EC2 instance. This post attempts to document a few approaches to troubleshooting CloudFormation stack errors and help the reader to take a (somewhat..) structured approach to troubleshooting wait condition timeouts.
A few months ago we were catching up with the guys from Puppet Labs here in the REA offices in Melbourne and they asked us this question:
PL: ” Configuration management, what are you doing about it? ”
J: ” Well, that’s a long story…”
We spent the rest of the morning sketching on the whiteboard the evolution of configuration management in REA, and the different stages we went through. A couple of weeks later my colleague David Lutz asked me if I wanted to present at the Melbourne Infrastructure Coders meetup that he co-hosts, and I thought that I could share the story with the wider audience. After receving some positive feedback about the presentation I sent a proposal to linux.conf.au to repeat the talk there at the Sysadmin miniconf. A couple of weeks ago I presented it in Auckland.
If you want to review the journey we’ve been through regarding configuration management at REA, and get a good peak into our Devops culture, check out the attached video:
Also if you are interested in the slides you can find them in Slideshare.
Introducing the latest addition to the Technology Services team – The Walkupinator – a device which simplifies the way we log our tickets from people just dropping by.
The Technology Services team at REA Group is extremely proud of the walk up service we provide to our staff, however logging tickets for our walk ups has become problematic.
After a busy morning on the service desk in the Innovation Hub it’s often hard to recall who we’ve assisted or what the issue was. With over 550 people at REA HQ, things get busy. To solve an issue that has consistently plagued our team, I’ve created a system that utilises existing technology to allow users to simply swipe a card to log a ticket. This system, which we’ve named “The Walkupinator” can save the person manning the service desk up to an hour a day, as well as saving time for our internal colleagues – or as we like to think of them, our customers.