Aimless Analysis

about

Do It Wrong The First Time

Bryson McIver - December 3rd, 2020

I think a lot more people have gained the mindset of iteration and that you'll never do something perfectly the first time (or you'll take too long to do it perfectly that it wont matter anymore) and I think most people aim for 70-80% quality. This puts you under the last 5-10% takes 50% of the time rule but generally puts thing in a pretty well completed state. You then either say that is good enough or you iterate a few more times with the new information of having completed something to get to that 100%. The commonly thought of alternative is the harder set of requirements and a final deliverable at the end.

I very recently had to pick up AWS, Terraform, consul, vault and some other technologies for my team's transition to the cloud. It all seems fairly straightforward, toss a VPC up, put your stuff in, expose it internally on the VPC, call it a day. Do it all with terraform so it is defined as code and that'll do you one better right. Who doesn't want to just run terraform apply and get their service fully running? This was my first time doing all this, but I had the opinions of others more familiar with AWS telling me it was possible and our setup was going to be terraform so that should work too. I validated that it looked like it would work (what a local dev setup would be, how the service would interact with others, etc) and then set out to write that 70% complete service that fit in our 70% completed architecture.

Really it just didn't end up being that easy. I really didn't know what I didn't know and hit problem after problem in our multi account AWS setup with fully private subnets (which I was supposed to put my service in) and standardized terraform layout which I was supposed to include my work in. After gaining more experience with terraform I found that this was really a poor way to deploy a whole service (please keep your terraform to just your infrastructure) without the ability to do some ordering (like terragrunt) on different terraform invocations. Being in a fully private subnet meant utilizing AWS through VPC private endpoints and operating in what felt like a non-cloud way.

I wish I could have taken a third approach, one that I find is very hard to take, but could have been very valuable in this situation. I would have started at nothing and just built something that worked. At around 40% of the way on this service, I learned so much and hard discovered so many unknowns that I would have taken a new approach all together. Being locked into the company's design at 70% though, I couldn't write a service that didn't hit the same design complexity as everyone else. The just building approach could've really helped me learn about the implementation complexities coming with the design and led more of the push back on certain choices that were going to end up killing velocity. You're going to do it wrong the first time going this way, this is obviously the value of experienced engineers in being able to predict the implementation complexities that come with designs versus just providing designs that meet goals (another valuable and humbling lesson learned).

Obviously this is not always going to be the right approach (and for many not up to them). If you have enough experience across all developers such that all designs can be vetted for the implementation difficulties and there is enough familiarity on all technologies, then flex that experience and build something nearly there the first time. You're going to run into far fewer unknowns with the previous experience and get a lot more pain from having to rewrite the things that were just built from natural progression. You also might not have the time to take this approach. Even if you struggle to hit 70%, it is going to end up faster than potential architecture rewrites to hit important features later down the line. I think the difference in time here is not as big as one would think, especially if you do already have natural team boundaries and prevent cross contamination there. With the boundaries you prevent impossible to break apart monoliths (at the cost of maybe doing something twice) and can break those boundaries naturally later down the line.

I find this approach tough to take because I naturally want to build a strong architecture from the beginning and don't want to accept building something with known flaws. It is hard to just build something when you keep finding ways you think could be better and want to make sure you have the full picture first. I had also never had anyone point this approach out as something valid to do outside of hobbies (although the argument is normally so you can actually make use of your limited time).

Not really sure how to end this one out. It is going to be a while before I can attempt this build first and doing it wrong approach professionally, but I hope my thoughts here can remind me and help make that decision next time.