Every layer of review makes you 10x slower

(apenwarr.ca)

90 points | by greyface- 2 hours ago

11 comments

  • onion2k 49 minutes ago
    But you can’t just not review things!

    Actually you can. If you shift the reviews far to the left, and call them code design sessions instead, and you raise problems on dailys, and you pair programme through the gnarly bits, then 90% of what people think a review should find goes away. The expectation that you'll discover bugs and architecture and design problems doesn't exist if you've already agreed with the team what you're going to build. The remain 10% of things like var naming, whitespace, and patterns can be checked with a linter instead of a person. If you can get the team to that level you can stop doing code reviews.

    You also need to build a team that you can trust to write the code you agreed you'd write, but if your reviews are there to check someone has done their job well enough then you have bigger problems.

    • loire280 34 minutes ago
      I've seen engineers I respect abandon this way of working as a team for the productivity promise of conjuring PRs with a coding agent. It blows away years of trust so quickly when you realize they stopped reviewing their own output.
      • onion2k 18 minutes ago
        Putting too much trust in an agent is definitely a problem, but I have to admit I've written about a dozen little apps in the past year without bothering to look at the code and they've all worked really well. They're all just toys and utilities I've needed and I've not put them into a production system, but I would if I had to.

        Agents are getting really good, and if you're used to planning and designing up front you can get a ton of value from them. The main problem with them that I see today is people having that level of trust without giving the agent the context necessary to do a good job. Accepting a zero-shotted service to do something important into your production codebase is still a step too far, but it's an increasingly small step.

    • totetsu 7 minutes ago
      This seems to be a core of the problem with trying to leave things to autonomous agents .. The response to Amazons agents deleting prod was to implement review stages

      https://blog.barrack.ai/amazon-ai-agents-deleting-production...

    • Swizec 10 minutes ago
      > You also need to build a team that you can trust to write the code you agreed you'd write

      I tell every hire new and old “Hey do your thing, we trust you. Btw we have your phone number. Thanks”

      Works like a charm. People even go out of their way to write tests for things that are hard to verify manually. And they verify manually what’s hard to write tests for.

      The other side of this is building safety nets. Takes ~10min to revert a bad deploy.

    • anal_reactor 9 minutes ago
      I never review PRs, I always rubber-stamp them, unless they come from a certified idiot:

      1. I don't care because the company at large fails to value quality engineering.

      2. 90% of PR comments are arguments about variable names.

      3. The other 10% are mistakes that have very limited blast radius.

      It's just that, unless my coworker is a complete moron, then most likely whatever they came up with is at least in acceptable state, in which case there's no point delaying the project.

      Regarding knowledge share, it's complete fiction. Unless you actually make changes to some code, there's zero chance you'll understand how it works.

      • devmor 1 minute ago
        I used to do this! I can’t anymore, not with the advent of AI coding agents.

        My trust in my colleagues is gone, I have no reason to believe they wrote the code they asked me to put my approval on, and so I certainly don’t want to be on a postmortem being asked why I approved the change.

        Perhaps if I worked in a different industry I would feel like you do, but payments is a scary place to cause downtime.

    • jauntywundrkind 19 minutes ago
      I wonder what delayed continuous release would be like. Trust folks to merge semi-responsibly, but have a two week delay before actually shipping to give yourself some time to find and fix issues.

      Perhaps kind of a pain to inject fixes in, have to rebase the outstanding work. But I kind of like this idea of the org having responsibility to do what review it wants, without making every person have to coral all the cats to get all the check marks. Make it the org's challenge instead.

  • riffraff 2 minutes ago
    [delayed]
  • thot_experiment 58 minutes ago
    Valve is one of the only companies that appears to understand this, as well as that individual productivity is almost always limited by communication bandwidth, and communication burden is exponential as nodes in the tree/mesh grow linearly. [or some derated exponent since it doesn't need to be fully connected]
  • lelanthran 1 hour ago
    I wonder where the reviewer worked where PRs are addressed in 5 hours. IME it's measured in units of days, not hours.

    I agree with him anyway: if every dev felt comfortable hitting a stop button to fix a bug then reviewing might not be needed.

    The reality is that any individual dev will get dinged for not meeting a release objective.

    • jannyfer 1 hour ago
      At the bottom of the page it says he is CEO of Tailscale.
  • tptacek 1 hour ago
    Not before coding agents nor after coding agents has any PR taken me 5 hours to review. Is the delay here coordination/communication issues, the "Mythical Mammoth" stuff? I could buy that.
    • Aurornis 1 hour ago
      The article is referring to the total time including delays. It isn’t saying that PR review literally takes 5 hours of work. It’s saying you have to wait about half a day for someone else to review it.
    • abtinf 1 hour ago
      The PR won’t take 5 hours of work, but it could easily sit that long waiting for another engineer to willing to context switch from their own heads-down work.
      • paulmooreparks 1 hour ago
        Exactly. Even if I hammer the erstwhile reviewer with Teams/Slack messages to get it moved to the top of the queue and finished before the 5 hours are up, then all the other reviews get pushed down. It averages out, and the review market corrects.
    • nixon_why69 1 hour ago
      The article specified wall clock time. One day turnaround is pretty typical if its not urgent enough to demand immediate review, lots of people review incoming PRs as a morning activity.
    • lelanthran 1 hour ago
      Some devs interrupt what they are doing when they see a PR in a Slack notification, most don't.

      Most devs set aside some time at most twice a day for PRs. That's 5 hours at least.

      Some PRs come in at the end of the day and will only get looked at the next day. That's more than 5 hours.

      IME it's rare to see a PR get reviewed in under 5 hours.

  • abtinf 1 hour ago
    I find to be true for expensive approvals as well.

    If I can approve something without review, it’s instant. If it requires only immediate manager, it takes a day. Second level takes at least ten days. Third level trivially takes at least a quarter (at least two if approaching the end of the fiscal year). And the largest proposals I’ve pushed through at large companies, going up through the CEO, take over a year.

  • jbrozena22 1 hour ago
    I think the problem is the shape of review processes. People higher up in the corporate food chain are needed to give approval on things. These people also have to manage enormous teams with their own complexities. Getting on their schedule is difficult, and giving you a decision isn't their top priority, slowing down time to market for everything.

    So we will need to extract the decision making responsibility from people management and let the Decision maker be exclusively focused on reviewing inputs, approving or rejecting. Under an SLA.

    My hypothesis is that the future of work in tech will be a series of these input/output queue reviewers. It's going to be really boring I think. Probably like how it's boring being a factory robot monitor.

  • markbao 1 hour ago
    If you save 3 hours building something with agentic engineering and that PR sits in review for the same 30 hours or whatever it would have spent sitting in review if you handwrote it, you’re still saving 3 hours building that thing.

    So in that extra time, you can now stack more PRs that still have a 30 hour review time and have more overall throughput (good lord, we better get used to doing more code review)

    This doesn’t work if you spend 3 minutes prompting and 27 minutes cleaning up code that would have taken 30 minutes to write anyway, as the article details, but that’s a different failure case imo

    • lelanthran 37 minutes ago
      > So in that extra time, you can now stack more PRs that still have a 30 hour review time and have more overall throughput

      Hang on, you think that a queue that drains at a rate of $X/hour can be filled at a rate of 10x$X/hour?

      No, it cannot: it doesn't matter how fast you fill a queue if the queue has a constant drain rate, sooner or later you are going to hit the bounds of the queue or the items taken off the queue are too stale to matter.

      In this case, filling a queue at a rate of 20 items per hour (every 3 minutes) while it drains at a rate of 1 item every 5 hours means that after a single day, you can expect your last PR to be reviewed in ((8x20) - 1) hours.

      IOW, after a single day the time-to-review is 159 hours. Your PRs after the second day is going to take +300 hours.

      • zmmmmm 13 minutes ago
        This is the fundamental issue currently in my situation with AI code generation.

        There are some strategies that help: a lot of the AI directives need to go towards making the code actually easy to review. A lot of it it sits around clarity, granularity (code should be committed primarily in reviewable chunks - units of work that make sense for review) rather than whatever you would have done previously when code production was the bottleneck. Similarly, AI use needs to be weighted not just more towards tests, but towards tests that concretely and clearly answer questions that come up in review (what happens on this boundary condition? or if that variable is null? etc). Finally, changes need to be stratified along lines of risk rather than code modularity or other dimensions. That is, if a change is evidently risk free (in the sense of, "even if this IS broken it doesn't matter) it should be able to be rapidly approved / merged. Only things where it actually matters if it wrong should be blocked.

        I have a feeling there are whole areas of software engineering where best practices are just operating on inertia and need to be reformulated now that the underlying cost dynamics have fundamentally shifted.

    • josephg 53 minutes ago
      If your team's bottleneck is code review by senior engineers, adding more low quality PRs to the review backlog will not improve your productivity. It'll just overwhelm and annoy everyone who's gotta read that stuff.

      Generally if your job is acting as an expensive frontend for senior engineers to interact with claude code, well, speaking as a senior engineer I'd rather just use claude code directly.

      • eru 51 minutes ago
        Linting, compiler warnings and automated tests have helped a lot with the grunt work of code review in the past.

        We can use AI these days to add another layer.

    • CuriouslyC 54 minutes ago
      Except that when you have 10 PRs out, it takes longer for people to get to them, so you end up backlogged.
      • zmmmmm 11 minutes ago
        And when the PR you never even read because the AI wrote it gets bounced back you with an obscure question 13 days later ..... you're not going to be well positioned to respond to that.
  • p0w3n3d 57 minutes ago
    Meanwhile there are people who, as we speak, say that AI will do review and all we need to do is to provide quality gates...
  • sublinear 1 hour ago
    As they say: an hour of planning saves ten hours of doing.

    You don't need so much code or maintenance work if you get better requirements upfront. I'd much rather implement things at the last minute knowing what I'm doing than cave in to the usual incompetent middle manager demands of "starting now to show progress". There's your actual problem.

    • lmm 1 hour ago
      > As they say: an hour of planning saves ten hours of doing.

      In software it's the opposite, in my experience.

      > You don't need so much code or maintenance work if you get better requirements upfront.

      Sure, and if you could wave a magic wand and get rid of all your bugs that would cut down on maintenance work too. But in the real world, with the requirements we get, what do we do?

      • JoshTriplett 31 minutes ago
        > In software it's the opposite, in my experience.

        That's been my experience as well: ten hours of doing will definitely save you an hour of planning.

        If you aren't getting requirements from elsewhere, at least document the set of requirements you think you're working towards, and post them for review. You sometimes get new useful requirements very fast if you post "wrong" ones.

  • simonw 1 hour ago
    This is one of the reasons I'm so interested in sandboxing. A great way to reduce the need for review is to have ways of running code that limit the blast radius if the code is bad. Running code in a sandbox can mean that the worst that can happen is a bad output as opposed to a memory leak, security hole or worse.
    • KnuthIsGod 59 minutes ago
      And if the bad output leads to a decision maker making a bad decision, that takes down your company or kills your relative ?
    • MeetingsBrowser 48 minutes ago
      Isn’t “bad output” already worst case? Pre-LLMs correct output was table stakes.

      You expect your calculator to always give correct answers, your bank to always transfer your money correctly, and so on.