The [[ Error: A Quiet Lesson in Shell Scripting

July 15, 2025 • 10 min read

There’s a special kind of dread reserved for when a tool you've used a thousand times suddenly doesn't exist. For many engineers, that moment arrives with a shell script that works perfectly on a laptop but fails in production with a cryptic `[[: not found`. This isn't just a syntax error, it's a sign that the very foundation we build on is not as solid as we assume.

In DevOps, Sysadmin and SRE world, the reliability of our automation often comes down to the smallest details. One of the most common pitfalls is the subtle but critical difference between sh and bash, a gap that leads to fragile shell scripts and maddening production failures. This isn't just about syntax, it's about understanding the environments our code truly runs in. There’s a special kind of dread that’s unique to our line of work. It’s not the loud panic of a server outage, but that quiet, sinking feeling you get when something small and simple fails unexpectedly. For me, it usually starts with two characters: [[.
You’ve written a perfectly good shell script. You tested it on your own machine, and it works like a charm. You push it to a server or stick it in a Docker image, and you run it. Then you see the error: line 12: [[: not found. It’s a moment that just doesn’t compute. How can something so basic, something that’s always been there in your terminal, just… not exist?

The Comfortable Lie of /bin/sh

This is usually when you learn, the hard way, that the comfortable, feature-packed shell you use every day isn't the one that actually runs the world. Your setup, whether it’s a MacBook with Zsh and tons of plugins or a nice Ubuntu desktop is a comfortable bubble. The real world is a patchwork of weird, minimal, and sometimes ancient systems. And in that world, the only language you can truly rely on is the one a committee decided on decades ago: the POSIX shell.

The assumption that bash is everywhere is one of the most common and dangerous myths in software today. We use bash for its nicer syntax, its arrays, and its handy testing features. We write scripts that depend on them without a second thought, mostly because on our own computers, /bin/sh is just a shortcut that points to /bin/bash. We’re essentially being tricked by our own dev environments. They’re training us to write code that won't work everywhere else by hiding the difference between the standard and the bells and whistles. The problem is, production isn’t your laptop. Production is a tiny Alpine Linux container where /bin/sh is actually ash, a lightweight shell that sticks to the rules and has never heard of your fancy [[ command. Production is a stripped-down server image that has your app and almost nothing else. It’s an old network device or a crusty server that hasn’t been patched in years, running a version of bash so old it’s practically a different program.

In these places, your script doesn't just fail gracefully. It fails in the most confusing way possible. That [[: not found error is a clue, but only if you’re already in on the secret. To anyone else, it looks like the machine itself is broken. It’s the kind of error that doesn't just break your script, it makes you doubt the whole system. You start questioning everything. Is the PATH wrong? Is the file system corrupted? It feels like a bug in the OS, but the bug is actually in what you assumed to be true.

POSIX Compliance as a Survival Tactic

This is why writing scripts that target /bin/sh and stick to the POSIX standard isn't just being difficult or old-fashioned. It’s a survival tactic. It’s an act of kindness to your future self, the one who will be debugging this thing on a server they’ve never seen before. When you write a POSIX-compliant script, you’re making a deal with the idea of a Unix-like system, not with one specific tool. You're betting on the standard, not the implementation.

I’ll admit, the standard is a pain to work with. There’s a reason bash and other modern shells invented better ways of doing things. Doing simple math requires the clunky $((...)) syntax. Juggling strings feels like you’re casting ancient spells with awk, sed, and cut. And trying to use anything like a list or array is just an exercise in frustration. The POSIX shell is bare-bones. It gives you just enough to get the job done and not a single thing more. And yet, there’s a certain beauty in its limitations. It forces you to think differently. When you can’t rely on bash shortcuts, you start building small tools that do one thing well and stringing them together. Your scripts start to look less like big, clunky programs and more like pipelines, which is the original Unix philosophy at its best. A script that uses [[ and other bash features is trying to be a C program. A script that uses [ and expr knows it’s a shell script and is okay with that. It works with its environment instead of fighting it.

This doesn’t mean bash has no place. If you’re writing a script for your own use, on a machine you control, then by all means, use bash. Use zsh. Use whatever makes you happy. If you’re writing automation for a fleet of servers and you can guarantee a modern version of bash is installed, then targeting bash is a perfectly fine decision. The key is to be honest about it. The line #!/usr/bin/env bash is a declaration. It says, "This script needs bash and its features to work." The lie is putting #!/bin/sh at the top of a script that’s full of bash features. It promises it will work anywhere, but it’s actually incredibly fragile. It’s a ticking time bomb, waiting for the day it’s run in an environment that takes its promise literally. This is how you get those subtle, maddening bugs, the script that works 99% of the time but fails on that one weird server, or the Docker build that works on your machine but fails in the CI/CD pipeline.

A Microcosm of the Dev vs. Production Gap

This sh versus bash issue is really a small example of a much bigger problem: the gap between our development setups and the reality of production. We build software in these wonderful, forgiving sandboxes, full of powerful tools. Then we throw it over the wall into production, which is a harsh, resource-limited, and often hostile place. The shell script is often the glue that holds it all together. It’s the startup file, the health check, the migration tool. When that glue is built on bad assumptions, the whole system becomes brittle. I’ve come to see the choice of shell as a kind of philosophical statement. Are you building a temporary tool for a specific job? Or are you building a tough piece of infrastructure that needs to survive out in the wild? The first one is fine for bash. The second demands the rigor of POSIX sh. It’s the difference between building with prefabricated panels and building with hand-cut stone. The stone is harder to work with, and the result might not be as fancy, but you can be damn sure it will still be standing years from now. We talk a lot about technical debt, but a folder full of "helper scripts" written with lazy, non-portable syntax is one of the sneakiest kinds. It doesn’t show up in any reports. It doesn’t have a line item in the budget. It just sits there, waiting. And its cost is measured not in dollars, but in the frantic, caffeine-fueled hours of an engineer trying to figure out why [[ is not found while the whole system is down.

So the next time you write a shell script, just pause for a second at the top line. Ask yourself: Where will this run? Who will have to debug it? What can I truly assume about its environment? Answering those questions honestly might lead you to the more painful, but ultimately safer, path of POSIX compliance. It may feel like a step backward, like you're giving up your powerful tools. But it’s not. It’s an act of engineering discipline. It’s recognizing that the most important feature of any script is not how clever it is, but its simple ability to run, reliably, every single time, no matter where it is.

I'd be interested to hear about your own '[[ not found' moments, the subtle assumptions that turned into hard-learned lessons. What are the quiet time bombs you've had to defuse in your own systems?

Back to Blog

Need expert help with your IT infrastructure?

Our team of DevOps engineers and cloud specialists can help you implement the solutions discussed in this article.

Our Services Contact Us