Automatic functional tests? These are often lacking. Not because they don't want them, but because they shy away from the risks:
- Production is 'sacred'
- Security or access restrictions
- Fear of unintentionally burdening systems
And so the validations are done manually, with checklists such as:
- "Is the login page still working?"
- "Can I place an order?"
- "Does the API still provide an answer?"
Everything seems fine at first glance... Until Monday morning.
Then problems surface: a misconfigured database, a broken link, or a missing dependency — issues invisible at the system level but functionally critical.
Why is classic monitoring not enough?
Most monitoring tools keep a close eye on technical parameters: CPU, memory, network, storage, process status, and so on.
But who monitors the functional chain?
- Is the web service accessible and usable?
- Can the application still communicate with the database?
- Are error messages or abnormal behavior detected?
- Is a slow or failing user flow detected?
That's where things often go wrong.
Monitoring sees that something is running, but not how well it is running.
A concrete example: everything "runs", but it still doesn't work
Suppose you have an internal web application for employees, which depends on a database. During a scheduled patch moment on Sunday, the technical team will perform updates. Afterwards, they confirm that both the web application and the database are up and running again.
Everything seems fine. Until Monday morning.
The first employees try to log in, but receive an error message:
"Currently unable to log in. Please try again later."
After analyzing the logs, it appears that the application could not connect to the database — while it was technically running. What happened?
When starting the application, the database was not yet fully ready (for example: the service was already running, but the connection was not yet available). The application tried to connect, failed, and continued without correct initialization. This made the entire login flow unusable.
Both systems were running, but the functional clutch was broken.
Fig. 1: What we see from the patching team
Fig. 2: What the logs really say
In an ideal world...
In a well-configured ecosystem:
- Is this error logged and centralized via log aggregation
- Is an alert triggered on a dashboard or via notification
- Is the patch team trained to recognize that error and understand what needs to be done
But in practice?
- Does the patch team have access to that dashboard?
- Do they know something is wrong?
- And more importantly: do they know what to do if they are alerted?
Test automation as your secret weapon
What if you could have proactively detected this situation?
What if you had automatically run a simple functional test on Sunday evening — right after the patch?
For example:
- A login attempt with valid credentials
- An API call that retrieves data in a controlled manner
- A check whether the database is initially usable
Then the error would have come to light immediately, even before the first user became frustrated on Monday morning.
Test automation helps you stop extinguishing fires and start detecting smoke early.
Shift in mindset: from validation to monitoring
Test automation is often seen as something for development and QA. But in reality, it can:
- Prove that an environment is really functional after an update
- Ensuring confidence as code: built-in assurance
- Strengthen monitoring by consciously triggering negative scenarios
- Even validating alerts: is your observability pipeline still working correctly?
Pratical use cases
Automatically run a test flow after patching:
- Log in
- Request a customer profile
- Enter a faulty request to validate an expected error
- Send the results to monitoring tools (e.g. via logs or Prometheus metrics)
- In case of deviation: send an alert to the patch team, during the weekend
The result?
- Faster intervention
- Fewer incidents in production
- More confidence in your release process
- And above all: no Monday mornings full of frustration
Because honestly:
Why wait till Monday?