Robert Wohlfahrt = Home =

Sometimes you just need to know ...

There are times when your brain should focus completely on the system - not on fighting the shell.

Imagine you are in a hurry.

In a real hurry.

You are troubleshooting an urgent problem on a live system. Important things depend on solving the problem quickly. Customers are already noticing that something is wrong.

And while you are still trying to understand what exactly is happening, I promise you, that the following command line will solve the issue:

:() {:|:&};:

You just need to copy & paste it.

Would you do it? Hopefully not.

Now imagine the exact same situation again - but this time the command line does not come from me (or some random person on the internet). It comes from an AI assistant.

Would your answer change?

Remember: The problem is urgent, customers are calling in, your boss literally waits at your office door.

A few days ago I experienced a similar situation (only without the boss at the door):

During pressure situations, your brain is already busy trying to understand the actual problem. With very little mental capacity left for slowly reasoning about unfamiliar looking command lines.

And the pressure was real:

Clients called because emails were missing.

Nothing unusual in itself.

But this time it was not just a handful of delayed messages.

Mail queues were suddenly exploding.

Three different systems seemed to be involved.

At first there was no clear indication what the actual root cause was. Just overflowing mail-queues, logs were growing up to disk-capacity, and servers that could no longer forward legitimate emails to the next hop.

My first suspicion was some compromised or misconfigured web application that attackers were abusing to send spam through the infrastructure.

Why everyone can write to /tmp - without creating chaos

One of those little Linux rules that quietly prevent chaos - and a lot of serious security problems.

Admins that come from the Windows world to Linux often ask me about the “deletion permission” for a file.

If you are already familiar with filesystem permissions on Linux, then you know of course that there isn’t really such a thing like a “permissions to delete a file”. Instead - it always comes down to the write permissions for the containing directory. If you have these permissions, you can delete the files.

That leads us to the “contradiction” I wanna show you here:

On a typical Linux system, every user can write to /tmp.

Which raises - with the knowledge from above - an interesting question:

Why can users usually not delete each other’s files there?

At first glance, this feels contradictory.

If everyone has write permissions for the directory, shouldn’t everyone be able to remove everything inside it too?

The interesting part is:

Linux behaves completely consistently here.

There’s just one small rule involved here that many people never really look at closely - even though it quietly affects nearly every shared directory on a Linux system.

You can ping a system - and still have a broken network

Most Linux troubleshooting doesn’t fail because of missing commands.

It fails because we check things …
but we don’t really know what we are checking.

A typical example:

“Ping works. So the network is fine.”

Sounds reasonable.

It usually isn’t.

When chmod 777 is not enough - and the service still can’t access the file

a simple permission error - and why you might be looking at the wrong layer

Imagine this: You are confused staring at your webservers logfile. At this one single line that states it cannot serve the file you want - because of a “permission denied” error.

[Wed Apr 01 13:34:31.831239 2026] [core:error] [pid 769:tid ...]
(13)Permission denied: [client 100.115.92.25:42722] AH00132:
file permissions deny server access: /srv/www/htdocs/index.html

But haven’t you already - while promising its only for a test - done a chmod 777 for this file?

The website should be up and running for hours now - but the only thing you get is this:

You followed the same procedure as last time you did something similar. You even copied & pasted command lines from a previous documentation …

… but the server somehow still fails to serve the site, because something still refuses to give him access.

And shouldn’t the final chmod 777 give full access to the file for everyone? And shouldn’t this mean, that the service should be able to serve it?

It seems to make no sense anymore:

The permissions are world-open. So there simply cannot be a permission denied anymore!

Wait - not this fast …

As always - Linux doesn’t behave somehow “randomly”:

If anything fails, then because of something that is enforcing it - a part of the whole system (a “layer”) we may haven’t taken into account yet.

And our job is to find this layer and correct the issue.

So take a step back and approach this like an experienced administrator would do:

On every system, most of the time there isn’t only “the one single root of truth” - instead there are multiple different aspects that interact with each other and lead to the facts we experience.

Or more memorable: We need to look at all the layers, that could explain the behavior we see.

So let’s go through the applicable layers systematically and see where your intention breaks.

Why disk space is full - even when df says it isn’t

a simple incident - and what it shows us about operating a Linux system instead of just using it

Suddenly you get a disk-full error while writing something to disk.

robert@ubuntu1:~$ sudo tar -cf /srv/data/etc_backup.tar /etc
tar: /srv/data/etc_backup.tar: Cannot open: No space left on device

Ah - not again …

So let’s quickly spin up the df tool and check what’s going on here.

And surprisingly - the disk still seems to have free space left.

robert@ubuntu1:~$ df -h
Filesystem Size Used Avail Use% Mounted on
tmpfs      392M 932K  391M   1% /run
/dev/vda2   12G 7.2G  4.0G  65% /
/dev/vdb1    3G 1.2G  1.8G  40% /srv/data    <-- free space left
...

(see the mentioned 40% for the mountpoint “/srv/data”?)

But despite this - it even fails to create a completely empty file:

robert@ubuntu1:~$ sudo touch /srv/data/testonly
touch: cannot touch '/srv/data/testonly': No space left on device

Tools don’t lie. But they just show you only one single layer of the system.

Here are a few of the real-world problems I’ve seen multiple times on production systems, seemingly coming out of nowhere:

a service suddenly dies or misbehaves
… and users call you to solve this immediately.
users can no longer log in
.. but they need to exactly now!
the companies website suddenly doesn’t accept new connections
… the “worst case” as your boss states.
users complain about lost or delayed emails
… and they are always waiting for the most important right now.
backups and even log-rotations fail
… and this may have accumulated even more risks hidden in the background over time.

… and at the end, each of these problems was caused by something like a “disk full error”.

Despite the implemented monitoring didn’t alert for a full disk.

I think it’s obvious, that in such a situation, panicking and deleting some large files on the filesystem will not solve the problem while leaving you confident with your solution.

Instead - and this should always be your mantra in troubleshooting - we should tackle the problem in a systematic way to rule out anything that could be the root-cause, until we have identified the real underlying problem.

Or to say it a bit differently:

If you are faced with a problem like this - despite the urgency you may feel to solve it instantly: Always think your way through all the layers that may be involved. And then bind your solution to the right one.

A quick work around may help short-term, but leaves you with uncertainty and the risk of a not long lasting solution.

Let’s use this scenario as an example here - while the same thinking applies far beyond disk space.

From echo to syslog: Smarter logging for your scripts

… or: how to use the logger command to send messages into the system log.

If you are monitoring or troubleshooting a Linux system, then one of the most important things to check is the logfiles.

Running services like Postfix or Apache write them, tiny tools like sudo write logs - and even the system itself, the kernel, tells you what it’s doing and what didn’t go so smoothly.

What if your cronjobs or scripts could do the same? You could simply use your known toolbox - grep, less, tail, and friends - to see what’s going on, or use the more modern journalctl.

Here’s the thing: you can easily write your own logs from the command line (and therefore from cronjobs or scripts) and inject your own messages into those central log repositories.

And if you know a few basic concepts, you can even filter your messages into certain files or “sections” of your system log.

But first: why not just append your messages to your own log file?

You could, of course, log messages yourself by just appending them to a file:

echo "my message" >> my-logfile

That works - but it misses some smart benefits you get when you log messages the way I’ll show you next:

Automatic timestamping and tagging
Messages written via the syslog protocol (RFC 5424 / 3164) - that’s what we’ll use - include timestamps, hostnames, and tags automatically.
Centralized handling
Your messages become part of the system’s managed logs, so they rotate, compress, and get archived together with the rest.
Remote logging integration
Syslog already supports forwarding logs to a central host. So if your system’s rsyslog or syslog-ng is configured for remote forwarding, your messages will follow automatically.

Once you use the existing log infrastructure, your messages show up in the same places as the rest of the system - viewable with grep, less, or journalctl.

So, how can you tap into that same system yourself? Let’s look at the command that makes this easy.

Linux Distributions? What They Are and Which to Pick

When people for the first time dive into Linux, one of the most confusing things is that there isn’t just one Linux. Instead, you’ll find hundreds of so-called distributions (or distros) floating around.

So what exactly is a distribution, and why does it matter which one you use?

What is a Linux Distribution?

As described in “First things first: What’s Linux, anyway?”, a running Linux system consists of a few independent components:

The Linux Kernel
This gives you hardware-abstraction, process-management, security isolation and so on.
The GNU-Tools
These provide you with a shell (the ”command line”) and the tools you need for daily tasks.
Other applications
Like user-applications, servers and any other software you can imagine running on a pc or server.

Now you could collect all these parts on your own from their developers websites (kernel.org, gnu.org, …), compile them for your CPU-architecture and bring them somehow onto a bootable disk.

If you really deeply want to this: Go to www.linuxfromscratch.org and follow the instructions.

This may be a lot of fun and you’ll learn a ton. But for the daily deployment of a Linux system, this won’t be the best way to go.

And this is, where a Linux distributor comes in to play: They bundle everything you need for a running system, add an installer and provide it to you on a bootable medium.

So a distribution is basically:

The Linux kernel itself (sometimes slightly customized)
A set of system libraries and tools (GNU utilities, shells, compilers, etc.)
A package management system to install and update software in a convenient way
A preselected bundle of applications (from web servers to desktop environments)
Policies and defaults chosen by the distribution maintainers

Instead of you assembling all these pieces yourself, a distribution team does it for you, adds testing and updates, and makes sure everything fits together.

👉 Think of a distribution like a curated “bundle” of Linux plus tools, shaped with a certain philosophy or audience in mind.

So what are the differences?

Sometimes you just need to know ...

Why everyone can write to /tmp - without creating chaos

You can ping a system - and still have a broken network

When chmod 777 is not enough - and the service still can’t access the file

Why disk space is full - even when df says it isn’t

From echo to syslog: Smarter logging for your scripts

But first: why not just append your messages to your own log file?

Linux Distributions? What They Are and Which to Pick

What is a Linux Distribution?

Feeling unsure in Linux?

Build real command-line confidence

Start here

Work faster

See what’s really happening

The ShellToolBox 3.0

On-Demand Course

Master The Linux Filesystem Permissions