cat - This multitool can help you more than you think

When you think about essential Linux command-line tools, cat is probably one of the first that comes into mind. Short for “concatenate”, this seemingly simple command is part of the basic equipment of every Linux user’s toolkit. Most commonly, it’s used to display the contents of a file in the terminal - a quick and easy way to peek inside a document.

But if you think that’s all cat is good for - you’re in for a surprise :-)

cat has a few tricks up its sleeve, that can help you to streamline your workflow: From transforming data to merging files or creating new ones - cat definitively deserves a closer look.

And along the way, I promise, we will stumble upon one or the other interesting tool or concept too …

Let’s start simple and understand the basic way cat is working

If you start cat on the command line without any additional parameter, then you will “lose your prompt”: You’ll have only a cursor on a blank line as a sign that you can enter some text:

Now if you enter some text and finish your input-line by hitting “<enter>”, then cat will immediately repeat the line you just typed in:

After that - again an empty line with a lonely cursor. Now you can enter the next line, which will also be repeated and so on. (you can stop the cat command in this state at any time by simply hitting <ctrl>+c)

what we just observed is exactly the way how cat works:

  • It reads in data line by line from its input datastream, which is by default bound to the terminal - and therefore to your keyboard.
  • The output of cat then goes to its output datastream, that is in this simple example bound to the terminal.

For illustration: This is part of a screenshot taken from the video I linked below: On the left-hand side I’ve tried to draw a keyboard, on the right-hand side a terminal (such an artist I am … :))

If you prefer watching instead of reading, here is the video covering the essentials from this post:

Useful command-line switches for “cat”

Now, that we have used cat in this simple manner, let’s try out some useful command-line switches.

First let’s start with numbering lines …

With an added -n at the command line, cat numbers all the lines it prints out:

[robert@demo ~]$ cat -n
let's type in a line
    1  let's type in a line
and now a second one
    2  and now a second one
finally a third
    3  finally a third
^C
[robert@demo ~]$

again - just hit <ctrl>+c to end cat.

Obviously this isn’t a really useful way to use cat. But how about numbering the lines of a file for better readability or for making references? If you add the name of a file to the command line, then cat uses this file for its input instead of your keyboard:

[robert@demo ~]$ cat -n test.txt
    1  this is a simple file for testing
    2          the content means nothing 
    3          
    4          really nothing
    5

And yes - if you omit the -n parameter from the example above, cat just prints out the content of the file without further modifications. This I the way I bet you have used this tool myriads of times before …

Although cat knows a second parameter for tweaking the numbering (-b only counts non-empty-lines):

if you need a ninja-tool for numbering text-data, have a look at nl which let’s you do the numbering in every possible way I may think of.

In its most basic variant, nl works just like cat -n without numbering completely empty lines:

[robert@demo ~]$ nl test.txt
    1  this is a simple file for testing
    2          the content means nothing 
    3          
    4          really nothing

(Do you see the seemingly empty line numbered with “3”? Read further to see with cat why this line is numbered.)

BTW: If you need to persist the output of a command for later usage or investigation, send the output of the command to a file by just appending  > filename  to the command line:

cat -n test.txt > test_cat_numbered.txt 

or

nl test.txt > test_nl_numbered.txt 

This works with every command line you use:

To save the output of a command into a file, append  > a_filename  to the end of the command just like in ls -l /etc > list.txt.

Back to the seemingly empty numbered line from the example above …

Use cat to detect special characters

If you use cat to print out the content of a file, then you always only see a representation of the data, done by the terminal.

If you see an empty line, then this line doesn’t necessarily have to be really empty: It may, for instance, contain several white spaces or TABs.

To make these typically invisible characters visible, cat can help you with a few parameters:

Use -E to see the end of each line.

With the parameter -E, cat prints at the very end of each line a ‘$’-sign as a visual marker:

[robert@demo ~]$ cat -E test.txt
this is a simple file for testing$
        the content means nothing $
        $
        really nothing$
$

As you can see, the third line isn’t completely empty as it seemed first: The $-sign as the line-end-marker is indented like the lines above and below.

To see, if the line is indented by spaces or by tabs, the next parameter is helpful …

Use -t so see markers for TABs

Tabulators will be visually marked with ^I , if you use the -t parameter:

[robert@demo ~]$ cat -t test.txt
this is a simple file for testing
^Ithe content means nothing 
        
        really nothing

[robert@demo ~]$

As you can see, the only line here containing a TAB is the first one. The other indented lines are indented with spaces.

And of course, you can use parameters in combination for showing TABs and line-endings at once:

[robert@demo ~]$ cat -tE test.txt
this is a simple file for testing$
^Ithe content means nothing $
        $
        really nothing$
$
[robert@demo ~]$

… where cat -tE ... is just the shorthand form for writing cat -t -E ...

See all other non-printing characters with -v

While -E and -t are for marking line-endings and TABs, the -v parameter is for all the other non-printing characters, that could be embedded in the file.

What’s the thing with “non-printing” characters? Well ..

Think about reading a text-file that was generated on a Windows system:

While a line-break on a Linux system (or any other Unix-like system like macOS) consists of only one single character (the “Line Feed” - represented as “\n”), Windows-systems use two characters instead: a “Carrige Return” followed by a “Line Feed” (represented as “\r\n”).

If you process such files with the tools you know, you may see odd and unusual results.

To see the additional “Carrige Return” characters in a file, call cat -v ... as in the following example:

[robert@demo ~]$ cat -v test_win.txt 
this is a simple file for testing^M
        the content means nothing ^M
        ^M
        really nothing^M
^M
[robert@demo ~]$

The ^M markers you see at the end of each line represent the added “Carrige Return” characters. (btw: all non-printing characters shown with -v will be marked as ^M)

side-note: If you want to convert line-endings between Windows- and Linux-style, you can use dos2unix or unix2dos if available on your system. If they are not, have a look at tr or sed respectively to do the modifications.

To convert a text file from Window line-endings to Unix line-endings, you can use one of the following commands to remove the unwanted “Carrige Return” characters (“\r”):
sed 's/\r//' text_win.txt > text_unix.txt or tr -d '\r' < text_win.txt > text_unix.txt

See them all at once with -A

With -A, you can combine the functionality of -E ,-t and -v into one single parameter. That’s how I typically make it easy for myself if I want to know if there are - somwehere in a file or a datastream - special characters that could create problems while further processing the data.

For instance, if grep or sed don’t work as I expect them to, I often ask cat to show me invisible characters I may not be aware of:

cat -A a_filename

Or, if I have the plan to further process the output of a command and I need to know if this command uses TABs or spaces for separation, I simply pipe the output of this command to cat -A.

As an example, try the following on one of your systems:

df -h | cat -A 

This gives you via df -h the fill level of all your mounted filesystems as a table, and cat -A will show you, how this table is built.

Concatenate files, and how to split one first

Beside marking special characters, cat is a helpful tool to combine data - text or binary - into one single file.

Let’s pretend you have a really huuuuge file like this:

Yes - I know - this isn’t this huge. You have seen larger files than this one, but for the purpose of this demo, this works perfectly.

So let’s pretend, we have a file that is too large to handle it the way we want: If I want to sent this file via email, I need to take care of the maximal attachment size my email provider dictates.

Or imagine, we have the image of a large virtual machine or a backup-archive we want to transfer somewhere else.

Then it can be helpful to split the larger file into smaller parts, handle or transfer these parts, and concatenate them at the end.

How to split a large file into parts with split

To split a file into parts, you can use split -b <size> <filname>, where <size> is a number followed by an optional unit like “10M” or “100K”. Without a unit, the number is interpreted as the a number of bytes.

I want to split my file “demolog” file into chunks not larger than 100MB. If I do this with split, the created file-parts will by default be named “xa”, “xb”, “xc” and so on:

As you can see, my 357MB large file was split into three files with 100MB each and one that holds the remaining 57MB.

To get more convenient filenames for the single parts, you can add a filename-prefix as a last parameter in the following way:

Now I have my perfectly named file-parts, which I can process as desired.

But in the end, I want to restore my file from theses single parts. And this is where cat comes into play again:

How to concatenate files with ‘cat’

We can use the feature here, that cat can print out multiple files at once - one after the other:

cat file1 file2 file3 ...

And it doesn’t matter if these files are simple textfiles or binaries - cat just prints them out without adding one single byte itself to the output data.

And if we combine this with redirecting the output of the command into a single file, we can really simple concatenate multiple files to a single one:

cat demolog_aa demolog_ab demolog_ac demolog_ad > demolog_new

We can even simplify this a lot, if we use a filename pattern (demolog_*) instead of typing in every single filename on our own.

This works perfectly here, because the shell will build a list of filenames from the given pattern, ordered alphabetically. And this is exactly what we need.

Concatenate all files matching “demolog_*” to one single new file called “demolog2”:

Prove that the new file holds the same content as the original

As a last step, we need to verify that the concatenated file holds exactly the same data as the original:

Both files have exactly the same size - a good starting point. But is there really every single byte still in its place? Let’s verify this by calculating a checksum for both files:

Since the calculated checksums match exactly, we have proven that splitting and then concatenating the file again was completely lossless.

If you want to check, if two files hold exactly the same content or not, you can calculate a checksum for each file and compare them. Common tools are md5sum for calculating an MD5-checksum or sha1sum for calculating a checksum based on SHA1.

shameless plug by the author: If you want to learn more about tools like the ones we talk about here, check out the “ShellToolBox”. You can find it at shelltoolbox.com or on Amazon.

Create new files with cat

There are many ways for creating new files at the command line. From using the simple touch command like

touch new_filename

to using output-redirection with an empty command like this:

> new_filename

Both versions have one thing in common - they give you an empty new file.

But sometimes you want to load initial content into the newly created file.

For this you can use cat with the following pattern:

[robert@demo ~]$ cat << END > new_file
the content goes here
you could even paste text data
from the clipboard
END

The pattern << END is called the “HERE-operator”, where the phrase “END” is a phrase you chose on your own.

The shell then starts the command line and collects everything you type in until the phrase you placed after the << is seen standalone on a single line.

The collected data is then given to the input datastream of the command.

And as you now know, cat reads in everything from its input datastream and prints it out again.

This output is the redirected via > new_file into a newly created file.

Let’s prove the functionality:

that worked.

Here is what to do next

LBFCoverSmall

If you followed me through this article, you certainly have realized that knowing some internals about how things are working at the Linux command line, can save you a lot of time and frustration.

And sometimes it’s just fun to leverage these powerful mechanics.

If you wanna know more about such “internal mechanisms” of the Linux command line - written especially for Linux beginners

have a look at “The Linux Beginners Framework”

In this framework I guide you through 5 simple steps to feel comfortable at the Linux command line.

This framework comes as a free pdf and you can get it here.

Wanna take an unfair advantage?

ToolboxCoverSmall

If it comes to working on the Linux command line - at the end of the day it is always about knowing the right tool for the right task.

And it is about knowing the tools that are most certainly available on the Linux system you are currently on.

To give you all the tools for your day-to-day work at the Linux command line, I have created “The ShellToolbox”.

This book gives you everything

  • from the very basic commands, through
  • everything you need for working with files and filesystems,
  • managing processes,
  • managing users and permissions, through
  • software management,
  • hardware analyses and
  • simple shell-scripting to the tools you need for
  • doing simple “networking stuff”.

Everything in one single, easy to read book. With explanations and example calls for illustration.

If you are interested, go to shelltoolbox.com and have a look (as long as it is available).