Zum Inhalt

Linux Textverarbeitung Cheat Sheet

generieren

Überblick

Linux Textverarbeitungstools bieten leistungsfähige Fähigkeiten zur Manipulation, Analyse und Transformation von Textdaten. Dieser umfassende Leitfaden umfasst wesentliche Werkzeuge wie Grep, awk, sed, sortieren und viele andere, die die Grundlage der Textverarbeitung und Datenanalyse-Workflows bilden.

ZEIT Warning: Textverarbeitungsbefehle können Dateien dauerhaft ändern. Sichern Sie immer wichtige Dateien, bevor Sie Massentextoperationen durchführen.

Dateiansicht und Navigation

Grundlegende Dateianzeige

```bash

Display entire file

cat filename cat -n filename # With line numbers cat -b filename # Number non-blank lines only cat -A filename # Show all characters including non-printing

Display multiple files

cat file1 file2 file3

Create file with content

cat > newfile << EOF Line 1 Line 2 EOF ```_

Paginated Viewing

```bash

Page through file

less filename more filename

Less navigation:

Space/f - next page

b - previous page

/pattern - search forward

?pattern - search backward

n - next search result

N - previous search result

q - quit

More options

less +F filename # Follow file like tail -f less +/pattern filename # Start at first match ```_

Teilweise Dateianzeige

```bash

First lines of file

head filename head -n 20 filename # First 20 lines head -c 100 filename # First 100 characters

Last lines of file

tail filename tail -n 20 filename # Last 20 lines tail -f filename # Follow file changes tail -F filename # Follow with retry

Specific line ranges

sed -n '10,20p' filename # Lines 10-20 awk 'NR>=10 && NR``<=20' filename ```_

Muster Suche mit Grep

Basic Grep Verwendung

```bash

Search for pattern

grep "pattern" filename grep "pattern" file1 file2 file3

Case-insensitive search

grep -i "pattern" filename

Show line numbers

grep -n "pattern" filename

Show only matching part

grep -o "pattern" filename

Count matches

grep -c "pattern" filename ```_

Erweiterte Grep Optionen

```bash

Recursive search

grep -r "pattern" /path/to/directory grep -R "pattern" /path/to/directory

Search in specific file types

grep -r --include=".txt" "pattern" /path grep -r --exclude=".log" "pattern" /path

Invert match (show non-matching lines)

grep -v "pattern" filename

Show context around matches

grep -A 3 "pattern" filename # 3 lines after grep -B 3 "pattern" filename # 3 lines before grep -C 3 "pattern" filename # 3 lines before and after

Multiple patterns

grep -E "pattern1|pattern2" filename grep -e "pattern1" -e "pattern2" filename ```_

Reguläre Ausdrücke mit Grep

```bash

Extended regular expressions

grep -E "^start.*end$" filename grep -E "[0-9]\{3\}-[0-9]\{3\}-[0-9]\{4\}" filename # Phone numbers grep -E "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+.[A-Z|a-z]\{2,\}\b" filename # Email

Perl-compatible regular expressions

grep -P "\d\{3\}-\d\{3\}-\d\{4\}" filename

Word boundaries

grep -w "word" filename # Match whole word only grep "\bword\b" filename # Same as -w

Character classes

grep "[0-9]" filename # Any digit grep "[a-zA-Z]" filename # Any letter grep "[^0-9]" filename # Not a digit ```_

Stream Editieren mit Sed

Basis-Sed-Operationen

```bash

Substitute (replace)

sed 's/old/new/' filename # First occurrence per line sed 's/old/new/g' filename # All occurrences sed 's/old/new/2' filename # Second occurrence per line

In-place editing

sed -i 's/old/new/g' filename sed -i.bak 's/old/new/g' filename # Create backup

Case-insensitive substitution

sed 's/old/new/gi' filename ```_

Erweiterte Sed Befehle

```bash

Delete lines

sed '5d' filename # Delete line 5 sed '5,10d' filename # Delete lines 5-10 sed '/pattern/d' filename # Delete lines matching pattern

Print specific lines

sed -n '5p' filename # Print line 5 only sed -n '5,10p' filename # Print lines 5-10 sed -n '/pattern/p' filename # Print matching lines

Insert and append

sed '5i\New line' filename # Insert before line 5 sed '5a\New line' filename # Append after line 5

Multiple commands

sed -e 's/old1/new1/g' -e 's/old2/new2/g' filename sed 's/old1/new1/g; s/old2/new2/g' filename ```_

Sed mit regelmäßigen Ausdrücken

```bash

Address ranges with patterns

sed '/start/,/end/d' filename # Delete from start to end pattern sed '/pattern/,+5d' filename # Delete matching line and next 5

Backreferences

sed 's/([0-9])-([0-9])/\2-\1/' filename # Swap numbers around dash

Multiple line operations

sed 'N;s/\n/ /' filename # Join pairs of lines ```_

Textverarbeitung mit AWK

Basis AWK Verwendung

```bash

Print specific fields

awk '\{print $1\}' filename # First field awk '\{print $1, $3\}' filename # First and third fields awk '\{print $NF\}' filename # Last field awk '\{print $(NF-1)\}' filename # Second to last field

Field separator

awk -F: '\{print $1\}' /etc/passwd # Use colon as separator awk -F',' '\{print $2\}' file.csv # Use comma as separator

Print with custom formatting

awk '\{printf "%-10s %s\n", $1, $2\}' filename ```_

AWK Muster passend

```bash

Pattern matching

awk '/pattern/ \{print\}' filename awk '/pattern/ \{print $1\}' filename awk '$1 ~ /pattern/ \{print\}' filename # First field matches pattern awk '$1 !~ /pattern/ \{print\}' filename # First field doesn't match

Numeric comparisons

awk '$3 >`` 100 \\{print\\}' filename # Third field greater than 100 awk '$2 == "value" \\{print\\}' filename # Second field equals value awk 'NR > 1 \\{print\\}' filename # Skip header line ```_

AWK Programmierkonstrukte

```bash

Variables and calculations

awk '\\{sum += $1\\} END \\{print sum\\}' filename # Sum first column awk '\\{count++\\} END \\{print count\\}' filename # Count lines

Conditional statements

awk '\\{if ($1 > 100) print "High: " $0; else print "Low: " $0\\}' filename

Loops

awk '\\{for(i=1; i<=NF; i++) print $i\\}' filename # Print each field on new line

Built-in variables

awk '\\{print NR, NF, $0\\}' filename # Line number, field count, whole line awk 'END \\{print NR\\}' filename # Total line count ```_

AWK erweiterte Funktionen

```bash

Multiple patterns

awk '/start/,/end/ \\{print\\}' filename # Print from start to end pattern

User-defined functions

awk 'function square(x) \\{return x*x\\} \\{print square($1)\\}' filename

Arrays

awk '\\{count[$1]++\\} END \\{for (word in count) print word, count[word]\\}' filename

String functions

awk '\\{print length($0)\\}' filename # Line length awk '\\{print substr($0, 1, 10)\\}' filename # First 10 characters awk '\\{print toupper($0)\\}' filename # Convert to uppercase ```_

Sortierung und Einzigartigkeit

Grundlegende Sortierung

```bash

Sort lines alphabetically

sort filename sort -r filename # Reverse order sort -u filename # Remove duplicates

Numeric sorting

sort -n filename # Numeric sort sort -nr filename # Numeric reverse sort sort -h filename # Human numeric sort (1K, 2M, etc.)

Sort by specific field

sort -k2 filename # Sort by second field sort -k2,2 filename # Sort by second field only sort -k2n filename # Numeric sort by second field ```_

Erweiterte Sortierung

```bash

Multiple sort keys

sort -k1,1 -k2n filename # Sort by field 1, then numerically by field 2

Custom field separator

sort -t: -k3n /etc/passwd # Sort passwd by UID

Sort by specific columns

sort -k1.2,1.4 filename # Sort by characters 2-4 of first field

Stable sort

sort -s -k2 filename # Maintain relative order of equal elements ```_

Einzigartige Operationen

```bash

Remove duplicate lines

uniq filename # Remove consecutive duplicates sort filename|uniq # Remove all duplicates

Count occurrences

uniq -c filename # Count consecutive duplicates sort filename|uniq -c # Count all duplicates

Show only duplicates or unique lines

uniq -d filename # Show only duplicate lines uniq -u filename # Show only unique lines

Compare fields

uniq -f1 filename # Skip first field when comparing uniq -s5 filename # Skip first 5 characters ```_

Texttransformation

Charakter Übersetzung

```bash

Character replacement

tr 'a-z' 'A-Z' < filename # Convert to uppercase tr 'A-Z' 'a-z' < filename # Convert to lowercase tr ' ' '_' < filename # Replace spaces with underscores

Delete characters

tr -d '0-9' < filename # Delete all digits tr -d '\n' < filename # Remove newlines tr -d '[:punct:]' < filename # Remove punctuation

Squeeze repeated characters

tr -s ' ' < filename # Squeeze multiple spaces to one tr -s '\n' < filename # Remove blank lines ```_

Schneiden und Einfügen

```bash

Extract columns

cut -c1-10 filename # Characters 1-10 cut -c1,5,10 filename # Characters 1, 5, and 10 cut -c10- filename # From character 10 to end

Extract fields

cut -d: -f1 /etc/passwd # First field (colon delimiter) cut -d, -f1,3 file.csv # Fields 1 and 3 (comma delimiter) cut -f2- filename # From field 2 to end (tab delimiter)

Paste files together

paste file1 file2 # Merge lines side by side paste -d, file1 file2 # Use comma as delimiter paste -s filename # Merge all lines into one ```_

Teilnahme an Operationen

```bash

Join files on common field

join file1 file2 # Join on first field join -1 2 -2 1 file1 file2 # Join field 2 of file1 with field 1 of file2 join -t: file1 file2 # Use colon as field separator

Outer joins

join -a1 file1 file2 # Include unmatched lines from file1 join -a2 file1 file2 # Include unmatched lines from file2 join -a1 -a2 file1 file2 # Full outer join ```_

Textanalyse und Statistik

Wort- und Zeilenzählung

```bash

Count lines, words, characters

wc filename wc -l filename # Lines only wc -w filename # Words only wc -c filename # Characters only wc -m filename # Characters (multibyte aware)

Count specific patterns

grep -c "pattern" filename # Count matching lines grep -o "pattern" filename|wc -l # Count pattern occurrences ```_

Frequenzanalyse

```bash

Word frequency

| tr ' ' '\n' < filename | sort | uniq -c | sort -nr |

Character frequency

| fold -w1 filename | sort | uniq -c | sort -nr |

Line frequency

| sort filename | uniq -c | sort -nr |

Field frequency

| awk '\\{print $1\\}' filename | sort | uniq -c | sort -nr | ```_

Erweiterte Textverarbeitung

Multifile Operationen

```bash

Process multiple files

grep "pattern" .txt awk '\\{print FILENAME, $0\\}' .txt sed 's/old/new/g' *.txt

Combine files

cat file1 file2 > combined sort -m sorted1 sorted2 > merged # Merge sorted files ```_

Komplexe Pipelines

```bash

Log analysis pipeline

| cat access.log | grep "404" | awk '\\{print $1\\}' | sort | uniq -c | sort -nr |

CSV processing

| cut -d, -f2,4 data.csv | grep -v "^$" | sort -u |

Text statistics

| cat document.txt | tr -d '[:punct:]' | tr ' ' '\n' | grep -v "^$" | sort | uniq -c | sort -nr | head -10 | ```_

Regelmäßige Ausdruckswerkzeuge

```bash

Perl-style regex

perl -pe 's/pattern/replacement/g' filename perl -ne 'print if /pattern/' filename

Extended grep alternatives

egrep "pattern1|pattern2" filename fgrep "literal string" filename # No regex interpretation ```_

Fehlerbehebung der Textverarbeitung

Gemeinsame Themen

```bash

Handle different line endings

dos2unix filename # Convert DOS to Unix line endings unix2dos filename # Convert Unix to DOS line endings tr -d '\r' < filename # Remove carriage returns

Encoding issues

iconv -f ISO-8859-1 -t UTF-8 filename # Convert encoding file filename # Check file type and encoding

Large file processing

split -l 1000 largefile prefix # Split into 1000-line chunks head -n 1000000 largefile|tail -n 1000 # Process middle section ```_

Leistungsoptimierung

```bash

Faster alternatives for large files

LC_ALL=C sort filename # Use C locale for faster sorting mawk instead of awk # Faster AWK implementation ripgrep (rg) instead of grep # Faster search tool

Memory-efficient processing

sort -S 1G filename # Use 1GB memory for sorting split and process in chunks # For very large files ```_

Ressourcen

--

*Dieses Betrugsblatt bietet umfassende Textverarbeitungsbefehle für Linux-Systeme. Üben Sie mit Beispieldaten, bevor Sie diese Befehle auf wichtige Dateien anwenden. *