Linux Textverarbeitung Cheat Sheet
Überblick
Linux Textverarbeitungstools bieten leistungsfähige Fähigkeiten zur Manipulation, Analyse und Transformation von Textdaten. Dieser umfassende Leitfaden umfasst wesentliche Werkzeuge wie Grep, awk, sed, sortieren und viele andere, die die Grundlage der Textverarbeitung und Datenanalyse-Workflows bilden.
ZEIT Warning: Textverarbeitungsbefehle können Dateien dauerhaft ändern. Sichern Sie immer wichtige Dateien, bevor Sie Massentextoperationen durchführen.
Dateiansicht und Navigation
Grundlegende Dateianzeige
```bash
Display entire file
cat filename cat -n filename # With line numbers cat -b filename # Number non-blank lines only cat -A filename # Show all characters including non-printing
Display multiple files
cat file1 file2 file3
Create file with content
cat > newfile << EOF Line 1 Line 2 EOF ```_
Paginated Viewing
```bash
Page through file
less filename more filename
Less navigation:
Space/f - next page
b - previous page
/pattern - search forward
?pattern - search backward
n - next search result
N - previous search result
q - quit
More options
less +F filename # Follow file like tail -f less +/pattern filename # Start at first match ```_
Teilweise Dateianzeige
```bash
First lines of file
head filename head -n 20 filename # First 20 lines head -c 100 filename # First 100 characters
Last lines of file
tail filename tail -n 20 filename # Last 20 lines tail -f filename # Follow file changes tail -F filename # Follow with retry
Specific line ranges
sed -n '10,20p' filename # Lines 10-20 awk 'NR>=10 && NR``<=20' filename ```_
Muster Suche mit Grep
Basic Grep Verwendung
```bash
Search for pattern
grep "pattern" filename grep "pattern" file1 file2 file3
Case-insensitive search
grep -i "pattern" filename
Show line numbers
grep -n "pattern" filename
Show only matching part
grep -o "pattern" filename
Count matches
grep -c "pattern" filename ```_
Erweiterte Grep Optionen
```bash
Recursive search
grep -r "pattern" /path/to/directory grep -R "pattern" /path/to/directory
Search in specific file types
grep -r --include=".txt" "pattern" /path grep -r --exclude=".log" "pattern" /path
Invert match (show non-matching lines)
grep -v "pattern" filename
Show context around matches
grep -A 3 "pattern" filename # 3 lines after grep -B 3 "pattern" filename # 3 lines before grep -C 3 "pattern" filename # 3 lines before and after
Multiple patterns
grep -E "pattern1|pattern2" filename grep -e "pattern1" -e "pattern2" filename ```_
Reguläre Ausdrücke mit Grep
```bash
Extended regular expressions
grep -E "^start.*end$" filename grep -E "[0-9]\{3\}-[0-9]\{3\}-[0-9]\{4\}" filename # Phone numbers grep -E "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+.[A-Z|a-z]\{2,\}\b" filename # Email
Perl-compatible regular expressions
grep -P "\d\{3\}-\d\{3\}-\d\{4\}" filename
Word boundaries
grep -w "word" filename # Match whole word only grep "\bword\b" filename # Same as -w
Character classes
grep "[0-9]" filename # Any digit grep "[a-zA-Z]" filename # Any letter grep "[^0-9]" filename # Not a digit ```_
Stream Editieren mit Sed
Basis-Sed-Operationen
```bash
Substitute (replace)
sed 's/old/new/' filename # First occurrence per line sed 's/old/new/g' filename # All occurrences sed 's/old/new/2' filename # Second occurrence per line
In-place editing
sed -i 's/old/new/g' filename sed -i.bak 's/old/new/g' filename # Create backup
Case-insensitive substitution
sed 's/old/new/gi' filename ```_
Erweiterte Sed Befehle
```bash
Delete lines
sed '5d' filename # Delete line 5 sed '5,10d' filename # Delete lines 5-10 sed '/pattern/d' filename # Delete lines matching pattern
Print specific lines
sed -n '5p' filename # Print line 5 only sed -n '5,10p' filename # Print lines 5-10 sed -n '/pattern/p' filename # Print matching lines
Insert and append
sed '5i\New line' filename # Insert before line 5 sed '5a\New line' filename # Append after line 5
Multiple commands
sed -e 's/old1/new1/g' -e 's/old2/new2/g' filename sed 's/old1/new1/g; s/old2/new2/g' filename ```_
Sed mit regelmäßigen Ausdrücken
```bash
Address ranges with patterns
sed '/start/,/end/d' filename # Delete from start to end pattern sed '/pattern/,+5d' filename # Delete matching line and next 5
Backreferences
sed 's/([0-9])-([0-9])/\2-\1/' filename # Swap numbers around dash
Multiple line operations
sed 'N;s/\n/ /' filename # Join pairs of lines ```_
Textverarbeitung mit AWK
Basis AWK Verwendung
```bash
Print specific fields
awk '\{print $1\}' filename # First field awk '\{print $1, $3\}' filename # First and third fields awk '\{print $NF\}' filename # Last field awk '\{print $(NF-1)\}' filename # Second to last field
Field separator
awk -F: '\{print $1\}' /etc/passwd # Use colon as separator awk -F',' '\{print $2\}' file.csv # Use comma as separator
Print with custom formatting
awk '\{printf "%-10s %s\n", $1, $2\}' filename ```_
AWK Muster passend
```bash
Pattern matching
awk '/pattern/ \{print\}' filename awk '/pattern/ \{print $1\}' filename awk '$1 ~ /pattern/ \{print\}' filename # First field matches pattern awk '$1 !~ /pattern/ \{print\}' filename # First field doesn't match
Numeric comparisons
awk '$3 >`` 100 \\{print\\}' filename # Third field greater than 100 awk '$2 == "value" \\{print\\}' filename # Second field equals value awk 'NR > 1 \\{print\\}' filename # Skip header line ```_
AWK Programmierkonstrukte
```bash
Variables and calculations
awk '\\{sum += $1\\} END \\{print sum\\}' filename # Sum first column awk '\\{count++\\} END \\{print count\\}' filename # Count lines
Conditional statements
awk '\\{if ($1 > 100) print "High: " $0; else print "Low: " $0\\}' filename
Loops
awk '\\{for(i=1; i<=NF; i++) print $i\\}' filename # Print each field on new line
Built-in variables
awk '\\{print NR, NF, $0\\}' filename # Line number, field count, whole line awk 'END \\{print NR\\}' filename # Total line count ```_
AWK erweiterte Funktionen
```bash
Multiple patterns
awk '/start/,/end/ \\{print\\}' filename # Print from start to end pattern
User-defined functions
awk 'function square(x) \\{return x*x\\} \\{print square($1)\\}' filename
Arrays
awk '\\{count[$1]++\\} END \\{for (word in count) print word, count[word]\\}' filename
String functions
awk '\\{print length($0)\\}' filename # Line length awk '\\{print substr($0, 1, 10)\\}' filename # First 10 characters awk '\\{print toupper($0)\\}' filename # Convert to uppercase ```_
Sortierung und Einzigartigkeit
Grundlegende Sortierung
```bash
Sort lines alphabetically
sort filename sort -r filename # Reverse order sort -u filename # Remove duplicates
Numeric sorting
sort -n filename # Numeric sort sort -nr filename # Numeric reverse sort sort -h filename # Human numeric sort (1K, 2M, etc.)
Sort by specific field
sort -k2 filename # Sort by second field sort -k2,2 filename # Sort by second field only sort -k2n filename # Numeric sort by second field ```_
Erweiterte Sortierung
```bash
Multiple sort keys
sort -k1,1 -k2n filename # Sort by field 1, then numerically by field 2
Custom field separator
sort -t: -k3n /etc/passwd # Sort passwd by UID
Sort by specific columns
sort -k1.2,1.4 filename # Sort by characters 2-4 of first field
Stable sort
sort -s -k2 filename # Maintain relative order of equal elements ```_
Einzigartige Operationen
```bash
Remove duplicate lines
uniq filename # Remove consecutive duplicates sort filename|uniq # Remove all duplicates
Count occurrences
uniq -c filename # Count consecutive duplicates sort filename|uniq -c # Count all duplicates
Show only duplicates or unique lines
uniq -d filename # Show only duplicate lines uniq -u filename # Show only unique lines
Compare fields
uniq -f1 filename # Skip first field when comparing uniq -s5 filename # Skip first 5 characters ```_
Texttransformation
Charakter Übersetzung
```bash
Character replacement
tr 'a-z' 'A-Z' < filename # Convert to uppercase tr 'A-Z' 'a-z' < filename # Convert to lowercase tr ' ' '_' < filename # Replace spaces with underscores
Delete characters
tr -d '0-9' < filename # Delete all digits tr -d '\n' < filename # Remove newlines tr -d '[:punct:]' < filename # Remove punctuation
Squeeze repeated characters
tr -s ' ' < filename # Squeeze multiple spaces to one tr -s '\n' < filename # Remove blank lines ```_
Schneiden und Einfügen
```bash
Extract columns
cut -c1-10 filename # Characters 1-10 cut -c1,5,10 filename # Characters 1, 5, and 10 cut -c10- filename # From character 10 to end
Extract fields
cut -d: -f1 /etc/passwd # First field (colon delimiter) cut -d, -f1,3 file.csv # Fields 1 and 3 (comma delimiter) cut -f2- filename # From field 2 to end (tab delimiter)
Paste files together
paste file1 file2 # Merge lines side by side paste -d, file1 file2 # Use comma as delimiter paste -s filename # Merge all lines into one ```_
Teilnahme an Operationen
```bash
Join files on common field
join file1 file2 # Join on first field join -1 2 -2 1 file1 file2 # Join field 2 of file1 with field 1 of file2 join -t: file1 file2 # Use colon as field separator
Outer joins
join -a1 file1 file2 # Include unmatched lines from file1 join -a2 file1 file2 # Include unmatched lines from file2 join -a1 -a2 file1 file2 # Full outer join ```_
Textanalyse und Statistik
Wort- und Zeilenzählung
```bash
Count lines, words, characters
wc filename wc -l filename # Lines only wc -w filename # Words only wc -c filename # Characters only wc -m filename # Characters (multibyte aware)
Count specific patterns
grep -c "pattern" filename # Count matching lines grep -o "pattern" filename|wc -l # Count pattern occurrences ```_
Frequenzanalyse
```bash
Word frequency
| tr ' ' '\n' < filename | sort | uniq -c | sort -nr |
Character frequency
| fold -w1 filename | sort | uniq -c | sort -nr |
Line frequency
| sort filename | uniq -c | sort -nr |
Field frequency
| awk '\\{print $1\\}' filename | sort | uniq -c | sort -nr | ```_
Erweiterte Textverarbeitung
Multifile Operationen
```bash
Process multiple files
grep "pattern" .txt awk '\\{print FILENAME, $0\\}' .txt sed 's/old/new/g' *.txt
Combine files
cat file1 file2 > combined sort -m sorted1 sorted2 > merged # Merge sorted files ```_
Komplexe Pipelines
```bash
Log analysis pipeline
| cat access.log | grep "404" | awk '\\{print $1\\}' | sort | uniq -c | sort -nr |
CSV processing
| cut -d, -f2,4 data.csv | grep -v "^$" | sort -u |
Text statistics
| cat document.txt | tr -d '[:punct:]' | tr ' ' '\n' | grep -v "^$" | sort | uniq -c | sort -nr | head -10 | ```_
Regelmäßige Ausdruckswerkzeuge
```bash
Perl-style regex
perl -pe 's/pattern/replacement/g' filename perl -ne 'print if /pattern/' filename
Extended grep alternatives
egrep "pattern1|pattern2" filename fgrep "literal string" filename # No regex interpretation ```_
Fehlerbehebung der Textverarbeitung
Gemeinsame Themen
```bash
Handle different line endings
dos2unix filename # Convert DOS to Unix line endings unix2dos filename # Convert Unix to DOS line endings tr -d '\r' < filename # Remove carriage returns
Encoding issues
iconv -f ISO-8859-1 -t UTF-8 filename # Convert encoding file filename # Check file type and encoding
Large file processing
split -l 1000 largefile prefix # Split into 1000-line chunks head -n 1000000 largefile|tail -n 1000 # Process middle section ```_
Leistungsoptimierung
```bash
Faster alternatives for large files
LC_ALL=C sort filename # Use C locale for faster sorting mawk instead of awk # Faster AWK implementation ripgrep (rg) instead of grep # Faster search tool
Memory-efficient processing
sort -S 1G filename # Use 1GB memory for sorting split and process in chunks # For very large files ```_
Ressourcen
- GNU Text Utilities Manual
- [AWK Programming Guide](LINK_5 -%20[Sed%20Manual](__LINK_5___ -%20Regular%20Expressions%20Tutorial
- Textverarbeitungsbeispiele
--
*Dieses Betrugsblatt bietet umfassende Textverarbeitungsbefehle für Linux-Systeme. Üben Sie mit Beispieldaten, bevor Sie diese Befehle auf wichtige Dateien anwenden. *