Zum Inhalt

Twint Twitter OSINT Tool Cheat Sheet

generieren

Überblick

Twint ist ein fortgeschrittener Twitter Scraping-Tool geschrieben in Python, die es ermöglicht, Tweets von Twitter-Profilen ohne Verwendung der Twitter-API zu kratzen. Es kann Tweets, Follower, Nachfolge, Retweets und mehr holen, während die meisten von Twitter Einschränkungen umgehen. Twint ist besonders nützlich für OSINT-Untersuchungen, Social Media Monitoring und Forschungszwecke.

ZEIT Legal Hinweis: Nur verwenden Sie Twint für legitime Forschung, OSINT-Untersuchungen oder autorisierte Sicherheitstests. Respektieren Sie die Nutzungsbedingungen von Twitter und geltende Datenschutzgesetze.

Installation

Python Pip Installation

```bash

Install via pip

pip3 install twint

Install development version

pip3 install --user --upgrade git+https://github.com/twintproject/twint.git@origin/master#egg=twint

Install with additional dependencies

pip3 install twint[all]

Verify installation

twint --help ```_

Docker Installation

```bash

Pull Docker image

docker pull twintproject/twint

Run with Docker

docker run -it --rm twintproject/twint

Build from source

git clone https://github.com/twintproject/twint.git cd twint docker build -t twint .

Run with volume mount

docker run -it --rm -v $(pwd)/output:/output twint ```_

Manuelle Installation

```bash

Clone repository

git clone https://github.com/twintproject/twint.git cd twint

Install dependencies

pip3 install -r requirements.txt

Install package

python3 setup.py install

Alternative: Run directly

python3 -m twint --help ```_

Virtual Environment Setup

```bash

Create virtual environment

python3 -m venv twint-env source twint-env/bin/activate

Install Twint

pip install twint

Verify installation

twint --version ```_

Basisnutzung

Kommandozeilenschnittstelle

```bash

Basic tweet scraping

twint -u username

Scrape tweets with specific search term

twint -s "search term"

Scrape tweets from specific user

twint -u elonmusk

Limit number of tweets

twint -u username --limit 100

Save to file

twint -u username -o tweets.csv --csv

Search with date range

twint -s "cybersecurity" --since "2023-01-01" --until "2023-12-31" ```_

Python API Verwendung

```python import twint

Configure Twint

c = twint.Config() c.Username = "username" c.Limit = 100 c.Store_csv = True c.Output = "tweets.csv"

Run search

twint.run.Search(c) ```_

Erweiterte Suchoptionen

Benutzerbasierte Suchanfragen

```bash

Get user's tweets

twint -u username

Get user's followers

twint -u username --followers

Get user's following

twint -u username --following

Get user's favorites/likes

twint -u username --favorites

Get user information

twint -u username --user-full

Get verified users only

twint -s "search term" --verified ```_

Inhaltsbasierte Recherchen

```bash

Search by keyword

twint -s "cybersecurity"

Search with hashtag

twint -s "#infosec"

Search with multiple keywords

twint -s "cybersecurity OR infosec"

Search for exact phrase

twint -s '"exact phrase"'

Search excluding terms

twint -s "cybersecurity -spam"

Search for tweets with links

twint -s "cybersecurity" --links

Search for tweets with media

twint -s "cybersecurity" --media ```_

Geografische und Sprachfilter

```bash

Search by location

twint -s "cybersecurity" --near "New York"

Search with specific language

twint -s "cybersecurity" --lang en

Search with geolocation

twint -s "cybersecurity" --geo "40.7128,-74.0060,10km"

Search popular tweets only

twint -s "cybersecurity" --popular

Search for tweets with minimum likes

twint -s "cybersecurity" --min-likes 10

Search for tweets with minimum retweets

twint -s "cybersecurity" --min-retweets 5 ```_

Datum und Uhrzeitfilter

```bash

Search with date range

twint -s "cybersecurity" --since "2023-01-01" --until "2023-12-31"

Search tweets from specific year

twint -s "cybersecurity" --year 2023

Search tweets from specific hour

twint -s "cybersecurity" --hour 14

Search tweets from today

twint -s "cybersecurity" --since $(date +%Y-%m-%d)

Search tweets from last week

twint -s "cybersecurity" --since $(date -d '7 days ago' +%Y-%m-%d) ```_

Ausgabeformate und Speicher

Optionen für die Dateiausgabe

```bash

Save as CSV

twint -u username -o output.csv --csv

Save as JSON

twint -u username -o output.json --json

Save as text file

twint -u username -o output.txt

Custom CSV format

twint -u username --csv --output tweets.csv --custom-csv "date,time,username,tweet"

Hide output (silent mode)

twint -u username --hide-output

Debug mode

twint -u username --debug ```_

Datenbankspeicher

```bash

Store in Elasticsearch

twint -u username --elasticsearch localhost:9200

Store in SQLite database

twint -u username --database tweets.db

Store with custom database table

twint -u username --database tweets.db --table-tweets custom_tweets ```_

Erweiterte Ausgabekonfiguration

```python import twint

Configure advanced output

c = twint.Config() c.Username = "username" c.Store_csv = True c.Output = "detailed_tweets.csv" c.Custom_csv = ["date", "time", "username", "tweet", "replies_count", "retweets_count", "likes_count", "hashtags", "urls"] c.Hide_output = True

Run search

twint.run.Search(c) ```_

Python API Erweiterte Nutzung

Grundkonfiguration

```python import twint import pandas as pd

def scrape_user_tweets(username, limit=100): """Scrape tweets from specific user""" c = twint.Config() c.Username = username c.Limit = limit c.Store_pandas = True c.Hide_output = True

twint.run.Search(c)

# Get pandas dataframe
tweets_df = twint.storage.panda.Tweets_df
return tweets_df

Usage

tweets = scrape_user_tweets("elonmusk", 50) print(f"Scraped \\{len(tweets)\\} tweets") ```_

Erweiterte Suche Konfiguration

```python import twint from datetime import datetime, timedelta

def advanced_search(search_term, days_back=7, min_likes=5): """Advanced search with multiple filters""" c = twint.Config()

# Search configuration
c.Search = search_term
c.Lang = "en"
c.Min_likes = min_likes
c.Popular_tweets = True

# Date range (last N days)
end_date = datetime.now()
start_date = end_date - timedelta(days=days_back)
c.Since = start_date.strftime("%Y-%m-%d")
c.Until = end_date.strftime("%Y-%m-%d")

# Output configuration
c.Store_pandas = True
c.Hide_output = True

# Run search
twint.run.Search(c)

# Process results
if twint.storage.panda.Tweets_df is not None:
    tweets_df = twint.storage.panda.Tweets_df
    return tweets_df
else:
    return pd.DataFrame()

Usage

cybersec_tweets = advanced_search("cybersecurity", days_back=30, min_likes=10) print(f"Found \\{len(cybersec_tweets)\\} popular cybersecurity tweets") ```_

Funktionen der Benutzeranalyse

```python import twint import pandas as pd from collections import Counter

class TwitterOSINT: def init(self): self.tweets_df = None self.users_df = None

def analyze_user(self, username):
    """Comprehensive user analysis"""
    # Get user tweets
    c = twint.Config()
    c.Username = username
    c.Limit = 1000
    c.Store_pandas = True
    c.Hide_output = True

    twint.run.Search(c)
    self.tweets_df = twint.storage.panda.Tweets_df

    if self.tweets_df is not None and not self.tweets_df.empty:
        analysis = \\\\{
            'username': username,
            'total_tweets': len(self.tweets_df),
            'date_range': \\\\{
                'earliest': self.tweets_df['date'].min(),
                'latest': self.tweets_df['date'].max()
            \\\\},
            'engagement': \\\\{
                'avg_likes': self.tweets_df['likes_count'].mean(),
                'avg_retweets': self.tweets_df['retweets_count'].mean(),
                'avg_replies': self.tweets_df['replies_count'].mean()
            \\\\},
            'top_hashtags': self.get_top_hashtags(),
            'top_mentions': self.get_top_mentions(),
            'posting_patterns': self.analyze_posting_patterns()
        \\\\}
        return analysis
    else:
        return None

def get_top_hashtags(self, top_n=10):
    """Extract top hashtags from tweets"""
    if self.tweets_df is None:
        return []

    all_hashtags = []
    for hashtags in self.tweets_df['hashtags'].dropna():
        if hashtags:
            all_hashtags.extend(hashtags)

    return Counter(all_hashtags).most_common(top_n)

def get_top_mentions(self, top_n=10):
    """Extract top mentions from tweets"""
    if self.tweets_df is None:
        return []

    all_mentions = []
    for mentions in self.tweets_df['mentions'].dropna():
        if mentions:
            all_mentions.extend(mentions)

    return Counter(all_mentions).most_common(top_n)

def analyze_posting_patterns(self):
    """Analyze posting time patterns"""
    if self.tweets_df is None:
        return \\\\{\\\\}

    # Convert time to hour
    self.tweets_df['hour'] = pd.to_datetime(self.tweets_df['time']).dt.hour

    patterns = \\\\{
        'hourly_distribution': self.tweets_df['hour'].value_counts().to_dict(),
        'most_active_hour': self.tweets_df['hour'].mode().iloc[0] if not self.tweets_df['hour'].empty else None,
        'daily_tweet_count': self.tweets_df.groupby('date').size().mean()
    \\\\}

    return patterns

def search_and_analyze(self, search_term, limit=500):
    """Search for tweets and analyze patterns"""
    c = twint.Config()
    c.Search = search_term
    c.Limit = limit
    c.Store_pandas = True
    c.Hide_output = True

    twint.run.Search(c)
    self.tweets_df = twint.storage.panda.Tweets_df

    if self.tweets_df is not None and not self.tweets_df.empty:
        analysis = \\\\{
            'search_term': search_term,
            'total_tweets': len(self.tweets_df),
            'unique_users': self.tweets_df['username'].nunique(),
            'top_users': self.tweets_df['username'].value_counts().head(10).to_dict(),
            'engagement_stats': \\\\{
                'total_likes': self.tweets_df['likes_count'].sum(),
                'total_retweets': self.tweets_df['retweets_count'].sum(),
                'avg_engagement': (self.tweets_df['likes_count'] + self.tweets_df['retweets_count']).mean()
            \\\\},
            'top_hashtags': self.get_top_hashtags(),
            'sentiment_indicators': self.basic_sentiment_analysis()
        \\\\}
        return analysis
    else:
        return None

def basic_sentiment_analysis(self):
    """Basic sentiment analysis using keyword matching"""
    if self.tweets_df is None:
        return \\\\{\\\\}

    positive_words = ['good', 'great', 'excellent', 'amazing', 'love', 'best', 'awesome']
    negative_words = ['bad', 'terrible', 'awful', 'hate', 'worst', 'horrible', 'disgusting']

    positive_count = 0
    negative_count = 0

    for tweet in self.tweets_df['tweet'].str.lower():
        if any(word in tweet for word in positive_words):
            positive_count += 1
        if any(word in tweet for word in negative_words):
            negative_count += 1

    total_tweets = len(self.tweets_df)
    return \\\\{
        'positive_tweets': positive_count,
        'negative_tweets': negative_count,
        'neutral_tweets': total_tweets - positive_count - negative_count,
        'positive_ratio': positive_count / total_tweets if total_tweets > 0 else 0,
        'negative_ratio': negative_count / total_tweets if total_tweets > 0 else 0
    \\\\}

Usage example

osint = TwitterOSINT()

Analyze specific user

user_analysis = osint.analyze_user("elonmusk") if user_analysis: print(f"User Analysis for \\{user_analysis['username']\\}:") print(f"Total tweets: \\{user_analysis['total_tweets']\\}") print(f"Average likes: \\{user_analysis['engagement']['avg_likes']:.2f\\}") print(f"Top hashtags: \\{user_analysis['top_hashtags'][:5]\\}")

Search and analyze topic

topic_analysis = osint.search_and_analyze("cybersecurity", limit=200) if topic_analysis: print(f"\nTopic Analysis for '\\{topic_analysis['search_term']\\}':") print(f"Total tweets: \\{topic_analysis['total_tweets']\\}") print(f"Unique users: \\{topic_analysis['unique_users']\\}") print(f"Average engagement: \\{topic_analysis['engagement_stats']['avg_engagement']:.2f\\}") ```_

OSINT Untersuchung Workflows

Zielbenutzer-Untersuchung

```python

!/usr/bin/env python3

twitter-user-investigation.py

import twint import pandas as pd import json from datetime import datetime, timedelta import matplotlib.pyplot as plt import seaborn as sns

class TwitterUserInvestigation: def init(self, username): self.username = username self.tweets_df = None self.followers_df = None self.following_df = None self.results = \\{\\}

def collect_user_data(self):
    """Collect comprehensive user data"""
    print(f"Investigating Twitter user: \\\\{self.username\\\\}")

    # Collect tweets
    self.collect_tweets()

    # Collect followers (limited)
    self.collect_followers()

    # Collect following (limited)
    self.collect_following()

    # Analyze collected data
    self.analyze_data()

def collect_tweets(self, limit=1000):
    """Collect user tweets"""
    print("Collecting tweets...")

    c = twint.Config()
    c.Username = self.username
    c.Limit = limit
    c.Store_pandas = True
    c.Hide_output = True

    try:
        twint.run.Search(c)
        self.tweets_df = twint.storage.panda.Tweets_df
        print(f"Collected \\\\{len(self.tweets_df)\\\\} tweets")
    except Exception as e:
        print(f"Error collecting tweets: \\\\{e\\\\}")

def collect_followers(self, limit=100):
    """Collect user followers"""
    print("Collecting followers...")

    c = twint.Config()
    c.Username = self.username
    c.Limit = limit
    c.Store_pandas = True
    c.Hide_output = True

    try:
        twint.run.Followers(c)
        self.followers_df = twint.storage.panda.Follow_df
        print(f"Collected \\\\{len(self.followers_df)\\\\} followers")
    except Exception as e:
        print(f"Error collecting followers: \\\\{e\\\\}")

def collect_following(self, limit=100):
    """Collect users being followed"""
    print("Collecting following...")

    c = twint.Config()
    c.Username = self.username
    c.Limit = limit
    c.Store_pandas = True
    c.Hide_output = True

    try:
        twint.run.Following(c)
        self.following_df = twint.storage.panda.Follow_df
        print(f"Collected \\\\{len(self.following_df)\\\\} following")
    except Exception as e:
        print(f"Error collecting following: \\\\{e\\\\}")

def analyze_data(self):
    """Analyze collected data"""
    if self.tweets_df is not None and not self.tweets_df.empty:
        self.results = \\\\{
            'basic_stats': self.get_basic_stats(),
            'temporal_analysis': self.analyze_temporal_patterns(),
            'content_analysis': self.analyze_content(),
            'network_analysis': self.analyze_network(),
            'behavioral_patterns': self.analyze_behavior()
        \\\\}

def get_basic_stats(self):
    """Get basic statistics"""
    return \\\\{
        'total_tweets': len(self.tweets_df),
        'date_range': \\\\{
            'first_tweet': self.tweets_df['date'].min(),
            'last_tweet': self.tweets_df['date'].max()
        \\\\},
        'engagement': \\\\{
            'total_likes': self.tweets_df['likes_count'].sum(),
            'total_retweets': self.tweets_df['retweets_count'].sum(),
            'total_replies': self.tweets_df['replies_count'].sum(),
            'avg_likes': self.tweets_df['likes_count'].mean(),
            'avg_retweets': self.tweets_df['retweets_count'].mean()
        \\\\}
    \\\\}

def analyze_temporal_patterns(self):
    """Analyze posting time patterns"""
    # Convert datetime
    self.tweets_df['datetime'] = pd.to_datetime(self.tweets_df['date'] + ' ' + self.tweets_df['time'])
    self.tweets_df['hour'] = self.tweets_df['datetime'].dt.hour
    self.tweets_df['day_of_week'] = self.tweets_df['datetime'].dt.day_name()

    return \\\\{
        'hourly_pattern': self.tweets_df['hour'].value_counts().to_dict(),
        'daily_pattern': self.tweets_df['day_of_week'].value_counts().to_dict(),
        'most_active_hour': self.tweets_df['hour'].mode().iloc[0],
        'most_active_day': self.tweets_df['day_of_week'].mode().iloc[0],
        'posting_frequency': len(self.tweets_df) / max(1, (self.tweets_df['datetime'].max() - self.tweets_df['datetime'].min()).days)
    \\\\}

def analyze_content(self):
    """Analyze tweet content"""
    # Extract hashtags and mentions
    all_hashtags = []
    all_mentions = []
    all_urls = []

    for _, row in self.tweets_df.iterrows():
        if row['hashtags']:
            all_hashtags.extend(row['hashtags'])
        if row['mentions']:
            all_mentions.extend(row['mentions'])
        if row['urls']:
            all_urls.extend(row['urls'])

    return \\\\{
        'top_hashtags': pd.Series(all_hashtags).value_counts().head(10).to_dict(),
        'top_mentions': pd.Series(all_mentions).value_counts().head(10).to_dict(),
        'url_domains': self.extract_domains(all_urls),
        'tweet_length_stats': \\\\{
            'avg_length': self.tweets_df['tweet'].str.len().mean(),
            'max_length': self.tweets_df['tweet'].str.len().max(),
            'min_length': self.tweets_df['tweet'].str.len().min()
        \\\\}
    \\\\}

def extract_domains(self, urls):
    """Extract domains from URLs"""
    from urllib.parse import urlparse

    domains = []
    for url in urls:
        try:
            domain = urlparse(url).netloc
            if domain:
                domains.append(domain)
        except:
            continue

    return pd.Series(domains).value_counts().head(10).to_dict()

def analyze_network(self):
    """Analyze network connections"""
    network_data = \\\\{\\\\}

    if self.followers_df is not None:
        network_data['followers_count'] = len(self.followers_df)

    if self.following_df is not None:
        network_data['following_count'] = len(self.following_df)

    # Analyze interaction patterns
    if self.tweets_df is not None:
        reply_users = []
        for mentions in self.tweets_df['mentions'].dropna():
            if mentions:
                reply_users.extend(mentions)

        network_data['frequent_interactions'] = pd.Series(reply_users).value_counts().head(10).to_dict()

    return network_data

def analyze_behavior(self):
    """Analyze behavioral patterns"""
    if self.tweets_df is None:
        return \\\\{\\\\}

    # Retweet vs original content ratio
    retweet_count = self.tweets_df['tweet'].str.startswith('RT @').sum()
    original_count = len(self.tweets_df) - retweet_count

    # Reply patterns
    reply_count = self.tweets_df['tweet'].str.startswith('@').sum()

    return \\\\{
        'content_type_distribution': \\\\{
            'original_tweets': original_count,
            'retweets': retweet_count,
            'replies': reply_count
        \\\\},
        'retweet_ratio': retweet_count / len(self.tweets_df),
        'engagement_patterns': \\\\{
            'high_engagement_threshold': self.tweets_df['likes_count'].quantile(0.9),
            'viral_tweets': len(self.tweets_df[self.tweets_df['likes_count'] > self.tweets_df['likes_count'].quantile(0.95)])
        \\\\}
    \\\\}

def generate_report(self):
    """Generate investigation report"""
    report = \\\\{
        'investigation_target': self.username,
        'investigation_date': datetime.now().isoformat(),
        'data_summary': \\\\{
            'tweets_collected': len(self.tweets_df) if self.tweets_df is not None else 0,
            'followers_collected': len(self.followers_df) if self.followers_df is not None else 0,
            'following_collected': len(self.following_df) if self.following_df is not None else 0
        \\\\},
        'analysis_results': self.results
    \\\\}

    # Save to JSON
    with open(f'twitter_investigation_\\\\{self.username\\\\}_\\\\{datetime.now().strftime("%Y%m%d")\\\\}.json', 'w') as f:
        json.dump(report, f, indent=2, default=str)

    # Generate HTML report
    self.generate_html_report(report)

    return report

def generate_html_report(self, report):
    """Generate HTML investigation report"""
    html_content = f"""
Twitter Investigation Report - \\\\{self.username\\\\}

Twitter OSINT Investigation Report

Investigation Summary

Target: @\\\\{self.username\\\\}
Date: \\\\{report['investigation_date']\\\\}
Tweets Analyzed: \\\\{report['data_summary']['tweets_collected']\\\\}
""" if 'basic_stats' in self.results: stats = self.results['basic_stats'] html_content += f"""

Basic Statistics

Total Tweets: \\\\{stats['total_tweets']\\\\}
Total Likes: \\\\{stats['engagement']['total_likes']\\\\}
Total Retweets: \\\\{stats['engagement']['total_retweets']\\\\}
Average Likes: \\\\{stats['engagement']['avg_likes']:.2f\\\\}
""" if 'content_analysis' in self.results: content = self.results['content_analysis'] html_content += """

Content Analysis

Top Hashtags

""" for hashtag, count in list(content['top_hashtags'].items())[:10]: html_content += f"" html_content += """
HashtagCount
#\\\\{hashtag\\\\}\\\\{count\\\\}

Top Mentions

""" for user, count in list(content['top_mentions'].items())[:10]: html_content += f"" html_content += "
UserCount
@\\\\{user\\\\}\\\\{count\\\\}
" html_content += """

"""

    with open(f'twitter_investigation_\\\\{self.username\\\\}_\\\\{datetime.now().strftime("%Y%m%d")\\\\}.html', 'w') as f:
        f.write(html_content)

def main(): import sys

if len(sys.argv) != 2:
    print("Usage: python3 twitter-user-investigation.py <username>")
    sys.exit(1)

username = sys.argv[1].replace('@', '')  # Remove @ if present

investigation = TwitterUserInvestigation(username)
investigation.collect_user_data()
report = investigation.generate_report()

print(f"\nInvestigation completed for @\\\\{username\\\\}")
print(f"Report saved as: twitter_investigation_\\\\{username\\\\}_\\\\{datetime.now().strftime('%Y%m%d')\\\\}.json")
print(f"HTML report saved as: twitter_investigation_\\\\{username\\\\}_\\\\{datetime.now().strftime('%Y%m%d')\\\\}.html")

if name == "main": main() ```_

Hashtag und Trendanalyse

```python

!/usr/bin/env python3

twitter-hashtag-analysis.py

import twint import pandas as pd import matplotlib.pyplot as plt import seaborn as sns from datetime import datetime, timedelta from collections import Counter import networkx as nx

class HashtagAnalysis: def init(self): self.tweets_df = None self.hashtag_network = None

def analyze_hashtag(self, hashtag, days_back=7, limit=1000):
    """Analyze specific hashtag usage"""
    print(f"Analyzing hashtag: #\\\\{hashtag\\\\}")

    # Configure search
    c = twint.Config()
    c.Search = f"#\\\\{hashtag\\\\}"
    c.Limit = limit
    c.Store_pandas = True
    c.Hide_output = True

    # Set date range
    end_date = datetime.now()
    start_date = end_date - timedelta(days=days_back)
    c.Since = start_date.strftime("%Y-%m-%d")
    c.Until = end_date.strftime("%Y-%m-%d")

    # Run search
    twint.run.Search(c)
    self.tweets_df = twint.storage.panda.Tweets_df

    if self.tweets_df is not None and not self.tweets_df.empty:
        analysis = \\\\{
            'hashtag': hashtag,
            'total_tweets': len(self.tweets_df),
            'unique_users': self.tweets_df['username'].nunique(),
            'date_range': f"\\\\{start_date.strftime('%Y-%m-%d')\\\\} to \\\\{end_date.strftime('%Y-%m-%d')\\\\}",
            'engagement_stats': self.calculate_engagement_stats(),
            'top_users': self.get_top_users(),
            'related_hashtags': self.get_related_hashtags(),
            'temporal_patterns': self.analyze_temporal_patterns(),
            'influence_metrics': self.calculate_influence_metrics()
        \\\\}

        return analysis
    else:
        print(f"No tweets found for #\\\\{hashtag\\\\}")
        return None

def calculate_engagement_stats(self):
    """Calculate engagement statistics"""
    return \\\\{
        'total_likes': self.tweets_df['likes_count'].sum(),
        'total_retweets': self.tweets_df['retweets_count'].sum(),
        'total_replies': self.tweets_df['replies_count'].sum(),
        'avg_likes': self.tweets_df['likes_count'].mean(),
        'avg_retweets': self.tweets_df['retweets_count'].mean(),
        'avg_replies': self.tweets_df['replies_count'].mean(),
        'engagement_rate': (self.tweets_df['likes_count'] + self.tweets_df['retweets_count'] + self.tweets_df['replies_count']).mean()
    \\\\}

def get_top_users(self, top_n=10):
    """Get top users by tweet count and engagement"""
    user_stats = self.tweets_df.groupby('username').agg(\\\\{
        'tweet': 'count',
        'likes_count': 'sum',
        'retweets_count': 'sum',
        'replies_count': 'sum'
    \\\\}).reset_index()

    user_stats['total_engagement'] = user_stats['likes_count'] + user_stats['retweets_count'] + user_stats['replies_count']

    return \\\\{
        'by_tweet_count': user_stats.nlargest(top_n, 'tweet')[['username', 'tweet']].to_dict('records'),
        'by_engagement': user_stats.nlargest(top_n, 'total_engagement')[['username', 'total_engagement']].to_dict('records')
    \\\\}

def get_related_hashtags(self, top_n=20):
    """Get hashtags that appear with the target hashtag"""
    all_hashtags = []

    for hashtags in self.tweets_df['hashtags'].dropna():
        if hashtags:
            all_hashtags.extend(hashtags)

    hashtag_counts = Counter(all_hashtags)
    return hashtag_counts.most_common(top_n)

def analyze_temporal_patterns(self):
    """Analyze temporal posting patterns"""
    self.tweets_df['datetime'] = pd.to_datetime(self.tweets_df['date'] + ' ' + self.tweets_df['time'])
    self.tweets_df['hour'] = self.tweets_df['datetime'].dt.hour
    self.tweets_df['day'] = self.tweets_df['datetime'].dt.date

    return \\\\{
        'hourly_distribution': self.tweets_df['hour'].value_counts().sort_index().to_dict(),
        'daily_volume': self.tweets_df['day'].value_counts().sort_index().to_dict(),
        'peak_hour': self.tweets_df['hour'].mode().iloc[0],
        'peak_day': self.tweets_df['day'].value_counts().index[0].strftime('%Y-%m-%d')
    \\\\}

def calculate_influence_metrics(self):
    """Calculate influence and reach metrics"""
    # Identify influential tweets (top 10% by engagement)
    engagement_threshold = self.tweets_df['likes_count'].quantile(0.9)
    influential_tweets = self.tweets_df[self.tweets_df['likes_count'] >= engagement_threshold]

    return \\\\{
        'influential_tweets_count': len(influential_tweets),
        'influential_users': influential_tweets['username'].unique().tolist(),
        'viral_threshold': engagement_threshold,
        'reach_estimate': self.tweets_df['retweets_count'].sum() * 100  # Rough estimate
    \\\\}

def create_hashtag_network(self, min_cooccurrence=2):
    """Create network of co-occurring hashtags"""
    hashtag_pairs = []

    for hashtags in self.tweets_df['hashtags'].dropna():
        if hashtags and len(hashtags) > 1:
            # Create pairs of hashtags that appear together
            for i in range(len(hashtags)):
                for j in range(i + 1, len(hashtags)):
                    pair = tuple(sorted([hashtags[i], hashtags[j]]))
                    hashtag_pairs.append(pair)

    # Count co-occurrences
    pair_counts = Counter(hashtag_pairs)

    # Create network graph
    G = nx.Graph()

    for (hashtag1, hashtag2), count in pair_counts.items():
        if count >= min_cooccurrence:
            G.add_edge(hashtag1, hashtag2, weight=count)

    self.hashtag_network = G
    return G

def visualize_hashtag_network(self, output_file="hashtag_network.png"):
    """Visualize hashtag co-occurrence network"""
    if self.hashtag_network is None:
        self.create_hashtag_network()

    plt.figure(figsize=(12, 8))

    # Calculate node sizes based on degree
    node_sizes = [self.hashtag_network.degree(node) * 100 for node in self.hashtag_network.nodes()]

    # Draw network
    pos = nx.spring_layout(self.hashtag_network, k=1, iterations=50)
    nx.draw(self.hashtag_network, pos,
            node_size=node_sizes,
            node_color='lightblue',
            font_size=8,
            font_weight='bold',
            with_labels=True,
            edge_color='gray',
            alpha=0.7)

    plt.title("Hashtag Co-occurrence Network")
    plt.axis('off')
    plt.tight_layout()
    plt.savefig(output_file, dpi=300, bbox_inches='tight')
    plt.close()

    print(f"Network visualization saved as: \\\\{output_file\\\\}")

def main(): import sys

if len(sys.argv) < 2:
    print("Usage: python3 twitter-hashtag-analysis.py <hashtag> [days_back] [limit]")
    sys.exit(1)

hashtag = sys.argv[1].replace('#', '')  # Remove # if present
days_back = int(sys.argv[2]) if len(sys.argv) > 2 else 7
limit = int(sys.argv[3]) if len(sys.argv) > 3 else 1000

analyzer = HashtagAnalysis()
analysis = analyzer.analyze_hashtag(hashtag, days_back, limit)

if analysis:
    print(f"\nHashtag Analysis Results for #\\\\{hashtag\\\\}")
    print("=" * 50)
    print(f"Total tweets: \\\\{analysis['total_tweets']\\\\}")
    print(f"Unique users: \\\\{analysis['unique_users']\\\\}")
    print(f"Average engagement: \\\\{analysis['engagement_stats']['engagement_rate']:.2f\\\\}")
    print(f"Peak hour: \\\\{analysis['temporal_patterns']['peak_hour']\\\\}:00")

    # Create network visualization
    analyzer.visualize_hashtag_network(f"hashtag_network_\\\\{hashtag\\\\}.png")

    # Save detailed results
    import json
    with open(f"hashtag_analysis_\\\\{hashtag\\\\}_\\\\{datetime.now().strftime('%Y%m%d')\\\\}.json", 'w') as f:
        json.dump(analysis, f, indent=2, default=str)

    print(f"\nDetailed analysis saved as: hashtag_analysis_\\\\{hashtag\\\\}_\\\\{datetime.now().strftime('%Y%m%d')\\\\}.json")

if name == "main": main() ```_

Best Practices und OPSEC

Operationelle Sicherheit

```bash

!/bin/bash

twint-opsec-setup.sh

echo "Twint OPSEC Configuration" echo "========================"

Use VPN or proxy

echo "1. Network Security:" echo " □ Configure VPN connection" echo " □ Use SOCKS proxy if needed" echo " □ Rotate IP addresses periodically"

Rate limiting

echo -e "\n2. Rate Limiting:" echo " □ Add delays between requests" echo " □ Limit concurrent searches" echo " □ Monitor for rate limiting"

Data security

echo -e "\n3. Data Security:" echo " □ Encrypt stored data" echo " □ Use secure file permissions" echo " □ Regular data cleanup"

Legal compliance

echo -e "\n4. Legal Compliance:" echo " □ Verify investigation scope" echo " □ Document methodology" echo " □ Respect privacy laws" ```_

Grenzwerte und Verzögerungen

```python import twint import time import random

def safe_twint_search(config, delay_range=(1, 3)): """Run Twint search with random delays""" try: # Add random delay delay = random.uniform(delay_range[0], delay_range[1]) time.sleep(delay)

    # Run search
    twint.run.Search(config)
    return True

except Exception as e:
    print(f"Search failed: \\\\{e\\\\}")
    # Longer delay on failure
    time.sleep(random.uniform(5, 10))
    return False

def batch_user_analysis(usernames, delay_range=(2, 5)): """Analyze multiple users with delays""" results = \\{\\}

for username in usernames:
    print(f"Analyzing @\\\\{username\\\\}")

    c = twint.Config()
    c.Username = username
    c.Limit = 100
    c.Store_pandas = True
    c.Hide_output = True

    if safe_twint_search(c, delay_range):
        if twint.storage.panda.Tweets_df is not None:
            results[username] = len(twint.storage.panda.Tweets_df)
        else:
            results[username] = 0
    else:
        results[username] = "Failed"

    # Clear storage for next user
    twint.storage.panda.Tweets_df = None

return results

```_

Fehlerbehebung

Gemeinsame Themen und Lösungen

```bash

Issue: No tweets returned

Solution: Check if user exists and has public tweets

twint -u username --debug

Issue: Rate limiting

Solution: Add delays and reduce request frequency

twint -u username --limit 50

Issue: SSL/TLS errors

Solution: Update certificates or disable SSL verification

pip install --upgrade certifi

Issue: Pandas storage not working

Solution: Clear storage and reinitialize

python3 -c "import twint; twint.storage.panda.Tweets_df = None" ```_

Debug und Logging

```python import twint import logging

Enable debug logging

logging.basicConfig(level=logging.DEBUG)

Configure with debug mode

c = twint.Config() c.Username = "username" c.Debug = True c.Verbose = True

Run with error handling

try: twint.run.Search(c) except Exception as e: print(f"Error: \\{e\\}") import traceback traceback.print_exc() ```_

Ressourcen

--

*Dieses Betrugsblatt bietet umfassende Anleitung für die Verwendung von Twint für Twitter OSINT-Untersuchungen. Stellen Sie immer eine ordnungsgemäße Autorisierung und rechtliche Compliance vor der Durchführung von Social Media Intelligence sammeln Aktivitäten. *