API Scraper Comandi¶
Traduzione: Copia tutti i comandi
Traduzione: Generare PDF
< >
Comandi e flussi di lavoro dell'API Scraper completi per la raccolta di dati e web scraping.
## Richieste API di base
|Command|Description|
|---------|-------------|
|`curl "http://api.scraperapi.com?api_key=YOUR_KEY&url=https://example.com"`|Basic scraping request|
|`curl "http://api.scraperapi.com?api_key=YOUR_KEY&url=https://example.com&render=true"`|Render JavaScript|
|`curl "http://api.scraperapi.com?api_key=YOUR_KEY&url=https://example.com&country_code=US"`|Use specific country|
|`curl "http://api.scraperapi.com?api_key=YOUR_KEY&url=https://example.com&premium=true"`|Use premium proxies|
## Attuazione di Python
|Command|Description|
|---------|-------------|
|`pip install requests`|Install requests library|
|`import requests`|Import requests module|
|`response = requests.get('http://api.scraperapi.com', params={'api_key': 'YOUR_KEY', 'url': 'https://example.com'})`|Basic Python request|
|`response = requests.get('http://api.scraperapi.com', params={'api_key': 'YOUR_KEY', 'url': 'https://example.com', 'render': 'true'})`|Python with JavaScript rendering|
## Node.js Attuazione
|Command|Description|
|---------|-------------|
|`npm install axios`|Install axios library|
|`const axios = require('axios')`|Import axios module|
|`axios.get('http://api.scraperapi.com?api_key=YOUR_KEY&url=https://example.com')`|Basic Node.js request|
|`axios.get('http://api.scraperapi.com', {params: {api_key: 'YOUR_KEY', url: 'https://example.com', render: true}})`|Node.js with parameters|
## Attuazione PHP
|Command|Description|
|---------|-------------|
|`$response = file_get_contents('http://api.scraperapi.com?api_key=YOUR_KEY&url=https://example.com')`|Basic PHP request|
|`$context = stream_context_create(['http' => ['timeout' => 60]])`|Set timeout context|
|`$response = file_get_contents('http://api.scraperapi.com?api_key=YOUR_KEY&url=https://example.com', false, $context)`|PHP with timeout|
## Applicazione di Ruby
|Command|Description|
|---------|-------------|
|`require 'net/http'`|Import HTTP library|
|`require 'uri'`|Import URI library|
|`uri = URI('http://api.scraperapi.com?api_key=YOUR_KEY&url=https://example.com')`|Create URI object|
|`response = Net::HTTP.get_response(uri)`|Make Ruby request|
## Applicazione Java
|Command|Description|
|---------|-------------|
|`import java.net.http.HttpClient`|Import HTTP client|
|`import java.net.http.HttpRequest`|Import HTTP request|
|`HttpClient client = HttpClient.newHttpClient()`|Create HTTP client|
|`HttpRequest request = HttpRequest.newBuilder().uri(URI.create("http://api.scraperapi.com?api_key=YOUR_KEY&url=https://example.com")).build()`|Build request|
## Parametri avanzati
|Parameter|Description|
|---------|-------------|
|`render=true`|Enable JavaScript rendering|
|`country_code=US`|Use specific country proxy|
|`premium=true`|Use premium proxy pool|
|`session_number=123`|Use session for sticky IP|
|`keep_headers=true`|Keep original headers|
|`device_type=desktop`|Set device type|
|`autoparse=true`|Enable automatic parsing|
|`format=json`|Return structured JSON|
## Opzioni di geolocalizzazione
|Country Code|Description|
|---------|-------------|
|`country_code=US`|United States|
|`country_code=UK`|United Kingdom|
|`country_code=CA`|Canada|
|`country_code=AU`|Australia|
|`country_code=DE`|Germany|
|`country_code=FR`|France|
|`country_code=JP`|Japan|
|`country_code=BR`|Brazil|
## Gestione delle sessioni
|Command|Description|
|---------|-------------|
|`session_number=1`|Use session 1|
|`session_number=2`|Use session 2|
|`session_number=123`|Use custom session|
|`session_number=random`|Use random session|
## Gestione degli errori
|Status Code|Description|
|---------|-------------|
|`200`|Success|
|`400`|Bad Request|
|`401`|Unauthorized (invalid API key)|
|`403`|Forbidden|
|`404`|Not Found|
|`429`|Rate limit exceeded|
|`500`|Internal Server Error|
|`503`|Service Unavailable|
## Formati di risposta
|Format|Description|
|---------|-------------|
|`format=html`|Return raw HTML (default)|
|`format=json`|Return structured JSON|
|`format=text`|Return plain text|
## Intestazioni personalizzate
|Command|Description|
|---------|-------------|
|`custom_headers={"User-Agent": "Custom Bot"}`|Set custom user agent|
|`custom_headers={"Accept": "application/json"}`|Set accept header|
|`custom_headers={"Referer": "https://google.com"}`|Set referer header|
## JavaScript Rendering
|Command|Description|
|---------|-------------|
|`render=true`|Enable JavaScript rendering|
|`wait_for_selector=.content`|Wait for specific element|
|`wait_for=2000`|Wait for milliseconds|
|`screenshot=true`|Take screenshot|
## Elaborazione batch
|Command|Description|
|---------|-------------|
|`curl -X POST "http://api.scraperapi.com/batch" -H "Content-Type: application/json" -d '{"api_key": "YOUR_KEY", "urls": ["url1", "url2"]}'`|Batch request|
|`async_batch=true`|Asynchronous batch processing|
|`callback_url=https://yoursite.com/callback`|Set callback URL|
## Gestione account
|Command|Description|
|---------|-------------|
|`curl "http://api.scraperapi.com/account?api_key=YOUR_KEY"`|Check account status|
|`curl "http://api.scraperapi.com/usage?api_key=YOUR_KEY"`|Check usage statistics|
## Limitamento del tasso
|Command|Description|
|---------|-------------|
|`concurrent_requests=5`|Set concurrent request limit|
|`delay=1000`|Add delay between requests|
|`throttle=true`|Enable automatic throttling|
## Configurazione proxy
|Command|Description|
|---------|-------------|
|`proxy_type=datacenter`|Use datacenter proxies|
|`proxy_type=residential`|Use residential proxies|
|`proxy_type=mobile`|Use mobile proxies|
|`sticky_session=true`|Enable sticky sessions|
## Estrazione dei dati
|Command|Description|
|---------|-------------|
|`extract_rules={"title": "h1"}`|Extract title from h1|
|`extract_rules={"links": "a@href"}`|Extract all links|
|`extract_rules={"text": "p"}`|Extract paragraph text|
|`css_selector=.product-price`|Use CSS selector|
## Configurazione Webhook
|Command|Description|
|---------|-------------|
|`webhook_url=https://yoursite.com/webhook`|Set webhook URL|
|`webhook_method=POST`|Set webhook method|
|`webhook_headers={"Authorization": "Bearer token"}`|Set webhook headers|
## Monitoraggio e debug
|Command|Description|
|---------|-------------|
|`debug=true`|Enable debug mode|
|`log_level=verbose`|Set verbose logging|
|`trace_id=custom123`|Set custom trace ID|
## Ottimizzazione delle prestazioni
|Command|Description|
|---------|-------------|
|`cache=true`|Enable response caching|
|`cache_ttl=3600`|Set cache TTL in seconds|
|`compression=gzip`|Enable compression|
|`timeout=30`|Set request timeout|
## Caratteristiche di sicurezza
|Command|Description|
|---------|-------------|
|`stealth_mode=true`|Enable stealth mode|
|`anti_captcha=true`|Enable CAPTCHA solving|
|`fingerprint_randomization=true`|Randomize browser fingerprint|
## Esempi di integrazione
|Framework|Command|
|---------|-------------|
|Scrapy|`SCRAPEOPS_API_KEY = 'YOUR_KEY'`|
|Selenium|`proxy = "api.scraperapi.com:8001"`|
|Puppeteer|`args: ['--proxy-server=api.scraperapi.com:8001']`|
|BeautifulSoup|`response = requests.get(scraperapi_url)`|
## Risoluzione dei problemi
|Issue|Solution|
|---------|-------------|
|Rate limited|Reduce concurrent requests|
|Blocked IP|Use different country code|
|JavaScript not loading|Enable render=true|
|Timeout errors|Increase timeout value|
|Invalid response|Check URL encoding|