Aller au contenu

API de grattage Commandes

Copier toutes les commandes Générer PDF

Commandes et workflows complets de l'API Scraper pour le grattage et la collecte de données web.

Demandes d'API de base

Command Description
curl "http://api.scraperapi.com?api_key=YOUR_KEY&url=https://example.com" Basic scraping request
curl "http://api.scraperapi.com?api_key=YOUR_KEY&url=https://example.com&render=true" Render JavaScript
curl "http://api.scraperapi.com?api_key=YOUR_KEY&url=https://example.com&country_code=US" Use specific country
curl "http://api.scraperapi.com?api_key=YOUR_KEY&url=https://example.com&premium=true" Use premium proxies

Mise en œuvre de Python

Command Description
pip install requests Install requests library
import requests Import requests module
response = requests.get('http://api.scraperapi.com', params={'api_key': 'YOUR_KEY', 'url': 'https://example.com'}) Basic Python request
response = requests.get('http://api.scraperapi.com', params={'api_key': 'YOUR_KEY', 'url': 'https://example.com', 'render': 'true'}) Python with JavaScript rendering

Mise en œuvre de Node.js

Command Description
npm install axios Install axios library
const axios = require('axios') Import axios module
axios.get('http://api.scraperapi.com?api_key=YOUR_KEY&url=https://example.com') Basic Node.js request
axios.get('http://api.scraperapi.com', {params: {api_key: 'YOUR_KEY', url: 'https://example.com', render: true}}) Node.js with parameters

Mise en œuvre de PHP

Command Description
$response = file_get_contents('http://api.scraperapi.com?api_key=YOUR_KEY&url=https://example.com') Basic PHP request
$context = stream_context_create(['http' => ['timeout' => 60]]) Set timeout context
$response = file_get_contents('http://api.scraperapi.com?api_key=YOUR_KEY&url=https://example.com', false, $context) PHP with timeout

Mise en œuvre Ruby

Command Description
require 'net/http' Import HTTP library
require 'uri' Import URI library
uri = URI('http://api.scraperapi.com?api_key=YOUR_KEY&url=https://example.com') Create URI object
response = Net::HTTP.get_response(uri) Make Ruby request

Mise en œuvre Java

Command Description
import java.net.http.HttpClient Import HTTP client
import java.net.http.HttpRequest Import HTTP request
HttpClient client = HttpClient.newHttpClient() Create HTTP client
HttpRequest request = HttpRequest.newBuilder().uri(URI.create("http://api.scraperapi.com?api_key=YOUR_KEY&url=https://example.com")).build() Build request

Paramètres avancés

Parameter Description
render=true Enable JavaScript rendering
country_code=US Use specific country proxy
premium=true Use premium proxy pool
session_number=123 Use session for sticky IP
keep_headers=true Keep original headers
device_type=desktop Set device type
autoparse=true Enable automatic parsing
format=json Return structured JSON

Options de géolocalisation

Country Code Description
country_code=US United States
country_code=UK United Kingdom
country_code=CA Canada
country_code=AU Australia
country_code=DE Germany
country_code=FR France
country_code=JP Japan
country_code=BR Brazil

Gestion des séances

Command Description
session_number=1 Use session 1
session_number=2 Use session 2
session_number=123 Use custom session
session_number=random Use random session

Gestion des erreurs

Status Code Description
200 Success
400 Bad Request
401 Unauthorized (invalid API key)
403 Forbidden
404 Not Found
429 Rate limit exceeded
500 Internal Server Error
503 Service Unavailable

Formats de réponse

Format Description
format=html Return raw HTML (default)
format=json Return structured JSON
format=text Return plain text

En-têtes personnalisés

Command Description
custom_headers={"User-Agent": "Custom Bot"} Set custom user agent
custom_headers={"Accept": "application/json"} Set accept header
custom_headers={"Referer": "https://google.com"} Set referer header

JavaScript Rendu

Command Description
render=true Enable JavaScript rendering
wait_for_selector=.content Wait for specific element
wait_for=2000 Wait for milliseconds
screenshot=true Take screenshot

Traitement par lots

Command Description
curl -X POST "http://api.scraperapi.com/batch" -H "Content-Type: application/json" -d '{"api_key": "YOUR_KEY", "urls": ["url1", "url2"]}' Batch request
async_batch=true Asynchronous batch processing
callback_url=https://yoursite.com/callback Set callback URL

Gestion des comptes

Command Description
curl "http://api.scraperapi.com/account?api_key=YOUR_KEY" Check account status
curl "http://api.scraperapi.com/usage?api_key=YOUR_KEY" Check usage statistics

Limite des taux

Command Description
concurrent_requests=5 Set concurrent request limit
delay=1000 Add delay between requests
throttle=true Enable automatic throttling

Configuration de proxy

Command Description
proxy_type=datacenter Use datacenter proxies
proxy_type=residential Use residential proxies
proxy_type=mobile Use mobile proxies
sticky_session=true Enable sticky sessions

Extraction de données

Command Description
extract_rules={"title": "h1"} Extract title from h1
extract_rules={"links": "a@href"} Extract all links
extract_rules={"text": "p"} Extract paragraph text
css_selector=.product-price Use CSS selector

Configuration de Webhook

Command Description
webhook_url=https://yoursite.com/webhook Set webhook URL
webhook_method=POST Set webhook method
webhook_headers={"Authorization": "Bearer token"} Set webhook headers

Surveillance et débogage

Command Description
debug=true Enable debug mode
log_level=verbose Set verbose logging
trace_id=custom123 Set custom trace ID

Optimisation des performances

Command Description
cache=true Enable response caching
cache_ttl=3600 Set cache TTL in seconds
compression=gzip Enable compression
timeout=30 Set request timeout

Caractéristiques de sécurité

Command Description
stealth_mode=true Enable stealth mode
anti_captcha=true Enable CAPTCHA solving
fingerprint_randomization=true Randomize browser fingerprint

Exemples d'intégration

Framework Command
Scrapy SCRAPEOPS_API_KEY = 'YOUR_KEY'
Selenium proxy = "api.scraperapi.com:8001"
Puppeteer args: ['--proxy-server=api.scraperapi.com:8001']
BeautifulSoup response = requests.get(scraperapi_url)

Dépannage

Issue Solution
Rate limited Reduce concurrent requests
Blocked IP Use different country code
JavaScript not loading Enable render=true
Timeout errors Increase timeout value
Invalid response Check URL encoding