Salta ai contenuti

Seldon Core Cheat Sheet

Overview

Seldon Core is an open-source platform for deploying, scaling, and monitoring machine learning models on Kubernetes. It converts ML models into production-ready microservices with REST and gRPC APIs, providing enterprise-grade features like canary deployments, A/B testing, multi-armed bandit routing, inference graphs, model explainability, and outlier/drift detection.

Seldon Core uses a custom Kubernetes resource called SeldonDeployment to declaratively define model serving configurations. It supports models from any framework (Scikit-learn, TensorFlow, PyTorch, XGBoost, custom) through pre-packaged model servers or custom Docker containers. The platform integrates with Istio or Ambassador for traffic management, Prometheus for monitoring, and Grafana for dashboards. Seldon Core V2 introduces a new architecture with an inference scheduler, multi-model serving, and pipeline capabilities.

Installation

Install with Helm

# Add Seldon Helm chart repo
helm repo add seldon https://storage.googleapis.com/seldon-charts
helm repo update

# Install Seldon Core operator
kubectl create namespace seldon-system

helm install seldon-core seldon/seldon-core-operator \
    --namespace seldon-system \
    --set usageMetrics.enabled=true \
    --set istio.enabled=true \
    --set certManager.enabled=false

# Verify installation
kubectl get pods -n seldon-system

# Install with Ambassador instead of Istio
helm install seldon-core seldon/seldon-core-operator \
    --namespace seldon-system \
    --set ambassador.enabled=true \
    --set istio.enabled=false

Install Seldon Core V2

# Install V2 with Helm
helm install seldon-core-v2 seldon/seldon-core-v2-setup \
    --namespace seldon-mesh \
    --create-namespace

helm install seldon-v2 seldon/seldon-core-v2-runtime \
    --namespace seldon-mesh

Prerequisites

# Install Istio (if using Istio ingress)
istioctl install --set profile=default
kubectl label namespace default istio-injection=enabled

# Install cert-manager (optional, for TLS)
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.14.0/cert-manager.yaml

Core Deployments

Simple Model Deployment

# sklearn-deployment.yaml
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: iris-model
  namespace: default
spec:
  name: iris
  predictors:
    - name: default
      replicas: 2
      graph:
        name: classifier
        implementation: SKLEARN_SERVER
        modelUri: gs://seldon-models/v1.19.0-dev/sklearn/iris
        envSecretRefName: seldon-init-container-secret
      componentSpecs:
        - spec:
            containers:
              - name: classifier
                resources:
                  requests:
                    cpu: "0.5"
                    memory: "1Gi"
                  limits:
                    cpu: "1"
                    memory: "2Gi"

XGBoost Model

apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: xgboost-model
spec:
  predictors:
    - name: default
      graph:
        name: model
        implementation: XGBOOST_SERVER
        modelUri: s3://models/xgboost/model
        parameters:
          - name: method
            type: STRING
            value: predict_proba

Custom Model Server

apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: custom-model
spec:
  predictors:
    - name: default
      graph:
        name: model
        type: MODEL
      componentSpecs:
        - spec:
            containers:
              - name: model
                image: my-registry/my-model:v1.0
                ports:
                  - containerPort: 9000
                    name: http
                env:
                  - name: MODEL_PATH
                    value: /models/model.pkl

Pre-Packaged Model Servers

ServerImplementationSupported Formats
SKLearnSKLEARN_SERVERPickle, Joblib
XGBoostXGBOOST_SERVERBST, JSON, UBJ
TensorFlowTENSORFLOW_SERVERSavedModel
MLflowMLFLOW_SERVERMLflow model artifacts
TritonTRITON_SERVERONNX, TensorRT, TF, PyTorch
TempoTEMPO_SERVERCustom Python
CustomN/A (use Docker image)Any

kubectl Commands

CommandDescription
kubectl get seldondeploymentsList all Seldon deployments
kubectl get sdepShort alias for above
kubectl describe sdep <name>Show deployment details
kubectl apply -f deployment.yamlCreate/update deployment
kubectl delete sdep <name>Delete deployment
kubectl get pods -l seldon-deployment-id=<name>List pods for deployment
kubectl logs -l seldon-deployment-id=<name> -c <container>View model logs
kubectl get svc -l seldon-deployment-id=<name>List services

Making Predictions

# REST prediction (via Istio gateway)
curl -X POST http://<istio-gateway>/seldon/default/iris-model/api/v1.0/predictions \
    -H "Content-Type: application/json" \
    -d '{
        "data": {
            "ndarray": [[5.1, 3.5, 1.4, 0.2]]
        }
    }'

# REST prediction (direct service)
curl -X POST http://iris-model-default.default.svc.cluster.local:8000/api/v1.0/predictions \
    -H "Content-Type: application/json" \
    -d '{"data": {"ndarray": [[5.1, 3.5, 1.4, 0.2]]}}'

# Feedback endpoint
curl -X POST http://<gateway>/seldon/default/iris-model/api/v1.0/feedback \
    -H "Content-Type: application/json" \
    -d '{
        "request": {"data": {"ndarray": [[5.1, 3.5, 1.4, 0.2]]}},
        "reward": 1.0,
        "truth": {"data": {"ndarray": [0]}}
    }'

# gRPC prediction
grpcurl -d '{"data": {"ndarray": [[5.1, 3.5, 1.4, 0.2]]}}' \
    -plaintext localhost:5000 seldon.protos.Seldon/Predict

Advanced Deployments

Canary Deployment

apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: canary-example
spec:
  predictors:
    - name: stable
      traffic: 80
      graph:
        name: model
        implementation: SKLEARN_SERVER
        modelUri: gs://models/v1
    - name: canary
      traffic: 20
      graph:
        name: model
        implementation: SKLEARN_SERVER
        modelUri: gs://models/v2

A/B Test with Multi-Armed Bandit

apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: ab-test
spec:
  predictors:
    - name: model-a
      traffic: 50
      graph:
        name: model
        implementation: SKLEARN_SERVER
        modelUri: gs://models/model-a
    - name: model-b
      traffic: 50
      graph:
        name: model
        implementation: SKLEARN_SERVER
        modelUri: gs://models/model-b

Inference Graph (Preprocessing + Model + Postprocessing)

apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: pipeline
spec:
  predictors:
    - name: default
      graph:
        name: model
        implementation: SKLEARN_SERVER
        modelUri: gs://models/classifier
        children:
          - name: preprocessor
            type: TRANSFORMER
            implementation: SKLEARN_SERVER
            modelUri: gs://models/transformer
      componentSpecs:
        - spec:
            containers:
              - name: model
                resources:
                  requests: {cpu: "1", memory: "2Gi"}
              - name: preprocessor
                resources:
                  requests: {cpu: "0.5", memory: "1Gi"}

Explainer (Model Interpretability)

apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: model-with-explainer
spec:
  predictors:
    - name: default
      graph:
        name: model
        implementation: SKLEARN_SERVER
        modelUri: gs://models/classifier
      explainer:
        type: AnchorTabular
        modelUri: gs://models/explainer
        containerSpec:
          resources:
            requests: {cpu: "0.5", memory: "1Gi"}
# Request explanation
curl -X POST http://<gateway>/seldon/default/model-with-explainer/api/v1.0/explain \
    -H "Content-Type: application/json" \
    -d '{"data": {"ndarray": [[5.1, 3.5, 1.4, 0.2]]}}'

Configuration

Custom Python Model

# MyModel.py
class MyModel:
    def __init__(self):
        self.model = None

    def load(self):
        import joblib
        self.model = joblib.load("/models/model.pkl")

    def predict(self, X, features_names=None):
        return self.model.predict(X)

    def predict_proba(self, X, features_names=None):
        return self.model.predict_proba(X)

    def tags(self):
        return {"version": "1.0", "framework": "sklearn"}

    def metrics(self):
        return [
            {"type": "COUNTER", "key": "predictions_total", "value": 1},
        ]

Monitoring with Prometheus

# ServiceMonitor for Prometheus
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: seldon-metrics
spec:
  selector:
    matchLabels:
      seldon-deployment-id: iris-model
  endpoints:
    - port: http
      path: /prometheus
      interval: 15s

Troubleshooting

IssueSolution
Pod stuck in InitCheck init container logs for model download errors. Verify modelUri and credentials
CrashLoopBackOffCheck container logs. Verify model file compatibility with server version
503 Service UnavailableCheck pod readiness. Verify Istio/Ambassador gateway configuration
Model download fails from S3Create secret seldon-init-container-secret with AWS credentials
Prediction returns 500Check model server logs. Verify input format matches model expectations
Canary not routing correctlyVerify traffic percentages sum to 100. Check Istio VirtualService
Explainer timeoutIncrease explainer resources. Reduce explanation sample size
Scaling not workingVerify HPA is configured. Check Seldon deployment annotations for autoscaling
gRPC connection refusedEnsure gRPC port (5001) is exposed. Check service port mapping
Metrics not appearingVerify Prometheus ServiceMonitor is created. Check /prometheus endpoint