# Define pipeline stagesstages:-build-test-deploy# Global variablesvariables:DOCKER_DRIVER:overlay2DATABASE_URL:"postgres://localhost/db"# Global scripts executed before each jobbefore_script:-echo "Pipeline started at $(date)"-export PATH=$PATH:/custom/bin# Global scripts executed after each jobafter_script:-echo "Cleaning up..."# Basic job definitionbuild_app:stage:buildimage:node:16-alpinescript:-npm install-npm run buildartifacts:paths:-dist/expire_in:1 weekcache:key:${CI_COMMIT_REF_SLUG}paths:-node_modules/only:-main-merge_requeststags:-docker
# Include external configurationsinclude:-project:'my-group/ci-templates'ref:mainfile:'/templates/.gitlab-ci-template.yml'-remote:'https://example.com/ci-template.yml'-local:'/templates/security-scan.yml'-template:Security/SAST.gitlab-ci.yml# Workflow rules for pipeline executionworkflow:rules:-if:'$CI_PIPELINE_SOURCE=="merge_request_event"'-if:'$CI_COMMIT_BRANCH=="main"'-if:'$CI_COMMIT_TAG'-when:never# Job with complex rulesdeploy_production:stage:deployscript:-./deploy.sh productionrules:-if:'$CI_COMMIT_BRANCH=="main"'when:manual-if:'$CI_COMMIT_TAG=~/^v[0-9]+\.[0-9]+\.[0-9]+$/'when:on_successenvironment:name:productionurl:https://prod.example.comon_stop:stop_production
stages:-build-test-deployvariables:NODE_ENV:productionbuild:stage:buildimage:node:16-alpinescript:-npm ci-npm run buildartifacts:paths:-dist/-node_modules/expire_in:1 hourcache:key:${CI_COMMIT_REF_SLUG}paths:-.npm/test:unit:stage:testimage:node:16-alpinedependencies:-buildscript:-npm run test:unitcoverage:'/Lines\s*:\s*(\d+\.\d+)%/'artifacts:reports:junit:junit.xmlcoverage_report:coverage_format:coberturapath:coverage/cobertura-coverage.xmltest:integration:stage:testimage:node:16-alpineservices:-postgres:13variables:POSTGRES_DB:test_dbPOSTGRES_USER:test_userPOSTGRES_PASSWORD:test_passwordscript:-npm run test:integration
Use cache for dependencies and artifacts for build outputs: Cache speeds up subsequent pipeline runs by storing dependencies like node_modules/, while artifacts pass build outputs between stages. Never cache build artifacts.
Implement proper workflow rules to avoid unnecessary pipeline runs: Use workflow:rules to control when pipelines execute, preventing waste on draft MRs or documentation-only changes. This saves runner resources and reduces costs.
Tag runners and jobs appropriately: Use specific tags (docker, kubernetes, gpu) to route jobs to appropriate runners. This ensures jobs run on infrastructure with required capabilities and prevents resource contention.
Use needs keyword for DAG pipelines: Instead of sequential stages, use needs: to create directed acyclic graphs (DAGs) that run jobs as soon as dependencies complete, significantly reducing total pipeline time.
Store sensitive data in CI/CD variables, never in code: Use protected and masked variables for secrets like API keys, passwords, and tokens. Enable protection to restrict access to protected branches/tags only.
Implement security scanning early in the pipeline: Include SAST, dependency scanning, and container scanning in early stages. Use allow_failure: true initially to avoid blocking development while teams address findings.
Use only:changes or rules:changes for monorepos: Trigger jobs only when relevant files change, preventing unnecessary builds and tests. This is critical for large monorepos with multiple applications.
Set appropriate artifact expiration times: Default artifacts to short expiration (1-7 days) to save storage costs. Use expire_in: never only for release artifacts that need permanent retention.
Leverage includes and templates for DRY configuration: Create reusable templates in separate repositories and include them using include:project or include:remote. This ensures consistency across projects.
Monitor runner capacity and scale appropriately: Track runner queue times and job wait times. Configure concurrent setting in config.toml based on available resources, and scale runners horizontally during peak times.
Verify runner is active: gitlab-runner verify. Check runner tags match job tags. Ensure runner isn't paused in GitLab UI. Check concurrent setting in /etc/gitlab-runner/config.toml.
"This job is stuck" error
No runner available with matching tags. Either add tags to job, register runner with those tags, or enable run_untagged: true in runner config.
Docker-in-Docker (DinD) permission errors
Add privileged: true to runner config or use Docker socket binding: --docker-volumes /var/run/docker.sock:/var/run/docker.sock. Ensure runner has proper permissions.
Cache not working between jobs
Verify cache key is consistent: use ${CI_COMMIT_REF_SLUG} or file-based keys. Check cache storage configuration (S3, GCS, etc.). Ensure cache:policy is set correctly (pull-push for read/write).
Pipeline fails with "yaml invalid"
Validate syntax with glab ci lint or GitLab's CI Lint tool (CI/CD > Pipelines > CI Lint). Check indentation (use spaces, not tabs). Verify all required fields are present.
Artifacts not available in downstream jobs
Use dependencies: or needs: to explicitly declare artifact dependencies. Check artifact paths are correct. Verify artifacts haven't expired (expire_in).
Jobs timing out
Increase timeout in job definition: timeout: 3h. Check for hanging processes. Review runner's concurrent setting if system resources are exhausted.
"Cannot connect to Docker daemon" error
Ensure Docker service is running on runner host. For Docker executor, add -v /var/run/docker.sock:/var/run/docker.sock. For DinD, use docker:dind service.
Kubernetes runner pods failing to start
Check namespace exists and runner has permissions. Verify resource requests/limits. Review pod logs: kubectl logs -n gitlab-runner <pod-name>. Check image pull secrets for private registries.
Variables not being passed to jobs
Check variable scope (project, group, instance). Ensure variables aren't masked when trying to print them. For protected variables, job must run on protected branch/tag. Use $ prefix: $VARIABLE_NAME.
Runner registration token invalid
Token may have expired or been revoked. Get new token from GitLab UI: Settings > CI/CD > Runners. For project runners, use project-specific token. For group/instance runners, use appropriate token.
High runner CPU/memory usage
Reduce concurrent value in config.toml. Implement job resource limits. Use cache to reduce redundant downloads. Consider distributing load across multiple runners.
SSL certificate verification failures
Add tls_verify = false to runner config (not recommended for production). Install proper CA certificates. Use CI_SERVER_TLS_CA_FILE variable to specify CA bundle.