Code Scanning with CodeQL

CodeQL is GitHub's powerful static analysis engine that finds security vulnerabilities in your code. It treats code as data, letting you write queries to identify patterns that lead to bugs and security issues—all integrated into your CI/CD pipeline.

SAST (Static Analysis) Code as Data Security Queries
What is CodeQL?

CodeQL is GitHub's industry-leading static analysis engine that treats code as data. Unlike traditional static analysis tools that use pattern matching, CodeQL actually understands the structure of your code—its syntax, data flow, and control flow. This allows it to find complex vulnerabilities that simple pattern matching would miss.

At its core, CodeQL works by extracting a relational database from your codebase. This database contains information about every token, syntax tree, data flow path, and control flow graph. You then write queries against this database using a powerful SQL-like language. These queries can find anything from simple SQL injection vulnerabilities to complex logic bugs that span multiple functions and files.

GitHub provides hundreds of pre-written security queries covering the most common vulnerability types: SQL injection, cross-site scripting (XSS), path traversal, command injection, insecure deserialization, and many more. These queries are maintained by GitHub's security research team and are constantly updated as new vulnerability patterns emerge.

CodeQL powers GitHub's code scanning feature. It's used by thousands of organizations, including Google, Microsoft, and the Linux kernel project, to find security vulnerabilities before they reach production.
Supported Languages

CodeQL supports a wide range of programming languages, with different levels of maturity:

Comprehensive support (all security queries available): C/C++, C#, Go, Java, JavaScript/TypeScript, Python, Ruby, and Kotlin.

Beta support: Swift and Rust. These languages are actively being developed and have growing query coverage.

For each supported language, CodeQL understands the language's unique semantics, including type systems, package structures, and common frameworks. This enables it to find framework-specific vulnerabilities—like finding XSS in React applications or SQL injection in Django ORM.

# Supported language identifiers for codeql.yml
language: javascript # JavaScript/TypeScript
language: python # Python
language: java # Java/Kotlin
language: csharp # C#
language: cpp # C/C++
language: go # Go
language: ruby # Ruby
language: swift # Swift (beta)
language: rust # Rust (beta)
How CodeQL Works: Code as Data

CodeQL's approach is fundamentally different from traditional static analysis. First, it builds a database of your code through a process called extraction. During extraction, CodeQL parses your source code and builds a relational database that captures every element—functions, classes, variables, expressions, and their relationships.

Once the database is built, CodeQL runs queries against it. These queries are written in QL, a declarative language that resembles SQL. A query might look for data flows from user input to a dangerous sink (like an SQL query) without passing through a sanitizer. Because CodeQL understands data flow across function boundaries and through objects, it can find vulnerabilities that span hundreds of lines of code.

Finally, CodeQL produces results with precise locations, showing exactly where the vulnerability originates, where it flows, and where it's used unsafely. These results appear as alerts in GitHub's code scanning interface, directly in the code view, and as pull request checks.

# Example QL query concept (simplified)
# Find SQL injection vulnerabilities
import sql
import javascript

from SqlInjectionFlowConfig config, DataFlow::PathNode source, DataFlow::PathNode sink
where config.hasFlowPath(source, sink)
select sink, source, sink, "This query depends on a user-provided value."
Setting Up CodeQL Code Scanning

Setting up CodeQL code scanning is straightforward. Go to your repository's Security tab, click "Set up code scanning," and choose CodeQL. GitHub will create a workflow file at .github/workflows/codeql.yml with default configuration.

The default configuration works for most projects. It automatically detects your languages, builds your code, and runs all security queries. For compiled languages (C/C++, C#, Java, Go), CodeQL needs to understand your build process. The default configuration uses an autobuild script that works for many common project structures. For complex builds, you can customize the build steps.

You can also configure CodeQL to run only on specific languages, use a custom query suite, or run on a schedule. The workflow runs on every push to the main branch and on every pull request, providing continuous security feedback.

# .github/workflows/codeql.yml
name: "CodeQL"

on:
  push:
    branches: [ "main" ]
  pull_request:
    branches: [ "main" ]
  schedule:
    - cron: '0 2 * * 0' # Weekly on Sunday

jobs:
  analyze:
    name: Analyze
    runs-on: ubuntu-latest
    strategy:
      matrix:
        language: [javascript, python, java]
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-java@v4
        with:
          distribution: 'temurin'
          java-version: '17'
      - uses: github/codeql-action/init@v3
        with:
          languages: ${{ matrix.language }}
          queries: security-extended
      - uses: github/codeql-action/autobuild@v3
      - uses: github/codeql-action/analyze@v3
Query Suites: Default, Extended, and Custom

CodeQL organizes security queries into suites. The default suite runs a balanced set of queries that produce few false positives. The extended suite runs more aggressive queries that may produce more results, including potential false positives, but catches more vulnerabilities. You can also create custom query suites for your specific needs.

You can also add custom queries. If your organization has specific security requirements, you can write QL queries and include them in your workflow. These queries can be checked into your repository, making your custom security rules version-controlled and shared with the team.

Results are categorized by severity: error, warning, and note. Errors are high-confidence findings that should be fixed immediately. Warnings are potential issues that deserve investigation. Notes are informational—they might highlight code patterns that are unusual but not necessarily problematic.

# Using a custom query suite
- uses: github/codeql-action/init@v3
  with:
    languages: javascript
    queries: security-extended
    config-file: .github/codeql/custom-config.yml

# .github/codeql/custom-config.yml
queries:
  - uses: security-extended
  - uses: ./custom-queries/ # Local custom queries
paths-ignore:
  - '**/test/**'
  - '**/vendor/**'
Interpreting CodeQL Results

When CodeQL finds a vulnerability, it appears in the Security tab and as a check on pull requests. Each alert includes a description of the vulnerability, a severity rating, and exact locations in your code. Clicking on an alert shows the data flow path—how untrusted data enters your application, how it flows through the code, and where it reaches a dangerous sink without sanitization.

Understanding this path is crucial for fixing the vulnerability. For example, an SQL injection alert might show that user input from a query parameter flows through a function call and ends up concatenated into an SQL query. The fix is to use parameterized queries instead of string concatenation.

Not every alert requires immediate action. Some may be false positives, where CodeQL couldn't determine that input was safely sanitized. Others may be in test code or legacy code that's being phased out. You can dismiss alerts with a reason (false positive, won't fix, used in tests) and optionally add a comment explaining your decision.

CodeQL's data flow analysis is what makes it so powerful. It can trace a variable from its origin (like a URL parameter) through ten function calls across three files, showing exactly how an attacker could reach a vulnerable sink.
Integrating CodeQL with Your CI/CD

CodeQL integrates seamlessly with GitHub's pull request workflow. When you enable code scanning, CodeQL runs on every pull request. If it finds any new vulnerabilities, it marks the check as failing, preventing merging until the issues are addressed. You can configure which severity levels block merging.

You can also run CodeQL on a schedule. A weekly scan on the main branch ensures that you catch vulnerabilities that might have been introduced and then fixed in feature branches, or that were discovered by new security queries after the code was merged.

For large codebases, CodeQL's performance is excellent. Analysis times vary by language and codebase size, but typically run in 10-30 minutes. The matrix strategy allows analyzing multiple languages in parallel, reducing total time.

# Enforce code scanning on pull requests
# In branch protection rules, require the "CodeQL" status check

# Configure severity thresholds in codeql-analysis.yml
- uses: github/codeql-action/analyze@v3
  with:
    severity-threshold: 'error' # Only block on errors
CodeQL Best Practices

Enable extended query suite for critical applications. The extended suite catches more vulnerabilities, especially useful for security-sensitive codebases.

Run on every pull request. Catching vulnerabilities before they reach main is much cheaper than fixing them later.

Review alerts promptly. Don't let alerts accumulate. Set a goal to review all new alerts within 24-48 hours.

Document dismissals. When you dismiss an alert, add a comment explaining why. This helps future reviewers understand the reasoning.

Build your custom queries. If your application has domain-specific security patterns, writing custom QL queries can catch bugs unique to your codebase.

Use path filters. Exclude test files, generated code, and third-party libraries from analysis to reduce noise and speed up analysis.

Combine with other security tools. CodeQL is powerful, but no tool catches everything. Combine it with dependency scanning, secret scanning, and manual reviews.

Organizations that adopt CodeQL report catching vulnerabilities earlier in the development cycle, reducing remediation costs by up to 90% compared to finding them in production.
Frequently Asked Questions
Is CodeQL free for public repositories?
Yes! CodeQL code scanning is completely free for all public repositories. For private repositories, it's included in GitHub Advanced Security, which requires a GitHub Enterprise license.
How long does a CodeQL analysis take?
Analysis time depends on your codebase size and languages. Small to medium projects typically complete in 5-15 minutes. Large monorepos might take 30-60 minutes. The matrix strategy runs languages in parallel.
What types of vulnerabilities can CodeQL find?
CodeQL finds hundreds of vulnerability types including: SQL injection, cross-site scripting (XSS), path traversal, command injection, insecure deserialization, authentication bypass, and many more. It also finds code quality issues like dead code and performance problems.
Can I write my own CodeQL queries?
Yes! CodeQL includes a query language for writing custom security checks. You can write queries to detect patterns specific to your organization. These queries can be checked into your repository and run alongside the default queries.
What's the difference between default and extended query suites?
Default runs a balanced set of queries with few false positives. Extended includes additional queries that may produce more results, including potential false positives, but catches more vulnerabilities. Extended is recommended for security-critical applications.
How do I handle false positives?
You can dismiss alerts with a reason (false positive, won't fix, used in tests). Adding a comment explaining why helps future reviewers. For recurring false positives, consider adding path filters or writing custom query exclusions.
Does CodeQL work with compiled languages?
Yes! For C/C++, C#, Java, and Go, CodeQL needs to understand the build process. It includes autobuild that works for many common build systems (CMake, Maven, Gradle, dotnet build). For complex builds, you can provide custom build commands.
How do I view CodeQL results in pull requests?
CodeQL results appear as a check in the pull request. Click on the check to see detailed results. New alerts are highlighted, and you can see the exact code locations where issues were found.
Previous: Secret Scanning Next: GitHub Actions Advanced

CodeQL transforms security testing from a periodic audit to a continuous, automated part of your development workflow. Find vulnerabilities before they find you.