The GitHub Blog 01月10日
How to secure your GitHub Actions workflows with CodeQL
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

文章介绍了在开源项目中加强GitHub Actions安全的工作,包括发现众多漏洞,为CodeQL提供新支持,增加对GitHub Actions的检测和修复能力,涉及taint tracking、Bash支持、模型及查询等方面。

🎯在开源项目中发现众多漏洞并加强安全

💻为CodeQL添加新支持,用于检测修复

🔍taint tracking可识别复杂注入漏洞

📜新CodeQL packs支持Bash,能发现相关漏洞

📊分析第三方行动,确定多种模型及查询

In the last few months, we secured more than 75 GitHub Actions workflows in open source projects, disclosing more than 90 different vulnerabilities. Out of this research, we produced new support for workflows in CodeQL, empowering you to secure yours.

The situation: growing number of insecure workflows

If you have read our series about keeping your GitHub Actions and workflows secure, you already have a good understanding of common vulnerabilities in GitHub Actions and how to solve them.

Unfortunately, we found that these vulnerabilities are still quite common, mostly because of a lack of awareness of how the moving parts interact with each other and what the impact of these vulnerabilities may be for your organization or repository.

To help prevent the introduction of vulnerabilities, identify them in existing workflows, and even fix them using GitHub Copilot Autofix, CodeQL support has been added for GitHub Actions. The new CodeQL packs can be used by code scanning to scan both existing and new workflows. As code scanning and Copilot Autofix are free for OSS repositories, all public GitHub repositories will have access to these new queries, empowering detection and remediation of these vulnerabilities.

In the rest of this post, we’ll see

Previous attempt

Previously, there was a single CodeQL query capable of identifying simplistic code injections in GitHub workflows. However, this query had several limitations. First, it was bundled with the JavaScript QL packs, meaning users had to enable JavaScript scanning even if they had no JavaScript code in their repositories, which was confusing and misleading. Additionally, the representation of GitHub workflow syntax and grammar was incomplete, making it difficult to express more complex patterns using the existing Abstract Syntax Tree (AST) of GitHub Actions, which is used by static analysis tools such as CodeQL. Most importantly, the CodeQL support for GitHub workflows did not previously include Taint Tracking support and models for non-straightforward sources of untrusted data or dangerous operations.

Taint Tracking is key!

So what is Taint Tracking and how important is it?

Through the previous query for Code Injection, we were able to identify simplistic vulnerabilities such as those cases where a known user-controlled property gets directly interpolated into a Run script:

That was a great starting point, but what about cases such as the following?

The first step of the path is the download of an artifact.

The second and third steps are setting the content of a file from the artifact as the output of the workflow step.

In the last step, the value from the previous step is interpolated in an unsafe manner in a Run script leading to a potential code injection.

In the case above, the source of untrusted data is not simply a GitHub Event Context access of a known untrusted property (for example, github.event.pull_request.body) but rather the download of an artifact. Should all artifacts be considered untrusted? Certainly not. However, in this instance, where the workflow is triggered by a workflow_run event with no branch filters and where an artifact is downloaded from the triggering workflow (github.event.workflow_run.workflow_id), the artifact should be considered untrusted. When decompressed, it may pollute the Runner’s workspace by writing to files in unexpected locations. Consequently, from that step onward, all files in the workspace should be considered untrusted. This example highlights a non-trivial pattern that we need to express using the new actions AST representation to identify sources of untrusted data.

Identifying sources of untrusted data is only the first step towards uncovering more complex injection vulnerabilities. In the example above, it is crucial to understand Bash scripts to determine if they are reading from untrusted files and inserting data into shell variables. It is essential to comprehend how these variables may flow into the output of a step, and, subsequently, how the output flows through different steps, jobs, composite actions, or reusable workflows until they reach a potentially dangerous sink. This understanding is what Taint Tracking and Control Flow will achieve.

In summary, as illustrated in the CodeQL alert above, we can now identify non-obvious sources of untrusted data (for example, git or gh commands or third-party actions) and, more importantly, track this untrusted data throughout complex workflows. These workflows involve multiple steps, jobs, actions, and even entire workflows, allowing us to better understand where this data is used and to report potential vulnerabilities effectively.

Bash support

GitHub’s workflows can execute various scripts, with Bash scripts being among the most common. The new CodeQL packs for GitHub Actions offer basic support for Bash, helping to identify tainted data originating from Bash scripts. For example, commands such as git diff-tree obtain a list of changed files. Tainted data can flow through a script when reading an attacker-controlled file or environment variable into a step’s output or another environment variable. A pertinent example of such a vulnerability could be found in a workflow of Azure CLI repository.

In the alert above, we can see how untrusted data, such as a pull request’s title, is assigned to the TITLE environment variable. This variable is then read and processed by several commands, resulting in a new message variable that gets redirected to the special file pointed to by the GITHUB_ENV variable. A malicious actor could craft a title that results in a multiline message, allowing them to inject arbitrary environment variables into subsequent steps. This, in turn, would enable the attacker to exfiltrate secrets used in the workflow.

The new CodeQL packs are able to parse Bash scripts. While they don’t yet generate a full AST, they already allow us to understand elements such as assignments, pipelines, and redirections, enabling us to report subtle vulnerabilities like the one mentioned above.

Models

As explained in “Keeping your GitHub Actions and workflows secure Part 2: Untrusted input,” GitHub’s event context is the most common source of untrusted data. Properties such as github.event.issue.title, github.event.pull_request.head.ref, or github.event.comment.body are typical sources of untrusted data. However, any third-party action may introduce untrusted data. For instance, an action that returns a list of filenames changed in a pull request should be considered a source of untrusted data. Similarly, actions that parse an issue body or comment for a command or structured data should also be treated as sources of untrusted data.

The same applies to actions that pass data from one of their inputs to their outputs or into an environment variable, therefore acting as taint steps (summaries). Actions such as actions/github-script, azure/cli, or azure/powershell should be considered sinks for Code Injection, similar to a Run’s step.

We have analyzed thousands of popular third-party actions and identified a number of models now incorporated into the analysis:

Queries

The previous support for GitHub Actions contained a single query for code injection, whereas the new CodeQL packs incorporate 18 new queries, including the Code Injection and Environment Variable Injection queries mentioned above.

Results

For the past few months, we have been testing the new queries on thousands of open source projects to validate their accuracy and performance. The results have been very impressive, allowing us to identify and report vulnerabilities in numerous critical organizations and repositories, such as Microsoft, Azure, GitHub, Eclipse, Jupyter, Adobe, AWS, Cloudflare, Discord, Hibernate, HuggingFace, and Apache.

The table below shows all the repositories affected, along with their GitHub stars, to give an idea of the impact that a supply chain attack could have had in these projects:

RepositoryStars
ant-design/ant-design92,412
Excalidraw/excalidraw84,021
apache/superset62,589
withastro/astro46,604
Stirling-Tools/Stirling-PDF44,988
geekan/MetaGPT44,901
Kong/kong39,221
LAION-AI/Open-Assistant37,045
appsmithorg/appsmith34,352
gradio-app/gradio33,709
DIYgod/RSSHub33,432
calcom/cal.com32,282
milvus-io/milvus30,299
k3s-io/k3s28,010
discordjs/discord.js25,390
element-plus/element-plus24,488
cilium/cilium20,150
monkeytypegame/monkeytype15,635
amplication/amplication15,196
docker-mailserver/docker-mailserver14,643
jupyterlab/jupyterlab14,167
openimsdk/open-im-server14,041
quarkusio/quarkus13,771
espressif/arduino-esp3213,609
sympy/sympy12,967
ionic-team/stencil12,561
zephyrproject-rtos/zephyr10,819
qgis/QGIS10,569
trinodb/trino10,413
OpenFeign/feign9,490
marimo-team/marimo7,583
dream-num/univer7,021
aws/karpenter-provider-aws6,782
hibernate/hibernate-orm5,976
ant-design-blazor/ant-design-blazor5,809
litestar-org/litestar5,511

New vulnerability patterns

Having triaged and reported numerous alerts, we have identified some common patterns that often lead to vulnerabilities in GitHub workflows:

Misuse of pull_request_target trigger

The pull_request_target event trigger, while offering powerful automation capabilities in GitHub Actions, harbors a dark side filled with potential security pitfalls. This event trigger, designed to execute workflows within the context of pull request’s base branch, presents special characteristics that severely increase the impact in case of any vulnerability. A workflow activated by pull_request_target and triggered from a fork operates with significant privileges, in contrast to the pull_request event:

When working with pull_request_triggered workflows, we have to be very careful and pay special attention to the following scenarios:

If we really need to use this trigger event, there are a few ways to harden the workflows to prevent any abuses:

Security boundaries and workflow_run event

The workflow_run event trigger in GitHub Actions is designed to automate tasks based on the execution or completion of another workflow. It may grant write permissions and access to secrets even if the triggering workflow doesn’t have such privileges. While this is beneficial for tasks like labeling pull requests based on test results, it poses significant security risks if not used carefully.

The workflow_run trigger poses a risk because it can often be initiated by an attacker. Some maintainers were surprised by this, believing that their triggering workflows, which were run on events such as release, were safe. This assumption was based on the idea that since an attacker couldn’t trigger a new release, they shouldn’t be able to initiate the triggering workflow or the subsequent workflow_run workflow.

The reality is that an attacker can submit a pull request that modifies the triggering workflow and even replace the triggering events. Since pull_request workflows run in the context of the pull request’s HEAD branch, the modified workflow will run and, upon completion, will be able to trigger an existing workflow_run workflow. The danger arises from the fact that even if the triggering pull_request workflow is not privileged, the triggered workflow_run workflow will have access to secrets and write-scoped tokens, even if the initial workflow did not have those privileges. This enables privilege escalation attacks, allowing attackers to execute malicious code with elevated permissions within the CI/CD pipeline.

Another significant pitfall with the workflow_run event trigger is artifact poisoning. Artifacts are files generated during a workflow run that can be shared with other workflows. Attackers can poison these artifacts by uploading malicious content through a pull request. When a workflow_run workflow downloads and uses these poisoned artifacts, it can lead to arbitrary code execution or other malicious activities within the privileged workflow. The issue is that many workflow_run workflows do not verify the contents of downloaded artifacts before using them, making them vulnerable to various attacks.

Securing workflow_run workflows requires a multi-faceted approach. By understanding the inherent risks and implementing the recommended mitigations, developers can leverage the automation benefits of workflow_run while minimizing the potential for security compromises.

Effective mitigations

Non-effective mitigations

IssueOops: Security pitfalls with issue_comment trigger

The issue_comment event trigger in GitHub Actions is a powerful tool for automating workflows based on comments on issues and pull requests. When applied in the context of IssueOps, it can streamline tasks like running commands in response to specific comments. However, this convenience comes with significant security risks that must be carefully considered.

Mitigating the risks

Ineffective or incomplete mitigations

Wrapping up

The new CodeQL support for GitHub Actions is in public preview. The new QL packs allow you to scan your repository for a variety of vulnerabilities in GitHub Actions, helping prevent supply chain attacks in the OSS software we all depend on! If you want to give them a try, take one of the following steps depending on your case:

Stay secure!

The post How to secure your GitHub Actions workflows with CodeQL appeared first on The GitHub Blog.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

GitHub Actions 安全漏洞 CodeQL taint tracking Bash支持
相关文章