The GitHub Blog 2024年12月24日
Announcing CodeQL Community Packs
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

CodeQL社区包是一套增强代码分析能力的查询和模型集合,包括多种类型的包,可用于多种场景,如代码扫描等,且社区参与度很重要。

📦CodeQL社区包是增强代码分析能力的集合,包含多种类型的包,如模型包、查询包、库包等。

💻社区包在GitHub Security Lab已广泛使用,其中的审计查询在手动代码审查中很有价值。

🌐社区包提供多种语言的额外查询和模型,如Java的CVEs、安全、审计探索等查询。

🎁社区包可在GitHub的代码扫描工作流和CodeQL CLI中使用,且鼓励社区参与贡献。

We are excited to introduce the new CodeQL Community Packs, a comprehensive set of queries and models designed to enhance your code analysis capabilities. These packs are tailored to augment the standard set of CodeQL queries, providing additional resources for security researchers and developers alike.

Why?

CodeQL is a semantic code analysis tool that allows developers to query their codebases as databases, enabling the identification of vulnerabilities, bugs, and patterns efficiently.

The standard set of CodeQL queries is focused on accuracy and low false positive rates, which is ideal for integration into CI/CD pipelines where alerts are primarily handled by developers. However, when alerts are operated by security engineers or researchers, the balance between false positives and false negatives can be adjusted to prioritize low false negatives, ensuring no bugs are left behind—albeit at the cost of more triaging.

What?

The CodeQL Community Packs is a set of CodeQL packs to augment the standard set queries. They include three main types of packs:

How?

The GitHub Security Lab has been extensively using these packs for the last few years and as our records show, they turned out to be very fruitful.

In addition to the additional queries and models provided by the community packs, we have also been using the audit queries, which proved invaluable when running deep-dive manual code reviews, such as the ones we did for Datahub and Home Assistant. Being able to list all the files which introduced untrusted data into the application or that perform security-relevant operations was really helpful when exploring unfamiliar huge codebases, such as the Home Assistant one.

What’s in the community packs?

The CodeQL Community Packs offer a variety of additional queries and models for languages, such as Java, C#, and Python. These packs are designed to move the Signal to Noise (SNR) ratio closer to the low false negatives end of the spectrum, making them particularly useful for security researchers.

For example, the Java packs include:

Library extension models

Remember Log4Shell? It was relatively easy for a SAST tool to detect, as the JNDI injection sink was well-known and covered by existing CodeQL models at that time. However, CodeQL’s default threat model, like most SAST tools, is based on modeling untrusted data as data that comes from the network. Therefore, CodeQL could have reported Log4Shell if we had analyzed an application that took untrusted data from the network (for example, a web application) and passed this untrusted data to Log4J logger methods.

To enable CodeQL to report such a data flow path, we would have needed to provide CodeQL with the source code of both the web application and Log4J. Could we have reported Log4Shell by analyzing only the Log4J source code? Certainly! But we would have needed a different threat model, one in which the arguments of logger methods such as info or error were considered sources of untrusted data. But how could CodeQL know that these methods could introduce untrusted data in the first place?

To support such a threat model, we developed the library source packs. We analyzed thousands of applications that took untrusted data and passed it to third-party APIs (such as Log4J’s error method). This analysis resulted in a list of third-party library methods used in real applications that are passed untrusted data.

Once we collected this list, which contained API methods such as Log4J’s AbstractLogger.error, we used it to define new sources of untrusted data to be used when scanning library code, such as Log4J code. By doing this with Log4J code, we were able to first identify that logger methods can be called with untrusted data from network requests and second, report a JNDI injection in Log4J code when using the new library source QL packs!

Exploration queries

Reviewing a new, unfamiliar codebase is a difficult and lengthy process. Reducing the review surface to the most significant and relevant files is crucial to making this process as efficient as possible.

When faced with similar reviews, the GitHub Security Lab likes to first map out the new codebase. We do this by listing all the entry points where potentially untrusted data enters the application and identifying operations that can be hazardous, such as file reads/writes, deserialization operations, or network requests.

To achieve this, we use the RemoteFlowSources.ql query, which provides a list of all places identified by CodeQL where untrusted data enters the application. We also use the HotSpots query, which returns a list of all hazardous sinks in the application, regardless of evidence of untrusted data flowing into them.

In addition to providing a good initial heat map of the codebase, this approach helps us better understand how well CodeQL covers the used libraries and whether additional modeling is needed.

How to use them?

The community packs are regular CodeQL packs and can be used both as part of GitHub’s code scanning workflows and with the CodeQL CLI.

To use the CodeQL community packs in code scanning, specify a with: packs: entry in the uses: github/codeql-action/init@v3 section of your CodeQL code scanning workflow. See the examples below.

Adding the community packs library extension models to a scan:

- name: Initialize CodeQL        uses: github/codeql-action/init@v3        with:          languages: java          packs: githubsecuritylab/codeql-java-library-sources,githubsecuritylab/codeql-java-extensions

Running the community packs additional security queries:

- name: Initialize CodeQL        uses: github/codeql-action/init@v3        with:          languages: java                    queries: java          packs: githubsecuritylab/codeql-java-queries

Running the community packs additional security queries with the additional community packs extension models:

- name: Initialize CodeQL        uses: github/codeql-action/init@v3        with:          languages: java                    queries: java          packs: githubsecuritylab/codeql-java-extensions,githubsecuritylab/codeql-java-queries

Similarly, you can use the community packs from the CLI.

Adding the community packs library extension models to a scan:

codeql database analyze --download <CodeQL DB> --model-packs githubsecuritylab/codeql-java-extensions --model-packs githubsecuritylab/codeql-java-library-sources codeql/java-queries --format=sarif-latest --output=scan.sarif --sarif-add-file-contents

Running the community packs additional security queries:

codeql database analyze --download <CodeQL DB> githubsecuritylab/codeql-java-queries --format=sarif-latest --output=scan.sarif --sarif-add-file-contents

Running the community packs additional security queries with the additional community packs extension models:

codeql database analyze --download db --model-packs githubsecuritylab/codeql-java-extensions githubsecuritylab/codeql-java-queries --format=sarif-latest --output=scan.sarif --sarif-add-file-contents

How to contribute?

The most important aspect of the community packs is the community involvement! Sharing your models and queries with the community is the best way to help secure the open source software we all depend on. Contributions can range from simple Model As Data (MaD) lines to existing extension files or even the creation of new queries that model new vulnerability classes. Every contribution is welcome!

The post Announcing CodeQL Community Packs appeared first on The GitHub Blog.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

CodeQL社区包 代码分析 安全研究 社区参与 审计查询
相关文章