Palo Alto Networks Blog 2024年10月02日
Palo Alto Networks Prevents Data Loss at Enterprise Scale with NVIDIA
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

文章探讨了在GenAI应用快速发展的背景下,企业面临的数据管理和安全挑战,以及NVIDIA的技术在其中的作用。包括利用AI和机器学习增强检测能力、解决数据发现和分类中的问题等,还提到了NVIDIA Triton Inference Server在提高性能和降低成本方面的优势。

🧐企业在利用GenAI时面临管理和保护大量数据的挑战,AI在提高劳动力生产力的同时,对数据安全尤其是分类起到关键作用,NVIDIA的技术为有效保护大量数据提供了计算能力和软件部署工具。

💪Palo Alto Networks利用NVIDIA的技术改进DLP机器学习模型的效率和准确性,以应对数据安全挑战,包括采用先进的AI和机器学习模型,通过三阶段方法增强检测能力等。

🎯NVIDIA Triton Inference Server在提高性能方面表现出色,如提升吞吐量、智能管理批处理、优化GPU利用率等,同时还能降低运营成本,提高成本效益。

The rapid adoption of generative AI (GenAI) applications is driving a seismic shift within the SaaS application ecosystem. As enterprises leverage more GenAI, they face the formidable challenge of managing and safeguarding massive volumes of data. Artificial intelligence (AI) not only transforms workforce productivity, but it also plays a critical role in powering data security, particularly with classification. The performance and cost-effectiveness of NVIDIA’s full-stack accelerated computing provides the computational power and AI software deployment tools needed to secure vast amounts of data effectively. 

Prasidh Srikanth, Senior Director of Product Management, Data Security, Palo Alto Networks said: 

Today’s enterprises face increased challenges in securing their data, largely driven by the massive volume and complexity of diverse data formats. That’s why Palo Alto Networks is using NVIDIA’s groundbreaking GPU and Triton technology to improve the efficiency and accuracy of our DLP machine learning models, leading to faster response times and better customer outcomes. Together with NVIDIA, we’re providing enterprises with the best in class security and the most advanced AI technology needed to protect their modern workforce.

Chris Arangio, Cybersecurity Developer Relations Lead, NVIDIA said:

As enterprises harness the power of generative AI, the need for robust data security scales exponentially. With NVIDIA accelerated computing and AI software, cybersecurity leaders like Palo Alto Networks can safeguard vast amounts of sensitive information with unprecedented speed and accuracy, ushering in a new era of AI-driven data protection.

At Palo Alto Networks, we understand that traditional data loss prevention (DLP) alone is unable to keep up with the demands of today’s modern connected enterprise. That’s why we’re leveraging best-in-class NVIDIA accelerated computing, which includes the NVIDIA AI Enterprise platform for the development and deployment of enterprise co-pilots and other GenAI apps. This powers our AI models for enhanced data discovery and classification for superior customer outcomes. As we like to say, we’re Securing AI by Design.

Challenges in Data Discovery and Classification

Traditional DLP methods are increasingly outpaced by the complexities of today’s digital environment. The modern landscape is marked by an explosion of new data types, formats and sources, alongside an unprecedented volume of sensitive information, all of which pose significant challenges to effective data security. Data sprawl can encompass a vast array of formats, including image files, documents with intricate layouts, proprietary datasets and dynamic web content.

One of the most daunting challenges in modern data protection is detecting sensitive data within complex data types. Traditional detection methods fall short in capturing the subtle and nuanced natural language contexts in which sensitive information resides. Sensitive data embedded within visual formats, such as scanned documents or images, can prove evasive due to the inherent complexities of visual data processing.

False positives further compound these challenges. Traditional techniques frequently misidentify benign data as sensitive, resulting in unnecessary alerts and wasted resources. This issue can lead to alert fatigue, where critical notifications are overlooked and can disrupt business operations by inadvertently blocking legitimate data transfers. Addressing these challenges requires a more sophisticated approach that leverages context-aware detection technologies to enhance accuracy and reduce false positives.

Enhancing Detection Capabilities with Machine Learning

At Palo Alto Networks, we employ advanced AI and machine learning (ML) models to address the aforementioned challenges through a sophisticated, three-phased approach:

    Augmenting existing detection capabilities with AI and ML.Utilizing generative AI for synthetic data creation.Increasing accuracy with LLM and context-aware detections.

Augmenting Existing Detection Capabilities with AI and ML

Regulated industries, like healthcare, often require the detection of unique documents that contain sensitive patient information. To meet these requirements, we provide trainable classifiers that excel in identifying specific data types. Classifiers can be specifically trained with labels to accurately recognize and categorize patient records, like personally identifiable information (PII), patient profiles, medical diagnosis reports and prescription data.

Palo Alto Networks ML-based detections also include optical character recognition (OCR). OCR technology is pivotal for identifying sensitive information embedded within image files, such as scanned documents like driver's licenses, passports and other forms of PII. Machine learning enhances OCR by training algorithms on extensive image datasets to improve text detection and recognition. This process involves preprocessing images to enhance its detection quality using deep learning models to accurately identify and interpret characters and words, while applying postprocessing techniques to further refine the output.

Utilizing Generative AI for Synthetic Data Creation

Large language models (LLMs), such as OpenAI o1 or GPT-4o, have demonstrated remarkable abilities to understand and generate text. With just a few examples, LLMs can create diverse datasets that can be used to help build robust ML models. This is particularly useful for datasets that historically contain very limited samples, such as EU driver's licenses and national IDs. The ability to generate synthetic data has proven to be highly effective in training our AI/ML detection models.

Increasing Accuracy with LLM and Context-Aware Detections

Palo Alto Networks Data Security addresses the issue of false positives that have often plagued data security administrators. Regular expression patterns and keyword-based methods are inherently prone to generating false positives, which can overwhelm security teams and dilute the effectiveness of data protection efforts. To mitigate this, we leverage LLM-powered ML models to provide context of the detected sensitive data and establish ground truth.

In the example below, a regular expression might mistakenly flag any 9-digit number as a Social Security number. However, by using context-aware ML detection, we significantly enhance the signal-to-noise ratio. LLM-powered improvements have led to a remarkable reduction in potential false positives by over 90%. This is achieved through advanced pattern recognition that understands the context and semantics of data. With advanced ML, we can accurately differentiate between true positives and benign data, empowering InfoSec teams to respond to real incidents.

Sensitive data detection with and without LLM-powered detections. All data used in this example are with synthetic data.

The Role of NVIDIA AI in Machine Learning

Computational power is essential to effectively train and deploy deep learning models. Two critical factors in deciding whether to host on a GPU are cost and response time. We have consistently found that transitioning to GPUs, especially for more computationally intensive models, not only reduces costs but significantly decreases response times compared to hosting the same models on CPUs with alternative inferencing hardware.

Our GPU deployment models have been aided by NVIDIA Triton Inference Server, part of NVIDIA AI Enterprise, providing a dynamic interface and allowing for the hosting of various kinds of models.

Performance Enhancement: Triton boosts throughput by 20% for both CPU and GPU hosting. This is achieved through dynamic batch processing and optimal utilization of GPU cores, leading to faster response times and more efficient use of computational resources.

Below is a performance table comparing the model inferencing key performance metrics across instances using CPU with Inferentia, NVIDIA GPU and NVIDIA GPU with Triton. The data demonstrates a substantial reduction in response time with GPU and Triton-based instances, showcasing the efficiency gains achieved through advanced GPU hosting and optimization with Triton.

Performance comparison table highlighting CPU, GPU and GPU with Triton metrics.

Cost Efficiency: Triton improves performance and drives down operational costs. By optimizing GPU usage and repurposing excess CPU resources, Triton enables substantial cost savings, making it a highly cost-efficient solution for deploying and managing machine learning models.

The chart below illustrates both the compute cost per hour and the response time (adjusted to deciseconds for clarity). The key insights are:

Lower cost and faster response time with GPUs optimized with Triton.

Conclusion

The integration of NVIDIA Triton Inference Server and GPU technology into Palo Alto Networks Data Security marks a significant advancement in our ability to handle complex data security challenges. By leveraging the exceptional computational power of NVIDIA accelerated computing, we have dramatically improved the efficiency and accuracy of our ML models, leading to faster response times and a substantial reduction in operational costs. Triton Inference Server optimizes GPU utilization and enhances throughput, enabling us to scale our services more effectively while maintaining cost-efficiency. As we continue to expand our capabilities, the use of NVIDIA AI and accelerated computing will play a pivotal role in driving the future of AI-powered data protection, ensuring our solutions remain at the cutting edge of innovation.

Contact your Palo Alto Networks representative to explore how our integrated data security solution can empower your business to thrive in today’s dynamic digital landscape.

The post Palo Alto Networks Prevents Data Loss at Enterprise Scale with NVIDIA appeared first on Palo Alto Networks Blog.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

NVIDIA AI 数据安全 机器学习 性能提升 成本降低
相关文章