Turbocharging premium audit capabilities with the power of generative AI: Verisk’s journey toward a sophisticated conversational chat platform to enhance customer support

This post is co-written with Sajin Jacob, Jerry Chen, Siddarth Mohanram, Luis Barbier, Kristen Chenowith, and Michelle Stahl from Verisk.

Verisk (Nasdaq: VRSK) is a leading data analytics and technology partner for the global insurance industry. Through advanced analytics, software, research, and industry expertise across more than 20 countries, Verisk helps build resilience for individuals, communities, and businesses. The company is committed to ethical and responsible AI development with human oversight and transparency. Verisk is using generative AI to enhance operational efficiencies and profitability for insurance clients while adhering to its ethical AI principles.

Verisk’s Premium Audit Advisory Service (PAAS®) is the leading source of technical information and training for premium auditors and underwriters. PAAS helps users classify exposure for commercial casualty insurance, including general liability, commercial auto, and workers’ compensation. PAAS offers a wide range of essential services, including more than 40,000 classification guides and more than 500 bulletins. PAAS now includes PAAS AI, the first commercially available interactive generative-AI chats specifically developed for premium audit, which reduces research time and empower users to make informed decisions by answering questions and quickly retrieving and summarizing multiple PAAS documents like class guides, bulletins, rating cards, etc.

In this post, we describe the development of the customer support process in PAAS, incorporating generative AI, the data, the architecture, and the evaluation of the results. Conversational AI assistants are rapidly transforming customer and employee support. Verisk has embraced this technology and developed its own PAAS AI, which provides an enhanced self-service capability to the PAAS platform.

The opportunity

The Verisk PAAS platform houses a vast array of documents—including class guides, advisory content, and bulletins—that aid Verisk’s customers in determining the appropriate rules and classifications for workers’ compensation, general liability, and commercial auto business. When premium auditors need accurate answers within this extensive document repository, the challenges they face are:

Overwhelming volume

Slow response times

Inconsistent quality of responses

To address this issue, Verisk PAAS AI is designed to alleviate the burden by providing round-the-clock support for business processing and delivering precise and quick responses to customer queries. This technology is deeply integrated into Verisk’s newly reimagined PAAS platform, using all of Verisk’s documentation, training materials, and collective expertise. It employs a retrieval augmented generation (RAG) approach and a combination of AWS services alongside proprietary evaluations to promptly answer most user questions about the capabilities of the Verisk PAAS platform.

When deployed at scale, this PAAS AI will enable Verisk staff to dedicate more time to complex issues, critical projects, and innovation, thereby enhancing the overall customer experience. Throughout the development process, Verisk encountered several considerations, key findings, and decisions that provide valuable insights for any enterprise looking to explore the potential of generative AI.

The approach

When creating an interactive agent using large language models (LLMs), two common approaches are RAG and model fine-tuning. The choice between these methods depends on the specific use case and available data. Verisk PAAS began developing a RAG pipeline for its PAAS AI and has progressively improved this solution. Here are some reasons why continuing with a RAG architecture was beneficial for Verisk:

Dynamic data access

Multiple data sources

Reduced hallucinations

LLM linguistics

Transparency

Data governance

Although both RAG and fine-tuning have their pros and cons, RAG is the best approach for building a PAAS AI on the PAAS platform, given Verisk’s needs for real-time accuracy, explainability, and configurability. The pipeline architecture supports iterative enhancement as the use cases for the Verisk PAAS platform develop.

Solution overview

The following diagram showcases a high-level architectural data flow that highlights various AWS services used in constructing the solution. Verisk’s system demonstrates a complex AI setup, where multiple components interact and frequently call on the LLM to provide user responses. Employing the PAAS platform to manage these varied components was an intuitive decision.

The key components are as follows:

Amazon ElastiCache

Amazon Bedrock

Amazon OpenSearch Service

Amazon ElastiCache

Verisk’s PAAS team determined that ElastiCache is the ideal solution for storing all chat history. This storage approach allows for seamless integration in conversational chats and enables the display of recent conversations on the website, providing an efficient and responsive user experience.

Amazon Bedrock

Anthropic’s Claude, available in Amazon Bedrock, played various roles within Verisk’s solution:

Response generation

Conversation summarization

Keyword extraction

Amazon OpenSearch Service

Primarily used for the storage of text embeddings, OpenSearch facilitates efficient document retrieval by enabling rapid access to indexed data. These embeddings serve as semantic representations of documents, allowing for advanced search capabilities that go beyond simple keyword matching. This semantic search functionality enhances the system’s ability to retrieve relevant documents that are contextually similar to the search queries, thereby improving the overall accuracy and speed of data queries. Additionally, OpenSearch functions as a semantic cache for similarity searches, optimizing performance by reducing the computational load and improving response times during data retrieval operations. This makes it an indispensable tool in the larger PAAS ecosystem, where the need for quick and precise information access is paramount.

Snowflake in Amazon

The integration of Snowflake in the PAAS AI ecosystem helps provide scalable and real-time access to data, allowing Verisk to promptly address customer concerns and improve its services. By using Snowflake’s capabilities, Verisk can perform advanced analytics, including sentiment analysis and predictive modeling, to better understand customer needs and enhance user experiences. This continuous feedback loop is vital for refining the PAAS AI and making sure it remains responsive and relevant to user demands.

Structuring and retrieving the data

An essential element in developing the PAAS AI’s knowledge base was properly structuring and effectively querying the data to deliver accurate answers. Verisk explored various techniques to optimize both the organization of the content and the methods to extract the most relevant information:

Chunking

Hybrid query

Data separation and filters

By thoroughly experimenting and optimizing both the knowledge base powering the PAAS AI and the queries to extract answers from it, Verisk was able to achieve very high answer accuracy during the proof of concept, paving the way for further development. The techniques explored—hybrid querying, HTML section chunking, and index filtering—became core elements of Verisk’s approach for extracting quality contexts.

LLM parameters and models

Experimenting with prompt structure, length, temperature, role-playing, and context was key to improving the quality and accuracy of the PAAS AI’s Claude-powered responses. The prompt design guidelines provided by Anthropic were incredibly helpful.

Verisk crafted prompts that provided Anthropic’s Claude with clear context and set roles for answering user questions. Setting the temperature to 0 helped reduce the randomness and indeterministic nature of LLM-generated responses.

Verisk also experimented with different models to improve the efficiency of the overall solution. For scenarios where latency was more important and less reasoning was required, Anthropic’s Claude Haiku was the perfect solution. For other scenarios such as question answering using provided contexts where it was more important for the LLM to be able to understand every detail given in the prompt, Anthropic’s Claude Sonnet was the better choice to balance latency, performance, and cost.

Guardrails

LLM guardrails were implemented in the PAAS AI project using both the guardrails provided by Amazon Bedrock and specialized sections within the prompt to detect unrelated questions and prompt attack attempts. Amazon Bedrock guardrails can be attached to any Amazon Bedrock model invocation call and automatically detect if the given model input and output are in violation of the language filters that are set (violence, misconduct, sexual, and so on), which helps with screening user inputs. The specialized prompts further improve LLM security by creating a second net that uses the power of the LLMs to catch any inappropriate inputs from the users.

This allows Verisk to be confident that the model will only answer to its intended purpose surrounding premium auditing services and will not be misused by threat actors.

After validating several evaluation tools such as Deepeval, Ragas, Trulens, and so on, the Verisk PAAS team realized that there were certain limitations to using these tools for their specific use case. Consequently, the team decided to develop its own evaluation API, shown in the following figure.

This custom API evaluates the answers based on three major metrics:

Answer relevancy score

Context relevancy score

Faithfulness score

This custom evaluation approach helps make sure that the answers generated are not only relevant and contextually appropriate but also faithful to the established generative AI knowledge base, minimizing the risk of misinformation. By incorporating these metrics, Verisk has enhanced the robustness and reliability of their PAAS AI, providing customers with accurate and trustworthy responses.

The Verisk PAAS team has implemented a comprehensive feedback loop mechanism, shown in the following figure, to support continuous improvement and address any issues that might arise.

This feedback loop is structured around the following key components:

Customer feedback analysis

Issue categorization

QA test case updates

Ground truth agreements

Ongoing evaluations

This robust feedback loop mechanism enables Verisk to continuously fine-tune the PAAS AI, making sure that it delivers precise, relevant, and contextually appropriate answers to customer queries. By integrating customer feedback, categorizing issues efficiently, updating test scenarios, and adhering to stringent evaluation protocols, Verisk maintains a high standard of service and drives continuous improvement in its generative AI capabilities.

Business impact

Verisk initially rolled out the PAAS AI to one beta customer to demonstrate real-world performance and impact. Supporting a customer in this way is a stark contrast to how Verisk has historically engaged with and supported customers in the past, where Verisk would typically have a team allocated to interact with the customer directly. Verisk’s PAAS AI has revolutionized the way subject matter experts (SMEs) work and cost-effectively scales while still providing high-quality assistance. What previously took hours of manual review can now be accomplished in minutes, resulting in an extraordinary 96–98% reduction in processing time per specialist. This dramatic improvement in efficiency not only streamline operations but also allows Verisk’s experts to focus on more strategic initiatives that drive greater value for the organization.

In analyzing this early usage data, Verisk uncovered additional areas where it can drive business value for its customers. As Verisk collects additional information, this data will help uncover what will be needed to improve results and prepare to roll out to a wider customer base of approximately 15,000 users.

Ongoing development will focus on expanding these capabilities, prioritized based on the collected questions. Most exciting, though, are the new possibilities on the horizon with generative AI. Verisk knows this technology is rapidly advancing and is eager to harness innovations to bring even more value to customers. As new models and techniques emerge, Verisk plans to adapt the PAAS AI to take advantage of the latest capabilities. Although the PAAS AI currently focuses on responding to user questions, this is only the starting point. Verisk plans to quickly improve its capabilities to proactively make suggestions and configure functionality directly in the system itself. The Verisk PAAS team is inspired by the challenge of pushing the boundaries of what’s possible with generative AI and is excited to test those boundaries.

Conclusion

Verisk’s development of a PAAS AI for its PAAS platform demonstrates the transformative power of generative AI in customer support and operational efficiency. Through careful data harvesting, structuring, retrieval, and the use of LLMs, semantic search functionalities, and stringent evaluation protocols, Verisk has crafted a robust system that delivers accurate, real-time answers to user questions. By continuing to enhance the PAAS AI’s features while maintaining ethical and responsible AI practices, Verisk is set to provide increased value to its customers, enable staff to concentrate on innovation, and establish new benchmarks for customer service in the insurance sector.

For more information, see the following resources:

Explore generative AI on AWS

unlocking the business value of generative AI

Anthropic’s Claude 3 models on Amazon Bedrock

Learn about Amazon Bedrock

generative AI quickstart proofs of concept

About the Authors

Sajin Jacob is the Director of Software Engineering at Verisk, where he leads the Premium Audit Advisory Service (PAAS) development team. In this role, Sajin plays a crucial part in designing the architecture and providing strategic guidance to eight development teams, optimizing their efficiency and ensuring the maintainability of all solutions. He holds an MS in Software Engineering from Periyar University, India.

Jerry Chen is a Lead Software Developer at Verisk, based in Jersey City. He leads the GenAi development team, working on solutions for projects within the Verisk Underwriting department to enhance application functionalities and accessibility. Within PAAS, he has worked on the implementation of the conversational RAG architecture with enhancements such as hybrid search, guardrails, and response evaluations. Jerry holds a degree in Computer Science from Stevens Institute of Technology.

Sid Mohanram is the Senior Vice President of Core Lines Technology at Verisk. His area of expertise includes data strategy, analytics engineering, and digital transformation. Sid is head of the technology organization with global teams across five countries. He is also responsible for leading the technology transformation for the multi-year Core Lines Reimagine initiative. Sid holds an MS in Information Systems from Stevens Institute of Technology.

Luis Barbier is the Chief Technology Officer (CTO) of Verisk Underwriting at Verisk. He provides guidance to the development teams’ architectures to maximize efficiency and maintainability for all underwriting solutions. Luis holds an MBA from Iona University.

Kristen Chenowith, MSMSL, CPCU, WCP, APA, CIPA, AIS, is PAAS Product Manager at Verisk. She is currently the product owner for the Premium Audit Advisory Service (PAAS) product suite, including PAAS AI, a first to market generative AI chat tool for premium audit that accelerates research for many consultative questions by 98% compared to traditional methods. Kristen holds an MS in Management, Strategy and Leadership at Michigan State University and a BS in Business Administration at Valparaiso University. She has been in the commercial insurance industry and premium audit field since 2006.

Michelle Stahl, MBA, CPCU, AIM, API, AIS, is a Digital Product Manager with Verisk. She has over 20 years of experience building and transforming technology initiatives for the insurance industry. She has worked as a software developer, project manager, and product manager throughout her career.

Arun Pradeep Selvaraj is a Senior Solutions Architect at AWS. Arun is passionate about working with his customers and stakeholders on digital transformations and innovation in the cloud while continuing to learn, build, and reinvent. He is creative, fast-paced, deeply customer-obsessed, and uses the working backward process to build modern architectures to help customers solve their unique challenges. Connect with him on LinkedIn.

Ryan Doty is a Solutions Architect Manager at AWS, based out of New York. He helps financial services customers accelerate their adoption of the AWS Cloud by providing architectural guidelines to design innovative and scalable solutions. Coming from a software development and sales engineering background, the possibilities that the cloud can bring to the world excite him.

Apoorva Kiran, PhD, is a Senior Solutions Architect at AWS, based out of New York. He is aligned with the financial service industry, and is responsible for providing architectural guidelines to design innovative and scalable fintech solutions. He specializes in developing and commercializing artificial intelligence and machine learning products. Connect with him on LinkedIn.

The opportunity

The approach

Solution overview

Amazon ElastiCache

Amazon Bedrock

Amazon OpenSearch Service

Snowflake in Amazon

Structuring and retrieving the data

LLM parameters and models

Guardrails

Business impact

Conclusion

About the Authors

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签