Accenture scales video analysis with Amazon Nova and Amazon Bedrock Agents

This post was written with Ilan Geller, Kamal Mannar, Debasmita Ghosh, and Nakul Aggarwal of Accenture.

Video highlights offer a powerful way to boost audience engagement and extend content value for content publishers. These short, high-impact clips capture key moments that drive viewer retention, amplify reach across social media, reinforce brand identity, and open new avenues for monetization. However, traditional highlight creation workflows are slow and labor-intensive. Editors must manually review footage, identify significant moments, cut clips, and add transitions or narration—followed by manual quality checks and formatting for distribution. Although this provides editorial control, it creates bottlenecks that don’t scale efficiently.

This post showcases how Accenture Spotlight delivers a scalable, cost-effective video highlight generation solution using Amazon Nova and Amazon Bedrock Agents. Amazon Nova foundation models (FMs) deliver frontier intelligence and industry-leading price-performance. With Spotlight, content owners can configure AI models and agents to support diverse use cases across the media industry while offering a human-in-the-loop option for quality assurance and collaborative refinement. This maintains accuracy, editorial oversight, and alignment with brand guidelines—without compromising on speed or scalability.

Real-world use cases

Spotlight has been applied across a range of industry scenarios, including:

Personalized short-form video generation

Sports editing and highlights

Content matching for stakeholders

Real-time retail offer generation

Spotlight’s architecture

Spotlight’s architecture addresses the challenge of scalable video processing, efficiently analyzing and generating content while maintaining speed and quality. It incorporates both task-specific models and Amazon Nova FMs that are orchestrated by specialized Amazon Bedrock agents. Key architectural highlights include:

Task-driven model selection

Agent orchestration

Scalable and adaptable

Spotlight uses a multi-layered agent workflow to automate video processing and generation while maintaining quality control. For example, to generate dynamic video highlights, Spotlight uses three specialized “super agents” that work in coordination under a central orchestrator agent’s supervision. Each super agent is powered by Amazon Nova models, and is supported by a collection of utility agents (see the following diagram). These agents work together to understand video content, generate high-quality highlights, and maintain alignment with user requirements and brand standards.

The workflow consists of the following super agents and utility agents:

Video processing agent

Research agent

Visual analysis agent

Audio analysis agent

Short video generation agent

Section of interest (SOI) agent

Video generation agent

Video postprocessing agent

Reviewer agent

Relevance check agent

Abruptness check agent

See Spotlight in action:

Solution overview

To interact with Spotlight, users access a frontend UI where they provide natural language input to specify their objective. Spotlight then employs its agentic workflow powered by Amazon Nova to achieve its given task. The following diagram illustrates the solution architecture for video highlight generation.

The workflow consists of the following key components (as numbered in the preceding diagram):

Amazon Cognito

Amazon CloudFront

Amazon API Gateway

AWS Elemental MediaLive

AWS Lambda

AWS Step Functions

Amazon SageMaker

AWS Elemental Media Convert

Amazon Simple Storage Service

Amazon CloudWatch

Key benefits

Spotlight’s approach to video processing and generation creates dynamic value. Additionally, its technical design using Amazon Nova and an integrated agentic workflow helps content owners realize gains in their video processing and editorial operations. Key benefits for Spotlight include:

Cross-industry application

Real-time processing

Cost-efficient deployment

Efficiency

The following table provides is a comparative analysis of Spotlight’s video processing approach to conventional approaches for video highlight creation.

Metric	Spotlight Performance	Conventional Approach
Video Processing Latency	Minutes for 2–3-hour sessions	Hours to days
Highlight Review Cost (3–5 minutes)	10 times lower with Amazon Nova	High cost using conventional approaches
Overall Highlight Generation Cost	10 times lower using serverless and on-demand LLM deployment	Manual workflows with high operational overhead
Deployment Architecture	Fully serverless with scalable LLM invocation	Typically resource-heavy and statically provisioned
Use Case Flexibility	Sports, media editing, retail personalization, and more	Often tailored to a single use case

Conclusion

Spotlight represents a cutting-edge agentic solution designed to tackle complex media processing and customer personalization challenges using generative AI. With modular, multi-agent workflows built on Amazon Nova, Spotlight seamlessly enables dynamic short-form video generation. The solution’s core framework is also extensible to diverse industry use cases that require multimodal content analysis at scale.

As an AWS Premier Tier Services Partner and Managed Services Provider (MSP), Accenture brings deep cloud and industry expertise. Accenture and AWS have worked together for more than a decade to help organizations realize value from their applications and data. Accenture brings its industry understanding and generative AI specialists to build and adapt generative AI solutions to client needs. Together with AWS, through the Accenture AWS Business Group (AABG), we help enterprises unlock business value by rapidly scaling generative AI solutions tailored to their needs—driving innovation and transformation in the cloud.

Try out Spotlight for your own use case, and share your feedback in the comments.

About the authors

Ilan Geller is a Managing Director in the Data and AI practice at Accenture. He is the Global AWS Partner Lead for Data and AI and the Center for Advanced AI. His roles at Accenture have primarily been focused on the design, development, and delivery of complex data, AI/ML, and most recently Generative AI solutions.

Dr. Kamal Mannar is a Global Computer Vision Lead at Accenture’s Center for Advanced AI, with over 20 years of experience applying AI across industries like agriculture, healthcare, energy, and telecom. He has led large-scale AI transformations, built scalable GenAI and computer vision solutions, and holds 10+ patents in areas including deep learning, wearable AI, and vision transformers. Previously, he headed AI at Vulcan AI, driving cutting-edge innovation in precision agriculture. Kamal holds a Ph.D. in Industrial & Systems Engineering from the University of Wisconsin–Madison.

Debasmita Ghosh is working as Associate Director in Accenture with 21 years of experience in Information Technology (8 years in AI/Gen AI capability), who currently among multiple responsibilities leads Computer Vision practice in India. She has presented her paper on Handwritten Text Recognition in multiple conferences including MCPR 2020, GHCI 2020. She has patent granted on Handwritten Text Recognition solution and received recognition from Accenture under the Accenture Inventor Award Program being named as an inventor on a granted patent. She has multiple papers on Computer Visions solutions like Table Extraction including non-uniform and borderless tables accepted and presented in the ComPE 2021 and CCVPR 2021 international conferences. She has managed projects across multiple technologies (Oracle Apps, SAP). As a programmer, she has worked during various phases of SDLC with experience on Oracle Apps Development across CRM, Procurement, Receivables, SCM, SAP Professional Services, SAP CRM. Debasmita holds M.Sc. in Statistics from Calcutta University.

Nakul Aggarwal is a Subject Matter Expert in Computer Vision and Generative AI at Accenture, with around 7 years of experience in developing and delivering cutting-edge solutions across computer vision, multimodal AI, and agentic systems. He holds a Master’s degree from the Indian Institute of Technology (IIT) Delhi and has authored several research papers presented at international conferences. He holds two patents in AI and currently leads multiple projects focused on multimodal and agentic AI. Beyond technical delivery, he plays a key role in mentoring teams and driving innovation by bridging advanced research with real-world enterprise applications.

Aramide Kehinde is Global Partner Solutions Architect for Amazon Nova at AWS. She works with high growth companies to build and deliver forward thinking technology solutions using AWS Generative AI. Her experience spans multiple industries, including Media & Entertainment, Financial Services, and Healthcare. Aramide enjoys building the intersection of AI and creative arenas and spending time with her family.

Rajdeep Banerjee is a Senior Partner Solutions Architect at AWS helping strategic partners and clients in the AWS cloud migration and digital transformation journey. Rajdeep focuses on working with partners to provide technical guidance on AWS, collaborate with them to understand their technical requirements, and designing solutions to meet their specific needs. He is a member of Serverless technical field community. Rajdeep is based out of Richmond, Virginia.

Real-world use cases

Spotlight’s architecture

Solution overview

Key benefits

Conclusion

About the authors

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签