未知数据源 2024年11月26日
Zooming Past the Competition
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Taboola举办全球R&D黑客马拉松,33个团队参与。本文介绍的获胜团队作品Taboola Zoom,是一款AR应用,用户通过手机摄像头获取内容推荐,其实现涉及多领域技术,项目在36小时内由5人完成。

🎯Taboola Zoom是AR应用,用户用手机摄像头获取内容推荐

💻实现该应用使用了AI、Web应用和微服务等多领域技术

📱各组件采用不同技术,如HTML5构建用户界面等

🌐需处理图像、搜索文章并筛选相似内容返回给用户

Imagine you’re walking down the street and you see a nice car you’re thinking ofbuying. Just by pointing your phone camera, you can see relevant content aboutthat car. How cool is that?!

That was our team’s idea that awarded us first place in the recent Taboola R&Dhackathon aptly named — Taboola Zoom!

Every year Taboola holds a global R&D hackathon for its 350+ engineers aimed atcreating ideas for cool potential products or just some fun experiments ingeneral.

This year, 33 teams worked for 36 hours to come up with ideas that are bothawesome and helpful to Taboola. Some of the highlights included a tool that canaccurately predict the users’ gender based on their browsing activity and anintegration to social networks for Taboola Feed.

Our team decided to create anAR (Augmented Reality)application that allows a user to get content recommendations, much likeTaboola’s recommendations widget, based on whatever they’re pointing their phonecamera at.

What is Zoom?

The app itself is an AR experience similar to that of Google Glass, which allowsyou to interact with the world using your phone camera. Using the app, one justneeds to point their camera onto an object to immediately get a list of storiesfrom the web related to that object.

To make this idea a reality we used technologies from several domains —AI, WebApplications andMicroservices.

Under the Hood

The flow is pretty simple:

    The user opens a web application on his phone — anHTML5 page thatbehaves like a native app.The app sends a screenshot of the captured video every second to a remoteserver.The server processes the image using computer vision technologies.The server searches for web articles with thumbnails that are the most similarto the processed image.The retrieved images are filtered using a similarity threshold, and are sentback to the user to be shown in a slick widget atop of the user’s display.When the user clicks the widget the relevant article is opened in the browser.

Zooming In

Each component of the system is implemented using different technologies:

Web Application

We decided to implement the user interface using HTML5, which allowed us accessto native capabilities of the phone such as the camera. Additionally, we usedWebRTC API andCanvas API.

Computer Vision Service

For every image, the service processes the image and returns anembedding— a numerical vector representing information extracted from the content of theimage.

Understanding the content of an image is a well known problem with plenty ofsolutions. We chose to use Google’sInceptionmodel, which is a DNN (DeepNeural Network) trained to classify the object found in an image.

A DNN is a construct of layers of neurons — similar to nodes in a graph, whereeach layer learns certain patterns in the image. The first layers learn tooutput simple patterns such as edges and corners, while the last layer outputsthe type of the object, e.g. dog, cat etc.

We chose to use the output of the layer before last — as that produced the bestresults.

Database

The only component that was already available to us was Taboola’s internaldatabase of articles, containing, among other things, the thumbnail and thearticle URL.

If such a database was not readily available to us, we could have just built ourown by scraping images using alibrary such as BeautifulSoup.

Server

We used Flask as our web server. On startup, the server queries Taboola’sinternal database of articles from the web. It then sends each image to thecomputer vision service, which returns an embedding. The embeddings are thenstored into a designated data structure calledFAISS(Facebook AI Similarity Search). It allows us to perform a nearest neighborssearch.

Each image sent from the user is similarly transformed into an embedding. It isthen used as a query to the above data structure to retrieve its nearestembeddings, meaning, images with similar patterns and content. Only images whichare considered similar above a predefined threshold are then returned to theuser.

So to recap, the entire architecture relies on three main components:

    web applicationcomputer vision servicearticles database

Do it Yourself

The entire project was developed in under 36 hours by a team of 5 people.

This app touches several interesting and exciting domains that are “hot” in theindustry — AR and AI. It was a breeze to implement thanks to the commonlyavailable tools and libraries.

If there’s only one thing we want you to take from this is that it’s not thatdifficult.

We invite you to be aware of the potential that lies in these domains and to beon the lookout for interesting and exciting ideas. Once you find one, go andhave your own private hackathon!


Finally

If you want to play around with the app, openhttps://zoom.taboola.com on your phone — useChrome on Android and Safari on iOS, and try hovering over different objects tosee the various results — keep in mind this is a pre-alpha hackathon project.

We want to thank our wonderful teammates who worked tirelessly to create thisamazing app –

Amir Keren, YoelZeldes, ElinaRevzin, AvivRotman and OfriMann.


Originally published atengineering.taboola.comby me and Amir Keren.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Taboola Zoom AR应用 多领域技术 内容推荐
相关文章