{"id":15369,"date":"2023-11-30T18:49:09","date_gmt":"2023-11-30T18:49:09","guid":{"rendered":"http:\/\/scannn.com\/a-decade-of-advancing-the-state-of-the-art-in-ai-through-open-research\/"},"modified":"2023-11-30T18:49:09","modified_gmt":"2023-11-30T18:49:09","slug":"a-decade-of-advancing-the-state-of-the-art-in-ai-through-open-research","status":"publish","type":"post","link":"https:\/\/scannn.com\/lv\/a-decade-of-advancing-the-state-of-the-art-in-ai-through-open-research\/","title":{"rendered":"A Decade of Advancing the State-of-the-Art in AI Through Open Research"},"content":{"rendered":"<p> <br \/>\n<\/p>\n<div>\n<p><span style=\"font-weight: 400;\">Today we\u2019re celebrating the 10-year anniversary of Meta\u2019s Fundamental AI Research (FAIR) team. For the last decade, <\/span><a href=\"https:\/\/ai.meta.com\/blog\/fair-10-year-anniversary-open-science-meta\/\"><span style=\"font-weight: 400;\">FAIR<\/span><\/a><span style=\"font-weight: 400;\"> has been the source of many AI breakthroughs and a beacon for doing research in an open and responsible way. We are committed to open science and sharing our work, whether it be papers, code, models, demos or responsible use guides.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">We\u2019ve made impressive strides in the past 10 years in object detection with <\/span><a href=\"https:\/\/ai.meta.com\/blog\/segment-anything-foundation-model-image-segmentation\/\"><span style=\"font-weight: 400;\">Segment Anything<\/span><\/a><span style=\"font-weight: 400;\">, which recognizes objects in images. Additionally, we were among the first to pioneer techniques for unsupervised machine translation, allowing us to build a model that can translate across 100 languages without relying on English. This led to our <\/span><a href=\"https:\/\/ai.meta.com\/research\/no-language-left-behind\/\"><span style=\"font-weight: 400;\">No Language Left Behind<\/span><\/a><span style=\"font-weight: 400;\"> breakthrough, which most recently expanded text-to-speech and speech-to-text technology to more than <\/span><a href=\"https:\/\/about.fb.com\/news\/2023\/05\/ai-massively-multilingual-speech-technology\/\"><span style=\"font-weight: 400;\">1,000 languages<\/span><\/a><span style=\"font-weight: 400;\">.\u00a0\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Earlier this year we released <\/span><a href=\"https:\/\/ai.meta.com\/blog\/large-language-model-llama-meta-ai\/\"><span style=\"font-weight: 400;\">Llama<\/span><\/a><span style=\"font-weight: 400;\">, an open, pre-trained large language model, followed by <\/span><a href=\"https:\/\/about.fb.com\/news\/2023\/07\/llama-2\/\"><span style=\"font-weight: 400;\">Llama 2<\/span><\/a><span style=\"font-weight: 400;\">, which is free for research and commercial use. And at Connect, we unveiled <\/span><a href=\"https:\/\/about.fb.com\/news\/2023\/09\/introducing-ai-powered-assistants-characters-and-creative-tools\/\"><span style=\"font-weight: 400;\">new AI products and experiences<\/span><\/a><span style=\"font-weight: 400;\"> that are now in the hands of millions of people \u2014 the culmination of early research work that Meta\u2019s Generative AI and product teams built upon.<\/span><\/p>\n<p><a href=\"https:\/\/about.fb.com\/wp-content\/uploads\/2023\/11\/01_Q1_Q4_Timeline-1.gif?resize=960%2C540\"><img fetchpriority=\"high\" decoding=\"async\" class=\"alignnone size-full wp-image-40043\" src=\"https:\/\/about.fb.com\/wp-content\/uploads\/2023\/11\/01_Q1_Q4_Timeline-1.gif?resize=960%2C540\" alt=\"Timeline showing Meta's Fundamental AI Research team's releases throughout 2023\" width=\"960\" height=\"540\" data-recalc-dims=\"1\"\/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">Today, we\u2019re sharing our latest advancements in Ego-Exo4D, Audiobox and Seamless Communication<\/span><\/p>\n<h2>Giving AI Models Both Egocentric and Exocentric Views<\/h2>\n<p><span style=\"font-weight: 400;\">In our efforts to teach AI to perceive the world through our eyes, we\u2019ve made updates to <\/span><a href=\"https:\/\/ai.meta.com\/blog\/teaching-ai-to-perceive-the-world-through-your-eyes\/\"><span style=\"font-weight: 400;\">Ego-Exo<\/span><\/a><span style=\"font-weight: 400;\">. The latest <\/span><a href=\"https:\/\/ai.meta.com\/blog\/ego-exo4d-video-learning-perception\/\"><span style=\"font-weight: 400;\">Ego-Exo4D<\/span><\/a><span style=\"font-weight: 400;\"> simultaneously captures first-person (egocentric) views from a wearable camera, as well as external (exocentric) views from cameras surrounding the person. Together, these perspectives give AI models a window into what people see and hear combined with more context about the environment.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In the future, these advances in AI will allow a person wearing smart glasses to quickly pick up new skills with a virtual AI coach guiding them through a how-to video. For example, imagine watching an expert repair a bike tire, juggle a soccer ball or fold an origami swan, and then being able to map their steps to your own actions.<\/span><\/p>\n<p><a href=\"https:\/\/about.fb.com\/wp-content\/uploads\/2023\/11\/02_EgoExo_Bike.gif?resize=800%2C450\"><img loading=\"lazy\" decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-40031\" src=\"https:\/\/about.fb.com\/wp-content\/uploads\/2023\/11\/02_EgoExo_Bike.gif?resize=800%2C450\" alt=\"Video clips showing Ego-Exo4D first person and external views of someone repairing a bike\" width=\"800\" height=\"450\" data-recalc-dims=\"1\"\/><\/a><\/p>\n<h2>Generating Voices and Sound Effects With Audiobox<\/h2>\n<p><span style=\"font-weight: 400;\">Earlier this year, we introduced <\/span><a href=\"https:\/\/about.fb.com\/news\/2023\/06\/introducing-voicebox-ai-for-speech-generation\/\"><span style=\"font-weight: 400;\">Voicebox<\/span><\/a><span style=\"font-weight: 400;\">, a generative AI model that can help with audio editing, sampling and styling. Now <\/span><a href=\"https:\/\/ai.meta.com\/blog\/audiobox-generating-audio-voice-natural-language-prompts\"><span style=\"font-weight: 400;\">Audiobox<\/span><\/a><span style=\"font-weight: 400;\">, its successor, advances generative AI for audio even further. With Audiobox, you can use voice prompts or text descriptions to describe sounds or types of speech you\u2019d like to generate. For example, you could create a soundtrack with a prompt like, \u201ca running river and birds chirping.\u201d You can even generate a voice by saying, \u201ca young woman speaks with a high pitch and fast pace.\u201d Audiobox makes it easy to create custom audio for all of your projects.\u00a0<\/span><\/p>\n<h2>Unlocking Seamless Language Translation<\/h2>\n<p><span style=\"font-weight: 400;\">Building on our work with <\/span><a href=\"https:\/\/about.fb.com\/news\/2023\/08\/seamlessm4t-ai-translation-model\/\"><span style=\"font-weight: 400;\">SeamlessM4T<\/span><\/a><span style=\"font-weight: 400;\">, we\u2019re now introducing <\/span><a href=\"https:\/\/ai.meta.com\/blog\/seamless-communication\"><span style=\"font-weight: 400;\">Seamless Communication<\/span><\/a><span style=\"font-weight: 400;\">: a suite of AI translation models that better preserve expression across languages and translate while the speaker is still talking to improve speed.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Earlier versions of language translation services often struggle to<\/span><span style=\"font-weight: 400;\"> capture tone of voice, pauses and emphasis, missing important signals that help us share emotions and intent. SeamlessExpressive is the first publicly available system that unlocks expressive cross-lingual communication. It uses a model that preserves the speaker\u2019s emotion and style, and addresses the rate and rhythm of speech. The model currently works for <\/span><span style=\"font-weight: 400;\">English, Spanish, German, French, Italian and Chinese.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">SeamlessStreaming unlocks real-time conversations with someone who speaks a different language. In contrast to conventional systems which translate when the speaker has finished their sentence, SeamlessStreaming translates while the speaker is still talking, allowing the person listening to hear a translation faster.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Meta is uniquely poised to solve AI\u2019s biggest challenges. Our investments in software, hardware and infrastructure allow us to weave learnings from our research into products that can benefit billions of people.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">FAIR is a critical piece to Meta\u2019s success, and one of the only groups in the world with all the requirements to deliver true breakthroughs: some of the brightest minds in the industry, a culture of openness, and most importantly, the freedom to conduct exploratory research. This freedom has helped us stay agile and contribute to building the future of social connection.<\/span><\/p>\n<h2>Responsible AI Research<\/h2>\n<p><span style=\"font-weight: 400;\">We value responsible AI research and openness because sharing thoughtful work through the scrutiny of peers pushes us towards excellence and builds trust in our advances. It also allows us to collaborate with the wider community, which brings faster progress and a more diverse set of contributors. Learn more about how we\u2019re <a href=\"https:\/\/ai.meta.com\/blog\/fair-progress-and-learnings-across-socially-responsible-ai-research\">conducting AI research responsibly<\/a>.\u00a0<\/span><\/p>\n<\/p><\/div>\n<p><script async defer crossorigin=\"anonymous\" src=\"https:\/\/connect.facebook.net\/en_US\/sdk.js#xfbml=1&#038;version=v5.0\"><\/script><br \/>\n<br \/><br \/>\n<br \/><a href=\"https:\/\/about.fb.com\/news\/2023\/11\/decade-of-advancing-ai-through-open-research\/\">Source link <\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Today we\u2019re celebrating the 10-year anniversary of Meta\u2019s Fundamental AI Research (FAIR) team. For the last decade, FAIR has been the source of many AI breakthroughs and a beacon for doing research in an open and responsible way. We are committed to open science and sharing our work, whether it be papers, code, models, demos [&hellip;]<\/p>\n","protected":false},"author":16,"featured_media":15370,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[123],"tags":[],"class_list":["post-15369","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-facebook"],"_links":{"self":[{"href":"https:\/\/scannn.com\/lv\/wp-json\/wp\/v2\/posts\/15369","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/scannn.com\/lv\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/scannn.com\/lv\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/scannn.com\/lv\/wp-json\/wp\/v2\/users\/16"}],"replies":[{"embeddable":true,"href":"https:\/\/scannn.com\/lv\/wp-json\/wp\/v2\/comments?post=15369"}],"version-history":[{"count":0,"href":"https:\/\/scannn.com\/lv\/wp-json\/wp\/v2\/posts\/15369\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/scannn.com\/lv\/wp-json\/wp\/v2\/media\/15370"}],"wp:attachment":[{"href":"https:\/\/scannn.com\/lv\/wp-json\/wp\/v2\/media?parent=15369"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/scannn.com\/lv\/wp-json\/wp\/v2\/categories?post=15369"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/scannn.com\/lv\/wp-json\/wp\/v2\/tags?post=15369"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}