<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Headline Edit</title>
    <description>Your weekly briefing on the most important news, informed by conversations with founders and industry leaders, and trusted by top executives at Fortune 500 companies.</description>
    
    <link>https://edit.headline.com/</link>
    <atom:link href="https://rss.beehiiv.com/feeds/nos8BOBDcf.xml" rel="self"/>
    
    <lastBuildDate>Thu, 16 Apr 2026 19:01:02 +0000</lastBuildDate>
    <pubDate>Tue, 17 Dec 2024 01:33:42 +0000</pubDate>
    <atom:published>2024-12-17T01:33:42Z</atom:published>
    <atom:updated>2026-04-16T19:01:02Z</atom:updated>
    
      <category>Venture Capital</category>
      <category>News</category>
      <category>Artificial Intelligence</category>
    <copyright>Copyright 2026, Headline Edit</copyright>
    
    <image>
      <url>https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/publication/logo/15e8942c-5b23-4f88-9e3b-405cd7e39417/h-logo.png</url>
      <title>Headline Edit</title>
      <link>https://edit.headline.com/</link>
    </image>
    
    <docs>https://www.rssboard.org/rss-specification</docs>
    <generator>beehiiv</generator>
    <language>en-us</language>
    <webMaster>support@beehiiv.com (Beehiiv Support)</webMaster>

      <item>
  <title>AI: OpenAI &amp; Google ship big updates, World Labs&#39; 3D/spatial demo, Liquid challenges transformers, and breaking down the data scarcity hype... (12.16.24)</title>
  <description>OpenAI, Google, WorldLabs, Liquid.AI</description>
      <enclosure url="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/0db820a7-9f0e-48a1-9527-c4ad3f75a9fd/DALL_E_2024-12-16_15.59.08_-_A_computer_screen_generating_3D_ships_dynamically_sailing_outward_from_the_screen__with_the_ships_appearing_to_come_alive_as_they_emerge._The_ships_ar.jpg" length="588344" type="image/jpeg"/>
  <link>https://edit.headline.com/p/ai-openai-google-are-shipping-big-updates-the-rise-of-a-transformer-challenger-and-the-data-scarcity</link>
  <guid isPermaLink="true">https://edit.headline.com/p/ai-openai-google-are-shipping-big-updates-the-rise-of-a-transformer-challenger-and-the-data-scarcity</guid>
  <pubDate>Tue, 17 Dec 2024 01:33:42 +0000</pubDate>
  <atom:published>2024-12-17T01:33:42Z</atom:published>
    <dc:creator>Sasha Krecinic</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #c7bab0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #161618; font-family: 'Roboto',-apple-system,BlinkMacSystemFont,Tahoma,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#fcf0e8; }
  .bh__table_header p { color: #161618; font-family:'Roboto',-apple-system,BlinkMacSystemFont,Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><p class="paragraph" style="text-align:left;">The last few weeks have been packed with exciting releases, and in this edition, we’ll dive into the major updates, along with a short thought piece on a persistent narrative: have we really “run out of data,” and is AI progress slowing down as a result?</p><p class="paragraph" style="text-align:left;">Happy Holidays!</p><p class="paragraph" style="text-align:left;">Sasha Krecinic</p><hr class="content_break"><p id="is-ai-progress-starting-to-slow-dow" class="paragraph" style="text-align:left;"><span style="font-size:1.5rem;"><b>Is AI progress starting to &#39;slow down&#39; because we have &#39;consumed&#39; all of the data?</b></span></p><p class="paragraph" style="text-align:left;"><b>Short Answer:</b> No.</p><p class="paragraph" style="text-align:left;"><b>Long Answer:</b> AI&#39;s momentum is parallelized now, meaning there are several viable pathways to explore, both on the research and scaling side. Yes, some models are huge and underperform expectations, and some models aren&#39;t released due to competitive concerns or safety and alignment concerns. However, it isn’t wise to judge progress based on headlines or one variable like ‘total publicly available data’.</p><p class="paragraph" style="text-align:left;">Unfortunately, the headlines have focused on the fact industry figures like Ilya Sutskever and other researchers have stated that the era of pre-training is coming to a close. However, the best way to track the frontier is by tracking the research developments. What they might not mention is <a class="link" href="https://www.reuters.com/technology/artificial-intelligence/openai-rivals-seek-new-path-smarter-ai-current-methods-hit-limitations-2024-11-11/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=ai-openai-google-ship-big-updates-world-labs-3d-spatial-demo-liquid-challenges-transformers-and-breaking-down-the-data-scarcity-hype-12-16-24" target="_blank" rel="noopener noreferrer nofollow">Ilya also recently said, &quot;Scaling the right thing matters more now than ever.”</a> What he is drawing upon is that there are several different doors to explore, and some might be dead ends, hence the benefit and focus on parallelization by the major AI labs. Both <a class="link" href="https://www.youtube.com/live/-cq3O4t0qQc?t=723s&utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=ai-openai-google-ship-big-updates-world-labs-3d-spatial-demo-liquid-challenges-transformers-and-breaking-down-the-data-scarcity-hype-12-16-24" target="_blank" rel="noopener noreferrer nofollow">Sam Altman </a>and <a class="link" href="https://www.youtube.com/watch?v=ugvHCXCOmm4&utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=ai-openai-google-ship-big-updates-world-labs-3d-spatial-demo-liquid-challenges-transformers-and-breaking-down-the-data-scarcity-hype-12-16-24" target="_blank" rel="noopener noreferrer nofollow">Dario Amodei </a>stated they have a clear line of sight on where to build for the next 18-24 months. Beyond that, it is arguably hard to plan because the frontier is moving so quickly.</p><p class="paragraph" style="text-align:left;"><b>Why are people saying this? </b><br>These comments are usually taken out of context. They often reflect one small part of the picture and, sadly, can be quite misleading at times. Some of the biggest developments have occurred in the last two months and even came sooner than many in the field expected. Here are a few examples: <br>- <a class="link" href="https://openai.com/index/learning-to-reason-with-llms/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=ai-openai-google-ship-big-updates-world-labs-3d-spatial-demo-liquid-challenges-transformers-and-breaking-down-the-data-scarcity-hype-12-16-24" target="_blank" rel="noopener noreferrer nofollow">Test time training/compute</a><br>- <a class="link" href="https://openai.com/index/introducing-the-realtime-api/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=ai-openai-google-ship-big-updates-world-labs-3d-spatial-demo-liquid-challenges-transformers-and-breaking-down-the-data-scarcity-hype-12-16-24" target="_blank" rel="noopener noreferrer nofollow">Real-time voice APIs</a> / <a class="link" href="https://aistudio.google.com/live?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=ai-openai-google-ship-big-updates-world-labs-3d-spatial-demo-liquid-challenges-transformers-and-breaking-down-the-data-scarcity-hype-12-16-24" target="_blank" rel="noopener noreferrer nofollow">Live AI screen vision </a> <br>- <a class="link" href="https://www.anthropic.com/news/3-5-models-and-computer-use?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=ai-openai-google-ship-big-updates-world-labs-3d-spatial-demo-liquid-challenges-transformers-and-breaking-down-the-data-scarcity-hype-12-16-24" target="_blank" rel="noopener noreferrer nofollow">AI Computer Use</a><br><br>Each of these has the potential to transform industries and the overall surface area AI is touching here is expanding much faster than the data is being ‘consumed’. The thing that stumps most people in this industry is how little coverage these developments have gotten. So when someone says AI progress is &quot;losing steam,&quot; ask them what they think about the research pathways and how quickly AI’s surface area is expanding… </p><p class="paragraph" style="text-align:left;">[<a class="link" href="https://www.linkedin.com/feed/update/urn:li:activity:7263656881494081536/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=ai-openai-google-ship-big-updates-world-labs-3d-spatial-demo-liquid-challenges-transformers-and-breaking-down-the-data-scarcity-hype-12-16-24" target="_blank" rel="noopener noreferrer nofollow">Why AI is not losing steam...</a>] Share this story <a class="link" href="mailto:?subject=%20.%20.%20AI%20Development%20Accelerates%20with%20Multiple%20Research%20Pathways%20%20&body=%0A%0A%0AAI%20Development%20Accelerates%20with%20Multiple%20Research%20Pathways%20%20%0AFirst%20impacted%3A%20%20%2F%2F%20Time%20to%20impact%3A%20%0A%0AIs%20AI%20progress%20starting%20to%20%27slow%20down%27%3F%20Have%20we%20%27tapped%20out%27%20all%20the%20data%3F%0A%0AShort%20Answer%3A%20lol%2C%20no%20%F0%9F%98%82%20%0A%0ALong%20Answer%3A%20AI%27s%20momentum%20is%20parallelized%20now%2C%20meaning%20there%20are%20several%20viable%20pathways%20to%20build%20on%20(both%20on%20the%20research%20and%20scaling%20side).%20Some%20models%20underperform%20expectations%2C%20and%20some%20models%20aren%27t%20released%20due%20to%20competitive%20concerns%20or%20safety%20and%20alignment%20concerns.%20You%20cannot%20judge%20progress%20based%20on%20consumer-facing%20product%20releases.%20The%20best%20way%20to%20track%20the%20frontier%20is%20through%20research%20developments.%20As%20Ilya%20said%2C%20%22Scaling%20the%20right%20thing%20matters%20more%20now%20than%20ever%2C%22%20hence%20the%20focus%20on%20parallelization.%20Both%20OpenAI%20and%20Anthropic%20have%20said%20they%20have%20a%20clear%20line%20of%20sight%20on%20where%20to%20build%20for%20the%20next%2018-24%20months.%20Beyond%20that%2C%20it%20is%20hard%20to%20plan%20because%20the%20frontier%20is%20actually%20moving%20so%20quickly%20(except%20for%20things%20like%20large%20infrastructure%20projects%20which%20can%20have%20long%20lead%20times).%0A%0AWhy%20are%20people%20saying%20this%20then%3F%0AThese%20comments%20are%20usually%20taken%20out%20of%20context.%20They%20often%20reflect%20one%20small%20part%20of%20the%20picture%20and%20can%20be%20misleading.%20A%20quick%20litmus%20test%20is%20to%20ask%20if%20they%20know%20what%20Arxiv%20is%20(https%3A%2F%2Flnkd.in%2FgZWd7gwY)%20or%20what%20the%20most%20recent%20research%20paper%20they%20read%20was.%20If%20they%20can%27t%20answer%2C%20it%27s%20unlikely%20they%20are%20tracking%20the%20broader%20AI%20landscape.%0A%0A%22Not%20much%20has%20happened%20since%20ChatGPT%22%20%E2%80%94%20what%20do%20you%20say%20to%20this%3F%0ASome%20of%20the%20biggest%20developments%20have%20occurred%20in%20the%20last%20two%20months%20and%20came%20sooner%20than%20many%20in%20the%20field%20expected.%20Here%20are%20a%20few%20examples%3A%0A-%20Test%20time%20training%2Fcompute%3A%20https%3A%2F%2Flnkd.in%2FgSeFqG4b%0A-%20Real-time%20voice%20API%3A%20https%3A%2F%2Flnkd.in%2FgK8bKeEK%0A-%20Computer%20Use%3A%20https%3A%2F%2Flnkd.in%2Fgn_8f222%0A%0AEach%20of%20these%20has%20the%20potential%20to%20transform%20industries.%20The%20thing%20that%20stumps%20most%20people%20in%20this%20industry%20is%20how%20little%20coverage%20these%20developments%20have%20gotten.%20%0A%0ASo%20when%20someone%20says%20AI%20progress%20is%20%22losing%20steam%2C%22%20ask%20them%20what%20research%20papers%20they%20read%20to%20form%20this%20opinion...%20%F0%9F%99%83%0A%0A%20%5BAI%20losing%20steam%20due%20to%20data%20running%20out...%20https%3A%2F%2Fwww.linkedin.com%2Ffeed%2Fupdate%2Furn%3Ali%3Aactivity%3A7263656881494081536%2F%5D" target="_blank" rel="noopener noreferrer nofollow">by email</a></p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="open-a-is-12-days-of-product-shipma"><b><a class="link" href="https://openai.com/12-days/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=ai-openai-google-ship-big-updates-world-labs-3d-spatial-demo-liquid-challenges-transformers-and-breaking-down-the-data-scarcity-hype-12-16-24" target="_blank" rel="noopener noreferrer nofollow">OpenAI’s 12 days of product ship-mass</a></b></h3><p class="paragraph" style="text-align:left;">In case you missed it, OpenAI’s 12 Days of back-to-back releases (so far):</p><ul><li><p class="paragraph" style="text-align:left;"><b>Day 1:</b> o1 & ChatGPT Pro (Premium features with $20-$200 plans)</p></li><li><p class="paragraph" style="text-align:left;"><b>Day 2:</b> Reinforcement Fine-Tuning Program (Research applications open) </p></li><li><p class="paragraph" style="text-align:left;"><b>Day 3:</b> Sora (Video generation and remixing)</p></li><li><p class="paragraph" style="text-align:left;"><b>Day 4:</b> Canvas (Collaborative coding and writing)</p></li><li><p class="paragraph" style="text-align:left;"><b>Day 5:</b> ChatGPT in Apple Intelligence (Ecosystem integration)</p></li><li><p class="paragraph" style="text-align:left;"><b>Day 6:</b> Advanced Voice & Santa Mode (Voice+video and festive fun)</p></li><li><p class="paragraph" style="text-align:left;"><b>Day 7:</b> Projects in ChatGPT (Organize and manage projects)</p></li><li><p class="paragraph" style="text-align:left;"><b>Day 8:</b> ChatGPT Search (Real-time web answers with links)</p></li></ul><p class="paragraph" style="text-align:left;">Some of these were expected, and others have been a complete surprise to me, like the Reinforcement Fine-Tuning Program. It’s a big development because it lets developers shape model behavior through iterative feedback loops rather than being stuck with static datasets. Traditional fine-tuning methods rely on adjusting model parameters with fixed training examples, but RL FT incorporates evaluative signals—like user feedback or predefined reward criteria—directly into the training process. This makes it possible to optimize a model’s responses toward desired outcomes more dynamically, improving its ability to handle complex tasks, follow specific instructions, and maintain quality and alignment over time! </p><p class="paragraph" style="text-align:left;">[<a class="link" href="https://openai.com/12-days/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=ai-openai-google-ship-big-updates-world-labs-3d-spatial-demo-liquid-challenges-transformers-and-breaking-down-the-data-scarcity-hype-12-16-24" target="_blank" rel="noopener noreferrer nofollow">12 Days of OpenAI </a>] Share this story <a class="link" href="mailto:?subject=%20.%20.%20ChatGPT%20Search%20launches%20for%20timely%20web-based%20answers%20%20&body=%0A%0A%0AChatGPT%20Search%20launches%20for%20timely%20web-based%20answers%20%20%0AFirst%20impacted%3A%20%20%2F%2F%20Time%20to%20impact%3A%20%0A%0AOpenAI%20has%20launched%20ChatGPT%20Search%20in%20October%202024%2C%20a%20feature%20that%20provides%20quick%20answers%20using%20relevant%20web%20sources%20and%20includes%20clear%20links%20to%20these%20sources.%20This%20launch%20follows%20the%20introduction%20of%20Canvas%20for%20collaborative%20writing%20and%20coding%2C%20and%20Sora%20for%20video%20creation%2C%20both%20designed%20to%20enhance%20creative%20expression%20and%20storytelling.%0A%0A%20%5B12%20Days%20of%20OpenAI%0A%20https%3A%2F%2Fopenai.com%2F12-days%2F%5D" target="_blank" rel="noopener noreferrer nofollow">by email</a></p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="google-launches-gemini-20-ai-model"><b><a class="link" href="https://blog.google/technology/google-deepmind/google-gemini-ai-update-december-2024/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=ai-openai-google-ship-big-updates-world-labs-3d-spatial-demo-liquid-challenges-transformers-and-breaking-down-the-data-scarcity-hype-12-16-24#ceo-message" target="_blank" rel="noopener noreferrer nofollow">Google Launches Gemini 2.0 AI Model</a></b></h3><p class="paragraph" style="text-align:left;">Google has launched Gemini 2.0, described as an “AI model for the agentic era,” according to their recent blog post. The model offers advanced multimodal capabilities and native tool use (e.g., search), with an experimental version called Gemini 2.0 Flash now available to developers, reportedly doubling the speed of its predecessor. Potentially most impressive is that Google Studio introduces a screen-sharing capability that allows users to have a video call with it and receive live assistance with anything on their screen in real time. Google also announced it is testing browser control for tasks like collecting contact information on web pages from a chat-based control panel. </p><p class="paragraph" style="text-align:left;">[<a class="link" href="https://blog.google/technology/google-deepmind/google-gemini-ai-update-december-2024/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=ai-openai-google-ship-big-updates-world-labs-3d-spatial-demo-liquid-challenges-transformers-and-breaking-down-the-data-scarcity-hype-12-16-24#ceo-message" target="_blank" rel="noopener noreferrer nofollow">Introducing Gemini 2.0: our new AI model for the agentic era</a>] Share this story <a class="link" href="mailto:?subject=%20.%20.%20Google%20Launches%20Gemini%202.0%20AI%20Model%20%20&body=%0A%0A%0AGoogle%20Launches%20Gemini%202.0%20AI%20Model%20%20%0AFirst%20impacted%3A%20%20%2F%2F%20Time%20to%20impact%3A%20%0A%0AGoogle%20and%20Alphabet%20have%20launched%20Gemini%202.0%2C%20an%20AI%20model%20designed%20for%20the%20agentic%20era%2C%20according%20to%20a%20blog%20post.%20They%20say%20this%20model%20offers%20advanced%20multimodal%20capabilities%20and%20native%20tool%20use%2C%20with%20an%20experimental%20version%20called%20Gemini%202.0%20Flash%20now%20available%20to%20developers%2C%20reportedly%20doubling%20the%20speed%20of%20its%20predecessor.%0A%0A%20%5BIntroducing%20Gemini%202.0%3A%20our%20new%20AI%20model%20for%20the%20agentic%20era%20https%3A%2F%2Fblog.google%2Ftechnology%2Fgoogle-deepmind%2Fgoogle-gemini-ai-update-december-2024%2F%23ceo-message%5D" target="_blank" rel="noopener noreferrer nofollow">by email</a></p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="world-labs-spatial-intelligence-is-"><b><a class="link" href="https://www.worldlabs.ai/blog?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=ai-openai-google-ship-big-updates-world-labs-3d-spatial-demo-liquid-challenges-transformers-and-breaking-down-the-data-scarcity-hype-12-16-24" target="_blank" rel="noopener noreferrer nofollow">World Labs: Spatial intelligence is on the horizon</a></b></h3><p class="paragraph" style="text-align:left;">A big development in spatial intelligence, World Labs has introduced an AI system capable of generating interactive 3D worlds from a single 2D image. The company raised $230M in September 2024 and is already showcasing impressive demos. Unlike conventional tools that produce static visuals, this new technology allows users to fully explore scenes, peering around corners and examining details in real time. Early demos show how the tool can transform creative workflows for artists, filmmakers, and game developers, offering unprecedented control and fidelity in digital environments. Check out the impressive demo here: </p><p class="paragraph" style="text-align:left;">[<a class="link" href="https://www.worldlabs.ai/blog?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=ai-openai-google-ship-big-updates-world-labs-3d-spatial-demo-liquid-challenges-transformers-and-breaking-down-the-data-scarcity-hype-12-16-24" target="_blank" rel="noopener noreferrer nofollow">3D AI worlds coming soon</a>] Share this story <a class="link" href="mailto:?subject=%20.%20.%203D%20AI%20worlds%20coming%20soon&body=%0A%0A%0A3D%20AI%20worlds%20coming%20soon%0AFirst%20impacted%3A%20%20%2F%2F%20Time%20to%20impact%3A%20%0A%0AIn%20a%20major%20step%20forward%20for%20spatial%20intelligence%2C%20World%20Labs%20has%20introduced%20an%20AI%20system%20capable%20of%20generating%20interactive%203D%20worlds%20from%20a%20single%202D%20image.%20Unlike%20conventional%20tools%20that%20produce%20static%20visuals%2C%20this%20new%20technology%20allows%20users%20to%20fully%20explore%20scenes%2C%20peering%20around%20corners%20and%20examining%20details%20in%20real-time.%20Early%20demos%20show%20how%20the%20tool%20can%20transform%20creative%20workflows%20for%20artists%2C%20filmmakers%2C%20and%20game%20developers%2C%20offering%20unprecedented%20control%20and%20fidelity%20in%20digital%20environments.%0A%0A%20%5B3D%20AI%20worlds%20coming%20soon%20https%3A%2F%2Fwww.worldlabs.ai%2Fblog%5D" target="_blank" rel="noopener noreferrer nofollow">by email</a></p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="liquid-ai-secures-250-m-to-expand-a"><b><a class="link" href="https://www.liquid.ai/blog/we-raised-250m-to-scale-capable-and-efficient-general-purpose-ai?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=ai-openai-google-ship-big-updates-world-labs-3d-spatial-demo-liquid-challenges-transformers-and-breaking-down-the-data-scarcity-hype-12-16-24" target="_blank" rel="noopener noreferrer nofollow">Liquid AI secures $250M to expand AI model infrastructure</a></b></h3><p class="paragraph" style="text-align:left;">Liquid AI has raised $250 million in a Series A round led by AMD Ventures to advance its Liquid Foundation Models, which are lightweight and general-purpose. The company says it will use the funds to enhance its computing infrastructure and expedite product readiness for edge and on-premise applications, fine-tuning, and is looking to bring their solution to a broader audience. </p><p class="paragraph" style="text-align:left;">[<a class="link" href="https://www.liquid.ai/blog/we-raised-250m-to-scale-capable-and-efficient-general-purpose-ai?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=ai-openai-google-ship-big-updates-world-labs-3d-spatial-demo-liquid-challenges-transformers-and-breaking-down-the-data-scarcity-hype-12-16-24" target="_blank" rel="noopener noreferrer nofollow">We raised $250M to scale capable and efficient general-purpose AI</a>] Share this story <a class="link" href="mailto:?subject=%20.%20.%20Liquid%20AI%20secures%20%24250M%20to%20expand%20AI%20model%20infrastructure%20%20&body=%0A%0A%0ALiquid%20AI%20secures%20%24250M%20to%20expand%20AI%20model%20infrastructure%20%20%0AFirst%20impacted%3A%20%20%2F%2F%20Time%20to%20impact%3A%20%0A%0ALiquid%20AI%20has%20raised%20%24250%20million%20in%20a%20Series%20A%20round%20led%20by%20AMD%20Ventures%20to%20advance%20its%20Liquid%20Foundation%20Models%2C%20which%20are%20lightweight%2C%20general-purpose%20AI%20models.%20The%20company%20says%20it%20will%20use%20the%20funds%20to%20enhance%20its%20computing%20infrastructure%20and%20expedite%20product%20readiness%20for%20edge%20and%20on-premise%20applications%2C%20with%20AMD%27s%20Mathew%20Hein%20expressing%20enthusiasm%20about%20the%20collaboration.%0A%0A%20%5BWe%20raised%20%24250M%20to%20scale%20capable%20and%20efficient%20general-purpose%20AI%20https%3A%2F%2Fwww.liquid.ai%2Fblog%2Fwe-raised-250m-to-scale-capable-and-efficient-general-purpose-ai%5D" target="_blank" rel="noopener noreferrer nofollow">by email</a></p></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=56a838f4-fa73-483d-8a5a-18c705399c81&utm_medium=post_rss&utm_source=headline_edit">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

      <item>
  <title>Huge Week in AI: OpenAI launches &#39;Strawberry&#39; and Google&#39;s Real-Time AI Gaming engine (9.12.24)</title>
  <description>OpenAI, Google, DOOM, Magic, Cursor, Replit</description>
      <enclosure url="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/585a0358-3867-472f-a1ae-d3f7268a4516/u6161822635_An_isometric_retro-futuristic_strawberry_robot_wi_dead3bca-1387-4fcb-a98b-eef6555cd171_1.png" length="1527582" type="image/png"/>
  <link>https://edit.headline.com/p/googles-realtime-ai-gaming-engine-openai-strawberry-details-9924</link>
  <guid isPermaLink="true">https://edit.headline.com/p/googles-realtime-ai-gaming-engine-openai-strawberry-details-9924</guid>
  <pubDate>Fri, 13 Sep 2024 00:00:12 +0000</pubDate>
  <atom:published>2024-09-13T00:00:12Z</atom:published>
    <dc:creator>Sasha Krecinic</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #c7bab0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #161618; font-family: 'Roboto',-apple-system,BlinkMacSystemFont,Tahoma,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#fcf0e8; }
  .bh__table_header p { color: #161618; font-family:'Roboto',-apple-system,BlinkMacSystemFont,Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/7284f95e-59eb-450a-84d0-59be1bb001df/image.png?t=1726184862"/></div><p class="paragraph" style="text-align:left;">You may notice a slight change in this week’s edition of our newsletter. We are excited to announce the official relaunch of our newsletter under a new name: &quot;Headline Edit.&quot; And what an action-packed week to introduce it! Our mission remains the same: distill the week&#39;s biggest stories in AI —just with a fresh new look.</p><p class="paragraph" style="text-align:left;">This week, we shine a spotlight on a critical challenge in AI: reasoning as a major bottleneck for current large language models (LLMs). We delve into how recent advancements aim to overcome this hurdle, including OpenAI&#39;s latest efforts to enhance reasoning capabilities in their models. We also explore Magic&#39;s remarkable 100 million token context window, Google&#39;s GameNGen potentially redefining real-time gaming through cutting-edge AI, and impressive updates in the AI software engineering tooling space from Replit, Cursor, and GitHub&#39;s Copilot. With context windows now reaching unprecedented lengths and code generation tools becoming increasingly capable, these developments are converging to significantly accelerate software development and enhance AI&#39;s reasoning abilities across industries. Read on and strap in because it’s going to be an exciting couple of years ahead! </p><p class="paragraph" style="text-align:left;">— Sasha Krecinic</p><p class="paragraph" style="text-align:left;"></p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="open-ai-and-anthropic-announce-earl"><a class="link" href="https://openai.com/index/learning-to-reason-with-llms/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=huge-week-in-ai-openai-launches-strawberry-and-google-s-real-time-ai-gaming-engine-9-12-24" target="_blank" rel="noopener noreferrer nofollow">OpenAI launches Strawberry aka ChatGPT-o1 and reveals even more…</a></h3><p class="paragraph" style="text-align:left;">OpenAI&#39;s new o1 series represents a massive breakthrough in AI capabilities, designed to handle complex, multi-step tasks in fields like science, math, and coding. What makes o1 stand out is its ability to &quot;think before it responds,&quot; using a chain-of-thought approach to solve problems more effectively than previous models. This is a major unlock for large language models (LLMs), which traditionally excel at generating text but struggle with deeper reasoning.</p><p class="paragraph" style="text-align:left;">In tests, o1 vastly outperforms both GPT-4o and human experts. For example, in the USA Math Olympiad (AIME), GPT-4o solved just 13% of problems, while o1 scored 83%, placing it among the top 500 students nationally. On PhD-level science benchmarks in physics, chemistry, and biology, o1 not only outperformed GPT-4o but also surpassed human PhD experts, becoming the first model to do so. In coding, o1 ranks in the 89th percentile in Codeforces challenges, far exceeding GPT-4o&#39;s 11th percentile performance.</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/ada7ed56-489a-4af8-bfd5-1f0ad78f7fd9/image.png?t=1726185167"/><div class="image__source"><span class="image__source_text"><p>Source: OpenAI - Learning to Reason with LLMs</p></span></div></div><p class="paragraph" style="text-align:left;"><b>Why Reasoning is a Big Unlock for LLMs</b></p><p class="paragraph" style="text-align:left;">Reasoning is the next frontier for LLMs, enabling them to tackle more complex, real-world problems that require logic, planning, and decision-making over time. While previous models like GPT-4o were great at generating coherent responses, they lacked the depth to break down intricate problems, reconsider approaches, and learn from mistakes in real-time. o1 changes this by leveraging reinforcement learning to refine its thinking process—similar to how humans problem-solve by rethinking steps and adapting strategies when things don’t work.</p><p class="paragraph" style="text-align:left;">This unlock means LLMs like o1 can be used in more sophisticated scenarios, from annotating genomic data in healthcare research to solving complicated quantum physics equations or optimizing complex code workflows. With reasoning, LLMs transition from being assistants that generate text to tools capable of deep analytical tasks, allowing them to rival human experts in specialized domains.</p><p class="paragraph" style="text-align:left;">o1-preview and the more efficient o1-mini are now available in ChatGPT, with plans for further updates to enhance these reasoning models’ functionality across broader applications.[<a class="link" href="https://openai.com/index/learning-to-reason-with-llms/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=huge-week-in-ai-openai-launches-strawberry-and-google-s-real-time-ai-gaming-engine-9-12-24" target="_blank" rel="noopener noreferrer nofollow">OpenAI Launches O1 Model Series</a>] Share this story by email</p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="google-launches-game-n-gen-a-ground"><a class="link" href="https://gamengen.github.io/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=huge-week-in-ai-openai-launches-strawberry-and-google-s-real-time-ai-gaming-engine-9-12-24" target="_blank" rel="noopener noreferrer nofollow">Google launches GameNGen, a groundbreaking game engine using AI</a></h3><p class="paragraph" style="text-align:left;">In a significant development, Google has unveiled GameNGen, a groundbreaking game engine that leverages advanced neural models to recreate classic games like DOOM in real-time. This innovative technology allows for high-quality, interactive gameplay at over 20 frames per second on a single TPU. GameNGen&#39;s unique training process involves a reinforcement learning agent that learns to play the game, followed by a diffusion model that predicts the next frame based on previous actions. This could revolutionize gaming by seamlessly blending real and simulated experiences. We highly recommend checking out the demo which is almost indistinguishable from the original game. [<a class="link" href="https://gamengen.github.io/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=huge-week-in-ai-openai-launches-strawberry-and-google-s-real-time-ai-gaming-engine-9-12-24" target="_blank" rel="noopener noreferrer nofollow">GameNGen</a>] Share this story by email</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/50cd7f07-397c-4471-93ec-72c2c8eb235f/image.png?t=1726184939"/></div><hr class="content_break"><h3 class="heading" style="text-align:left;" id="magic-ltm-2-mini-launches-with-unpr"><a class="link" href="https://magic.dev/blog/100m-token-context-windows?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=huge-week-in-ai-openai-launches-strawberry-and-google-s-real-time-ai-gaming-engine-9-12-24" target="_blank" rel="noopener noreferrer nofollow">Magic LTM-2-Mini launches with unprecedented 100 million token capacity</a></h3><p class="paragraph" style="text-align:left;">In a significant advancement, Magic&#39;s LTM-2-Mini model now supports a remarkable 100 million tokens, equivalent to 10 million lines of code or 750 novels. CEO Eric Steinberger also announced a $320 million funding round aimed at realizing their vision of autonomous AI while acknowledging the considerable challenges ahead.</p><p class="paragraph" style="text-align:left;">The best example of why this is relevant is arguably in software development. Imagine a developer working with a massive codebase that includes millions of lines of code, along with various libraries and documentation. A traditional AI SWE might struggle to keep track of all this information, leading to mistakes, poor troubleshooting, or incomplete solutions. However, Magic&#39;s AI, with its ability to handle ultra-long contexts, can keep all of this information in mind at once. This means it can accurately suggest code improvements, debug complex issues, or even generate new code by understanding the entire context. For instance, if the AI needs to fix a bug, it can consider the entire codebase, identify the exact spot where the issue occurs, and suggest a precise fix, all without losing track of other important details. This makes the AI a powerful tool for developers working on large, complex projects. [<a class="link" href="https://magic.dev/blog/100m-token-context-windows?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=huge-week-in-ai-openai-launches-strawberry-and-google-s-real-time-ai-gaming-engine-9-12-24" target="_blank" rel="noopener noreferrer nofollow">100M Token Context Windows</a>] Share this story by email</p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="study-shows-that-generative-ai-tool"><a class="link" href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4945566&utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=huge-week-in-ai-openai-launches-strawberry-and-google-s-real-time-ai-gaming-engine-9-12-24" target="_blank" rel="noopener noreferrer nofollow">Study shows that generative AI tools boost software developer productivity by over 26%</a></h3><p class="paragraph" style="text-align:left;">A recent study indicates that GitHub Copilot can boost software developer productivity by over 26%, particularly benefiting less experienced coders. However, a significant portion of developers—30 to 40 percent—have yet to try the AI tool, underscoring the need to consider individual preferences in workplace technology adoption. The study reveals that the productivity gains from GitHub Copilot are particularly pronounced among less experienced developers, with these users achieving a remarkable 39% increase in output compared to their more seasoned counterparts, who only see marginal improvements. Interestingly, despite the clear benefits of using Copilot, around 30-40% of developers in the experiments chose not to adopt the tool, indicating that factors such as personal preferences and perceived utility play a crucial role in technology adoption within the workplace. [<a class="link" href="https://ssrn.com?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=huge-week-in-ai-openai-launches-strawberry-and-google-s-real-time-ai-gaming-engine-9-12-24" target="_blank" rel="noopener noreferrer nofollow">ssrn.com</a>] Share this story by email</p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="replit-launches-replit-agent-to-sim"><a class="link" href="https://twitter.com/amasad/status/1831730911685308857?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=huge-week-in-ai-openai-launches-strawberry-and-google-s-real-time-ai-gaming-engine-9-12-24" target="_blank" rel="noopener noreferrer nofollow">Replit launches Replit Agent to simplify software development tasks</a></h3><p class="paragraph" style="text-align:left;">Replit has launched Replit Agent in early access, which aims to automate software development. Users are praising its ability to simplify the setup and deployment of applications, potentially transforming how software is created and launched. The Replit Agent not only streamlines the coding process but also integrates fully featured development and production environments, allowing users to build and deploy applications seamlessly from a single platform. Early testers are already expressing excitement about the potential of Replit Agent to enable the creation of entire web apps directly from mobile devices, marking a significant leap in accessibility for developers on the go. [<a class="link" href="https://twitter.com/amasad/status/1831730911685308857?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=huge-week-in-ai-openai-launches-strawberry-and-google-s-real-time-ai-gaming-engine-9-12-24" target="_blank" rel="noopener noreferrer nofollow">via @amasad</a>] Share this story by email</p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="cursor-also-accelerates-app-develop"><a class="link" href="https://twitter.com/mckaywrigley/status/1831429674582602198?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=huge-week-in-ai-openai-launches-strawberry-and-google-s-real-time-ai-gaming-engine-9-12-24" target="_blank" rel="noopener noreferrer nofollow">Cursor also accelerates app development, replicating complex software in minutes</a></h3><p class="paragraph" style="text-align:left;">The main difference between Replit and Cursor is that Replit is a cloud-based, collaborative coding environment, ideal for quick prototyping and small projects without the need for local setup. Cursor, on the other hand, is a local code editor designed for more complex, long-term projects that require scalability, better performance, and deeper integration with your machine&#39;s development environment. One Cursor user has achieved a significant milestone by creating a Perplexity clone in just eight minutes, reducing app development time to under 14 interactions. This innovation excites users, as features that previously required significant resources to develop can now potentially be completed in a fraction of the time. [<a class="link" href="https://twitter.com/mckaywrigley/status/1831429674582602198?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=huge-week-in-ai-openai-launches-strawberry-and-google-s-real-time-ai-gaming-engine-9-12-24" target="_blank" rel="noopener noreferrer nofollow">via @mckaywrigley</a>] Share this story by email</p><p class="paragraph" style="text-align:left;">__</p><p class="paragraph" style="text-align:left;"><span style="color:rgb(29, 28, 29);font-family:Slack-Lato, Slack-Fractions, appleLogo, sans-serif;font-size:15px;">The views and opinions expressed in this newsletter are those of the individual authors and do not necessarily reflect the official policy or position of Headline. All content is intended for informational purposes only and should not be construed as professional advice. Headline disclaims any responsibility for the accuracy, completeness, or reliability of the information presented.</span></p></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=82cbc6c2-33ba-4f54-814e-c3fbf9d21b11&utm_medium=post_rss&utm_source=headline_edit">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

      <item>
  <title>The AI Scientist, Another Step Closer to AGI?</title>
  <description>Sakana AI, Grok, OpenAI, Anthropic</description>
      <enclosure url="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/fdc8f28f-16d7-4185-a98a-0eed5cee0831/An_isometric_retrofu.jpg" length="136588" type="image/jpeg"/>
  <link>https://edit.headline.com/p/ai-ai-scientist-another-step-closer-agi-81624</link>
  <guid isPermaLink="true">https://edit.headline.com/p/ai-ai-scientist-another-step-closer-agi-81624</guid>
  <pubDate>Fri, 16 Aug 2024 21:25:11 +0000</pubDate>
  <atom:published>2024-08-16T21:25:11Z</atom:published>
    <dc:creator>Sasha Krecinic</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #c7bab0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #161618; font-family: 'Roboto',-apple-system,BlinkMacSystemFont,Tahoma,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#fcf0e8; }
  .bh__table_header p { color: #161618; font-family:'Roboto',-apple-system,BlinkMacSystemFont,Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><p class="paragraph" style="text-align:left;"></p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/fdc8f28f-16d7-4185-a98a-0eed5cee0831/An_isometric_retrofu.jpg?t=1724172562"/></div><p class="paragraph" style="text-align:left;">The research community is closely monitoring the advancements needed to unlock AGI (Artificial General Intelligence). Today, we may be one step closer with the latest research from <a class="link" href="https://Sakana.ai?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=the-ai-scientist-another-step-closer-to-agi" target="_blank" rel="noopener noreferrer nofollow">Sakana.ai</a>, which outlines how they have automated the scientific research lifecycle. Additionally, we see notable progress in autonomous software engineering AI agents, demonstrated by the team at Cosine. Meanwhile, the latest chatbot rankings highlight a highly competitive landscape in terms of speed, price, and capability.</p><p class="paragraph" style="text-align:left;">We’ve also launched a podcast covering some of the breaking news stories. If you like to consume content in video format, subscribe to our YouTube channel! <a class="link" href="https://www.youtube.com/watch?v=2nd_k4Lj080&utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=the-ai-scientist-another-step-closer-to-agi" target="_blank" rel="noopener noreferrer nofollow">Check out this week’s episode</a> where we cover OpenAI Strawberry, Google’s Pingpong robot, and the pricing cuts from OpenAI and Google.</p><p class="paragraph" style="text-align:left;">— Sasha Krecinic</p><div class="image"><a class="image__link" href="https://www.youtube.com/watch?v=2nd_k4Lj080&utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=the-ai-scientist-another-step-closer-to-agi" rel="noopener" target="_blank"><img alt="" class="image__image" style="" src="https://lh7-rt.googleusercontent.com/docsz/AD_4nXcJqntbkee7akNuBlTc0ylwq1Xszk66pF00hqmwErkcBQ8rlCGRBLRh61TrNuIU9TB_PNoR_Xo2QatiOJiQIln6W92FyPQkS7A-bB-qMf0wqcht_e5bh02hoLLjifxRjsNYFNgBLUmquUiY8ZnOecPRkag?key=rEkcOMhtlAqm3SJL2aKClw"/></a></div><hr class="content_break"><h3 class="heading" style="text-align:left;" id="sakana-ai-launches-the-ai-scientist"><a class="link" href="https://arxiv.org/abs/2408.06292?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=the-ai-scientist-another-step-closer-to-agi" target="_blank" rel="noopener noreferrer nofollow">Sakana AI launches The AI Scientist for automated scientific research</a></h3><p class="paragraph" style="text-align:left;">Sakana AI has launched The AI Scientist, an AI system that automates scientific research and claims it can manage the entire machine learning research lifecycle. Developed with researchers from the University of Oxford and the University of British Columbia, the system has produced papers in areas such as language modeling and diffusion and is open-sourced. <span style="color:rgba(0, 0, 0, 0.9);font-family:-apple-system, system-ui, system-ui, Segoe UI, Roboto, Helvetica Neue, Fira Sans, Ubuntu, Oxygen, Oxygen Sans, Cantarell, Droid Sans, Apple Color Emoji, Segoe UI Emoji, Segoe UI Emoji, Segoe UI Symbol, Lucida Grande, Helvetica, Arial, sans-serif;font-size:16px;">Many in the research community see this as a potential flywheel for AI systems to unlock the ability for self-improvement.</span> [<a class="link" href="https://arxiv.org/abs/2408.06292?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=the-ai-scientist-another-step-closer-to-agi" target="_blank" rel="noopener noreferrer nofollow">The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery</a>]</p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="cosine-ai-claims-to-have-built-the-"><a class="link" href="https://twitter.com/AlistairPullen/status/1822981361608888619?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=the-ai-scientist-another-step-closer-to-agi" target="_blank" rel="noopener noreferrer nofollow">Cosine AI claims to have built the most capable AI software engineer</a></h3><p class="paragraph" style="text-align:left;">Cosine AI says it has built an AI software engineer that scored 30.08% on the SWE-Bench benchmark, surpassing Amazon and Cognition. The model is designed to mimic human software engineering behavior and claims to perform at 50% on the SWE-Lite benchmark, with potential for future challenges. [<a class="link" href="https://twitter.com/AlistairPullen/status/1822981361608888619?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=the-ai-scientist-another-step-closer-to-agi" target="_blank" rel="noopener noreferrer nofollow">via @AlistairPullen</a>]</p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="gpt-4-o-reclaims-top-spot-and-grok-"><a class="link" href="http://leaderboard.lmsys.org/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=the-ai-scientist-another-step-closer-to-agi" target="_blank" rel="noopener noreferrer nofollow">GPT-4o reclaims top spot and Grok 2 ranks #3 on Chatbot Arena leaderboard</a></h3><p class="paragraph" style="text-align:left;">OpenAI&#39;s ChatGPT-4o has taken the lead in the Chatbot Arena with a score of 1314, surpassing Google&#39;s Gemini-1.5-Pro-Exp after over 11,000 community votes under the masked title of &quot;anonymous chatbot&quot;. The model ranks highly in Math, Coding, and Instruction-Following, with a notable improvement in coding, scoring over 30 points higher than its predecessor. Meanwhile, Grok 2 was a recent addition and reached third place and achieved #2 in Coding, #2 in Math, and #4 in Hard Prompts. Super impressive results from the newcomer! [<a class="link" href="https://lmsys.org?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=the-ai-scientist-another-step-closer-to-agi" target="_blank" rel="noopener noreferrer nofollow">lmsys.org</a>]</p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="anthropic-unveils-prompt-caching-90"><a class="link" href="https://www.anthropic.com/news/prompt-caching?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=the-ai-scientist-another-step-closer-to-agi" target="_blank" rel="noopener noreferrer nofollow">Anthropic Unveils Prompt Caching: 90% Cost Savings, 85% Faster AI Responses</a></h3><p class="paragraph" style="text-align:left;">Anthropic has introduced a prompt caching feature for its Claude AI models, now available in public beta, which allows developers to cache frequently used contexts between API calls. This feature reduces costs by up to 90% and improves latency by up to 85%, making it highly efficient for use cases like conversational agents, coding assistants, and large document processing. The pricing model involves a slightly higher cost for caching inputs but offers significantly cheaper access to cached content, with early adopters like Notion already seeing substantial benefits. [<a class="link" href="https://www.anthropic.com/news/prompt-caching?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=the-ai-scientist-another-step-closer-to-agi" target="_blank" rel="noopener noreferrer nofollow">Prompt caching with Claude</a>]</p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="gemini-makes-your-mobile-device-a-p"><a class="link" href="https://dpmd.ai/46RToL9?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=the-ai-scientist-another-step-closer-to-agi" target="_blank" rel="noopener noreferrer nofollow">Gemini Makes Your Mobile Device a Powerful AI Assistant</a></h3><p class="paragraph" style="text-align:left;">Google DeepMind has introduced &quot;Gemini Live,&quot; a new feature that enhances conversational AI interactions on Android devices. Available to Gemini Advanced subscribers, Gemini Live allows for more natural conversations with the ability to brainstorm ideas, interrupt to ask questions, and pause chats to resume later. This feature is now rolling out in English, making mobile AI interactions smoother and more responsive, with significant potential for accessibility and productivity improvements. [<a class="link" href="https://dpmd.ai/46RToL9?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=the-ai-scientist-another-step-closer-to-agi" target="_blank" rel="noopener noreferrer nofollow">Gemini makes your mobile device a powerful AI assistant</a>]</p></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=55484d94-47f8-41f8-b2ff-dd23e105e324&utm_medium=post_rss&utm_source=headline_edit">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

      <item>
  <title>Gemini Leads LLM Leaderboard, Character.AI Founders Return to Google, Groq Series D and More (8.8.24)</title>
  <description>Google, Gemini, Character.ai, Groq, AI Safety, Argentina</description>
      <enclosure url="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/a939baee-594c-4d57-8d4d-f2a6d74d94d6/An_isometric_retrofu__1_.jpg" length="190909" type="image/jpeg"/>
  <link>https://edit.headline.com/p/ai-gemini-leads-llm-leaderboard-characterai-founders-return-google-groq-series-d-8524</link>
  <guid isPermaLink="true">https://edit.headline.com/p/ai-gemini-leads-llm-leaderboard-characterai-founders-return-google-groq-series-d-8524</guid>
  <pubDate>Fri, 09 Aug 2024 00:54:17 +0000</pubDate>
  <atom:published>2024-08-09T00:54:17Z</atom:published>
    <dc:creator>Sasha Krecinic</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #c7bab0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #161618; font-family: 'Roboto',-apple-system,BlinkMacSystemFont,Tahoma,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#fcf0e8; }
  .bh__table_header p { color: #161618; font-family:'Roboto',-apple-system,BlinkMacSystemFont,Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><p class="paragraph" style="text-align:left;"></p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/a939baee-594c-4d57-8d4d-f2a6d74d94d6/An_isometric_retrofu__1_.jpg?t=1724172898"/></div><p class="paragraph" style="text-align:left;">Google Gemini 1.5 Pro is making waves as it outperforms GPT-4o and Claude-3.5 in the LMSYS Chatbot Arena. Many predicted it was only a matter of time before Google caught up, and it&#39;s the first time we&#39;ve seen them take the lead, somewhat reversing sentiments that Google was getting &#39;left behind&#39;. We also see hints of a broader strategy towards higher personalization with Google&#39;s partnership with <a class="link" href="https://Character.AI?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=gemini-leads-llm-leaderboard-character-ai-founders-return-to-google-groq-series-d-and-more-8-8-24" target="_blank" rel="noopener noreferrer nofollow">Character.AI</a>, known for its expertise in creating advanced conversational models and personalized AI interactions. In other news, Groq has announced new funding, and Argentina is pursuing AI applications reminiscent of sci-fi, drawing comparisons to the 2002 movie Minority Report.</p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="google-gemini-15-pro-tops-chatbot-a"><a class="link" href="https://chat.lmsys.org/?leaderboard=&utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=gemini-leads-llm-leaderboard-character-ai-founders-return-to-google-groq-series-d-and-more-8-8-24" target="_blank" rel="noopener noreferrer nofollow">Google Gemini 1.5 Pro tops Chatbot Arena</a></h3><p class="paragraph" style="text-align:left;">Google&#39;s Gemini 1.5 Pro (Experimental Version 0801) has claimed the top spot in Chatbot Arena, surpassing GPT-4o and Claude-3.5 with a score of 1300. The score of 1300 for Google&#39;s Gemini 1.5 Pro in Chatbot Arena reflects its ELO rating, a ranking model used in Chess, and indicates a win percentage of 54% against GPT-4o and 59% against Claude-3.5 Sonnet, showcasing its superior performance in head-to-head comparisons. It has shown that it excels in multilingual tasks and technical areas like Math, Instruction-Following, and Coding, though it trails in domains like Coding and Hard Prompts compared to Claude 3.5 Sonnet and GPT-4o. Google Cloud says this experimental version is now available for early testing and feedback in Google AI Studio and the Gemini API. [<a class="link" href="https://chat.lmsys.org/?leaderboard=&utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=gemini-leads-llm-leaderboard-character-ai-founders-return-to-google-groq-series-d-and-more-8-8-24" target="_blank" rel="noopener noreferrer nofollow">For the first time, Google Gemini has claimed the #1 spot, surpassing GPT-4o/Claude-3.5 with an impressive score of 1300 (!), and also achieving #1 on our Vision Leaderboard.</a>]</p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="character-ai-gets-funding-from-goog"><a class="link" href="https://Character.AI?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=gemini-leads-llm-leaderboard-character-ai-founders-return-to-google-groq-series-d-and-more-8-8-24" target="_blank" rel="noopener noreferrer nofollow">Character.AI</a><a class="link" href="https://n-4ycu2pnnxihsfrjsogoakm2f2q5acpq5n7eoqrq-1lu-script.googleusercontent.com/userCodeAppPanel?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=gemini-leads-llm-leaderboard-character-ai-founders-return-to-google-groq-series-d-and-more-8-8-24" target="_blank" rel="noopener noreferrer nofollow"> Gets Funding from Google and Co-Founders Return to Google</a></h3><p class="paragraph" style="text-align:left;">Noam Shazeer and Daniel De Freitas, co-founders of <a class="link" href="https://Character.AI?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=gemini-leads-llm-leaderboard-character-ai-founders-return-to-google-groq-series-d-and-more-8-8-24" target="_blank" rel="noopener noreferrer nofollow">Character.AI</a>, are returning to Google, with Shazeer joining the DeepMind research team, while <a class="link" href="https://Character.AI?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=gemini-leads-llm-leaderboard-character-ai-founders-return-to-google-groq-series-d-and-more-8-8-24" target="_blank" rel="noopener noreferrer nofollow">Character.AI</a>’s general counsel Dominic Perella will serve as interim CEO. <a class="link" href="https://Character.ai?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=gemini-leads-llm-leaderboard-character-ai-founders-return-to-google-groq-series-d-and-more-8-8-24" target="_blank" rel="noopener noreferrer nofollow">Character.ai</a> is a platform where users can create and interact with AI-driven virtual characters that adapt to individual inputs, tailoring their responses and behaviors for personalized, engaging interactions across various applications. According to a TechCrunch interview, Google has signed a non-exclusive agreement to use <a class="link" href="https://Character.AI?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=gemini-leads-llm-leaderboard-character-ai-founders-return-to-google-groq-series-d-and-more-8-8-24" target="_blank" rel="noopener noreferrer nofollow">Character.AI</a>’s technology, which Shazeer says will provide funding for <a class="link" href="https://Character.AI?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=gemini-leads-llm-leaderboard-character-ai-founders-return-to-google-groq-series-d-and-more-8-8-24" target="_blank" rel="noopener noreferrer nofollow">Character.AI</a>’s continued growth and focus on building personalized AI products. [<a class="link" href="https://n-4ycu2pnnxihsfrjsogoakm2f2q5acpq5n7eoqrq-1lu-script.googleusercontent.com/userCodeAppPanel?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=gemini-leads-llm-leaderboard-character-ai-founders-return-to-google-groq-series-d-and-more-8-8-24" target="_blank" rel="noopener noreferrer nofollow">Title: Exclusive: </a><a class="link" href="https://Character.AI?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=gemini-leads-llm-leaderboard-character-ai-founders-return-to-google-groq-series-d-and-more-8-8-24" target="_blank" rel="noopener noreferrer nofollow">Character.AI</a><a class="link" href="https://n-4ycu2pnnxihsfrjsogoakm2f2q5acpq5n7eoqrq-1lu-script.googleusercontent.com/userCodeAppPanel?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=gemini-leads-llm-leaderboard-character-ai-founders-return-to-google-groq-series-d-and-more-8-8-24" target="_blank" rel="noopener noreferrer nofollow"> CEO Noam Shazeer returns to Google as the tech giant invests in the AI company</a>]</p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="groq-raises-640-million-valuation-h"><a class="link" href="https://www.reuters.com/technology/artificial-intelligence/ai-chip-startup-groq-valued-28-bln-after-latest-funding-round-2024-08-05/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=gemini-leads-llm-leaderboard-character-ai-founders-return-to-google-groq-series-d-and-more-8-8-24" target="_blank" rel="noopener noreferrer nofollow">Groq raises $640 million, valuation hits $2.8 billion</a></h3><p class="paragraph" style="text-align:left;">Groq has raised $640 million in Series D funding, led by Cisco Investments, Samsung Catalyst Fund, and BlackRock Private Equity Partners, bringing its valuation to $2.8 billion. Groq&#39;s chips are designed with a unique architecture called Tensor Streaming Processor (TSP). This architecture processes data in a highly parallel manner, allowing it to handle multiple tasks simultaneously with the benefit of both speed and efficiency. Unlike traditional GPUs that are designed for a wide range of tasks, Groq&#39;s TSP is specifically optimized for AI workloads, making it exceptionally fast and energy-efficient for inference and tasks like running large language models. The company says its chips, which now run Meta Platforms&#39; LLaMA, are four times faster, five times cheaper, and three times more energy-efficient than Nvidia&#39;s GPUs for AI inference tasks. [<a class="link" href="https://www.reuters.com/technology/artificial-intelligence/ai-chip-startup-groq-valued-28-bln-after-latest-funding-round-2024-08-05/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=gemini-leads-llm-leaderboard-character-ai-founders-return-to-google-groq-series-d-and-more-8-8-24" target="_blank" rel="noopener noreferrer nofollow">AI chip startup Groq valued at $2.8 bln after latest funding round</a>]</p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="argentinas-president-launches-ai-se"><a class="link" href="https://www.theguardian.com/world/article/2024/aug/01/argentina-ai-predicting-future-crimes-citizen-rights?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=gemini-leads-llm-leaderboard-character-ai-founders-return-to-google-groq-series-d-and-more-8-8-24" target="_blank" rel="noopener noreferrer nofollow">Argentina&#39;s President Launches AI Security Unit</a></h3><p class="paragraph" style="text-align:left;">Argentina&#39;s President Javier Milei has launched the AI Applied to Security Unit, raising concerns about potential human rights violations. The Ministry of Security says the unit will use AI to predict crimes and monitor social media, while human rights groups fear it could lead to over-surveillance and profiling. [<a class="link" href="https://www.theguardian.com/world/article/2024/aug/01/argentina-ai-predicting-future-crimes-citizen-rights?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=gemini-leads-llm-leaderboard-character-ai-founders-return-to-google-groq-series-d-and-more-8-8-24" target="_blank" rel="noopener noreferrer nofollow">Argentina will use AI to ‘predict future crimes’ but experts worry for citizens’ rights</a>]</p></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=cde0144d-9f8c-4b58-bd8e-4d8c5d7fcf4f&utm_medium=post_rss&utm_source=headline_edit">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

      <item>
  <title>OpenAI launches Search, Meta&#39;s ‘Segment Anything Model’, and Llama 3.1 Climbs the Leaderboard (7.31.24)</title>
  <description>OpenAI, Meta, Search, SAM</description>
      <enclosure url="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/dd429db8-aa10-4f56-a838-d63cfab69749/An_isometric_retrofu.jpg" length="96137" type="image/jpeg"/>
  <link>https://edit.headline.com/p/ai-openai-launches-search-metas-segment-anything-model-llama-31-climbs-leaderboard-73024</link>
  <guid isPermaLink="true">https://edit.headline.com/p/ai-openai-launches-search-metas-segment-anything-model-llama-31-climbs-leaderboard-73024</guid>
  <pubDate>Thu, 01 Aug 2024 00:40:00 +0000</pubDate>
  <atom:published>2024-08-01T00:40:00Z</atom:published>
    <dc:creator>Sasha Krecinic</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #c7bab0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #161618; font-family: 'Roboto',-apple-system,BlinkMacSystemFont,Tahoma,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#fcf0e8; }
  .bh__table_header p { color: #161618; font-family:'Roboto',-apple-system,BlinkMacSystemFont,Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/dd429db8-aa10-4f56-a838-d63cfab69749/An_isometric_retrofu.jpg?t=1724181654"/></div><p class="paragraph" style="text-align:left;">This week, we saw two new products launch that influence search and video processing capabilities. OpenAI&#39;s Search product has finally been released in closed beta after many months of rumors, and Meta&#39;s SAM 2 (Segment Anything Model) promises to redefine real-time object segmentation and tracking in images and videos. </p><p class="paragraph" style="text-align:left;">– Sasha Krecinic</p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="open-ai-launches-search-gpt-prototy"><a class="link" href="https://openai.com/index/searchgpt-prototype/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=openai-launches-search-meta-s-segment-anything-model-and-llama-3-1-climbs-the-leaderboard-7-31-24" target="_blank" rel="noopener noreferrer nofollow">OpenAI launches SearchGPT prototype</a></h3><p class="paragraph" style="text-align:left;">OpenAI is testing a new prototype called SearchGPT, designed to provide fast and timely answers with clear and relevant sources. According to a blog post, the prototype is being launched with a small group of users for feedback, with plans to integrate it into ChatGPT to enhance real-time search capabilities. The search product was launched a few weeks after their acquisition of RockSet. Rumors were also swirling for several weeks that OpenAI would launch a search product. A notable OpenAI employee, Noam Brown, also commented that this is &quot;another step toward general AI personal assistants for all.&quot; [<a class="link" href="https://openai.com?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=openai-launches-search-meta-s-segment-anything-model-and-llama-3-1-climbs-the-leaderboard-7-31-24" target="_blank" rel="noopener noreferrer nofollow">openai.com</a>] Share this story by email</p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="metas-llama-31405-b-ranks-third-in-"><a class="link" href="http://leaderboard.lmsys.org/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=openai-launches-search-meta-s-segment-anything-model-and-llama-3-1-climbs-the-leaderboard-7-31-24" target="_blank" rel="noopener noreferrer nofollow">Meta&#39;s Llama-3.1-405B Ranks Third in AI Leaderboard</a></h3><p class="paragraph" style="text-align:left;">Meta&#39;s Llama-3.1-405B has reached #3 on the Overall Arena leaderboard, marking the first time an open model has made the top 3. The model was tested over the past week, receiving over 10K community votes. [<a class="link" href="https://lmsys.org?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=openai-launches-search-meta-s-segment-anything-model-and-llama-3-1-climbs-the-leaderboard-7-31-24" target="_blank" rel="noopener noreferrer nofollow">lmsys.org</a>] Share this story by email</p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="meta-launches-sam-2-for-realtime-ob"><a class="link" href="https://go.fb.me/yck7bu?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=openai-launches-search-meta-s-segment-anything-model-and-llama-3-1-climbs-the-leaderboard-7-31-24" target="_blank" rel="noopener noreferrer nofollow">Meta launches SAM 2 for real-time object segmentation</a></h3><p class="paragraph" style="text-align:left;">Meta has launched the Segment Anything Model 2 (SAM 2), a groundbreaking unified model for real-time, promptable object segmentation in both images and videos. Following the success of SAM, SAM 2 offers state-of-the-art performance and introduces a novel &quot;memory attention&quot; feature that uses a transformer with memory across frames. It stores special &quot;object pointer&quot; tokens in a &quot;memory bank&quot; FIFO queue of recent and prompted frames. SAM 2 can segment any object in any video or image, even those it has not seen before, enabling a diverse range of use cases without custom adaptation. It is open source under the Apache 2.0 license and includes the SA-V dataset, containing approximately 51,000 real-world videos and more than 600,000 masklets. Meta provides a web demo, research paper, and datasets that are worth checking out! [<a class="link" href="https://fb.me?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=openai-launches-search-meta-s-segment-anything-model-and-llama-3-1-climbs-the-leaderboard-7-31-24" target="_blank" rel="noopener noreferrer nofollow">fb.me</a>] Share this story by email</p></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=368225ac-7b5b-41e6-85e5-085a793397a4&utm_medium=post_rss&utm_source=headline_edit">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

      <item>
  <title>A Big Week for AI: Meta&#39;s New SOTA Model, UBI Study, GPT-4o Mini + Free Finetuning, and Voice Standards</title>
  <description>OpenAI, Meta, Daily, UBI, Sam Altman</description>
      <enclosure url="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/97321f69-9dc4-4be6-a22a-faea0db5ff8c/An_isometric_retrofu__1_.jpg" length="123122" type="image/jpeg"/>
  <link>https://edit.headline.com/p/new-postai-big-week-ai-metas-new-sota-model-ubi-study-gpt4o-mini-free-finetuning-voice-standards-726</link>
  <guid isPermaLink="true">https://edit.headline.com/p/new-postai-big-week-ai-metas-new-sota-model-ubi-study-gpt4o-mini-free-finetuning-voice-standards-726</guid>
  <pubDate>Fri, 26 Jul 2024 19:15:00 +0000</pubDate>
  <atom:published>2024-07-26T19:15:00Z</atom:published>
    <dc:creator>Sasha Krecinic</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #c7bab0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #161618; font-family: 'Roboto',-apple-system,BlinkMacSystemFont,Tahoma,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#fcf0e8; }
  .bh__table_header p { color: #161618; font-family:'Roboto',-apple-system,BlinkMacSystemFont,Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><h1 class="heading" style="text-align:start;" id="heading-1"></h1><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/97321f69-9dc4-4be6-a22a-faea0db5ff8c/An_isometric_retrofu__1_.jpg?t=1724181799"/></div><p class="paragraph" style="text-align:start;">It has been another big week in AI. Most notably, there is a paradigm shift in the competition between open and closed-source models. Meta&#39;s latest release, Llama 3.1 405B, redefines open-source real-time AI performance with enhanced reasoning and multimodal capabilities. This new version is state-of-the-art for the open-source community, radically improving AI toolkits for AI startups and developers. But that&#39;s not all—today&#39;s edition also covers the broader implications of advancing AI with the results of the Sam Altman-backed UBI (universal basic income) study being released. OpenAI has also introduced GPT-4o mini, which is reportedly smarter and 60% cheaper than GPT-3.5 Turbo. Additionally, OpenAI has launched free fine-tuning for GPT-4o mini until September. We also saw that Daily released an open standard for Real-time Voice and Video Inference (RTVI-AI). These developments are significant because they make cutting-edge AI technology more accessible and affordable along with the potential implications of UBI. Based on today’s updates, it isn&#39;t crazy to imagine a world where your next dentist&#39;s appointment is booked by speaking to an AI agent. People could work less and potentially have a steady stream of money coming in each month.</p><hr class="content_break"><h3 class="heading" style="text-align:start;" id="meta-launches-sota-open-source-mode"><a class="link" href="https://go.fb.me/vq04tr?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=a-big-week-for-ai-meta-s-new-sota-model-ubi-study-gpt-4o-mini-free-finetuning-and-voice-standards" target="_blank" rel="noopener noreferrer nofollow">Meta launches SOTA Open Source Model</a></h3><p class="paragraph" style="text-align:start;">Meta has introduced Llama 3.1, a new set of foundation models designed to rival leading closed-source models in various tasks. These models, including a 405 billion parameter version, boast enhanced reasoning capabilities and a larger 128,000-token context window, along with multimodal features for image and video processing. Key to the model&#39;s size and performance are its improved data quality and scale, with training conducted on a diverse and high-quality dataset of 15 trillion multilingual tokens. Meta has made these models publicly available, including both pre-trained and post-trained versions, to foster innovation in the research community and promote the responsible development of artificial general intelligence (AGI). [<a class="link" href="https://fb.me?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=a-big-week-for-ai-meta-s-new-sota-model-ubi-study-gpt-4o-mini-free-finetuning-and-voice-standards" target="_blank" rel="noopener noreferrer nofollow">fb.me</a>] Share this story <a class="link" href="mailto:?subject=%20.%20.%20Meta%20launches%20SOTA%20Open%20Source%20Model&body=%0A%0A%0AMeta%20launches%20SOTA%20Open%20Source%20Model%0AFirst%20impacted%3A%20AI%20researchers%2C%20Data%20scientists%20%2F%2F%20Time%20to%20impact%3A%20%0A%0AMeta%20has%20introduced%20Llama%203.1%2C%20a%20new%20set%20of%20foundation%20models%20designed%20to%20rival%20leading%20closed-source%20models%20in%20various%20tasks.%20These%20models%2C%20including%20a%20405%20billion%20parameter%20version%2C%20boast%20enhanced%20reasoning%20capabilities%20and%20a%20larger%20128%2C000-token%20context%20window%2C%20along%20with%20multimodal%20features%20for%20image%20and%20video%20processing.%20Key%20to%20the%20model%27s%20size%20and%20performance%20are%20its%20improved%20data%20quality%20and%20scale%2C%20with%20training%20conducted%20on%20a%20diverse%20and%20high-quality%20dataset%20of%2015%20trillion%20multilingual%20tokens.%20Meta%20has%20made%20these%20models%20publicly%20available%2C%20including%20both%20pre-trained%20and%20post-trained%20versions%2C%20to%20foster%20innovation%20in%20the%20research%20community%20and%20promote%20the%20responsible%20development%20of%20artificial%20general%20intelligence%20(AGI).%0A%0A%20%5Bfb.me%20https%3A%2F%2Fgo.fb.me%2Fvq04tr%5D" target="_blank" rel="noopener noreferrer nofollow">by email</a></p><hr class="content_break"><h3 class="heading" style="text-align:start;" id="open-ai-launches-gpt-4-o-mini-with-"><a class="link" href="https://openai.com/index/gpt-4o-mini-advancing-cost-efficient-intelligence/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=a-big-week-for-ai-meta-s-new-sota-model-ubi-study-gpt-4o-mini-free-finetuning-and-voice-standards" target="_blank" rel="noopener noreferrer nofollow">OpenAI Launches GPT-4o Mini with Free Training Tokens</a></h3><p class="paragraph" style="text-align:start;">OpenAI has launched GPT-4o mini, which they say is smarter and 60% cheaper than GPT-3.5 Turbo. OpenAI has also launched fine-tuning for GPT-4o mini. GPT-4o mini excels in reasoning, math, coding, and multimodal tasks, outperforming GPT-3.5 Turbo and other small models on several key benchmarks. OpenAI also mentioned in another release that they will be offering free fine-tuning for the model with the first 2 million training tokens per day are free until September 23! These super small and cheaper models are important for highly repetitive tasks that don’t need a larger or more expensive model (and because they use significantly less power, they are much better for the the wallet and the planet!) [<a class="link" href="https://openai.com?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=a-big-week-for-ai-meta-s-new-sota-model-ubi-study-gpt-4o-mini-free-finetuning-and-voice-standards" target="_blank" rel="noopener noreferrer nofollow">openai.com</a>] Share this story <a class="link" href="mailto:?subject=%20.%20.%20OpenAI%20Launches%20GPT-4o%20mini%20with%20Free%20Training%20Tokens&body=%0A%0A%0AOpenAI%20Launches%20GPT-4o%20mini%20with%20Free%20Training%20Tokens%0AFirst%20impacted%3A%20AI%20developers%2C%20tech%20startups%20%2F%2F%20Time%20to%20impact%3A%20%0A%0AOpenAI%20has%20launched%20GPT-4o%20mini%2C%20which%20they%20say%20is%20smarter%20and%2060%25%20cheaper%20than%20GPT-3.5%20Turbo.%20OpenAI%20has%20also%20launched%20fine-tuning%20for%20GPT-4o%20mini.%20GPT-4o%20mini%20excels%20in%20reasoning%2C%20math%2C%20coding%2C%20and%20multimodal%20tasks%2C%20outperforming%20GPT-3.5%20Turbo%20and%20other%20small%20models%20on%20several%20key%20benchmarks.%20OpenAI%20also%20mentioned%20in%20another%20release%20that%20they%20will%20be%20offering%20free%20fine-tuning%20for%20the%20model%20with%20the%20first%202%20million%20training%20tokens%20per%20day%20are%20free%20until%20September%2023!%20These%20super%20small%20and%20cheaper%20models%20are%20important%20for%20highly%20repetitive%20tasks%20that%20don%E2%80%99t%20need%20a%20larger%20or%20more%20expensive%20model%20(and%20because%20they%20use%20significantly%20less%20power%2C%20they%20are%20much%20better%20for%20the%20the%20wallet%20and%20the%20planet!)%0A%0A%20%5Bopenai.com%20https%3A%2F%2Fopenai.com%2Findex%2Fgpt-4o-mini-advancing-cost-efficient-intelligence%2F%5D" target="_blank" rel="noopener noreferrer nofollow">by email</a></p><hr class="content_break"><h3 class="heading" style="text-align:start;" id="daily-launches-open-standard-for-re"><a class="link" href="https://demo.rtvi.ai/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=a-big-week-for-ai-meta-s-new-sota-model-ubi-study-gpt-4o-mini-free-finetuning-and-voice-standards" target="_blank" rel="noopener noreferrer nofollow">Daily Launches Open Standard for Real-time Voice and Video AI</a></h3><p class="paragraph" style="text-align:start;">Daily has launched an open standard for Real-time Voice and Video Inference (RTVI-AI) along with open-source JavaScript and React SDKs, with iOS and Android SDKs coming soon. According to the release, RTVI-AI defines how client applications communicate with inference services, enabling use cases like voice chat with LLMs, enterprise voice workflows, video avatars, voice-driven user interfaces, and high-framerate image generation. The demo leverages Llama 3.1 running on @GroqInc and has impressive 500ms voice-to-voice response times (which is comparable to real-life conversations!) and shows how far the frontier of tech for live voice agents has come in a short time. [<a class="link" href="https://github.com?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=a-big-week-for-ai-meta-s-new-sota-model-ubi-study-gpt-4o-mini-free-finetuning-and-voice-standards" target="_blank" rel="noopener noreferrer nofollow">github.com</a>] Share this story <a class="link" href="mailto:?subject=%20.%20.%20Daily%20Launches%20Open%20Standard%20for%20Real-time%20Voice%20and%20Video%20AI%20&body=%0A%0A%0ADaily%20Launches%20Open%20Standard%20for%20Real-time%20Voice%20and%20Video%20AI%20%0AFirst%20impacted%3A%20Developers%2C%20AI%20researchers%20%2F%2F%20Time%20to%20impact%3A%20%0A%0ADaily%20has%20launched%20an%20open%20standard%20for%20Real-time%20Voice%20and%20Video%20Inference%20(RTVI-AI)%20along%20with%20open-source%20JavaScript%20and%20React%20SDKs%2C%20with%20iOS%20and%20Android%20SDKs%20coming%20soon.%20According%20to%20the%20release%2C%20RTVI-AI%20defines%20how%20client%20applications%20communicate%20with%20inference%20services%2C%20enabling%20use%20cases%20like%20voice%20chat%20with%20LLMs%2C%20enterprise%20voice%20workflows%2C%20video%20avatars%2C%20voice-driven%20user%20interfaces%2C%20and%20high-framerate%20image%20generation.%20The%20demo%20leverages%20Llama%203.1%20running%20on%20%40GroqInc%20and%20has%20impressive%20500ms%20voice-to-voice%20response%20times%20(which%20is%20comparable%20to%20real-life%20conversations!)%20and%20shows%20how%20far%20the%20frontier%20of%20tech%20for%20live%20voice%20agents%20has%20come%20in%20a%20short%20time.%20%0A%0A%20%5Bgithub.com%20https%3A%2F%2Fdemo.rtvi.ai%2F%5D" target="_blank" rel="noopener noreferrer nofollow">by email</a></p><hr class="content_break"><h3 class="heading" style="text-align:start;" id="ubi-study-findings-released"><a class="link" href="https://www.openresearchlab.org/findings?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=a-big-week-for-ai-meta-s-new-sota-model-ubi-study-gpt-4o-mini-free-finetuning-and-voice-standards" target="_blank" rel="noopener noreferrer nofollow">UBI Study </a><a class="link" href="https://www.openresearchlab.org/findings?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=a-big-week-for-ai-meta-s-new-sota-model-ubi-study-gpt-4o-mini-free-finetuning-and-voice-standards" target="_blank" rel="noopener noreferrer nofollow">Findings Released</a></h3><p class="paragraph" style="text-align:start;">A study by OpenResearch with backing from Sam Altman examined the effects of giving $1,000 per month to low-income individuals. The study explored the impact on spending, agency, employment, health, and moving. The summary findings stated that: &quot;The program resulted in a 2.0 percentage point decrease in labor market participation for participants and a 1.3-1.4 hour per week reduction in labor hours, with participants’ partners reducing their hours worked by a comparable amount. The transfer generated the largest increases in time spent on leisure, as well as smaller increases in time spent in other activities such as transportation and finances. Despite asking detailed questions about amenities, we find no impact on quality of employment, and our confidence intervals can rule out even small improvements. We observe no significant effects on investments in human capital, though younger participants may pursue more formal education. Overall, our results suggest a moderate labor supply effect that does not appear offset by other productive activities.&quot; [<a class="link" href="https://openresearchlab.org?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=a-big-week-for-ai-meta-s-new-sota-model-ubi-study-gpt-4o-mini-free-finetuning-and-voice-standards" target="_blank" rel="noopener noreferrer nofollow">openresearchlab.org</a>] Share this story <a class="link" href="mailto:?subject=%20.%20.%20UBI%20Study%20Finds%20Reduced%20Labor%20Participation&body=%0A%0A%0AUBI%20Study%20Finds%20Reduced%20Labor%20Participation%0AFirst%20impacted%3A%20Policy%20makers%2C%20social%20scientists%20%2F%2F%20Time%20to%20impact%3A%20%0A%0AA%20study%20by%20OpenResearch%20with%20backing%20from%20Sam%20Altman%20examined%20the%20effects%20of%20giving%20%241%2C000%20per%20month%20to%20low-income%20individuals.%20The%20study%20explored%20the%20impact%20on%20spending%2C%20agency%2C%20employment%2C%20health%2C%20and%20moving.%20The%20summary%20findings%20stated%20that%3A%20%22The%20program%20resulted%20in%20a%202.0%20percentage%20point%20decrease%20in%20labor%20market%20participation%20for%20participants%20and%20a%201.3-1.4%20hour%20per%20week%20reduction%20in%20labor%20hours%2C%20with%20participants%E2%80%99%20partners%20reducing%20their%20hours%20worked%20by%20a%20comparable%20amount.%20The%20transfer%20generated%20the%20largest%20increases%20in%20time%20spent%20on%20leisure%2C%20as%20well%20as%20smaller%20increases%20in%20time%20spent%20in%20other%20activities%20such%20as%20transportation%20and%20finances.%20Despite%20asking%20detailed%20questions%20about%20amenities%2C%20we%20find%20no%20impact%20on%20quality%20of%20employment%2C%20and%20our%20confidence%20intervals%20can%20rule%20out%20even%20small%20improvements.%20We%20observe%20no%20significant%20effects%20on%20investments%20in%20human%20capital%2C%20though%20younger%20participants%20may%20pursue%20more%20formal%20education.%20Overall%2C%20our%20results%20suggest%20a%20moderate%20labor%20supply%20effect%20that%20does%20not%20appear%20offset%20by%20other%20productive%20activities.%22%0A%0A%20%5Bopenresearchlab.org%20https%3A%2F%2Fwww.openresearchlab.org%2Ffindings%5D" target="_blank" rel="noopener noreferrer nofollow">by email</a></p></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=d4f35dbc-79bd-4f78-875c-4206cba79966&utm_medium=post_rss&utm_source=headline_edit">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

      <item>
  <title>OpenAI &quot;On the Cusp&quot; of Level 2 AGI, Generalist Robotics Model, New TTT Architecture, and Better Tiny Local Models</title>
  <description>OpenAI, SkildAI, Meta, AGI, Test-Time Training</description>
      <enclosure url="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/03c7a4b6-6a0b-45cf-a173-a8d18b2f4341/An_isometric_retrofu__2_.jpg" length="136208" type="image/jpeg"/>
  <link>https://edit.headline.com/p/ai-openai-cusp-level-2-agi-generalist-robotics-model-new-ttt-architecture-better-tiny-local-models-7</link>
  <guid isPermaLink="true">https://edit.headline.com/p/ai-openai-cusp-level-2-agi-generalist-robotics-model-new-ttt-architecture-better-tiny-local-models-7</guid>
  <pubDate>Wed, 17 Jul 2024 01:01:00 +0000</pubDate>
  <atom:published>2024-07-17T01:01:00Z</atom:published>
    <dc:creator>Sasha Krecinic</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #c7bab0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #161618; font-family: 'Roboto',-apple-system,BlinkMacSystemFont,Tahoma,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#fcf0e8; }
  .bh__table_header p { color: #161618; font-family:'Roboto',-apple-system,BlinkMacSystemFont,Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/03c7a4b6-6a0b-45cf-a173-a8d18b2f4341/An_isometric_retrofu__2_.jpg?t=1724182114"/></div><p class="paragraph" style="text-align:left;">This week&#39;s AI developments make up for a slow week last week. We see notable commentary, research, and applications. First, OpenAI is allegedly making strides towards Level 2 AI, which they have labeled as &quot;Reasoners,&quot; and defined as performing human-level problem-solving tasks. Research on Test-Time Training (TTT) layers shows that models can adapt and improve in real-time, potentially outperforming traditional models in long-context tasks. Skild AI&#39;s recent funding underscores the buy-in and investment in generalist AI models to drive robotics forward, and finally, Meta&#39;s MobileLLM models demonstrate efficient and capable, on-device AI solutions that address limitations in mobile technology. Happy reading!</p><p class="paragraph" style="text-align:left;">--Sasha Krecinic</p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="open-ai-on-the-cusp-of-level-2-agi"><a class="link" href="https://www.bloomberg.com/news/articles/2024-07-11/openai-sets-levels-to-track-progress-toward-superintelligent-ai?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=openai-on-the-cusp-of-level-2-agi-generalist-robotics-model-new-ttt-architecture-and-better-tiny-local-models" target="_blank" rel="noopener noreferrer nofollow">OpenAI &quot;On the Cusp&quot; of Level 2 AGI</a></h3><p class="paragraph" style="text-align:left;">According to a recent Bloomberg article, OpenAI executives have said the company is “on the cusp” of Level 2 AGI in its five-tier AI progress tracking system. It is also rumored that OpenAI demonstrated GPT-4 with improved reasoning capabilities at a recent all-hands meeting. Level 2, involves AI systems performing problem-solving tasks at the level of a human with a doctorate-level education without using any tools. The scale also includes the following stages of artificial intelligence: </p><p class="paragraph" style="text-align:left;">Level 1: Chatbots - AI with conversational language abilities </p><p class="paragraph" style="text-align:left;">Level 2: Reasoners - AI with human-level problem-solving capabilities </p><p class="paragraph" style="text-align:left;">Level 3: Agents - Systems that can take actions </p><p class="paragraph" style="text-align:left;">Level 4: Innovators - AI that can aid in invention </p><p class="paragraph" style="text-align:left;">Level 5: Organizations - AI that can perform the work of an entire organization </p><p class="paragraph" style="text-align:left;">[<a class="link" href="https://www.bloomberg.com/news/articles/2024-07-11/openai-sets-levels-to-track-progress-toward-superintelligent-ai?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=openai-on-the-cusp-of-level-2-agi-generalist-robotics-model-new-ttt-architecture-and-better-tiny-local-models" target="_blank" rel="noopener noreferrer nofollow">OpenAI Scale Ranks Progress Toward ‘Human-Level’ Problem Solving The company believes its technology is approaching the second level of five on the path to artificial general intelligence</a>] Share this story by email</p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="open-ai-researcher-comments-that-co"><a class="link" href="https://twitter.com/polynoamial/status/1810675986549428306?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=openai-on-the-cusp-of-level-2-agi-generalist-robotics-model-new-ttt-architecture-and-better-tiny-local-models" target="_blank" rel="noopener noreferrer nofollow">OpenAI Researcher Comments That Company Focus is Still on Ambitious Research</a></h3><p class="paragraph" style="text-align:left;">OpenAI researcher Noam Brown, a specialist in AI reasoning, tweeted, &quot;When I joined @OpenAI a year ago, I feared ChatGPT&#39;s success might shift focus from long-term research to incremental product tweaks. But it quickly became clear that wasn&#39;t the case. @OpenAI excels at placing big bets on ambitious research directions driven by strong conviction. They remain committed to ambitious research despite the success of ChatGPT.&quot; This comment emphasizes that OpenAI continues to prioritize long-term research over incremental product tweaks. It also suggests that they are not solely measuring themselves against existing benchmarks but are focused on paradigm-shifting developments through research, such as Noam Brown&#39;s work. This stance contrasts with the messaging of the former alignment team, who recently departed OpenAI. [<a class="link" href="https://twitter.com/polynoamial/status/1810675986549428306?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=openai-on-the-cusp-of-level-2-agi-generalist-robotics-model-new-ttt-architecture-and-better-tiny-local-models" target="_blank" rel="noopener noreferrer nofollow">via @polynoamial</a>] Share this story by email</p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="ttt-layers-match-performance-of-tra"><a class="link" href="https://arxiv.org/abs/2407.04620?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=openai-on-the-cusp-of-level-2-agi-generalist-robotics-model-new-ttt-architecture-and-better-tiny-local-models" target="_blank" rel="noopener noreferrer nofollow">TTT Layers Match Performance of Transformers and Mamba RNNs</a></h3><p class="paragraph" style="text-align:left;">According to a recent research paper, Test-Time Training (TTT) layers match or exceed the performance of strong Transformers and Mamba RNNs in long-context tasks. RNNs stands for Recurrent Neural Networks. They are a type of artificial neural network designed to recognize patterns in sequences of data, such as text, genomes, handwriting, or numerical time series data. TTT is a method where a model continues to update its parameters during the testing phase using self-supervised learning, allowing it to adapt to new data in real-time and improve performance dynamically. The paper highlights that TTT-Linear is faster than Transformers at 8k context and matches Mamba RNNs in wall-clock time. Wall-clock time refers to the actual elapsed time it takes to complete a task, as opposed to the number of operations or computational steps, and is crucial for evaluating the real-world efficiency of algorithms. Despite facing memory I/O challenges, TTT-MLP shows significant potential. TTT layers uniquely update their hidden state through self-supervised learning even during test sequences, making them highly adaptive and efficient for long-context tasks. This innovative approach of using a machine learning model as the hidden state allows TTT-Linear to outperform Transformers in speed at 8k context, demonstrating its potential for scalable applications. [<a class="link" href="https://arxiv.org/abs/2407.04620?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=openai-on-the-cusp-of-level-2-agi-generalist-robotics-model-new-ttt-architecture-and-better-tiny-local-models" target="_blank" rel="noopener noreferrer nofollow">GitHub - test-time-training/ttt-lm-pytorch: Official PyTorch implementation of Learning to (Learn at Test Time): RNNs with Expressive Hidden States</a>] Share this story by email</p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="skild-ai-raises-300-million-in-seri"><a class="link" href="https://www.forbes.com/sites/rashishrivastava/2024/07/09/this-15-billion-ai-company-is-building-a-general-purpose-brain-for-robots/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=openai-on-the-cusp-of-level-2-agi-generalist-robotics-model-new-ttt-architecture-and-better-tiny-local-models" target="_blank" rel="noopener noreferrer nofollow">Skild AI Raises $300 Million in Series A Funding To Develop Universal Robots</a></h3><p class="paragraph" style="text-align:left;">Pittsburgh-based robotics startup Skild AI has raised $300 million at a $1.5 billion valuation in a Series A funding round led by Lightspeed Ventures, Softbank, Coatue, and Jeff Bezos. Skild AI&#39;s models enable robots to perform tasks in unfamiliar environments, such as climbing stairs and recovering objects that slip out of hand. The robots demonstrated emergent capabilities, showcasing abilities they weren&#39;t explicitly taught. The AI model was trained on a database 1,000 times larger than those used by competitors, using diverse data collection techniques. According to the company&#39;s press release: &quot;Skild’s model serves as a shared, general-purpose brain for a diverse embodiment of robots, scenarios, and tasks, including manipulation, locomotion, and navigation. From resilient quadrupeds mastering adverse physical conditions to vision-based humanoids performing dexterous manipulation of objects for complex household and industrial tasks, the company’s model will enable the use of low-cost robots across a broad range of industries and applications.&quot; [<a class="link" href="https://www.forbes.com/sites/rashishrivastava/2024/07/09/this-15-billion-ai-company-is-building-a-general-purpose-brain-for-robots/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=openai-on-the-cusp-of-level-2-agi-generalist-robotics-model-new-ttt-architecture-and-better-tiny-local-models" target="_blank" rel="noopener noreferrer nofollow">This $1.5 Billion AI Company Is Building A ‘General Purpose Brain’ For Robots</a>] Share this story by email</p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="metas-mobile-llm-models-boost-on-de"><a class="link" href="https://arxiv.org/abs/2402.14905?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=openai-on-the-cusp-of-level-2-agi-generalist-robotics-model-new-ttt-architecture-and-better-tiny-local-models" target="_blank" rel="noopener noreferrer nofollow">Meta&#39;s MobileLLM Models Boost On-Device Accuracy Without Increasing Size</a></h3><p class="paragraph" style="text-align:left;">Meta has introduced MobileLLM models designed for efficient on-device large language models (LLMs). With fewer than a billion parameters, these models perform competitively with larger models in specific tasks, showing significant improvements in chat benchmarks and API calling tasks. They use deep and thin architectures with embedding sharing and grouped-query attention mechanisms, enhancing accuracy without increasing model size. The practicality of these small models is highlighted for mobile devices, addressing concerns such as memory capacity and energy consumption, making them suitable for common on-device use cases. [<a class="link" href="https://arxiv.org/abs/2402.14905?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=openai-on-the-cusp-of-level-2-agi-generalist-robotics-model-new-ttt-architecture-and-better-tiny-local-models" target="_blank" rel="noopener noreferrer nofollow">MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases</a>] Share this story by email</p></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=ae0e432c-dd3f-4ec8-aa93-fc4ace2032b2&utm_medium=post_rss&utm_source=headline_edit">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

      <item>
  <title>OpenAI Goes Shopping; Anthropic Edges Into First Place; and Microsoft&#39;s Tiny Vision Model</title>
  <description></description>
      <enclosure url="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/372456c8-e7e1-422c-bc0d-fc0af5f92361/An_isometric_retrofu__3_.jpg" length="138026" type="image/jpeg"/>
  <link>https://edit.headline.com/p/ai-openai-goes-shopping-anthropic-edges-first-place-microsofts-tiny-vision-model-62624</link>
  <guid isPermaLink="true">https://edit.headline.com/p/ai-openai-goes-shopping-anthropic-edges-first-place-microsofts-tiny-vision-model-62624</guid>
  <pubDate>Thu, 27 Jun 2024 00:54:50 +0000</pubDate>
  <atom:published>2024-06-27T00:54:50Z</atom:published>
    <dc:creator>Sasha Krecinic</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #c7bab0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #161618; font-family: 'Roboto',-apple-system,BlinkMacSystemFont,Tahoma,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#fcf0e8; }
  .bh__table_header p { color: #161618; font-family:'Roboto',-apple-system,BlinkMacSystemFont,Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/840ebc00-6a5c-4623-9768-44b9082566a8/An_isometric_retrofu__3_.jpg?t=1724182207"/></div><h1 class="heading" style="text-align:left;" id="ai-open-ai-goes-shopping-anthropic-"><b>…AI: OpenAI Goes Shopping; Anthropic Edges Into First Place; and Microsoft&#39;s Tiny Vision Model (6.26.24)</b></h1><p class="paragraph" style="text-align:left;"><span style="color:rgb(34, 34, 34);">This week’s respective models have received substantial infrastructure and performance enhancements. OpenAI’s acquisitions of Multi and Rockset are strategic moves to enhance its remote control and data retrieval capabilities, highlighting a broader industry shift towards more versatile and powerful agentic workflows. This really makes you wonder - not if, but when - they will release a search product. On the state-of-the-art (SOTA) front, we observed Anthropic’s enhancements with Claude 3.5 Sonnet and Microsoft’s debut of the Florence vision model, both of which represent significant advancements in AI model efficiency and effectiveness. Lastly, Ilya Sutskever, co-founder and former Chief Scientist at OpenAI, has news on his new project. Considering Ilya’s departure from OpenAI last month, this week’s headlines about acquisitions focused on &#39;Search&#39; and &#39;remote control&#39; potentially shed light on why he left OpenAI. Is it possible this direction did not align with the former Chief Scientist’s emphasis on creating safe AI?</span></p><p class="paragraph" style="text-align:left;">-- Sasha Krecinic</p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="open-ai-acquires-rockset-to-boost-d"><a class="link" href="https://openai.com/index/openai-acquires-rockset/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=openai-goes-shopping-anthropic-edges-into-first-place-and-microsoft-s-tiny-vision-model" target="_blank" rel="noopener noreferrer nofollow">OpenAI Acquires Rockset To Boost &quot;Data Infrastructure&quot; and &quot;Retrieval&quot; (aka Search)</a></h3><p class="paragraph" style="text-align:left;">OpenAI has acquired Rockset to enhance its data retrieval systems and plans to integrate Rockset&#39;s technology into its products to turn data into &quot;useful insights&quot;. If you read some of Rockset&#39;s documentation you will get a sense of what the team is working to solve. It would not be crazy to imagine a world where they power search with this. It could be the new generation of ad targeting and personalized recommendations. The capability to deliver nuanced, context-aware insights and recommendations opens new revenue streams, enhancing OpenAI&#39;s competitive and commercial edge online and within agentic workflows. [<a class="link" href="https://openai.com/index/openai-acquires-rockset/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=openai-goes-shopping-anthropic-edges-into-first-place-and-microsoft-s-tiny-vision-model" target="_blank" rel="noopener noreferrer nofollow">OpenAI Acquires Rockset</a>] </p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="open-ai-acquires-multi-a-company-po"><a class="link" href="https://twitter.com/itsandrewgao/status/1805264567548748151?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=openai-goes-shopping-anthropic-edges-into-first-place-and-microsoft-s-tiny-vision-model" target="_blank" rel="noopener noreferrer nofollow">OpenAI Acquires Multi, A Company Powering Remote Computer Control</a></h3><p class="paragraph" style="text-align:left;">OpenAI has acquired Multi, a startup specializing in remote computer control, and announced that Multi will stop its services. Multi has closed new team signups, and existing users can access the app until July 24, 2024, after which all user data will be deleted. [<a class="link" href="https://twitter.com/itsandrewgao/status/1805264567548748151?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=openai-goes-shopping-anthropic-edges-into-first-place-and-microsoft-s-tiny-vision-model" target="_blank" rel="noopener noreferrer nofollow">via @itsandrewgao</a>] </p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="anthropics-claude-35-sonnet-climbs-"><a class="link" href="https://www.anthropic.com/news/claude-3-5-sonnet?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=openai-goes-shopping-anthropic-edges-into-first-place-and-microsoft-s-tiny-vision-model" target="_blank" rel="noopener noreferrer nofollow">Anthropic&#39;s Claude 3.5 Sonnet Climbs Leaderboard</a></h3><p class="paragraph" style="text-align:left;">Anthropic says its Claude 3.5 Sonnet outperforms competitor models and Claude 3 Opus, operating at twice the speed and one-fifth the cost. Claude 3.5 Sonnet is their strongest vision model, surpassing Claude 3 Opus on standard vision benchmarks but still slightly behind OpenAI&#39;s offering. [<a class="link" href="https://www.anthropic.com/news/claude-3-5-sonnet?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=openai-goes-shopping-anthropic-edges-into-first-place-and-microsoft-s-tiny-vision-model" target="_blank" rel="noopener noreferrer nofollow">Introducing Claude 3.5 Sonnet</a>] </p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="microsoft-launches-florence-2-visio"><a class="link" href="https://huggingface.co/collections/microsoft/florence-6669f44df0d87d9c3bfb76de?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=openai-goes-shopping-anthropic-edges-into-first-place-and-microsoft-s-tiny-vision-model" target="_blank" rel="noopener noreferrer nofollow">Microsoft Launches Florence-2 Vision Model</a></h3><p class="paragraph" style="text-align:left;">Microsoft has launched Florence-2, a vision model with 200M and 800M parameters, which they say matches the quality of models 100 times larger. Florence-2 is designed to handle a wide range of computer vision tasks, including captioning, object detection, and segmentation, using a unified, prompt-based representation. The model demonstrates strong zero-shot and fine-tuning performance, addressing challenges in spatial hierarchy and semantic granularity, and achieving state-of-the-art results on various vision benchmarks. [<a class="link" href="https://huggingface.co/collections/microsoft/florence-6669f44df0d87d9c3bfb76de?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=openai-goes-shopping-anthropic-edges-into-first-place-and-microsoft-s-tiny-vision-model" target="_blank" rel="noopener noreferrer nofollow">Florence - a microsoft Collection</a>] </p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="ilya-sutskever-launches-safe-superi"><a class="link" href="http://ssi.inc/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=openai-goes-shopping-anthropic-edges-into-first-place-and-microsoft-s-tiny-vision-model" target="_blank" rel="noopener noreferrer nofollow">Ilya Sutskever Launches Safe Superintelligence Inc.</a></h3><p class="paragraph" style="text-align:left;">Safe Superintelligence Inc. says it is focused on creating the first safe superintelligence by addressing safety and capabilities together. Sutskever announced that they will focus on &quot;one goal and one product&quot; and also mentioned that they are hiring. [<a class="link" href="http://ssi.inc/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=openai-goes-shopping-anthropic-edges-into-first-place-and-microsoft-s-tiny-vision-model" target="_blank" rel="noopener noreferrer nofollow">Safe Superintelligence Inc.</a>] </p></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=3febc58f-69a0-40cb-9a0e-92e04bfa30c8&utm_medium=post_rss&utm_source=headline_edit">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

      <item>
  <title>What If Our Assumptions On Compute Requirements Are Wrong? How A Tiny 8B Model Outperforms GPT-4 (6.21.24)</title>
  <description>Neuromorphic Computing, SakanaAI, AIResearch, HippoRAG</description>
      <enclosure url="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/4d000625-63f8-4434-959b-5ee8540bba2c/An_isometric_retrofu__4_.jpg" length="124199" type="image/jpeg"/>
  <link>https://edit.headline.com/p/ai-assumptions-compute-requirements-wrong-tiny-8b-model-outperforms-gpt4-62124</link>
  <guid isPermaLink="true">https://edit.headline.com/p/ai-assumptions-compute-requirements-wrong-tiny-8b-model-outperforms-gpt4-62124</guid>
  <pubDate>Sat, 22 Jun 2024 02:55:00 +0000</pubDate>
  <atom:published>2024-06-22T02:55:00Z</atom:published>
    <dc:creator>Sasha Krecinic</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #c7bab0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #161618; font-family: 'Roboto',-apple-system,BlinkMacSystemFont,Tahoma,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#fcf0e8; }
  .bh__table_header p { color: #161618; font-family:'Roboto',-apple-system,BlinkMacSystemFont,Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><p class="paragraph" style="text-align:left;"></p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/4d000625-63f8-4434-959b-5ee8540bba2c/An_isometric_retrofu__4_.jpg?t=1724182305"/></div><p class="paragraph" style="text-align:left;">This week&#39;s edition focuses on the AI/Nature parallels, and one of the hottest fields of research: neuromorphic computing (the method of computer engineering in which elements of a computer are modeled after systems in the human brain and nervous system). </p><p class="paragraph" style="text-align:left;">While nature isn&#39;t a perfect model, it provides some upper and lower bounds for us to benchmark what is theoretically possible. The human brain uses something like 25 watts of energy for cognitive functions per hour. A single GPT-4 query is estimated to use something like 300 watts. This chasm of consumption hints that nature has found a way to solve for &#39;compute&#39; on a budget. Algorithmic optimization and neuromorphic computing, using nature as inspiration and a research catalyst, have the potential to solve our ever-increasing hunger for computing and dynamically trained models.</p><p class="paragraph" style="text-align:left;">This week, we have some interesting developments on this front, raising an important question: &quot;What if our assumptions about the computing requirements to train and run these models are wrong?&quot; Researchers from the Shanghai AI Lab published a paper that sheds some light, achieving results from a tiny 8B parameter model that outperforms GPT-4 in math and coding. We also see a RAG methodology called HippoRAG that emulates aspects of the hippocampus in the human brain. Additionally, we see an LLM training methodology from the team at SakanaAI, who use large language models for discovering and optimizing new training algorithms, akin to natural evolution.</p><p class="paragraph" style="text-align:left;"><span style="text-decoration:underline;"><b>A final thought:</b></span> Is this AI wave just hype? Have we “used up all the data for training”? Will we run out of compute? There is a lot of commentary like this saying that AI is “losing steam.” However, there are two different things that people often conflate. On the surface, this notion might appear to be true if you measure it by today&#39;s commercial applications (which still leverage legacy technology often). From a &quot;frontier&quot; research perspective (which I would posit is the rate-limiting step here), the developments we cover this month, in tandem with commentary from the founders I have spoken to, paint a very different picture. </p><p class="paragraph" style="text-align:left;">To paraphrase some sentiments from experienced founders, &quot;The frontier is moving so quickly that you have to choose what to build very carefully so you are not made obsolete by the rapid movements at the frontier.&quot; Sadly, the investment and hype at the wrapper/surface layer are overshadowing the large strides that continue to be made in the research space. Tying this back to this week’s title: the research is how an 8B parameter model outperformed GPT-4. </p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="llama-3-model-with-mcts-outperforms"><a class="link" href="https://arxiv.org/abs/2406.07394?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=what-if-our-assumptions-on-compute-requirements-are-wrong-how-a-tiny-8b-model-outperforms-gpt-4-6-21-24" target="_blank" rel="noopener noreferrer nofollow">Llama-3 Model with MCTS Outperforms GPT-4 in Math Tasks</a></h3><p class="paragraph" style="text-align:left;">A small 8B Llama-3 model combined with Monte Carlo Tree Search (MCTS) reportedly outperforms GPT-4 in complex mathematical reasoning tasks. The MCTS algorithm enhances mathematical reasoning by combining LLMs with Monte Carlo Tree Search (MCTS). It systematically explores and refines solutions using heuristic methods (practical techniques used to solve complex problems efficiently, relying on rules of thumb, educated guesses, and intuitive judgments to find satisfactory solutions quickly, even if they are not perfect or optimal). The algorithm builds a search tree by selecting, refining, evaluating options, and optimizing decisions using an enhanced Upper Confidence Bound (UCB) formula. Testing shows that MCTS significantly improves success rates on challenging math problems, advancing LLMs in complex tasks for more accurate and reliable AI-driven decision-making. [<a class="link" href="https://twitter.com?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=what-if-our-assumptions-on-compute-requirements-are-wrong-how-a-tiny-8b-model-outperforms-gpt-4-6-21-24" target="_blank" rel="noopener noreferrer nofollow">twitter.com</a>]</p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="sakana-ai-shares-disco-pop-evolutio"><a class="link" href="https://sakana.ai/llm-squared/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=what-if-our-assumptions-on-compute-requirements-are-wrong-how-a-tiny-8b-model-outperforms-gpt-4-6-21-24" target="_blank" rel="noopener noreferrer nofollow">Sakana AI Shares DiscoPOP - Evolution for LLMs by LLMs</a></h3><p class="paragraph" style="text-align:left;">By leveraging LLMs to automatically create and test new optimization algorithms, Sakana AI developed Discovered Preference Optimization (DiscoPOP), a novel method combining logistic and exponential losses. This approach, which showed state-of-the-art performance, marks a step towards using AI to advance AI, reducing human intervention and computational resources. The research highlights the potential of LLM-driven discovery to continuously improve AI models, opening new avenues for innovation and efficiency in AI development. [<a class="link" href="https://sakana.ai/llm-squared/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=what-if-our-assumptions-on-compute-requirements-are-wrong-how-a-tiny-8b-model-outperforms-gpt-4-6-21-24" target="_blank" rel="noopener noreferrer nofollow">Sakana AI</a>]</p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="hippo-rag-rag-that-mimics-human-mem"><a class="link" href="https://arxiv.org/abs/2405.14831?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=what-if-our-assumptions-on-compute-requirements-are-wrong-how-a-tiny-8b-model-outperforms-gpt-4-6-21-24" target="_blank" rel="noopener noreferrer nofollow">HippoRAG: RAG That Mimics Human Memory, Enhancing Speed and Efficiency</a></h3><p class="paragraph" style="text-align:left;">New research from Ohio State University introduces HippoRAG, a retrieval framework inspired by human memory, which they say outperforms existing methods by up to 20%, while being 10-30 times cheaper and 6-13 times faster. The study highlights HippoRAG&#39;s integration of LLMs, knowledge graphs, and the Personalized PageRank algorithm, demonstrating improvements in multi-hop question answering and single-step retrieval. HippoRAG mimics the human brain&#39;s memory processes by integrating large language models (LLMs) with knowledge graphs and the Personalized PageRank algorithm, similar to how the hippocampus and neocortex work together to store and integrate knowledge efficiently and effectively. [<a class="link" href="https://arxiv.org/abs/2405.14831?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=what-if-our-assumptions-on-compute-requirements-are-wrong-how-a-tiny-8b-model-outperforms-gpt-4-6-21-24" target="_blank" rel="noopener noreferrer nofollow">HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models </a>]</p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="greenblatt-achieves-50-accuracy-on-"><a class="link" href="https://redwoodresearch.substack.com/p/getting-50-sota-on-arc-agi-with-gpt?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=what-if-our-assumptions-on-compute-requirements-are-wrong-how-a-tiny-8b-model-outperforms-gpt-4-6-21-24" target="_blank" rel="noopener noreferrer nofollow">Greenblatt Achieves 50% Accuracy on ARC-AGI Test with GPT-4o</a></h3><p class="paragraph" style="text-align:left;">Ryan Greenblatt says he achieved 50% accuracy on the ARC-AGI public test set using GPT-4o, surpassing the previous state-of-the-art of 34%. He claims his solution reached 72% accuracy on a subset of the train set, compared to human performance of 85%, by using specialized few-shot prompts and better grid representations. [<a class="link" href="https://redwoodresearch.substack.com/p/getting-50-sota-on-arc-agi-with-gpt?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=what-if-our-assumptions-on-compute-requirements-are-wrong-how-a-tiny-8b-model-outperforms-gpt-4-6-21-24" target="_blank" rel="noopener noreferrer nofollow">Getting 50% (SoTA) on ARC-AGI with GPT-4o</a>]</p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="deep-seek-coder-v-2-surpasses-gpt-4"><a class="link" href="https://github.com/deepseek-ai/DeepSeek-Coder-V2/blob/main/paper.pdf?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=what-if-our-assumptions-on-compute-requirements-are-wrong-how-a-tiny-8b-model-outperforms-gpt-4-6-21-24" target="_blank" rel="noopener noreferrer nofollow">DeepSeek-Coder-V2 surpasses GPT4-Turbo in Coding and Math</a></h3><p class="paragraph" style="text-align:left;">DeepSeek-Coder-V2, an open-source model, has reportedly outperformed GPT4-Turbo in coding and math, supporting 338 programming languages and extending context length to 128K. According to a paper posted on GitHub, it achieved 90.2% on HumanEval and 75.7% on MATH, surpassing GPT-4-Turbo-0409. [<a class="link" href="https://github.com/deepseek-ai/DeepSeek-Coder-V2/blob/main/paper.pdf?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=what-if-our-assumptions-on-compute-requirements-are-wrong-how-a-tiny-8b-model-outperforms-gpt-4-6-21-24" target="_blank" rel="noopener noreferrer nofollow">DeepSeek-Coder-V2/paper.pdf at main · deepseek-ai/DeepSeek-Coder-V2</a>] </p></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=ed57e235-7f6c-436a-b34f-3f704d13dff8&utm_medium=post_rss&utm_source=headline_edit">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

      <item>
  <title>Apple Integrates ChatGPT, Unbabel&#39;s SOTA Translation LLM, and a Mixture of Agents Paper</title>
  <description>OpenAI, Unbabel, Apple, Hugging Face</description>
      <enclosure url="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/edc9d74d-32b8-49fa-8a8a-75209c00fbce/An_isometric_retrofu__5_.jpg" length="125319" type="image/jpeg"/>
  <link>https://edit.headline.com/p/ai-apple-integrates-chatgpt-unbabels-sota-translation-llm-mixture-agents-paper-61124</link>
  <guid isPermaLink="true">https://edit.headline.com/p/ai-apple-integrates-chatgpt-unbabels-sota-translation-llm-mixture-agents-paper-61124</guid>
  <pubDate>Wed, 12 Jun 2024 00:31:08 +0000</pubDate>
  <atom:published>2024-06-12T00:31:08Z</atom:published>
    <dc:creator>Sasha Krecinic</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #c7bab0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #161618; font-family: 'Roboto',-apple-system,BlinkMacSystemFont,Tahoma,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#fcf0e8; }
  .bh__table_header p { color: #161618; font-family:'Roboto',-apple-system,BlinkMacSystemFont,Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><h2 class="heading" style="text-align:start;" id="heading-2"></h2><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/f2f33f4d-0a03-450b-86e4-5303bdc67c09/An_isometric_retrofu__5_.jpg?t=1724184953"/></div><p class="paragraph" style="text-align:start;">This week we had WWDC, and Apple and OpenAI announced the integration of ChatGPT with Siri on Apple devices. This move didn&#39;t surprise many, as OpenAI and Apple&#39;s talks have been an open secret for months. We see a great summary by Andrej Karpathy outlining what many are thinking and seeing on this. We also see new state-of-the-art (SOTA) models in translation, an open-source robotics offering, and a research paper on MoA (Mixture of Agents) and how the framework is pushing the frontier of AI capabilities yet again.</p><p class="paragraph" style="text-align:start;">— Sasha Krecinic</p><hr class="content_break"><h3 class="heading" style="text-align:start;" id="chat-gpt-integration-announced-for-"><a class="link" href="https://twitter.com/sama/status/1800237314360127905?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=apple-integrates-chatgpt-unbabel-s-sota-translation-llm-and-a-mixture-of-agents-paper" target="_blank" rel="noopener noreferrer nofollow">ChatGPT Integration Announced for Apple Devices</a></h3><p class="paragraph" style="text-align:start;">Apple announced at WWDC 2024 that ChatGPT will be integrated into Siri and available for free in iOS 18 and macOS Sequoia later this year. This partnership with OpenAI aims to enhance Apple&#39;s AI features, making advanced AI accessible while maintaining a commitment to safety and innovation. Sam Altman also says he is excited about partnering with Apple to integrate ChatGPT into their devices later this year, which he believes users will greatly appreciate. The highlight of the show and the conference was the new Siri demo. [<a class="link" href="https://twitter.com/sama/status/1800237314360127905?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=apple-integrates-chatgpt-unbabel-s-sota-translation-llm-and-a-mixture-of-agents-paper" target="_blank" rel="noopener noreferrer nofollow">via @sama</a>] Share this story <a class="link" href="mailto:?subject=%20.%20.%20ChatGPT%20Integration%20Announced%20for%20Apple%20Devices&body=%0A%0A%0AChatGPT%20Integration%20Announced%20for%20Apple%20Devices%0AFirst%20impacted%3A%20Apple%20device%20users%2C%20AI%20technology%20enthusiasts%20%2F%2F%20Time%20to%20impact%3A%20%0A%0AApple%20announced%20at%20WWDC%202024%20that%20ChatGPT%20will%20be%20integrated%20into%20Siri%20and%20available%20for%20free%20in%20iOS%2018%20and%20macOS%20Sequoia%20later%20this%20year.%20This%20partnership%20with%20OpenAI%20aims%20to%20enhance%20Apple%27s%20AI%20features%2C%20making%20advanced%20AI%20accessible%20while%20maintaining%20a%20commitment%20to%20safety%20and%20innovation.%20Sam%20Altman%20also%20says%20he%20is%20excited%20about%20partnering%20with%20Apple%20to%20integrate%20ChatGPT%20into%20their%20devices%20later%20this%20year%2C%20which%20he%20believes%20users%20will%20greatly%20appreciate.%20This%20announcement%20was%20made%20via%20a%20post%20on%20X%2C%20where%20Altman%20expressed%20his%20enthusiasm%20for%20the%20upcoming%20integration.%0A%0A%20%5Bvia%20%40sama%20https%3A%2F%2Ftwitter.com%2Fsama%2Fstatus%2F1800237314360127905%5D" target="_blank" rel="noopener noreferrer nofollow">by email</a></p><hr class="content_break"><h3 class="heading" style="text-align:start;" id="andrejs-breakdown-of-apple-intellig"><a class="link" href="https://twitter.com/karpathy/status/1800242310116262150?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=apple-integrates-chatgpt-unbabel-s-sota-translation-llm-and-a-mixture-of-agents-paper" target="_blank" rel="noopener noreferrer nofollow">Andrej&#39;s Breakdown of ‘Apple Intelligence</a>’</h3><p class="paragraph" style="text-align:start;">Andrej Karpathy praised Apple&#39;s Intelligence announcement, highlighting the integration of AI across the entire OS. He outlined key themes: enabling multimodal I/O, seamless inter-operation of OS and apps, and a frictionless user experience. He also emphasized the potential for proactive AI features, on-device intelligence, and modular support for various function calling while maintaining privacy with on-device computing. Zooming out, we agree that this is possibly the first step in a fully autonomous and highly personalized AI strategy and capable agent model, where computer vision takes on-screen data as visual context and potentially moves in the direction of Microsoft&#39;s Copilot/Recall feature by &#39;seeing&#39; or recording on-screen activity. [<a class="link" href="https://twitter.com/karpathy/status/1800242310116262150?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=apple-integrates-chatgpt-unbabel-s-sota-translation-llm-and-a-mixture-of-agents-paper" target="_blank" rel="noopener noreferrer nofollow">via @karpathy</a>] Share this story <a class="link" href="mailto:?subject=%20.%20.%20Andrej%27s%20Breakdown%20of%20Apple%27s%20ChatGPT%20Integration&body=%0A%0A%0AAndrej%27s%20Breakdown%20of%20Apple%27s%20ChatGPT%20Integration%0AFirst%20impacted%3A%20Apple%20device%20users%2C%20developers%20%2F%2F%20Time%20to%20impact%3A%20%0A%0AAndrej%20Karpathy%20praised%20Apple%27s%20Intelligence%20announcement%2C%20highlighting%20the%20integration%20of%20AI%20across%20the%20entire%20OS.%20He%20outlined%20key%20themes%3A%20enabling%20multimodal%20I%2FO%2C%20seamless%20inter-operation%20of%20OS%20and%20apps%2C%20and%20a%20frictionless%20user%20experience.%20He%20also%20emphasized%20the%20potential%20for%20proactive%20AI%20features%2C%20on-device%20intelligence%2C%20and%20modular%20support%20for%20various%20function%20calling%20while%20maintaining%20privacy%20with%20on-device%20computing.%20Zooming%20out%2C%20we%20agree%20that%20this%20is%20possibly%20the%20first%20step%20in%20a%20fully%20autonomous%20and%20highly%20personalized%20AI%20strategy%20and%20capable%20agent%20model%2C%20where%20computer%20vision%20takes%20on-screen%20data%20as%20visual%20context%20and%20potentially%20moves%20in%20the%20direction%20of%20Microsoft%27s%20Copilot%2FRecall%20feature%20by%20%27seeing%27%20or%20recording%20on-screen%20activity.%0A%0A%20%5Bvia%20%40karpathy%20https%3A%2F%2Ftwitter.com%2Fkarpathy%2Fstatus%2F1800242310116262150%5D" target="_blank" rel="noopener noreferrer nofollow">by email</a></p><hr class="content_break"><h3 class="heading" style="text-align:start;" id="unbabel-launches-tower-llm-and-reac"><a class="link" href="https://unbabel.com/meet-towerllm/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=apple-integrates-chatgpt-unbabel-s-sota-translation-llm-and-a-mixture-of-agents-paper" target="_blank" rel="noopener noreferrer nofollow">Unbabel Launches TowerLLM and Reaches SOTA Translation Performance</a></h3><p class="paragraph" style="text-align:start;">Unbabel says it has launched TowerLLM, a new translation LLM that outperforms competitors like GPT-4, GPT-3.5, Google, and DeepL in accuracy and cost-efficiency. The company highlights that TowerLLM, built on billions of words of high-quality translation data, offers features such as source correction, and named entity recognition, and supports 18 language pairs across various domains. [<a class="link" href="https://unbabel.com/meet-towerllm/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=apple-integrates-chatgpt-unbabel-s-sota-translation-llm-and-a-mixture-of-agents-paper" target="_blank" rel="noopener noreferrer nofollow">Introducing TowerLLM, Multilingual by design Unbabel’s Generative AI model is the best performing machine translation on the market, enabling our customers to scale globally with lower costs and higher accuracy.</a>] Share this story <a class="link" href="mailto:?subject=%20.%20.%20Unbabel%20Launches%20TowerLLM%20and%20Reaches%20SOTA%20Translation%20Performance%20%20&body=%0A%0A%0AUnbabel%20Launches%20TowerLLM%20and%20Reaches%20SOTA%20Translation%20Performance%20%20%0AFirst%20impacted%3A%20multilingual%20content%20managers%2C%20software%20developers%20%2F%2F%20Time%20to%20impact%3A%20%0A%0AUnbabel%20says%20it%20has%20launched%20TowerLLM%2C%20a%20new%20translation%20LLM%20that%20outperforms%20competitors%20like%20GPT-4%2C%20GPT-3.5%2C%20Google%2C%20and%20DeepL%20in%20accuracy%20and%20cost-efficiency.%20The%20company%20highlights%20that%20TowerLLM%2C%20built%20on%20billions%20of%20words%20of%20high-quality%20translation%20data%2C%20offers%20features%20such%20as%20source%20correction%2C%20and%20named%20entity%20recognition%2C%20and%20supports%2018%20language%20pairs%20across%20various%20domains.%0A%0A%20%5BIntroducing%20TowerLLM%2C%20Multilingual%20by%20design%0A%0AUnbabel%E2%80%99s%20Generative%20AI%20model%20is%20the%20best%20performing%20machine%20translation%20on%20the%20market%2C%20enabling%20our%20customers%20to%20scale%20globally%20with%20lower%20costs%20and%20higher%20accuracy.%20https%3A%2F%2Funbabel.com%2Fmeet-towerllm%2F%5D" target="_blank" rel="noopener noreferrer nofollow">by email</a></p><hr class="content_break"><h3 class="heading" style="text-align:start;" id="hugging-face-expands-locally-hosted"><a class="link" href="https://twitter.com/julien_c/status/1800153076994801929?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=apple-integrates-chatgpt-unbabel-s-sota-translation-llm-and-a-mixture-of-agents-paper" target="_blank" rel="noopener noreferrer nofollow">Hugging Face Expands Locally Hosted AI App Offerings</a></h3><p class="paragraph" style="text-align:start;">Hugging Face has launched a second batch of local Generative AI apps, now available on compatible model pages. The company welcomes new additions to its community, a sentiment echoed by retweets from its CEO, Clement Delangue. [<a class="link" href="https://twitter.com/julien_c/status/1800153076994801929?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=apple-integrates-chatgpt-unbabel-s-sota-translation-llm-and-a-mixture-of-agents-paper" target="_blank" rel="noopener noreferrer nofollow">via @julien_c</a>] Share this story <a class="link" href="mailto:?subject=%20.%20.%20Hugging%20Face%20Expands%20Locally%20Hosted%20AI%20App%20Offerings&body=%0A%0A%0AHugging%20Face%20Expands%20Locally%20Hosted%20AI%20App%20Offerings%0AFirst%20impacted%3A%20AI%20developers%2C%20tech-savvy%20consumers%20%2F%2F%20Time%20to%20impact%3A%20%0A%0AHugging%20Face%20has%20launched%20a%20second%20batch%20of%20local%20Generative%20AI%20apps%2C%20now%20available%20on%20compatible%20model%20pages.%20The%20company%20welcomes%20new%20additions%20to%20its%20community%2C%20a%20sentiment%20echoed%20by%20retweets%20from%20its%20CEO%2C%20Clement%20Delangue.%0A%0A%20%5Bvia%20%40julien_c%20https%3A%2F%2Ftwitter.com%2Fjulien_c%2Fstatus%2F1800153076994801929%5D" target="_blank" rel="noopener noreferrer nofollow">by email</a></p><hr class="content_break"><h3 class="heading" style="text-align:start;" id="le-robot-launches-on-py-torch-to-de"><a class="link" href="https://x.com/RemiCadene/status/1799000991876178038?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=apple-integrates-chatgpt-unbabel-s-sota-translation-llm-and-a-mixture-of-agents-paper" target="_blank" rel="noopener noreferrer nofollow">LeRobot Launches on PyTorch to Democratize Robotics</a></h3><p class="paragraph" style="text-align:start;">LeRobot, developed on PyTorch, has been launched on the Hugging Face community page to enhance accessibility in robotics using advanced AI tools and models. According to their press release, LeRobot provides pre-trained models, datasets, and simulation environments to facilitate learning complex tasks in robotics without the need for physical robot assembly. [<a class="link" href="https://x.com/RemiCadene/status/1799000991876178038?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=apple-integrates-chatgpt-unbabel-s-sota-translation-llm-and-a-mixture-of-agents-paper" target="_blank" rel="noopener noreferrer nofollow">via @RemiCadene</a>] Share this story <a class="link" href="mailto:?subject=%20.%20.%20LeRobot%20Launches%20on%20PyTorch%20to%20Democratize%20Robotics%20%20&body=%0A%0A%0ALeRobot%20Launches%20on%20PyTorch%20to%20Democratize%20Robotics%20%20%0AFirst%20impacted%3A%20robotics%20engineers%2C%20AI%20researchers%20%2F%2F%20Time%20to%20impact%3A%20%0A%0ALeRobot%2C%20developed%20on%20PyTorch%2C%20has%20been%20launched%20on%20the%20Hugging%20Face%20community%20page%20to%20enhance%20accessibility%20in%20robotics%20using%20advanced%20AI%20tools%20and%20models.%20According%20to%20their%20press%20release%2C%20LeRobot%20provides%20pre-trained%20models%2C%20datasets%2C%20and%20simulation%20environments%20to%20facilitate%20learning%20complex%20tasks%20in%20robotics%20without%20the%20need%20for%20physical%20robot%20assembly.%0A%0A%20%5Bvia%20%40RemiCadene%20https%3A%2F%2Fx.com%2FRemiCadene%2Fstatus%2F1799000991876178038%5D" target="_blank" rel="noopener noreferrer nofollow">by email</a></p><hr class="content_break"><h3 class="heading" style="text-align:start;" id="mixtureof-agents-methodology-outper"><a class="link" href="https://arxiv.org/abs/2406.04692?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=apple-integrates-chatgpt-unbabel-s-sota-translation-llm-and-a-mixture-of-agents-paper" target="_blank" rel="noopener noreferrer nofollow">Mixture-of-Agents Methodology Outperforms GPT-4 Omni</a></h3><p class="paragraph" style="text-align:start;">In a recent research paper, scientists introduced the Mixture-of-Agents methodology, which combines multiple LLMs to enhance language model performance, achieving a 65.1% score on AlpacaEval 2.0 and surpassing GPT-4 Omni&#39;s 57.5%. This method utilizes a layered architecture where each layer&#39;s LLM agents refine responses based on the previous layer&#39;s outputs, demonstrating improved performance using only open-source LLMs. [<a class="link" href="https://arxiv.org/abs/2406.04692?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=apple-integrates-chatgpt-unbabel-s-sota-translation-llm-and-a-mixture-of-agents-paper" target="_blank" rel="noopener noreferrer nofollow">Mixture-of-Agents Enhances Large Language Model Capabilities</a>] Share this story <a class="link" href="mailto:?subject=%20.%20.%20Mixture-of-Agents%20Methodology%20Outperforms%20GPT-4%20Omni%20&body=%0A%0A%0AMixture-of-Agents%20Methodology%20Outperforms%20GPT-4%20Omni%20%0AFirst%20impacted%3A%20AI%20researchers%2C%20software%20developers%20%2F%2F%20Time%20to%20impact%3A%20%0A%0AIn%20a%20recent%20research%20paper%2C%20scientists%20introduced%20the%20Mixture-of-Agents%20methodology%2C%20which%20combines%20multiple%20LLMs%20to%20enhance%20language%20model%20performance%2C%20achieving%20a%2065.1%25%20score%20on%20AlpacaEval%202.0%20and%20surpassing%20GPT-4%20Omni%27s%2057.5%25.%20This%20method%20utilizes%20a%20layered%20architecture%20where%20each%20layer%27s%20LLM%20agents%20refine%20responses%20based%20on%20the%20previous%20layer%27s%20outputs%2C%20demonstrating%20improved%20performance%20using%20only%20open-source%20LLMs.%0A%0A%20%5BMixture-of-Agents%20Enhances%20Large%20Language%20Model%20Capabilities%20https%3A%2F%2Farxiv.org%2Fabs%2F2406.04692%5D" target="_blank" rel="noopener noreferrer nofollow">by email</a></p></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=438c54b7-bfac-416b-8f6d-7fbb07d49c72&utm_medium=post_rss&utm_source=headline_edit">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

      <item>
  <title>Why Humans Struggle to Estimate AI Progress, Antitrust Woes, OpenAI Hiring Robotics, and, Datasets (6.6.24)</title>
  <description>AGI, Antitrust, AI Robotics, Investment, Datasets</description>
      <enclosure url="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/db7ce8af-8c8f-4f52-95fc-da5d1bd5be67/An_isometric_retrofu__6_.jpg" length="159115" type="image/jpeg"/>
  <link>https://edit.headline.com/p/ai-humans-suck-estimating-ai-progress-antitrust-woes-openai-hiring-robotics-datasets-6624</link>
  <guid isPermaLink="true">https://edit.headline.com/p/ai-humans-suck-estimating-ai-progress-antitrust-woes-openai-hiring-robotics-datasets-6624</guid>
  <pubDate>Fri, 07 Jun 2024 00:42:14 +0000</pubDate>
  <atom:published>2024-06-07T00:42:14Z</atom:published>
    <dc:creator>Sasha Krecinic</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #c7bab0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #161618; font-family: 'Roboto',-apple-system,BlinkMacSystemFont,Tahoma,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#fcf0e8; }
  .bh__table_header p { color: #161618; font-family:'Roboto',-apple-system,BlinkMacSystemFont,Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/afe25ea0-b216-4a83-babe-7090744a2db2/An_isometric_retrofu__6_.jpg?t=1724185061"/></div><p class="paragraph" style="text-align:left;">This week&#39;s edition focuses on the training and progression of AI towards AGI. We also see activity by regulators for antitrust investigations, some new roles in Robotics advertised by the OpenAI team, and a new dataset for model training with detailed commentary. Short, sweet, and enlightening. Enjoy!</p><p class="paragraph" style="text-align:left;">-- Sasha Krecinic<br></p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="why-do-we-struggle-to-estimate-ai-a"><a class="link" href="https://situational-awareness.ai/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=why-humans-struggle-to-estimate-ai-progress-antitrust-woes-openai-hiring-robotics-and-datasets-6-6-24" target="_blank" rel="noopener noreferrer nofollow">Why do we struggle to estimate AI Advancements?</a></h3><p class="paragraph" style="text-align:left;">Humans inherently struggle with understanding exponential relationships and growth. The wheat and chessboard problem illustrates this. The story goes that a king agrees to place one grain of rice/wheat on the first square of a chessboard, doubling it on each subsequent square. By the 64th square, the amount of wheat is huge, enough to bankrupt the kingdom. Similarly, during COVID-19, many couldn&#39;t grasp how quickly the virus could spread, leading to delayed responses and widespread impacts. We all had that friend who warned, “It’s coming, start preparing,” but most didn’t listen. </p><p class="paragraph" style="text-align:left;">It may not be obvious, but AI development is experiencing similar exponential growth and the information asymmetry is also increasing. To understand how it will evolve, you need to look at the sub-components, roadmaps, and rate-limiting steps in the hardware, software, data, people, and capital. There are those in the know who quietly acknowledge this exponential growth, aware of its potential, often keeping their ‘extreme views’ to themselves, once again, analogous to early COVID-19. </p><p class="paragraph" style="text-align:left;">Leopold Aschenbrenner&#39;s extensive analysis, &quot;Situational Awareness,&quot; highlights the accelerating pace of AI capabilities across the various layers that drive innovation. From massive compute clusters to evolving AI models, the trajectory toward Artificial General Intelligence (AGI) is still a function of the sum of its parts. Leopold posits that AI advancements will continue to be exponential, driven by computing power, algorithmic efficiency, and new methodologies. Despite potential bottlenecks like data scarcity and unknowing challenges, the path to AGI is becoming clearer. Improvements in hardware, software, data, skills, and capital are compounding and driving transformative impacts that are often hiding in plain sight. </p><p class="paragraph" style="text-align:left;">While mainstream views often downplay AI ‘end game’ scenarios, a small group of experts is busily preparing for and predicting an ‘imminent’ (read this as within the next 3-5 years) AGI breakthrough. As an investor in this space, you see a range of people&#39;s reactions to this; unfortunately, some are still a little too skeptical, in my opinion. Naivety on this scale hasn’t served societies well historically. If we account for the consistent revision of AGI forecasts, it also points to a trend that AGI may be much closer than society thinks. Check out the full text if you&#39;d like to see the detailed breakdown, I think it&#39;s worth a read! [<a class="link" href="https://situational-awareness.ai/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=why-humans-struggle-to-estimate-ai-progress-antitrust-woes-openai-hiring-robotics-and-datasets-6-6-24" target="_blank" rel="noopener noreferrer nofollow">Situational awareness</a>] Share this story <a class="link" href="mailto:?subject=%20.%20.%20Why%20do%20we%20suck%20at%20estimating%20AI%20Advancements%3F&body=%0A%0A%0AWhy%20do%20we%20suck%20at%20estimating%20AI%20Advancements%3F%0AFirst%20impacted%3A%20Everyone%20%2F%2F%20Time%20to%20impact%3A%20%0A%0AHumans%20inherently%20struggle%20with%20understanding%20exponential%20relationships%20and%20growth.%20The%20wheat%20and%20chessboard%20problem%20illustrates%20this.%20The%20story%20goes%20that%20a%20king%20agrees%20to%20place%20one%20grain%20of%20wheat%20on%20the%20first%20square%20of%20a%20chessboard%2C%20doubling%20it%20on%20each%20subsequent%20square.%20By%20the%2064th%20square%2C%20the%20amount%20of%20wheat%20is%20huge%2C%20enough%20to%20bankrupt%20the%20kingdom.%20Similarly%2C%20during%20COVID-19%2C%20many%20couldn%27t%20grasp%20how%20quickly%20the%20virus%20could%20spread%2C%20leading%20to%20delayed%20responses%20and%20widespread%20impacts.%20We%20all%20had%20that%20friend%20who%20warned%2C%20%E2%80%9CIt%E2%80%99s%20coming%2C%20start%20preparing%2C%E2%80%9D%20but%20most%20didn%E2%80%99t%20listen.%20%0A%0AIt%20may%20not%20be%20obvious%2C%20but%20AI%20development%20is%20experiencing%20similar%20exponential%20growth%20and%20the%20information%20asymmetry%20is%20also%20increasing.%20To%20understand%20how%20it%20will%20evolve%2C%20you%20need%20to%20look%20at%20the%20sub-components%2C%20roadmaps%2C%20and%20rate-limiting%20steps%20in%20the%20hardware%2C%20software%2C%20data%2C%20people%2C%20and%20capital.%20There%20are%20those%20in%20the%20know%20who%20quietly%20acknowledge%20this%20exponential%20growth%2C%20aware%20of%20its%20potential%2C%20often%20keeping%20their%20%E2%80%98extreme%20views%E2%80%99%20to%20themselves%2C%20once%20again%2C%20analogous%20to%20early%20COVID-19.%20%0A%0ALeopold%20Aschenbrenner%27s%20extensive%20analysis%2C%20%22Situational%20Awareness%2C%22%20highlights%20the%20accelerating%20pace%20of%20AI%20capabilities%20across%20the%20various%20layers%20that%20drive%20innovation.%20From%20massive%20compute%20clusters%20to%20evolving%20AI%20models%2C%20the%20trajectory%20toward%20Artificial%20General%20Intelligence%20(AGI)%20is%20still%20a%20function%20of%20the%20sum%20of%20its%20parts.%20Leopold%20posits%20that%20AI%20advancements%20will%20continue%20to%20be%20exponential%2C%20driven%20by%20computing%20power%2C%20algorithmic%20efficiency%2C%20and%20new%20methodologies.%20Despite%20potential%20bottlenecks%20like%20data%20scarcity%20and%20unknowing%20challenges%2C%20the%20path%20to%20AGI%20is%20becoming%20clearer.%20Improvements%20in%20hardware%2C%20software%2C%20data%2C%20skills%2C%20and%20capital%20are%20compounding%20and%20driving%20transformative%20impacts%20that%20are%20often%20hiding%20in%20plain%20sight.%20%0A%0AWhile%20mainstream%20views%20often%20downplay%20AI%20%E2%80%98end%20game%E2%80%99%20scenarios%2C%20a%20small%20group%20of%20experts%20is%20busily%20preparing%20for%20and%20predicting%20an%20%E2%80%98imminent%E2%80%99%20(read%20this%20as%20within%20the%20next%203-5%20years)%20AGI%20breakthrough.%20As%20an%20investor%20in%20this%20space%2C%20you%20see%20a%20range%20of%20people%27s%20reactions%20to%20this%3B%20unfortunately%2C%20some%20are%20still%20a%20little%20too%20skeptical%2C%20in%20my%20opinion.%20Naivety%20on%20this%20scale%20hasn%E2%80%99t%20served%20societies%20well%20historically.%20If%20we%20account%20for%20the%20consistent%20revision%20of%20AGI%20forecasts%2C%20it%20also%20points%20to%20a%20trend%20that%20AGI%20may%20be%20much%20closer%20than%20society%20thinks.%20Check%20out%20the%20full%20text%20if%20you%27d%20like%20to%20see%20the%20detailed%20breakdown%2C%20I%20think%20it%27s%20worth%20a%20read!%0A%0A%20%5BSituational%20awareness%20https%3A%2F%2Fsituational-awareness.ai%2F%5D" target="_blank" rel="noopener noreferrer nofollow">by email</a></p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/596fab38-57f7-429b-9046-6f3ef665620e/Screenshot_2024-06-06_at_5.35.21_PM.png?t=1717720530"/></div><hr class="content_break"><h3 class="heading" style="text-align:left;" id="regulators-target-ai-giants-for-ant"><a class="link" href="https://www.nytimes.com/2024/06/05/technology/nvidia-microsoft-openai-antitrust-doj-ftc.html?unlocked_article_code=1.xk0.Ynfh.VR9THK54Llzm%E2%88%A3%3Durl-share&utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=why-humans-struggle-to-estimate-ai-progress-antitrust-woes-openai-hiring-robotics-and-datasets-6-6-24" target="_blank" rel="noopener noreferrer nofollow">Regulators Target AI Giants for Antitrust Probes</a></h3><p class="paragraph" style="text-align:left;">Federal regulators have agreed to initiate antitrust investigations into Microsoft, OpenAI, and Nvidia to examine their dominant positions in the AI industry, according to the New York Times, referencing two individuals familiar with the confidential discussions. The Justice Department will investigate Nvidia, while the FTC will focus on OpenAI and Microsoft, reflecting a broader initiative to address potential monopolistic practices in the AI sector. [<a class="link" href="https://www.nytimes.com/2024/06/05/technology/nvidia-microsoft-openai-antitrust-doj-ftc.html?unlocked_article_code=1.xk0.Ynfh.VR9THK54Llzm%E2%88%A3%3Durl-share&utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=why-humans-struggle-to-estimate-ai-progress-antitrust-woes-openai-hiring-robotics-and-datasets-6-6-24" target="_blank" rel="noopener noreferrer nofollow">U.S. Clears Way for Antitrust Inquiries of Nvidia, Microsoft and OpenAI</a>] Share this story <a class="link" href="mailto:?subject=%20.%20.%20Regulators%20Target%20AI%20Giants%20for%20Antitrust%20Probes&body=%0A%0A%0ARegulators%20Target%20AI%20Giants%20for%20Antitrust%20Probes%0AFirst%20impacted%3A%20AI%20industry%20analysts%2C%20regulatory%20compliance%20officers%20%2F%2F%20Time%20to%20impact%3A%20%0A%0AFederal%20regulators%20have%20agreed%20to%20initiate%20antitrust%20investigations%20into%20Microsoft%2C%20OpenAI%2C%20and%20Nvidia%20to%20examine%20their%20dominant%20positions%20in%20the%20AI%20industry%2C%20according%20to%20the%20New%20York%20Times%2C%20referencing%20two%20individuals%20familiar%20with%20the%20confidential%20discussions.%20The%20Justice%20Department%20will%20investigate%20Nvidia%2C%20while%20the%20FTC%20will%20focus%20on%20OpenAI%20and%20Microsoft%2C%20reflecting%20a%20broader%20initiative%20to%20address%20potential%20monopolistic%20practices%20in%20the%20AI%20sector.%0A%0A%20%5BU.S.%20Clears%20Way%20for%20Antitrust%20Inquiries%20of%20Nvidia%2C%20Microsoft%20and%20OpenAI%20https%3A%2F%2Fwww.nytimes.com%2F2024%2F06%2F05%2Ftechnology%2Fnvidia-microsoft-openai-antitrust-doj-ftc.html%3Funlocked_article_code%3D1.xk0.Ynfh.VR9THK54Llzm%26smid%3Durl-share%5D" target="_blank" rel="noopener noreferrer nofollow">by email</a></p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="open-ai-recruits-for-robotics"><a class="link" href="https://openai.com/careers/research-engineer-robotics/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=why-humans-struggle-to-estimate-ai-progress-antitrust-woes-openai-hiring-robotics-and-datasets-6-6-24" target="_blank" rel="noopener noreferrer nofollow">OpenAI Recruits for Robotics</a></h3><p class="paragraph" style="text-align:left;">OpenAI is seeking a Research Engineer for its Robotics team in San Francisco to focus on training and fine-tuning large multimodal LLMs, as detailed in their job posting. The role involves collaborating with industry partners to enhance robotics applications and signals the first time they have made these sorts of moves since 2020. [<a class="link" href="https://openai.com/careers/research-engineer-robotics/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=why-humans-struggle-to-estimate-ai-progress-antitrust-woes-openai-hiring-robotics-and-datasets-6-6-24" target="_blank" rel="noopener noreferrer nofollow">OpenAI recruits for robotics talent</a>] Share this story <a class="link" href="mailto:?subject=%20.%20.%20OpenAI%20Recruits%20for%20Robotics&body=%0A%0A%0AOpenAI%20Recruits%20for%20Robotics%0AFirst%20impacted%3A%20Robotics%20engineers%2C%20AI%20researchers%20%2F%2F%20Time%20to%20impact%3A%20%0A%0AOpenAI%20is%20seeking%20a%20Research%20Engineer%20for%20its%20Robotics%20team%20in%20San%20Francisco%20to%20focus%20on%20training%20and%20fine-tuning%20large%20multimodal%20LLMs%2C%20as%20detailed%20in%20their%20job%20posting.%20The%20role%20involves%20collaborating%20with%20industry%20partners%20to%20enhance%20robotics%20applications%20and%20signals%20the%20first%20time%20they%20have%20made%20these%20sorts%20of%20moves%20since%202020.%0A%0A%20%5BOpenAI%20recruits%20for%20robotics%20talent%20https%3A%2F%2Fopenai.com%2Fcareers%2Fresearch-engineer-robotics%2F%5D" target="_blank" rel="noopener noreferrer nofollow">by email</a></p><hr class="content_break"><h3 class="heading" style="text-align:left;" id="fine-web-edu-sets-new-dataset-stand"><a class="link" href="https://twitter.com/Thom_Wolf/status/1797178777820127724?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=why-humans-struggle-to-estimate-ai-progress-antitrust-woes-openai-hiring-robotics-and-datasets-6-6-24" target="_blank" rel="noopener noreferrer nofollow">FineWeb-Edu Sets New Dataset Standard</a></h3><p class="paragraph" style="text-align:left;">Guilherme Penedo from Hugging Face shared a report on the release of FineWeb and its educational subset, FineWeb-Edu, which incorporates 1.3 trillion tokens from a high-quality filtered Common Crawl dataset. According to the report, FineWeb-Edu outperforms all other publicly available web-scale datasets on benchmarks such as MMLU, ARC, and OpenBookQA, prompting a reassessment of the perceived quality of internet data. The post also received a lot of praise from industry leaders like Andrej Karpathy and Thomas Wolf, who noted it as potentially the best 45 minutes of reading you could do in the space if you want to understand how high-performing models work! Check out the full text if you&#39;re in this space! [<a class="link" href="https://twitter.com/Thom_Wolf/status/1797178777820127724?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=why-humans-struggle-to-estimate-ai-progress-antitrust-woes-openai-hiring-robotics-and-datasets-6-6-24" target="_blank" rel="noopener noreferrer nofollow">via @Thom_Wolf</a>] Share this story <a class="link" href="mailto:?subject=%20.%20.%20FineWeb-Edu%20Sets%20New%20Dataset%20Standard%20&body=%0A%0A%0AFineWeb-Edu%20Sets%20New%20Dataset%20Standard%20%0AFirst%20impacted%3A%20AI%20researchers%2C%20AI%20developers%20%2F%2F%20Time%20to%20impact%3A%20%0A%0AGuilherme%20Penedo%20from%20Hugging%20Face%20shared%20a%20report%20on%20the%20release%20of%20FineWeb%20and%20its%20educational%20subset%2C%20FineWeb-Edu%2C%20which%20incorporates%201.3%20trillion%20tokens%20from%20a%20high-quality%20filtered%20Common%20Crawl%20dataset.%20According%20to%20the%20report%2C%20FineWeb-Edu%20outperforms%20all%20other%20publicly%20available%20web-scale%20datasets%20on%20benchmarks%20such%20as%20MMLU%2C%20ARC%2C%20and%20OpenBookQA%2C%20prompting%20a%20reassessment%20of%20the%20perceived%20quality%20of%20internet%20data.%20The%20post%20also%20received%20a%20lot%20of%20praise%20from%20industry%20leaders%20like%20Andrej%20Karpathy%20and%20Thomas%20Wolf%2C%20who%20noted%20it%20as%20potentially%20the%20best%2045%20minutes%20of%20reading%20you%20could%20do%20in%20the%20space%20if%20you%20want%20to%20understand%20how%20high-performing%20models%20work!%20Check%20out%20the%20full%20text%20if%20you%27re%20in%20this%20space!%0A%0A%20%5Bvia%20%40Thom_Wolf%20https%3A%2F%2Ftwitter.com%2FThom_Wolf%2Fstatus%2F1797178777820127724%5D" target="_blank" rel="noopener noreferrer nofollow">by email</a></p><p class="paragraph" style="text-align:left;">—</p></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=f55fb772-d2e4-4650-a73c-1c307b37ab4a&utm_medium=post_rss&utm_source=headline_edit">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

      <item>
  <title>Smaller Models Get Better and China Invests in Chip Independence</title>
  <description>China, Mistral, Llama, Andrej Karpathy, Sonic</description>
      <enclosure url="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/f74e9f4e-af5a-46f2-bc2e-ad0723514548/An_isometric_retrofu__7_.jpg" length="174475" type="image/jpeg"/>
  <link>https://edit.headline.com/p/smaller-models-get-better-china-invests-chip-independence-52924</link>
  <guid isPermaLink="true">https://edit.headline.com/p/smaller-models-get-better-china-invests-chip-independence-52924</guid>
  <pubDate>Thu, 30 May 2024 06:14:32 +0000</pubDate>
  <atom:published>2024-05-30T06:14:32Z</atom:published>
    <dc:creator>Sasha Krecinic</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #c7bab0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #161618; font-family: 'Roboto',-apple-system,BlinkMacSystemFont,Tahoma,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#fcf0e8; }
  .bh__table_header p { color: #161618; font-family:'Roboto',-apple-system,BlinkMacSystemFont,Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><p class="paragraph" style="text-align:left;"></p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/f74e9f4e-af5a-46f2-bc2e-ad0723514548/An_isometric_retrofu__7_.jpg?t=1724185745"/></div><p class="paragraph" style="text-align:left;">This week&#39;s newsletter has a theme: the miniaturization of models while maintaining comparable results to larger models. This once again shows a potential path to locally hosted models. We have also intentionally left out all the drama/arguing happening on X between Elon and Yann, as it lacked substance. We also spotlight China&#39;s huge $47 billion USD investment in its semiconductor industry in an attempt to gain independence from US chip manufacturers and regulations.</p><p class="paragraph" style="text-align:left;">— Sasha</p><hr class="content_break"><h3 class="heading" style="text-align:start;" id="llama-3-v-claims-superior-performan"><a class="link" href="https://aksh-garg.medium.com/llama-3v-building-an-open-source-gpt-4v-competitor-in-under-500-7dd8f1f6c9ee?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=smaller-models-get-better-and-china-invests-in-chip-independence" target="_blank" rel="noopener noreferrer nofollow">Llama3-V Claims Superior Performance at Reduced Size</a></h3><p class="paragraph" style="text-align:start;">The team from AmbientGPT has released Llama3-V, a new AI model developed on the Llama3 platform. Llama3-V has been launched and shows a 10-20% improvement in benchmarks over Llama, the current state-of-the-art model for multimodal understanding. It is also 100 times smaller than the current SOTA models. The team has also released a Mac interface, which uses the context from the screen to help improve prompt responses and can run using local or cloud models. [Llama 3-V: Matching GPT4-V with a 100x smaller model and 500 dollars]</p><hr class="content_break"><h3 class="heading" style="text-align:start;" id="andrej-karpathy-trains-gpt-2-in-90-"><a class="link" href="https://github.com/karpathy/llm.c/discussions/481?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=smaller-models-get-better-and-china-invests-in-chip-independence" target="_blank" rel="noopener noreferrer nofollow">Andrej Karpathy Trains GPT2 in 90 Minutes and $20 of compute</a></h3><p class="paragraph" style="text-align:start;">According to a recent post, Andrej Karpathy has replicated the GPT-2 (124M) model in approximately 90 minutes of training at a cost of around $20. The training utilized the FineWeb dataset with 10 billion tokens and achieved a HellaSwag accuracy of 29.9, surpassing the previous score of the original GPT-2 model of 29.4. The exercise highlights the replicability of GPT models and the limited costs of training small models. [<a class="link" href="https://github.com/karpathy/llm.c/discussions/481?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=smaller-models-get-better-and-china-invests-in-chip-independence" target="_blank" rel="noopener noreferrer nofollow">Reproducing GPT-2 (124M) in llm.c in 90 minutes for $20 · karpathy llm.c · Discussion #481</a>]</p><hr class="content_break"><h3 class="heading" style="text-align:start;" id="codestral-launches-sets-new-ai-benc"><a class="link" href="https://mistral.ai/news/codestral/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=smaller-models-get-better-and-china-invests-in-chip-independence" target="_blank" rel="noopener noreferrer nofollow">Codestral Launches, Sets New AI Benchmark For Model Size</a></h3><p class="paragraph" style="text-align:start;">Mistral AI introduces Codestral, a 22B open-weight generative AI model designed for code generation, fluent in over 80 programming languages, and equipped with a 32k context window. Codestral, now available under a Non-Production License and accessible via an API, outperforms competitors in benchmarks of similar sizes and, based on user testimony, even challenges models of significantly larger sizes. [<a class="link" href="https://mistral.ai/news/codestral/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=smaller-models-get-better-and-china-invests-in-chip-independence" target="_blank" rel="noopener noreferrer nofollow">Codestral: Hello, World!</a>]</p><hr class="content_break"><h3 class="heading" style="text-align:start;" id="cartesia-launches-sonic-for-real-ti"><a class="link" href="https://www.loom.com/share/72b8bd84009443d5926eb97f92d53a9f?sid=975fa7d0-bdc5-4670-b4f2-80c1eba1c4eb&utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=smaller-models-get-better-and-china-invests-in-chip-independence" target="_blank" rel="noopener noreferrer nofollow">Cartesia Launches Sonic for Real-Time Voice Applications</a></h3><p class="paragraph" style="text-align:start;">Cartesia has launched Sonic, a new voice model with a latency of 135ms, designed for real-time applications such as customer support and entertainment. According to Cartesia, Sonic has demonstrated superior performance in tests, achieving twice the accuracy in audio generation and delivering initial audio output 1.5 times faster than existing Transformer models, with a fourfold increase in processing speed. [<a class="link" href="https://www.loom.com/share/72b8bd84009443d5926eb97f92d53a9f?sid=975fa7d0-bdc5-4670-b4f2-80c1eba1c4eb&utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=smaller-models-get-better-and-china-invests-in-chip-independence" target="_blank" rel="noopener noreferrer nofollow">Sonic Demo</a>]</p><hr class="content_break"><h3 class="heading" style="text-align:start;" id="china-launches-largest-semiconducto"><a class="link" href="https://www.reuters.com/technology/china-sets-up-475-bln-state-fund-boost-semiconductor-industry-2024-05-27/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=smaller-models-get-better-and-china-invests-in-chip-independence" target="_blank" rel="noopener noreferrer nofollow">China Launches Largest Semiconductor Fund For Self-Sufficiency</a></h3><p class="paragraph" style="text-align:start;">China has launched its largest state-backed investment fund to date, totaling 344B yuan ($47B USD), to bolster its semiconductor industry, according to official sources. The fund, established on May 24, includes significant contributions from China&#39;s finance ministry and major Chinese banks, leading to a 3% increase in the CES CN Semiconductor Index [<a class="link" href="https://www.reuters.com/technology/china-sets-up-475-bln-state-fund-boost-semiconductor-industry-2024-05-27/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=smaller-models-get-better-and-china-invests-in-chip-independence" target="_blank" rel="noopener noreferrer nofollow">China sets up third fund with $47.5 bln to boost semiconductor sector</a>]</p><hr class="content_break"><h3 class="heading" style="text-align:start;" id="tougher-prompts-and-benchmarking-fo"><a class="link" href="https://lmsys.org/blog/2024-05-17-category-hard/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=smaller-models-get-better-and-china-invests-in-chip-independence" target="_blank" rel="noopener noreferrer nofollow">Tougher Prompts and Benchmarking for the Chatbot Arena</a></h3><p class="paragraph" style="text-align:start;">Chatbot Arena has launched a new &quot;Hard Prompts&quot; category on its leaderboard, responding to community interest in more complex challenges for AI language models. The harder prompts also expose models with more robust capabilities and show that not all models&#39; performance degrades equally as prompts become harder. Overfitting and fine-tuning towards the benchmark prompts are often issues, so the prompts were carefully curated in this instance to get a richer picture of model performance. [<a class="link" href="https://lmsys.org/blog/2024-05-17-category-hard/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=smaller-models-get-better-and-china-invests-in-chip-independence" target="_blank" rel="noopener noreferrer nofollow">Introducing Hard Prompts Category in Chatbot Arena | LMSYS Org</a>]</p></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=c7a005ef-a8f4-40ee-a336-8547c77104c4&utm_medium=post_rss&utm_source=headline_edit">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

      <item>
  <title>A New Era for Personal Computing</title>
  <description>Agents, AI Privacy, Latency, OpenAI, Google, Microsoft</description>
      <enclosure url="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/e33fffa4-d714-43d1-9ce1-1b7b3cd79b17/An_isometric_retrofu__8_.jpg" length="112620" type="image/jpeg"/>
  <link>https://edit.headline.com/p/new-era-personal-computing-52224</link>
  <guid isPermaLink="true">https://edit.headline.com/p/new-era-personal-computing-52224</guid>
  <pubDate>Thu, 23 May 2024 02:33:41 +0000</pubDate>
  <atom:published>2024-05-23T02:33:41Z</atom:published>
    <dc:creator>Sasha Krecinic</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #c7bab0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #161618; font-family: 'Roboto',-apple-system,BlinkMacSystemFont,Tahoma,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#fcf0e8; }
  .bh__table_header p { color: #161618; font-family:'Roboto',-apple-system,BlinkMacSystemFont,Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/e33fffa4-d714-43d1-9ce1-1b7b3cd79b17/An_isometric_retrofu__8_.jpg?t=1724185983"/></div><p class="paragraph" style="text-align:left;">Microsoft has launched its AI-Optimized Copilot, along with a new line of PCs, marking a new chapter in personal and business computing. We break down why this is such a big deal and also share other players in the space working on agents and computer vision to improve contextual awareness. In case you missed it, other headlines from earlier in the week were about Google&#39;s I/O event and Gemini 1.5 results. Since then, we&#39;ve seen their search results incorporate some of the new functionality, which has been a step up in the search experience. We also see headlines in the world of &#39;agents,&#39; and we suspect this will continue to be the buzzword that garners attention for a while!</p><hr class="content_break"><h3 class="heading" style="text-align:start;" id="microsoft-launches-ai-optimized-cop"><a class="link" href="https://blogs.microsoft.com/blog/2024/05/20/introducing-copilot-pcs/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=a-new-era-for-personal-computing" target="_blank" rel="noopener noreferrer nofollow">Microsoft Launches AI-Optimized Copilot+ for PCs and Business Applications</a></h3><p class="paragraph" style="text-align:start;"><i>First impacted:</i> Everyone</p><p class="paragraph" style="text-align:start;">Microsoft has launched Copilot+ and a new line of Windows PCs designed specifically for AI applications. These devices feature powerful new silicon, including ARM processors with Neural Processing Units (NPUs) capable of a whopping 40+ TOPS, all-day battery life, and access to advanced AI models. Starting at $999, these PCs are within reach of average consumers and set an interesting stage for a world that can accommodate &#39;agent&#39;-centric workflows, or what Microsoft is calling their &#39;Copilot&#39;. </p><p class="paragraph" style="text-align:start;"><b>Hardware &gt; Local Models &gt; Privacy &gt; HAL9000? </b><br>The new Copilot+ PCs introduce AI-driven experiences like Recall and Cocreator. This innovation, along with advancements in multimodal AI and privacy, is transforming our interactions with computers. Users will be able to share significantly more data with a locally hosted model than they otherwise might feel comfortable with. Instead of the usual &quot;Don&#39;t insert any personal data,&quot; soon you might find yourself comfortable enough to share your social security or credit card details. The core mechanics of locally hosted models, combined with enhanced privacy and computer vision, could change most human-to-computer interactions. </p><p class="paragraph" style="text-align:start;"><b>Enhancements to Microsoft Copilot for Business </b><br>Microsoft has also announced significant updates to its Copilot feature, now integrating with Microsoft Teams and Planner to improve team collaboration by managing agendas and tracking action items. Additionally, the company introduced Copilot Studio, which allows users to create custom AI agents that automate business processes and integrate with business data systems. The ramifications for workflow management software are quite interesting, and it is impressive how quickly the landscape is changing. </p><p class="paragraph" style="text-align:start;"><b>Why Is This Important? </b><br>In the context of highly capable large language models (LLMs) with low latency, computer vision, and hardware optimized to host local LLMs/Agents, advancements are occurring faster than most anticipated (even those of us who were saying &quot;it&#39;s coming sooner than you think&quot;). We are quickly approaching a future where your local computer will be able to perform tasks autonomously and assist you proactively. What the local LLM cannot do on the local machine, the bigger, more powerful version in the cloud helps with, potentially guiding the local LLM as required. For instance, imagine needing to change your desktop background. Instead of searching for instructions online, you could simply ask, &quot;Hey, find how to change the desktop background to X,&quot; and your computer would handle it for you. Or if you are a software vendor, instead of having customer support, you could provide agent instructions that prompt users for help if they are clicking around for more than X seconds, offering guidance on the relevant workflow. This could potentially change most workflows significantly. [<a class="link" href="https://blogs.microsoft.com/blog/2024/05/20/introducing-copilot-pcs/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=a-new-era-for-personal-computing" target="_blank" rel="noopener noreferrer nofollow">Introducing Copilot+ PCs - The Official Microsoft Blog</a>] Share this story <a class="link" href="mailto:?subject=%20.%20.%20Microsoft%20Launches%20AI-Optimized%20Copilot%2B%20for%20PCs%20and%20Business%20Applications&body=%0A%0A%0AMicrosoft%20Launches%20AI-Optimized%20Copilot%2B%20for%20PCs%20and%20Business%20Applications%0AFirst%20impacted%3A%20AI%20developers%2C%20tech-savvy%20professionals%20%2F%2F%20Time%20to%20impact%3A%20%0A%0AMicrosoft%20has%20launched%20Copilot%2B%20and%20a%20new%20line%20of%20Windows%20PCs%20designed%20specifically%20for%20AI%20applications.%20These%20devices%20feature%20powerful%20new%20silicon%2C%20including%20ARM%20processors%20with%20Neural%20Processing%20Units%20(NPUs)%20capable%20of%20a%20whopping%2040%2B%20TOPS%2C%20all-day%20battery%20life%2C%20and%20access%20to%20advanced%20AI%20models.%20Starting%20at%20%24999%2C%20these%20PCs%20are%20within%20reach%20of%20average%20consumers%20and%20set%20an%20interesting%20stage%20for%20a%20world%20that%20can%20accommodate%20%27agent%27-centric%20workflows%2C%20or%20what%20Microsoft%20is%20calling%20their%20%27Copilot%27.%20%0A%0AHardware%20%3E%20Local%20Models%20%3E%20Privacy%20%3E%20HAL9000%3F%20%0AThe%20new%20Copilot%2B%20PCs%20introduce%20AI-driven%20experiences%20like%20Recall%20and%20Cocreator.%20This%20innovation%2C%20along%20with%20advancements%20in%20multimodal%20AI%20and%20privacy%2C%20is%20transforming%20our%20interactions%20with%20computers.%20Users%20will%20be%20able%20to%20share%20significantly%20more%20data%20with%20a%20locally%20hosted%20model%20than%20they%20otherwise%20might%20feel%20comfortable%20with.%20Instead%20of%20the%20usual%20%22Don%27t%20insert%20any%20personal%20data%2C%22%20soon%20you%20might%20find%20yourself%20comfortable%20enough%20to%20share%20your%20social%20security%20or%20credit%20card%20details.%20The%20core%20mechanics%20of%20locally%20hosted%20models%2C%20combined%20with%20enhanced%20privacy%20and%20computer%20vision%2C%20could%20change%20most%20human-to-computer%20interactions.%0A%0AEnhancements%20to%20Microsoft%20Copilot%20for%20Business%0AMicrosoft%20has%20also%20announced%20significant%20updates%20to%20its%20Copilot%20feature%2C%20now%20integrating%20with%20Microsoft%20Teams%20and%20Planner%20to%20improve%20team%20collaboration%20by%20managing%20agendas%20and%20tracking%20action%20items.%20Additionally%2C%20the%20company%20introduced%20Copilot%20Studio%2C%20which%20allows%20users%20to%20create%20custom%20AI%20agents%20that%20automate%20business%20processes%20and%20integrate%20with%20business%20data%20systems.%20The%20ramifications%20for%20workflow%20management%20software%20are%20quite%20interesting%2C%20and%20it%20is%20impressive%20how%20quickly%20the%20landscape%20is%20changing.%0A%0AWhy%20Is%20This%20Important%3F%0AIn%20the%20context%20of%20highly%20capable%20large%20language%20models%20(LLMs)%20with%20low%20latency%2C%20computer%20vision%2C%20and%20hardware%20optimized%20to%20host%20local%20LLMs%2FAgents%2C%20advancements%20are%20occurring%20faster%20than%20most%20anticipated%20(even%20those%20of%20us%20who%20were%20saying%20%22it%27s%20coming%20sooner%20than%20you%20think%22).%20We%20are%20quickly%20approaching%20a%20future%20where%20your%20local%20computer%20will%20be%20able%20to%20perform%20tasks%20autonomously%20and%20assist%20you%20proactively.%20What%20the%20local%20LLM%20cannot%20do%20on%20the%20local%20machine%2C%20the%20bigger%2C%20more%20powerful%20version%20in%20the%20cloud%20helps%20with%2C%20potentially%20guiding%20the%20local%20LLM%20as%20required.%20For%20instance%2C%20imagine%20needing%20to%20change%20your%20desktop%20background.%20Instead%20of%20searching%20for%20instructions%20online%2C%20you%20could%20simply%20ask%2C%20%22Hey%2C%20find%20how%20to%20change%20the%20desktop%20background%20to%20X%2C%22%20and%20your%20computer%20would%20handle%20it%20for%20you.%20Or%20if%20you%20are%20a%20software%20vendor%2C%20instead%20of%20having%20customer%20support%2C%20you%20could%20provide%20agent%20instructions%20that%20prompt%20users%20for%20help%20if%20they%20are%20clicking%20around%20for%20more%20than%20X%20seconds%2C%20offering%20guidance%20on%20the%20relevant%20workflow.%20This%20could%20potentially%20change%20most%20workflows%20significantly.%20%0A%0A%20%5BIntroducing%20Copilot%2B%20PCs%20-%20The%20Official%20Microsoft%20Blog%20https%3A%2F%2Fblogs.microsoft.com%2Fblog%2F2024%2F05%2F20%2Fintroducing-copilot-pcs%2F%5D" target="_blank" rel="noopener noreferrer nofollow">by email</a></p><hr class="content_break"><h3 class="heading" style="text-align:start;" id="internal-and-external-views-on-open"><a class="link" href="https://twitter.com/gdb/status/1791869138132218351?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=a-new-era-for-personal-computing" target="_blank" rel="noopener noreferrer nofollow">Internal and External Views on OpenAI&#39;s Alignment Problem</a></h3><p class="paragraph" style="text-align:start;"><i>First impacted:</i> Everyone</p><p class="paragraph" style="text-align:start;">In a significant shakeup at OpenAI, key team members Ilya Sutskever and Jan Leike resigned, with Leike highlighting that “safety culture and processes have taken a backseat to shiny products” at the company. Their departure follows the disbanding of OpenAI’s Superalignment team, which aimed to address long-term AI risks. Leike emphasized the need for serious preparations for AGI to benefit humanity. Meanwhile, AI leader Yann LeCun criticized the urgency around controlling superintelligent AI, arguing that we need to focus on developing systems smarter than basic animals first. He likened the current urgency to trying to ensure the safety of advanced aircraft before the fundamental technology even exists, emphasizing a gradual, iterative approach to AI development and safety. [<a class="link" href="https://twitter.com/gdb/status/1791869138132218351?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=a-new-era-for-personal-computing" target="_blank" rel="noopener noreferrer nofollow">via @gdb</a>] Share this story <a class="link" href="mailto:?subject=%20.%20.%20Internal%20and%20External%20Views%20on%20OpenAI%27s%20Alignment%20Problem&body=%0A%0A%0AInternal%20and%20External%20Views%20on%20OpenAI%27s%20Alignment%20Problem%0AFirst%20impacted%3A%20AI%20developers%2C%20safety%20engineers%20%2F%2F%20Time%20to%20impact%3A%20%0A%0AIn%20a%20significant%20shakeup%20at%20OpenAI%2C%20key%20team%20members%20Ilya%20Sutskever%20and%20Jan%20Leike%20resigned%2C%20with%20Leike%20highlighting%20that%20%E2%80%9Csafety%20culture%20and%20processes%20have%20taken%20a%20backseat%20to%20shiny%20products%E2%80%9D%20at%20the%20company.%20Their%20departure%20follows%20the%20disbanding%20of%20OpenAI%E2%80%99s%20Superalignment%20team%2C%20which%20aimed%20to%20address%20long-term%20AI%20risks.%20Leike%20emphasized%20the%20need%20for%20serious%20preparations%20for%20AGI%20to%20benefit%20humanity.%20Meanwhile%2C%20AI%20leader%20Yann%20LeCun%20criticized%20the%20urgency%20around%20controlling%20superintelligent%20AI%2C%20arguing%20that%20we%20need%20to%20focus%20on%20developing%20systems%20smarter%20than%20basic%20animals%20first.%20He%20likened%20the%20current%20urgency%20to%20trying%20to%20ensure%20the%20safety%20of%20advanced%20aircraft%20before%20the%20fundamental%20technology%20even%20exists%2C%20emphasizing%20a%20gradual%2C%20iterative%20approach%20to%20AI%20development%20and%20safety.%0A%0A%20%5Bvia%20%40gdb%20https%3A%2F%2Ftwitter.com%2Fgdb%2Fstatus%2F1791869138132218351%5D" target="_blank" rel="noopener noreferrer nofollow">by email</a></p><hr class="content_break"><h3 class="heading" style="text-align:start;" id="wafer-shares-demo-of-agent-mobile-o"><a class="link" href="https://wafer.systems/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=a-new-era-for-personal-computing" target="_blank" rel="noopener noreferrer nofollow">Wafer Shares Demo of Agent + Mobile OS Capabilities</a></h3><p class="paragraph" style="text-align:start;"><i>First impacted:</i> mobile developers, smartphone users</p><p class="paragraph" style="text-align:start;">Developers at Wafer.System are creating AI at the OS level, enabling AI agents to use the same device interfaces as users, such as virtual keyboards and touchscreens, to enhance efficiency and user experience. They say this integration allows AI agents to access extensive app data and user interactions, potentially tripling efficiency by predicting and automating user actions without third-party app integrations. Check out the video, it&#39;s super interesting! [<a class="link" href="https://wafer.systems/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=a-new-era-for-personal-computing" target="_blank" rel="noopener noreferrer nofollow">Wafer</a>] Share this story <a class="link" href="mailto:?subject=%20.%20.%20Wafer%20Shares%20Demo%20of%20Agent%20%2B%20Mobile%20OS%20Capabilities&body=%0A%0A%0AWafer%20Shares%20Demo%20of%20Agent%20%2B%20Mobile%20OS%20Capabilities%0AFirst%20impacted%3A%20mobile%20developers%2C%20smartphone%20users%20%2F%2F%20Time%20to%20impact%3A%20%0A%0ADevelopers%20at%20Wafer.System%20are%20creating%20AI%20at%20the%20OS%20level%2C%20enabling%20AI%20agents%20to%20use%20the%20same%20device%20interfaces%20as%20users%2C%20such%20as%20virtual%20keyboards%20and%20touchscreens%2C%20to%20enhance%20efficiency%20and%20user%20experience.%20They%20say%20this%20integration%20allows%20AI%20agents%20to%20access%20extensive%20app%20data%20and%20user%20interactions%2C%20potentially%20tripling%20efficiency%20by%20predicting%20and%20automating%20user%20actions%20without%20third-party%20app%20integrations.%20Check%20out%20the%20video%2C%20it%27s%20super%20interesting!%0A%0A%20%5BWafer%20https%3A%2F%2Fwafer.systems%2F%5D" target="_blank" rel="noopener noreferrer nofollow">by email</a></p><hr class="content_break"><h3 class="heading" style="text-align:start;" id="ambient-gpt-launches-mac-os-app-and"><a class="link" href="https://github.com/siddrrsh/ambientGPT?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=a-new-era-for-personal-computing" target="_blank" rel="noopener noreferrer nofollow">AmbientGPT Launches MacOS App and Enhances Contextual Understanding with Screen Vision Data</a></h3><p class="paragraph" style="text-align:start;"><i>First impacted:</i> mobile developers, smartphone users</p><p class="paragraph" style="text-align:start;">Awni Hannun says AmbientGPT, a new MacOS app that integrates GPT-4o for enhanced contextual understanding directly from your screen, is set to launch soon. The app, which operates entirely on-device to ensure data privacy, requires an ARM64 MacBook and a specific OpenAI API key, and is pending Apple certification. [<a class="link" href="https://github.com/siddrrsh/ambientGPT?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=a-new-era-for-personal-computing" target="_blank" rel="noopener noreferrer nofollow">GitHub - siddrrsh/ambientGPT</a>] Share this story <a class="link" href="mailto:?subject=%20.%20.%20AmbientGPT%20Launches%20MacOS%20App%20and%20Enhances%20Contextual%20Understanding%20with%20Screen%20Vision%20Data&body=%0A%0A%0AAmbientGPT%20Launches%20MacOS%20App%20and%20Enhances%20Contextual%20Understanding%20with%20Screen%20Vision%20Data%0AFirst%20impacted%3A%20MacOS%20developers%2C%20privacy-conscious%20MacBook%20users%20%2F%2F%20Time%20to%20impact%3A%20%0A%0AAwni%20Hannun%20says%20AmbientGPT%2C%20a%20new%20MacOS%20app%20that%20integrates%20GPT-4o%20for%20enhanced%20contextual%20understanding%20directly%20from%20your%20screen%2C%20is%20set%20to%20launch%20soon.%20The%20app%2C%20which%20operates%20entirely%20on-device%20to%20ensure%20data%20privacy%2C%20requires%20an%20ARM64%20MacBook%20and%20a%20specific%20OpenAI%20API%20key%2C%20and%20is%20pending%20Apple%20certification.%0A%0A%20%5BGitHub%20-%20siddrrsh%2FambientGPT%20https%3A%2F%2Fgithub.com%2Fsiddrrsh%2FambientGPT%5D" target="_blank" rel="noopener noreferrer nofollow">by email</a></p><hr class="content_break"><h3 class="heading" style="text-align:start;" id="google-shares-gemini-15-pro-results"><a class="link" href="https://goo.gle/GeminiV1-5?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=a-new-era-for-personal-computing" target="_blank" rel="noopener noreferrer nofollow">Google Shares Gemini 1.5 Pro Results In Technical Report</a></h3><p class="paragraph" style="text-align:start;"><i>First impacted:</i> AI researchers, software developers</p><p class="paragraph" style="text-align:start;">A report from Google&#39;s Gemini team highlights that the Gemini 1.5 Pro model demonstrates improved performance over the previous 1.0 Ultra, particularly in text and vision benchmarks, achieving a 91.7% score in the MMLU benchmark. The model has enhanced in-context learning capabilities, especially in low-resource language translation and mixed-modal learning, as detailed in the updated 153-page technical report. Check it out if you&#39;d like to get into the details. [<a class="link" href="http://goo.gle?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=a-new-era-for-personal-computing" target="_blank" rel="noopener noreferrer nofollow">goo.gle</a>] Share this story <a class="link" href="mailto:?subject=%20.%20.%20Google%20Shares%20Gemini%201.5%20Pro%20Results%20In%20Technical%20Report&body=%0A%0A%0AGoogle%20Shares%20Gemini%201.5%20Pro%20Results%20In%20Technical%20Report%0AFirst%20impacted%3A%20AI%20researchers%2C%20software%20developers%20%2F%2F%20Time%20to%20impact%3A%20%0A%0AA%20report%20from%20Google%27s%20Gemini%20team%20highlights%20that%20the%20Gemini%201.5%20Pro%20model%20demonstrates%20improved%20performance%20over%20the%20previous%201.0%20Ultra%2C%20particularly%20in%20text%20and%20vision%20benchmarks%2C%20achieving%20a%2091.7%25%20score%20in%20the%20MMLU%20benchmark.%20The%20model%20has%20enhanced%20in-context%20learning%20capabilities%2C%20especially%20in%20low-resource%20language%20translation%20and%20mixed-modal%20learning%2C%20as%20detailed%20in%20the%20updated%20153-page%20technical%20report.%20Check%20it%20out%20if%20you%27d%20like%20to%20get%20into%20the%20details.%20%0A%0A%20%5Bgoo.gle%20https%3A%2F%2Fgoo.gle%2FGeminiV1-5%5D" target="_blank" rel="noopener noreferrer nofollow">by email</a></p><hr class="content_break"><h3 class="heading" style="text-align:start;" id="scale-ai-secures-1-billion-in-fundi"><a class="link" href="https://scale.com/careers?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=a-new-era-for-personal-computing" target="_blank" rel="noopener noreferrer nofollow">Scale AI Secures $1 Billion in Funding</a></h3><p class="paragraph" style="text-align:start;"><i>First impacted:</i> AI developers, investors, tech industry developers</p><p class="paragraph" style="text-align:start;">Scale AI, led by CEO Alexandr Wang, has raised $1 billion in a financing round with Accel and existing investors, boosting its valuation to $13.8 billion. The company says the funds will be used to further develop its frontier data and enhance its Data Engine, which supports advanced LLMs and computer vision models. [<a class="link" href="https://scale.com/careers?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=a-new-era-for-personal-computing" target="_blank" rel="noopener noreferrer nofollow">Careers | Scale AI</a>] Share this story <a class="link" href="mailto:?subject=%20.%20.%20Scale%20AI%20Secures%20%241%20Billion%20in%20Funding%20%20&body=%0A%0A%0AScale%20AI%20Secures%20%241%20Billion%20in%20Funding%20%20%0AFirst%20impacted%3A%20AI%20researchers%2C%20tech%20industry%20developers%20%2F%2F%20Time%20to%20impact%3A%20%0A%0AScale%20AI%2C%20led%20by%20CEO%20Alexandr%20Wang%2C%20has%20raised%20%241%20billion%20in%20a%20financing%20round%20with%20Accel%20and%20existing%20investors%2C%20boosting%20its%20valuation%20to%20%2413.8%20billion.%20The%20company%20says%20the%20funds%20will%20be%20used%20to%20further%20develop%20its%20frontier%20data%20and%20enhance%20its%20Data%20Engine%2C%20which%20supports%20advanced%20LLMs%20and%20computer%20vision%20models.%0A%0A%20%5BCareers%20%7C%20Scale%20AI%20https%3A%2F%2Fscale.com%2Fcareers%5D" target="_blank" rel="noopener noreferrer nofollow">by email</a></p></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=1345cdfc-0249-4f0a-9833-ea7daa203fbb&utm_medium=post_rss&utm_source=headline_edit">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

      <item>
  <title>OpenAI Unveils GPT-4o: A Breakdown of This Year&#39;s Biggest Day in AI</title>
  <description>#GPT-4o Launch, Agents, Robots, Apple</description>
      <enclosure url="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/866f9e0a-ab95-4a5a-891e-8aeea94270fa/An_isometric_retrofu__9_.jpg" length="87106" type="image/jpeg"/>
  <link>https://edit.headline.com/p/openai-unveils-gpt4o-breakdown-years-biggest-day-ai-51324</link>
  <guid isPermaLink="true">https://edit.headline.com/p/openai-unveils-gpt4o-breakdown-years-biggest-day-ai-51324</guid>
  <pubDate>Tue, 14 May 2024 03:07:06 +0000</pubDate>
  <atom:published>2024-05-14T03:07:06Z</atom:published>
    <dc:creator>Sasha Krecinic</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #c7bab0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #161618; font-family: 'Roboto',-apple-system,BlinkMacSystemFont,Tahoma,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#fcf0e8; }
  .bh__table_header p { color: #161618; font-family:'Roboto',-apple-system,BlinkMacSystemFont,Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><p class="paragraph" style="text-align:start;"></p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/866f9e0a-ab95-4a5a-891e-8aeea94270fa/An_isometric_retrofu__9_.jpg?t=1724186268"/></div><p class="paragraph" style="text-align:start;">Today&#39;s newsletter is a bit different. We have a diverse audience, and it&#39;s clear that the information gap between pioneers and the mainstream media is growing. Sometimes, we need to highlight what&#39;s just below the surface. OpenAI&#39;s release today was spectacular for many reasons, most of which weren&#39;t immediately apparent, mentioned by OpenAI, or covered by mainstream news. We&#39;ll dissect some of these and provide a brief breakdown of why they matter. Plus, there&#39;s more traction in the rumor mill about Apple&#39;s partnership with OpenAI and a new humanoid robot that costs less than a Toyota Yaris! </p><p class="paragraph" style="text-align:start;">Also, a quick administrative update: we&#39;ve decided to change the cadence of our newsletter from daily to weekly. This lets us deliver a hyper-curated roundup of the AI news that really matters and its implications, so you can stay informed without the clutter. If there&#39;s a breaking story you can&#39;t miss, we&#39;ll make an exception and fill you in immediately. Thanks for keeping up with us!</p><hr class="content_break"><h3 class="heading" style="text-align:start;" id="open-ai-launches-gpt-4-o-and-its-mu"><a class="link" href="https://openai.com/index/hello-gpt-4o/?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=openai-unveils-gpt-4o-a-breakdown-of-this-year-s-biggest-day-in-ai" target="_blank" rel="noopener noreferrer nofollow">OpenAI Launches GPT-4o And It&#39;s Much More Than Meets The Eye</a></h3><p class="paragraph" style="text-align:start;"><i>First impacted:</i> Everyone!<br><i>Time to impact:</i> <b>Short</b></p><p class="paragraph" style="text-align:start;">OpenAI has launched GPT-4o, now available in ChatGPT, which integrates text, audio, and vision processing in real time. We recommend checking out the demo video. OpenAI also highlighted that the new models ranked as &#39;GPT2&#39; on the Lymsys chatbot leaderboards, which attracted a lot of attention and speculation. These models achieve a +100 ELO rating over previous models (which is a scoring model originally used for Chess players), representing a 64% win rate against the next closest models (which is a significant increase in win rate percentage of 78%). It accomplishes this in as little as 232 milliseconds, operating at twice the speed and half the cost of GPT-4 Turbo. Here are some highlights we found particularly impressive: </p><p class="paragraph" style="text-align:start;"><b>Reasoning</b>: This is something that the AI community has been anticipating eagerly. The ability to break down a problem into its subcomponents and plan on how to solve them is crucial. There is a significant increase in the model&#39;s &#39;logic&#39; capability which received little coverage to date. </p><p class="paragraph" style="text-align:start;"><b>Desktop App</b>: OpenAI has shown they are masters of tech and distribution. The ability to distribute their future products and functionality to desktops will be an amazing channel because they won’t be restricted to a web browser or mobile anymore. Having the ability to watch, learn, and interact with a user’s workstation will be pivotal in expanding their value to users. </p><p class="paragraph" style="text-align:start;"><b>Multimodality</b>: The ability to see and hear is a huge leap forward, and more important than it initially seems. If it can see the world, it will soon be able to see your computer screen. Once it can do that, it will have the potential to help you interact with your daily work. What does that mean? We are much closer to a world where AI can manage the flow of information and productive work than meets the eye. </p><p class="paragraph" style="text-align:start;"><b>On Mobile</b>: In every pocket and focusing on consumer-led distribution first, OpenAI has really dominated the consumer market in this respect. The more users they get, the better the intent data and the better the model, creating a flywheel effect that will serve them well in the years ahead. </p><p class="paragraph" style="text-align:start;"><b>Latency</b>: The final limiting factor to pulling all this off was it feeling natural. To do all of the above in a reasonable turnaround time is an engineering feat that will be underappreciated. The engineering and algorithmic optimization required to achieve this is nothing short of a marvel. </p><p class="paragraph" style="text-align:start;">What does this potentially change? Most industries. It&#39;s a bigger change than GPT-4 was from GPT-3 and GPT-3 was from GPT-2. It&#39;s also confirmation that we are still on the exponential development ramp. The foundation is set to change most of our day-to-day routines, including search (yes, Google), communication (the 2013 movie <i>Her</i> example isn&#39;t lost on anyone), education, etc. We hope you are as excited about today&#39;s update as we are! [<a class="link" href="http://twitter.com?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=openai-unveils-gpt-4o-a-breakdown-of-this-year-s-biggest-day-in-ai" target="_blank" rel="noopener noreferrer nofollow">twitter.com</a>] Share this story <a class="link" href="mailto:?subject=%20.%20.%20OpenAI%20Launches%20GPT-4o%20And%20It%27s%20Much%20More%20Than%20Meets%20The%20Eye&body=%0A%0A%0AOpenAI%20Launches%20GPT-4o%20And%20It%27s%20Much%20More%20Than%20Meets%20The%20Eye%0AFirst%20impacted%3A%20developers%2C%20content%20creators%20%2F%2F%20Time%20to%20impact%3A%20Short%0A%0AOpenAI%20has%20launched%20GPT-4o%2C%20now%20available%20in%20ChatGPT%2C%20which%20integrates%20text%2C%20audio%2C%20and%20vision%20processing%20in%20real%20time.%20We%20recommend%20checking%20out%20the%20demo%20video.%20OpenAI%20also%20highlighted%20that%20the%20new%20models%20ranked%20as%20%27GPT2%27%20on%20the%20Lymsys%20chatbot%20leaderboards%2C%20which%20attracted%20a%20lot%20of%20attention%20and%20speculation.%20These%20models%20achieve%20a%20%2B100%20ELO%20rating%20over%20previous%20models%20(which%20is%20a%20scoring%20model%20originally%20used%20for%20Chess%20players)%2C%20representing%20a%2064%25%20win%20rate%20against%20the%20next%20closest%20models%20(which%20is%20a%20significant%20increase%20in%20win%20rate%20percentage%20of%2078%25).%20It%20accomplishes%20this%20in%20as%20little%20as%20232%20milliseconds%2C%20operating%20at%20twice%20the%20speed%20and%20half%20the%20cost%20of%20GPT-4%20Turbo.%20Here%20are%20some%20highlights%20we%20found%20particularly%20impressive%3A%0A%0AReasoning%3A%20This%20is%20something%20that%20the%20AI%20community%20has%20been%20anticipating%20eagerly.%20The%20ability%20to%20break%20down%20a%20problem%20into%20its%20subcomponents%20and%20plan%20on%20how%20to%20solve%20them%20is%20crucial.%20There%20is%20a%20significant%20increase%20in%20the%20model%27s%20%27logic%27%20capability%20which%20received%20little%20coverage%20to%20date.%0A%0ADesktop%20App%3A%20OpenAI%20has%20shown%20they%20are%20masters%20of%20tech%20and%20distribution.%20The%20ability%20to%20distribute%20their%20future%20products%20and%20functionality%20to%20desktops%20will%20be%20an%20amazing%20channel%20because%20they%20won%E2%80%99t%20be%20restricted%20to%20a%20web%20browser%20or%20mobile%20anymore.%20Having%20the%20ability%20to%20watch%2C%20learn%2C%20and%20interact%20with%20a%20user%E2%80%99s%20workstation%20will%20be%20pivotal%20in%20expanding%20their%20value%20to%20users.%20%0A%0AMultimodality%3A%20The%20ability%20to%20see%20and%20hear%20is%20a%20huge%20leap%20forward%2C%20and%20more%20important%20than%20it%20initially%20seems.%20If%20it%20can%20see%20the%20world%2C%20it%20will%20soon%20be%20able%20to%20see%20your%20computer%20screen.%20Once%20it%20can%20do%20that%2C%20it%20will%20have%20the%20potential%20to%20help%20you%20interact%20with%20your%20daily%20work.%20What%20does%20that%20mean%3F%20We%20are%20much%20closer%20to%20a%20world%20where%20AI%20can%20manage%20the%20flow%20of%20information%20and%20productive%20work%20than%20meets%20the%20eye.%20%0A%0AOn%20Mobile%3A%20In%20every%20pocket%20and%20focusing%20on%20consumer-led%20distribution%20first%2C%20OpenAI%20has%20really%20dominated%20the%20consumer%20market%20in%20this%20respect.%20The%20more%20users%20they%20get%2C%20the%20better%20the%20intent%20data%20and%20the%20better%20the%20model%2C%20creating%20a%20flywheel%20effect%20that%20will%20serve%20them%20well%20in%20the%20years%20ahead.%0A%0ALatency%3A%20The%20final%20limiting%20factor%20to%20pulling%20all%20this%20off%20was%20it%20feeling%20natural.%20To%20do%20all%20of%20the%20above%20in%20a%20reasonable%20turnaround%20time%20is%20an%20engineering%20feat%20that%20will%20be%20underappreciated.%20The%20engineering%20and%20algorithmic%20optimization%20required%20to%20achieve%20this%20is%20nothing%20short%20of%20a%20marvel.%0A%0AWhat%20does%20this%20potentially%20change%3F%20Most%20industries.%20It%27s%20a%20bigger%20change%20than%20GPT-4%20was%20from%20GPT-3%20and%20GPT-3%20was%20from%20GPT-2.%20It%27s%20also%20confirmation%20that%20we%20are%20still%20on%20the%20exponential%20development%20ramp.%20The%20foundation%20is%20set%20to%20change%20most%20of%20our%20day-to-day%20routines%2C%20including%20search%20(yes%2C%20Google)%2C%20communication%20(the%202013%20movie%20Her%20example%20isn%27t%20lost%20on%20anyone)%2C%20education%2C%20etc.%20We%20hope%20you%20are%20as%20excited%20about%20today%27s%20update%20as%20we%20are!%0A%0A%0A%20%5Btwitter.com%20https%3A%2F%2Fopenai.com%2Findex%2Fhello-gpt-4o%2F%5D" target="_blank" rel="noopener noreferrer nofollow">by email</a></p><hr class="content_break"><h3 class="heading" style="text-align:start;" id="unitree-launches-g-1-humanoid-agent"><a class="link" href="https://twitter.com/UnitreeRobotics/status/1789931753974517820?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=openai-unveils-gpt-4o-a-breakdown-of-this-year-s-biggest-day-in-ai" target="_blank" rel="noopener noreferrer nofollow">Unitree Launches G1 Humanoid Agent that Costs Less Than Toyota Yaris</a></h3><p class="paragraph" style="text-align:start;"><i>First impacted:</i> Everyone!<br><i>Time to impact:</i> <b>Short</b></p><p class="paragraph" style="text-align:start;">Unitree has launched the G1 Humanoid Agent, a $16,000 AI-driven robot that boasts up to 34 joints and advanced joint movement, designed to enhance sports capabilities, according to the company. The robot features AI-controlled dexterous hands capable of precise operations, potentially broadening its application in automating physical labor. [<a class="link" href="https://twitter.com/UnitreeRobotics/status/1789931753974517820?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=openai-unveils-gpt-4o-a-breakdown-of-this-year-s-biggest-day-in-ai" target="_blank" rel="noopener noreferrer nofollow">via @UnitreeRobotics</a>] Share this story <a class="link" href="mailto:?subject=%20.%20.%20Unitree%20Launches%20G1%20Humanoid%20Agent%20that%20Costs%20Less%20Than%20Toyota%20Yaris&body=%0A%0A%0AUnitree%20Launches%20G1%20Humanoid%20Agent%20that%20Costs%20Less%20Than%20Toyota%20Yaris%0AFirst%20impacted%3A%20sports%20coaches%2C%20physical%20laborers%20%2F%2F%20Time%20to%20impact%3A%20Short%0A%0AUnitree%20has%20launched%20the%20G1%20Humanoid%20Agent%2C%20a%20%2416%2C000%20AI-driven%20robot%20that%20boasts%20up%20to%2034%20joints%20and%20advanced%20joint%20movement%2C%20designed%20to%20enhance%20sports%20capabilities%2C%20according%20to%20the%20company.%20The%20robot%20features%20AI-controlled%20dexterous%20hands%20capable%20of%20precise%20operations%2C%20potentially%20broadening%20its%20application%20in%20automating%20physical%20labor.%0A%0A%20%5Bvia%20%40UnitreeRobotics%20https%3A%2F%2Ftwitter.com%2FUnitreeRobotics%2Fstatus%2F1789931753974517820%5D" target="_blank" rel="noopener noreferrer nofollow">by email</a></p><hr class="content_break"><h3 class="heading" style="text-align:start;" id="rumors-of-apple-and-open-ai-partner"><a class="link" href="https://www.forbes.com/sites/kateoflahertyuk/2024/05/13/apples-new-chatgpt-deal-heres-what-it-means-for-your-iphone/?sh=43ee010583ef&utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=openai-unveils-gpt-4o-a-breakdown-of-this-year-s-biggest-day-in-ai" target="_blank" rel="noopener noreferrer nofollow">Rumors of Apple and OpenAI partnership Continue to Grow</a></h3><p class="paragraph" style="text-align:start;"><i>First impacted:</i> iPhone users, privacy-conscious consumers<br><i>Time to impact:</i> <b>Short</b></p><p class="paragraph" style="text-align:start;">Rumors are enduring that Apple is nearing a partnership with OpenAI to integrate ChatGPT technology into its upcoming iOS 18, focusing on user privacy by potentially processing data on-device. This collaboration could introduce a new chatbot feature for iPhones while maintaining Apple&#39;s commitment to privacy. [<a class="link" href="http://twitter.com?utm_source=edit.headline.com&utm_medium=newsletter&utm_campaign=openai-unveils-gpt-4o-a-breakdown-of-this-year-s-biggest-day-in-ai" target="_blank" rel="noopener noreferrer nofollow">twitter.com</a>] Share this story <a class="link" href="mailto:?subject=%20.%20.%20Rumors%20of%20Apple%20and%20OpenAI%20partnership%20Continue%20to%20Grow&body=%0A%0A%0ARumors%20of%20Apple%20and%20OpenAI%20partnership%20Continue%20to%20Grow%0AFirst%20impacted%3A%20iPhone%20users%2C%20privacy-conscious%20consumers%20%2F%2F%20Time%20to%20impact%3A%20Short%0A%0ARumors%20are%20enduring%20that%20Apple%20is%20nearing%20a%20partnership%20with%20OpenAI%20to%20integrate%20ChatGPT%20technology%20into%20its%20upcoming%20iOS%2018%2C%20focusing%20on%20user%20privacy%20by%20potentially%20processing%20data%20on-device.%20This%20collaboration%20could%20introduce%20a%20new%20chatbot%20feature%20for%20iPhones%20while%20maintaining%20Apple%27s%20commitment%20to%20privacy.%0A%0A%20%5Btwitter.com%20https%3A%2F%2Fwww.forbes.com%2Fsites%2Fkateoflahertyuk%2F2024%2F05%2F13%2Fapples-new-chatgpt-deal-heres-what-it-means-for-your-iphone%2F%3Fsh%3D43ee010583ef%5D" target="_blank" rel="noopener noreferrer nofollow">by email</a></p></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=b701b196-85b4-4628-bece-cf21867a2844&utm_medium=post_rss&utm_source=headline_edit">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

  </channel>
</rss>
