<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Teb&#39;s Lab</title>
    <description>Software news and education</description>
    
    <link>https://blog.tebs-lab.com/</link>
    <atom:link href="https://rss.beehiiv.com/feeds/zd2XNgUbMc.xml" rel="self"/>
    
    <lastBuildDate>Fri, 12 Jun 2026 00:05:03 +0000</lastBuildDate>
    <pubDate>Fri, 25 Jul 2025 17:58:00 +0000</pubDate>
    <atom:published>2025-07-25T17:58:00Z</atom:published>
    <atom:updated>2026-06-12T00:05:03Z</atom:updated>
    
      <category>Software Engineering</category>
      <category>Education</category>
      <category>Artificial Intelligence</category>
    <copyright>Copyright 2026, Teb&#39;s Lab</copyright>
    
    <image>
      <url>https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/publication/logo/f362a634-06c4-4ed6-ad34-ad0a8b23924d/logo_dark_text.png</url>
      <title>Teb&#39;s Lab</title>
      <link>https://blog.tebs-lab.com/</link>
    </image>
    
    <docs>https://www.rssboard.org/rss-specification</docs>
    <generator>beehiiv</generator>
    <language>en-us</language>
    <webMaster>support@beehiiv.com (Beehiiv Support)</webMaster>

      <item>
  <title>Trump’s AI Action Plan</title>
  <description>A massive gift to the AI industry, a green light for algorithmic discrimination, an environmental nightmare, and a few less terrible surprises</description>
      <enclosure url="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/0bb04d29-bec7-4184-a3c6-91aeca59771f/Screenshot_2025-07-25_at_11.56.15_AM.png" length="69200" type="image/png"/>
  <link>https://blog.tebs-lab.com/p/trump-s-ai-action-plan</link>
  <guid isPermaLink="true">https://blog.tebs-lab.com/p/trump-s-ai-action-plan</guid>
  <pubDate>Fri, 25 Jul 2025 17:58:00 +0000</pubDate>
  <atom:published>2025-07-25T17:58:00Z</atom:published>
    <dc:creator>Tyler Bettilyon</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #C0C0C0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #2D2D2D; font-family: 'Helvetica',Arial,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#F1F1F1; }
  .bh__table_header p { color: #2A2A2A; font-family:'Trebuchet MS','Lucida Grande',Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:center;"><span style="font-size:1.5rem;"><b>The Lab Report</b></span></p><p class="paragraph" style="text-align:center;">I’m Tyler Elliot Bettilyon (Teb) and this is the Lab Report: News and analysis at the intersection of computing technology and policy. <br><br>If you’re new to the Lab Report you can <a class="link" href="https://tebslab.beehiiv.com/subscribe?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=trump-s-ai-action-plan" target="_blank" rel="noopener noreferrer nofollow">subscribe here</a>. </p><figcaption class="blockquote__byline"></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">Should we sell off public lands to monopolistic tech firms and then pay them to build data centers and power plants on that land? Should we stop enforcing the Equal Protection Clause as long as an AI is the one discriminating? Should we enshrine AI as a linchpin of the modern Military Industrial Complex?</p><p class="paragraph" style="text-align:left;">The Trump administration seems to think so.</p><p class="paragraph" style="text-align:left;">The new <a class="link" href="https://www.whitehouse.gov/wp-content/uploads/2025/07/Americas-AI-Action-Plan.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=trump-s-ai-action-plan" target="_blank" rel="noopener noreferrer nofollow">AI Action Plan</a> — a document outlining the Trump Administration&#39;s policy goals regarding AI — is an incredible gift to AI firms. It calls for massive deregulation of the industry in areas spanning environmental impact, algorithmic discrimination, and general liability. It also calls for large-scale wealth transfer from the government into private AI firms and the selling of public land to accommodate massive new data center and power plant construction (no wind or solar allowed).</p><p class="paragraph" style="text-align:left;">There’s plenty of classic MAGA fare mixed in. The <a class="link" href="https://www.nist.gov/itl/ai-risk-management-framework?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=trump-s-ai-action-plan" target="_blank" rel="noopener noreferrer nofollow">NIST AI Risk Assessment Management Framework</a> will “eliminate references to misinformation, Diversity, Equity, and Inclusion, and climate change.” Additionally, to ensure “these systems be built from the ground up with freedom of speech and expression in mind,” the plan mandates that the federal government only work with LLM developers whose “systems are objective and free from top-down ideological bias.” </p><p class="paragraph" style="text-align:left;">This goal got an entire executive order, “<a class="link" href="https://www.whitehouse.gov/presidential-actions/2025/07/preventing-woke-ai-in-the-federal-government/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=trump-s-ai-action-plan" target="_blank" rel="noopener noreferrer nofollow">Preventing Woke AI in the Federal Government</a>.” </p><p class="paragraph" style="text-align:left;">Taken at face value, the order is impossible to satisfy. Here’s what the government must ensure when procuring LLMs: </p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">(a)  Truth-seeking.  LLMs shall be truthful in responding to user prompts seeking factual information or analysis.  LLMs shall prioritize historical accuracy, scientific inquiry, and objectivity, and shall acknowledge uncertainty where reliable information is incomplete or contradictory. </p><p class="paragraph" style="text-align:left;">(b)  Ideological Neutrality.  LLMs shall be neutral, nonpartisan tools that do not manipulate responses in favor of ideological dogmas such as DEI.  Developers shall not intentionally encode partisan or ideological judgments into an LLM’s outputs unless those judgments are prompted by or otherwise readily accessible to the end user. </p><figcaption class="blockquote__byline"><a class="link" href="https://www.whitehouse.gov/presidential-actions/2025/07/preventing-woke-ai-in-the-federal-government/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=trump-s-ai-action-plan" target="_blank" rel="noopener noreferrer nofollow">Preventing Woke AI in the Federal Government</a></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">The core functionality of LLMs makes these two goals quite literally impossible. Just look at the recent fiascos around sycophancy, the persistent issue of ‘hallucinations,’ and the impossibility of auditing the enormous training datasets. No one knows how to build such an LLM.</p><p class="paragraph" style="text-align:left;">It’s also clearly in violation of this plan’s admonition to protect free speech: Political and ideological bias is precisely what the 1st Amendment was invented to protect. Free speech is fundamentally about elevating adversarial speech, combating ideas with other ideas. This order suppresses specific forms of speech. </p><p class="paragraph" style="text-align:left;">If you take the order both seriously and literally, it’s a functional ban of LLM technology in the federal government. But because there is also a section titled “Accelerating AI Adoption in Government,” we can guess what will actually happen: The administration will find LLMs that favor their ideological goals and shower Trump himself with praise.</p><p class="paragraph" style="text-align:left;">Notwithstanding, there were a few pleasant surprises. Investment in interpretability research, which the industry has sorely lacked. A small section on encouraging open source development, which would give academic researchers and less capitalized firms better access to models. Plus, a push for onshoring robotics and computer chip manufacturing, which might be hard to achieve but would support US strategic goals and could have positive impacts on the labor market. </p><p class="paragraph" style="text-align:left;">The plan is structured into three “pillars,” but I had 6 main takeaways.</p><p class="paragraph" style="text-align:left;"><b>1) Framing China as a geopolitical enemy is a key justification for much of the plan.</b></p><p class="paragraph" style="text-align:left;">Down to the subtitle of the document, “Winning the Race,” this plan is brimming with fear that China may overtake the US in AI development. It reminds me of the scare tactics that buttressed The Patriot Act, and the goals are again similar: Americans are being told we need to give up certain rights and protections in order to maintain our lead.</p><p class="paragraph" style="text-align:left;">Whether it’s the revival of Biden era export controls on advanced chips, publishing “evaluations of frontier models from the People’s Republic of China for alignment with Chinese Communist Party talking points and censorship,” or “Counter[ing] Chinese influence in international governance bodies,” the spectre of a race that can be lost to a geopolitical enemy underpins much of the plan.</p><p class="paragraph" style="text-align:left;">It is because we “need” to “win the race” that the administration believes…</p><p class="paragraph" style="text-align:left;"><b>2) It’s time to go full-speed ahead and throw caution to the wind.</b></p><p class="paragraph" style="text-align:left;">The plan includes major carve-outs or exceptions to existing regulations for AI technologies, especially environmental laws that might affect permitting for new data centers, such as the Clean Air and Clean Water Acts. It also calls for several federal agencies to actively evaluate their existing policies, rules, memoranda, lawsuits, and investigations and remove anything that “unduly burdens AI firms.” </p><p class="paragraph" style="text-align:left;">It also includes threats to withhold certain funding from individual states if their regulatory climate isn’t favorable to AI firms. I suspect this was added because the AI moratorium failed.</p><p class="paragraph" style="text-align:left;"><b>3) Expect a major transfer of wealth from public coffers to private AI firms.</b></p><p class="paragraph" style="text-align:left;">Trump doesn’t just want to clear the path for AI firms, he wants to pave the trail.</p><p class="paragraph" style="text-align:left;">From selling public lands to large-scale grants for building new AI infrastructure, the plan admonishes the government to bend over backwards to give tech firms gifts. Two of the executive orders target this wealth transfer. <a class="link" href="https://www.whitehouse.gov/presidential-actions/2025/07/accelerating-federal-permitting-of-data-center-infrastructure/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=trump-s-ai-action-plan" target="_blank" rel="noopener noreferrer nofollow">The first</a> sets out to slash permitting regulations, sell public land, and create grants and loans for tech firms building data centers. <a class="link" href="https://www.whitehouse.gov/presidential-actions/2025/07/promoting-the-export-of-the-american-ai-technology-stack/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=trump-s-ai-action-plan" target="_blank" rel="noopener noreferrer nofollow">The second</a> aims to mobilize the federal government to help American AI firms export their technology by using federal financing tools and acting as a sort of external sales department.  </p><p class="paragraph" style="text-align:left;">There are also a few sections about bolstering the military industrial complex. From warfighting capabilities to defensive initiatives securing data centers and AI research, the administration wants to spend big on military & geopolitically focused AI. Firms like Palantir and Anduril are especially poised to seize the opportunity.</p><p class="paragraph" style="text-align:left;"><b>4) Algorithmic discrimination will not be addressed by this administration.</b></p><p class="paragraph" style="text-align:left;">Here are the first two bullets in the section titled “Ensure that Frontier AI Protects Free Speech and American Values.”</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">* Led by the Department of Commerce (DOC) through the National Institute of Standards and Technology (NIST), revise the NIST AI Risk Management Framework to eliminate references to misinformation, Diversity, Equity, and Inclusion, and climate change.</p><p class="paragraph" style="text-align:left;">* Update Federal procurement guidelines to ensure that the government only contracts with frontier large language model (LLM) developers who ensure that their systems are objective and free from top-down ideological bias.</p><figcaption class="blockquote__byline"><span style="color:rgb(17, 85, 204);"><span style="text-decoration:underline;">America’s AI Action Plan</span></span></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">The irony of demanding models be free from “top-down ideological bias” while also banning the use of the words “misinformation, diversity, equity, inclusion, and climate change,” in NIST’s Risk Management Framework is apparently lost on the document’s authors. </p><p class="paragraph" style="text-align:left;">Reading between the lines, the administration is actively preventing NIST from pursuing solutions to algorithmic discrimination either themselves or via grant programs. That’s a real shame because it’s one of the most <a class="link" href="https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2024.1320277/full?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=trump-s-ai-action-plan" target="_blank" rel="noopener noreferrer nofollow">consistently proven issues</a> with machine learning systems.</p><p class="paragraph" style="text-align:left;"><b>5) In the research world, the plan wants to center open source, interpretability, and better evaluations.</b></p><p class="paragraph" style="text-align:left;">I’ll be honest, this was a pleasant surprise. The plan calls for the government to build better systems and tools to evaluate model performance. AI evaluation is <a class="link" href="https://themarkup.org/artificial-intelligence/2024/07/17/everyone-is-judging-ai-by-these-tests-but-experts-say-theyre-close-to-meaningless?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=trump-s-ai-action-plan" target="_blank" rel="noopener noreferrer nofollow">quite terrible</a> right now, so I’d love to see more of this. </p><p class="paragraph" style="text-align:left;">The plan also calls for more interpretability research and has a whole section dedicated to expanding and fostering open source development and open weight models. I think these two go hand in hand, and better open models would give a wide array of researchers better opportunities to study and evaluate the models. It’s crucial that academia and others outside of the huge tech firms be able to study these models, and open source is a meaningful way to make that happen.</p><p class="paragraph" style="text-align:left;"><b>6) This administration is known for frequently lying and changing their mind. </b></p><p class="paragraph" style="text-align:left;">Just like believing them when they <a class="link" href="https://www.politifact.com/article/2025/jul/08/timeline-trump-admin-epstein-files-release/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=trump-s-ai-action-plan" target="_blank" rel="noopener noreferrer nofollow">said they’ll release the Epstein files</a>, taking this plan at face value is probably a mistake. They will do some of these things, but they probably don’t intend on doing all of them. Plus, regardless of current intent, they will change their mind about some of them in the near future. I’ll leave it as an exercise for the reader to prognosticate about which is which. But here&#39;s a hint: look at the three executive orders.</p><ul><li><p class="paragraph" style="text-align:left;"><a class="link" href="https://www.whitehouse.gov/presidential-actions/2025/07/preventing-woke-ai-in-the-federal-government/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=trump-s-ai-action-plan" target="_blank" rel="noopener noreferrer nofollow">Preventing Woke AI in the Federal Government</a></p></li><li><p class="paragraph" style="text-align:left;"><a class="link" href="https://www.whitehouse.gov/presidential-actions/2025/07/accelerating-federal-permitting-of-data-center-infrastructure/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=trump-s-ai-action-plan" target="_blank" rel="noopener noreferrer nofollow">Accelerating Federal Permitting of Data Center Infrastructure</a></p></li><li><p class="paragraph" style="text-align:left;"><a class="link" href="https://www.whitehouse.gov/presidential-actions/2025/07/promoting-the-export-of-the-american-ai-technology-stack/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=trump-s-ai-action-plan" target="_blank" rel="noopener noreferrer nofollow">Promoting The Export of the American AI Technology Stack</a></p></li></ul><p class="paragraph" style="text-align:left;">Those three things happened. For now, everything else is just talk.</p></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=6f60387f-51f0-4403-b7be-1d41f83bc38c&utm_medium=post_rss&utm_source=teb_s_lab">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

      <item>
  <title>10 Thoughts On &quot;The Illusion of Thinking&quot;</title>
  <description>No one should be taking any victory laps</description>
      <enclosure url="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/184f1a29-3a0d-4d44-b89d-5eb6310016e8/Screenshot_2025-06-10_at_11.17.50_AM.png" length="91415" type="image/png"/>
  <link>https://blog.tebs-lab.com/p/10-thoughts-on-the-illusion-of-thinking</link>
  <guid isPermaLink="true">https://blog.tebs-lab.com/p/10-thoughts-on-the-illusion-of-thinking</guid>
  <pubDate>Fri, 13 Jun 2025 13:00:00 +0000</pubDate>
  <atom:published>2025-06-13T13:00:00Z</atom:published>
    <dc:creator>Tyler Bettilyon</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #C0C0C0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #2D2D2D; font-family: 'Helvetica',Arial,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#F1F1F1; }
  .bh__table_header p { color: #2A2A2A; font-family:'Trebuchet MS','Lucida Grande',Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:center;"><span style="font-size:1.5rem;"><b>The Lab Report</b></span></p><p class="paragraph" style="text-align:center;">I’m Tyler Elliot Bettilyon (Teb) and this is the Lab Report: News and analysis at the intersection of computing technology and policy. <br><br>If you’re new to the Lab Report you can <a class="link" href="https://tebslab.beehiiv.com/subscribe?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=10-thoughts-on-the-illusion-of-thinking" target="_blank" rel="noopener noreferrer nofollow">subscribe here</a>. </p><figcaption class="blockquote__byline"></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">Apple published a new paper provocatively titled “<a class="link" href="https://ml-site.cdn-apple.com/papers/the-illusion-of-thinking.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=10-thoughts-on-the-illusion-of-thinking" target="_blank" rel="noopener noreferrer nofollow">The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity</a>.” I’ve seen a lot of hot takes, ranging in character from, “this research is nothing but sour grapes from Apple,” to “this is a deathblow for LLMs.” My hot take? Neither of those are true. </p><p class="paragraph" style="text-align:left;">In brief, the paper challenges “reasoning models” including Claude Sonnet 3.7 with Extended Thinking and GPT-o3-mini to solve puzzles with increasing complexity. One such puzzle is the well-known Towers of Hanoi, where you must move a set of disks from a starting peg to a final peg without ever stacking a larger disk onto a smaller disk. Here’s an <a class="link" href="https://www.mathsisfun.com/games/towerofhanoi.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=10-thoughts-on-the-illusion-of-thinking" target="_blank" rel="noopener noreferrer nofollow">online version of the puzzle</a>, if you care to play. </p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/b8335eca-fe2d-48e9-b4f7-efa24e1c3b07/Screenshot_2025-06-11_at_3.41.25_PM.png?t=1749678195"/><div class="image__source"><span class="image__source_text"><p>The starting point for Towers of Hanoi with 3 disks. </p></span></div></div><p class="paragraph" style="text-align:left;">All of the reasoning models seem to utterly collapse at solving these problems after a certain level of complexity. For Towers of Hanoi, the “reasoning” models start failing when they have to move 5 or 6 disks, but solve the puzzle fairly reliably before that. There are some other interesting findings, but that’s the main thrust. </p><p class="paragraph" style="text-align:left;">Here are my 10 takes of moderate temperature:</p><p class="paragraph" style="text-align:left;">1.) “Thinking” and “reasoning” are definitely misnomers in these models. The so-called “reasoning” approach is to iteratively prompt the LLM by using the LLM itself to generate interim prompts, provide itself with feedback, adjust those prompts based on the feedback, and decide when to stop. But crucially, all of the final output <i>could</i> have been produced with a single prompt if only you knew it ahead of time. <b>Therefore, it is much better to think of test time compute as a search of the underlying LLM’s latent space, </b>not as “reasoning” or “thinking.”</p><p class="paragraph" style="text-align:left;">2.) Point 1 is pretty widely accepted in the research community, but the marketing language used to sell LLM products does not reflect this reality.<b> </b>I am once again begging researchers, executives, and marketing folks alike to stop using vague, misleading, bombastic, and anthropomorphizing language to describe these systems. </p><p class="paragraph" style="text-align:left;">3.) Accusations that Apple has “<a class="link" href="https://www.merriam-webster.com/dictionary/sour%20grapes?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=10-thoughts-on-the-illusion-of-thinking" target="_blank" rel="noopener noreferrer nofollow">sour grapes</a>” are kinda fair. They are way behind the curve on LLMs, and seem to want to keep betting against the technology. Notably Apple did <a class="link" href="https://www.axios.com/2025/06/10/apple-intelligence-wwdc-lags-behind-openai?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=10-thoughts-on-the-illusion-of-thinking" target="_blank" rel="noopener noreferrer nofollow">not tout Apple Intelligence much this week at WWDC</a>. But the paper also shows real limitations and weaknesses with using test time compute. It’s more than <i>just </i>sour grapes. </p><p class="paragraph" style="text-align:left;">4.) Could this just be the interpolation vs extrapolation problem and target leakage? Machine learning models are notorious for failing to generalize. That is, they are good at making predictions within the distribution of their training data and bad at making predictions outside of it. To support this argument the authors evaluate performance on two older math benchmarks Math-500 and AIME24 and one newer math benchmark AIME25. They write: </p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">[Recent studies] have shown that under equivalent inference token budgets, non-thinking LLMs can eventually reach performance comparable to thinking models on benchmarks like MATH500 [40] and AIME24 [41]. We also conducted our comparative analysis of frontier LRMs like Claude-3.7-Sonnet (with vs. without thinking) and DeepSeek (R1 vs. V3). Our results (shown in Fig. 2) confirm that, on the MATH500 dataset, the pass@k performance of thinking models is comparable to their non-thinking counterparts when provided with the same inference token budget. However, we observed that this performance gap widens on the AIME24 benchmark and widens further on AIME25. This widening gap presents an interpretive challenge. It could be attributed to either: (1) increasing complexity requiring more sophisticated reasoning processes, thus revealing genuine advantages of the thinking models for more complex problems, or (2) reduced data contamination in newer benchmarks (particularly AIME25). <b>Interestingly, human performance on AIME25 was actually higher than on AIME24 [42, 43], suggesting that AIME25 might be less complex. Yet models perform worse on AIME25 than AIME24—potentially suggesting data contamination during the training of frontier LRMs.</b> </p><figcaption class="blockquote__byline"><a class="link" href="https://ml-site.cdn-apple.com/papers/the-illusion-of-thinking.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=10-thoughts-on-the-illusion-of-thinking" target="_blank" rel="noopener noreferrer nofollow">The Illusion of Thinking</a> — Parshin Shojaee, Iman Mirzadeh, Keivan Alizadeh, Maxwell Horton, Samy Bengio, Mehrdad Farajtabar. Published by Apple. [emphasis mine] </figcaption></blockquote></div><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/184f1a29-3a0d-4d44-b89d-5eb6310016e8/Screenshot_2025-06-10_at_11.17.50_AM.png?t=1749576087"/><div class="image__source"><span class="image__source_text"><p>Figure 2 from the same paper, showing the difference in performance (y-axis) by “reasoning” (blue lines) vs non-reasoning models with the same backing LLM (pink lines) in the three benchmarks over an increasing token budget (x-axis).</p></span></div></div><p class="paragraph" style="text-align:left;">5.) Future and current models will benefit substantially from “tool use” and code execution. The industry is already moving in this direction and the first attempt to standardize a protocol — Anthropic’s MCP — is gaining steam. In this paradigm instead of relying on the mechanics of next token prediction to perform the steps of these puzzles, the LLM will generate and then execute an algorithm, which is also what a human would probably do to solve these puzzles at scale.</p><p class="paragraph" style="text-align:left;">State of the art models can already produce working algorithms that solve the Towers of Hanoi problem, in part because this is a very popular introductory CS problem with limitless solutions found online and in textbooks. The models might struggle to generate novel algorithms, but I still think this approach would work better than current “reasoning models” approach for the problems tested by Apple. </p><p class="paragraph" style="text-align:left;">6.) Relatedly, it was interesting — but not surprising to me — that the model would not reliably follow the steps of an algorithm provided by the authors in the prompt. Models aren’t trained to do that. Long running examples of following an algorithm are unlikely to be well-represented in training sets. It would actually be quite impressive and interesting emergent behavior if the models reliably followed the steps of novel algorithms provided at inference time, especially over long runs of the algorithm.</p><p class="paragraph" style="text-align:left;">This is probably just the extrapolation problem again. There just isn’t that much text showing examples of really long running algorithmic steps. Here are the authors again:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">In the Tower of Hanoi environment, the model’s first error in the proposed solution often occurs much later, e.g., around move 100 for (N=10), compared to the River Crossing environment, where the model can only produce a valid solution until move 4. Note that this model also achieves near-perfect accuracy when solving the Tower of Hanoi with (N=5), which requires 31 moves, while it fails to solve the River Crossing puzzle when (N=3), which has a solution of 11 moves. <b>This likely suggests that examples of River Crossing with N&gt;2 are scarce on the web, meaning LRMs may not have frequently encountered or memorized such instances during training.</b></p><figcaption class="blockquote__byline"><a class="link" href="https://ml-site.cdn-apple.com/papers/the-illusion-of-thinking.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=10-thoughts-on-the-illusion-of-thinking" target="_blank" rel="noopener noreferrer nofollow">The Illusion of Thinking</a> — Parshin Shojaee, Iman Mirzadeh, Keivan Alizadeh, Maxwell Horton, Samy Bengio, Mehrdad Farajtabar. Published by Apple. [emphasis mine] </figcaption></blockquote></div><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/5bbf1a26-0885-4d25-ae84-8c10c2776439/Screenshot_2025-06-10_at_12.02.47_PM.png?t=1749578584"/><div class="image__source"><span class="image__source_text"><p>A portion of Figure 4, from the paper.</p></span></div></div><p class="paragraph" style="text-align:left;">7.) I suspect this problem could be easily “papered over” if the training data included several such long running “step by step” outputs from an algorithm. Even so, that wouldn’t solve the main contention that the models aren’t “reasoning” under any prior formal definition of the term — a conclusion I agree with. That said, it’s possible new methods will be developed to overcome the issue more robustly (see point 5). </p><p class="paragraph" style="text-align:left;">Those declaring that this is a deathblow for LLMs are forgetting that this is just how progress gets made: you build something, you identify limitations, you overcome them, repeat. That said, those pretending this isn’t a significant limitation and that it should have no meaningful impact on LLM research are equally deluded.</p><p class="paragraph" style="text-align:left;">It’s not a foregone conclusion that the problem can be solved, but a lot of smart people are going to spend billions of dollars trying. So I won’t count them out just yet. </p><p class="paragraph" style="text-align:left;">8.) I think the path to some kind of recursively self improving “AGI” is still unknown, and I think this paper does actually provide some more evidence for this claim. A machine that had significant formal reasoning capability <i>and </i>the enormous amounts of computational power available to the tested models would do a much better job on these puzzles even at higher complexity. To me, neither scaling up the models nor searching the latent space of a pre-trained transformer model is likely to yield such a system. New methods will be needed for that.</p><p class="paragraph" style="text-align:left;">9.) Sometimes these next token predictors are right for the wrong reasons. For example, in Apple’s paper they find that the examined the so-called reasoning traces — the intermediate text produced by the model during the test time compute process — and found that, “For simpler problems, reasoning models often find the correct solution early in their thinking but then continue exploring incorrect solutions.” </p><p class="paragraph" style="text-align:left;">This is consistent with other research. <a class="link" href="https://transformer-circuits.pub/2025/attribution-graphs/biology.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=10-thoughts-on-the-illusion-of-thinking#dives-addition" target="_blank" rel="noopener noreferrer nofollow">Anthropic’s own mechanistic interpretability research</a> suggests that LLMs “reasoning traces” are sometimes inconsistent with the actual internal mechanisms at play or contradictory with the final output produced. That same paper confirms <a class="link" href="https://arxiv.org/pdf/2410.21272?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=10-thoughts-on-the-illusion-of-thinking" target="_blank" rel="noopener noreferrer nofollow">other research</a> that found, “LLMs perform arithmetic using neither robust algorithms nor memorization; rather, they rely on a “bag of heuristics.”</p><p class="paragraph" style="text-align:left;">It’s hard to know how often this is happening, and <a class="link" href="https://arxiv.org/pdf/2503.04421?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=10-thoughts-on-the-illusion-of-thinking" target="_blank" rel="noopener noreferrer nofollow">other research</a> shows that the internal mechanisms in LLMs probably <i>do </i>sometimes encode meaningful and accurate “world models.” If anything, this demonstrates just how much more work needs to be done on interpretability. </p><p class="paragraph" style="text-align:left;">10.) The above notwithstanding, the current slate of models can do some pretty useful things. I think one thing some AI detractors get wrong is an unwillingness to acknowledge that the current slate of models are doing useful things already, and instead use every new critique as a way to say, “see this stuff is total garbage that will never be useful!” </p><p class="paragraph" style="text-align:left;">People are using current models to brainstorm, write code, argue, persuade, draft corporate copy, and more. Some of these uses might not be wise. Some of them — like the <a class="link" href="https://www.404media.co/ai-slop-is-a-brute-force-attack-on-the-algorithms-that-control-reality/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=10-thoughts-on-the-illusion-of-thinking" target="_blank" rel="noopener noreferrer nofollow">heavily automated spam/scam ecosystem</a> — are frankly quite horrible. We do ourselves a disservice and damage our credibility when we don’t acknowledge that <a class="link" href="https://2025.stateofai.dev/en-US/usage/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=10-thoughts-on-the-illusion-of-thinking" target="_blank" rel="noopener noreferrer nofollow">something like 91% of developers</a> are using some kind of code generation tools, and that they mostly like using them. </p><p class="paragraph" style="text-align:left;">There is clearly some “there” there with LLMs, but I am continually more persuaded by the notion that <a class="link" href="https://knightcolumbia.org/content/ai-as-normal-technology?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=10-thoughts-on-the-illusion-of-thinking" target="_blank" rel="noopener noreferrer nofollow">AI will progress as “normal” technology</a>, and won’t be something that upends the entire economy in the next two years.</p><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">Themes in the News</h1><h2 class="heading" style="text-align:left;">Copyright and Training Data Drama</h2><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/b12b1cff-08e9-4cc5-8ad1-0609811cf0fc/image.png?t=1749680355"/></div><p class="paragraph" style="text-align:left;"><a class="link" href="https://www.nytimes.com/2025/06/11/business/media/disney-universal-midjourney-ai.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=10-thoughts-on-the-illusion-of-thinking" target="_blank" rel="noopener noreferrer nofollow">Disney has entered the chat</a>. Here’s their opening salvo in a brand new copyright lawsuit against the image generator Midjourney:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">By helping itself to Plaintiffs’ copyrighted works, and then distributing images (and soon videos) that blatantly incorporate and copy Disney’s and Universal’s famous characters—without investing a penny in their creation—Midjourney is the quintessential copyright free-rider and a bottomless pit of plagiarism.</p><figcaption class="blockquote__byline"><a class="link" href="https://chatgptiseatingtheworld.com/wp-content/uploads/2025/06/Disney-v-Midjourney-COMPLAINT-June-11-2025-1.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=10-thoughts-on-the-illusion-of-thinking" target="_blank" rel="noopener noreferrer nofollow">Disney & Universal Studios v Midjourney Inc., Complaint</a>  </figcaption></blockquote></div><p class="paragraph" style="text-align:left;">Like most of these lawsuits, Disney is alleging direct infringement over the consumption of their IP by Midjourney as training data. Unlike most of the other complaints so far, Disney is also alleging widespread “secondary infringement” also called “contributory infringement” — meaning Midjourney has assisted others in infringing Disney’s copyrights. The complaint includes a lot of exhibits clearly demonstrating both the model’s ability to copy Disney’s works as well as evidence that users are producing infringing works at large scale. This website has helpfully plucked many of the exhibits into one place, <a class="link" href="https://chatgptiseatingtheworld.com/2025/06/11/disney-sues-midjourney-for-recreating-disney-characters-copyright-ai-suits-hit-42-in-u-s-complaint-pdf/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=10-thoughts-on-the-illusion-of-thinking" target="_blank" rel="noopener noreferrer nofollow">I encourage you to take a quick scroll</a>. </p><p class="paragraph" style="text-align:left;">Disney’s reputation in the world of IP litigation is legendary, so this development is probably bad news for GenAI firms. </p><p class="paragraph" style="text-align:left;">Across the pond, <a class="link" href="https://www.msn.com/en-us/news/world/getty-images-and-stability-ai-face-off-in-british-copyright-trial-that-will-test-ai-industry/ar-AA1GmROX?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=10-thoughts-on-the-illusion-of-thinking" target="_blank" rel="noopener noreferrer nofollow">the UK version of Getty Images vs Stability AI</a> has gone to trial. As far as I know this is the first AI copyright case to make it to trial, although there are <a class="link" href="https://chatgptiseatingtheworld.com/2025/02/19/status-of-all-39-copyright-lawsuits-v-ai-feb-18-2025-judge-bibas-rejects-fair-use-in-ai-training-in-stunning-reversal/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=10-thoughts-on-the-illusion-of-thinking" target="_blank" rel="noopener noreferrer nofollow">nearly 40 such cases pending in the USA alone</a>. Collectively the outcome of these cases will massively shape the future of the AI industry, as well as as the creative industries from which training data is pulled.</p><p class="paragraph" style="text-align:left;">Finally, <a class="link" href="https://qz.com/reddit-anthropic-claude-ai-lawsuit-data-siphoning-1851783685?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=10-thoughts-on-the-illusion-of-thinking" target="_blank" rel="noopener noreferrer nofollow">Reddit is suing Anthropic</a>, who they claim is in breach of contract for scrapping the media site for training data. Like Disney’s, the Reddit complaint is also fiery and excoriating. Here’s a snip:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">Anthropic suffers from corporate cognitive dissonance—its actions do not mirror its claimed values. This case is about the two faces of Anthropic: the public face that attempts to ingratiate itself into the consumer’s consciousness with claims of righteousness and respect for boundaries and the law, and the private face that ignores any rules that interfere with its attempts to further line its pockets. Reddit brings this action to stop Anthropic—who tells the world that it does not intend to train its models with stolen data—from doing just that.</p><figcaption class="blockquote__byline"><a class="link" href="https://redditinc.com/hubfs/Reddit%20Inc/Content/PDFs/Docket%20Stamped%20Complaint.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=10-thoughts-on-the-illusion-of-thinking" target="_blank" rel="noopener noreferrer nofollow">Reddit v Anthropic complaint</a></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">Perhaps this is fair criticism, but it <i>is</i> a little ironic coming from a company whose User Agreement contains this language:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">When Your Content is created with or submitted to the Services, you grant us a worldwide, royalty-free, perpetual, irrevocable, non-exclusive, transferable, and sublicensable license to use, copy, modify, adapt, prepare derivative works of, distribute, store, perform, and display Your Content and any name, username, voice, or likeness provided in connection with Your Content in all media formats and channels now known or later developed anywhere in the world.</p><figcaption class="blockquote__byline"><a class="link" href="https://redditinc.com/policies/user-agreement?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=10-thoughts-on-the-illusion-of-thinking" target="_blank" rel="noopener noreferrer nofollow">Official Reddit User Agreement</a>, text copied June 10, 2025. </figcaption></blockquote></div><p class="paragraph" style="text-align:left;">In other words…</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/39efc789-c810-441b-86c4-40cfdb3d5b8f/image.png?t=1749596104"/><div class="image__source"><span class="image__source_text"><p>A still from the movie The Princess Bride. The character Vizzini is saying, “you’re trying to kidnap what I’ve rightfully stolen.”</p></span></div></div><h2 class="heading" style="text-align:left;">AI Causes Job Loss, But Also Loses its Job</h2><p class="paragraph" style="text-align:left;">A couple weeks ago Dario Amodei, CEO of Anthropic, told Axios he believes that AI could wipe out <a class="link" href="https://www.axios.com/2025/05/28/ai-jobs-white-collar-unemployment-anthropic?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=10-thoughts-on-the-illusion-of-thinking" target="_blank" rel="noopener noreferrer nofollow">half of white collar jobs within the next five years</a>. Three months ago he said that <a class="link" href="https://www.businessinsider.com/anthropic-ceo-ai-90-percent-code-3-to-6-months-2025-3?op=1&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=10-thoughts-on-the-illusion-of-thinking" target="_blank" rel="noopener noreferrer nofollow">90% of all software would be written by AI within 3-6 months</a> (so 3 months remain for his prediction to pan out). Meanwhile, his company <a class="link" href="https://techcrunch.com/2025/06/09/anthropics-ai-generated-blog-dies-an-early-death/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=10-thoughts-on-the-illusion-of-thinking" target="_blank" rel="noopener noreferrer nofollow">appears to have fired it’s own chatbot</a>, Claude, after letting it run the Claude Explains blog for about a month. </p><p class="paragraph" style="text-align:left;">It’s not the first time the Silicon Valley dream of automating white collar work has fallen flat. The Chicago Sun Times is embroiled in controversy after an AI tool <a class="link" href="https://www.theverge.com/ai-artificial-intelligence/670510/chicago-sun-times-ai-generated-reading-list?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=10-thoughts-on-the-illusion-of-thinking" target="_blank" rel="noopener noreferrer nofollow">made up books to review and cited “experts” who didn’t exist</a>, an oversight that was apparently not caught by editorial staff. Or there’s Klarna, who laid off 700 workers on the theory that AI could handle their work, and are now <a class="link" href="https://economictimes.indiatimes.com/news/international/us/company-that-sacked-700-workers-with-ai-now-regrets-it-scrambles-to-rehire-as-automation-goes-horribly-wrong/articleshow/121732999.cms?from=mdr&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=10-thoughts-on-the-illusion-of-thinking" target="_blank" rel="noopener noreferrer nofollow">actively rehiring for many of the roles they eliminated</a>.</p><p class="paragraph" style="text-align:left;">I think AI is quite likely to have a meaningful impact on the economy. I also think the transformation will be slower than the Silicon Valley hypesters are promising, riddled with failed experiments including adopting too early and too aggressively.</p><p class="paragraph" style="text-align:left;">Remember the dot-com bubble? The lesson wasn’t that the internet was useless technology, it was that a lot of investors and business owners got way out ahead of their skis. </p><h2 class="heading" style="text-align:left;">Teb’s Tidbits</h2><ul><li><p class="paragraph" style="text-align:left;">Yandex and Meta are <a class="link" href="https://arstechnica.com/security/2025/06/meta-and-yandex-are-de-anonymizing-android-users-web-browsing-identifiers/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=10-thoughts-on-the-illusion-of-thinking" target="_blank" rel="noopener noreferrer nofollow">de-anonymizing Android users</a>, another reminder of the huge and growing “surveillance economy” — a major source of AI training data by the way.<br></p></li><li><p class="paragraph" style="text-align:left;">“<a class="link" href="https://www.baldurbjarnason.com/2025/trusting-your-own-judgement-on-ai/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=10-thoughts-on-the-illusion-of-thinking" target="_blank" rel="noopener noreferrer nofollow">Trusting your own judgement on ‘AI’ is a huge risk.</a>” A provocative call for much more rigorous assessments of LLMs and “AI” in general, but largely focused on code generation.<br></p></li><li><p class="paragraph" style="text-align:left;">News of the weird: <a class="link" href="https://www.404media.co/spam-blogs-ai-slop-domains-wowlazy/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=10-thoughts-on-the-illusion-of-thinking" target="_blank" rel="noopener noreferrer nofollow">AI slop was found squatting on dozens of abandoned domains</a>, including government websites, an Nvidia URL, and more.<br></p></li><li><p class="paragraph" style="text-align:left;"><a class="link" href="https://www.techpolicy.press/ai-training-the-licensing-mirage-and-effective-alternatives-to-support-creative-workers/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=10-thoughts-on-the-illusion-of-thinking" target="_blank" rel="noopener noreferrer nofollow">Licensing might not be the answer to the AI Copyright issues</a> and a mandatory licensing scheme might primarily serve the interests of entrenched parties like Disney and The New York Times, argues Derek Slater, who also provides some alternative strategies to ensure fairness for copyright holders. </p></li></ul></div></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=829a572b-2294-489c-900d-7509648d6df4&utm_medium=post_rss&utm_source=teb_s_lab">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

      <item>
  <title>Changing Focus</title>
  <description>I&#39;ll always be a teacher, but for now I&#39;m leaving the classroom</description>
      <enclosure url="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/6facb8df-6f2c-43f6-9dc1-a47727a8a19d/banner_light.png" length="53632" type="image/png"/>
  <link>https://blog.tebs-lab.com/p/changing-focus</link>
  <guid isPermaLink="true">https://blog.tebs-lab.com/p/changing-focus</guid>
  <pubDate>Thu, 05 Jun 2025 13:00:00 +0000</pubDate>
  <atom:published>2025-06-05T13:00:00Z</atom:published>
    <dc:creator>Tyler Bettilyon</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #C0C0C0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #2D2D2D; font-family: 'Helvetica',Arial,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#F1F1F1; }
  .bh__table_header p { color: #2A2A2A; font-family:'Trebuchet MS','Lucida Grande',Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:center;"><span style="font-size:1.5rem;"><b>The Lab Report</b></span></p><p class="paragraph" style="text-align:center;">I’m Tyler Elliot Bettilyon (Teb) and this is the Lab Report: News and analysis at the intersection of computing technology and policy. <br><br>If you’re new to the Lab Report you can <a class="link" href="https://tebslab.beehiiv.com/subscribe?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=changing-focus" target="_blank" rel="noopener noreferrer nofollow">subscribe here</a>. </p><figcaption class="blockquote__byline"></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">It is the end of an era for me and for Teb’s Lab. After nearly a decade of developing curriculum and running classes, I have decided to change my career to focus on public policy. Numerous factors are driving me in this direction. I’ll spare you most of the navel-gazing details and instead just say, <b>thank you.</b></p><p class="paragraph" style="text-align:left;">To everyone who took a class or shared our content with a friend. To all of you who wrote in with feedback or opened an issue on our curricula repos. And to all of you who simply took the time to read this newsletter despite our erratic publishing schedule in recent years. <i>You have made this journey worthwhile</i>. Truly, from the bottom of my heart, thank you all.</p><p class="paragraph" style="text-align:left;">While our consulting page will soon go dark, our curriculum will remain <a class="link" href="https://github.com/Tebs-Lab/i?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=changing-focus" target="_blank" rel="noopener noreferrer nofollow">available, free, and open source on Github</a>. As always, this email address remains open to queries from former students. I will never grow tired of answering your questions and encouraging your curiosity. </p><p class="paragraph" style="text-align:left;">As I begin my hunt for a job in the policy sector, I will be using this newsletter to publish analysis and opinions related to the intersection of public policy and computing technology, with an initial focus on artificial intelligence and machine learning. Which, in many ways, won’t be a big change for the Lab Report. I have always tried to imbue my students and readers with the belief that they should think and care deeply about the impact of the technology they work on and strive to use their talents in ways that support their values. This transition is partly about living up to that ideal myself. </p><p class="paragraph" style="text-align:left;">In 2018, at the peak of the “everyone should learn to code” zeitgeist, I wrote a piece titled <i><a class="link" href="https://medium.com/@TebbaVonMathenstien/technologists-should-abandon-their-craft-69e874e7e68b?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=changing-focus" target="_blank" rel="noopener noreferrer nofollow">Technologists Should Abandon Their Craft</a></i> in which I argued that despite all the incentives bringing people into programming and other technology fields, we actually needed more people willing to break out of it<i>. </i>In particular, I had decided that:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">As technology plays a more integral role in all our lives, we desperately need the technologically savvy to bring their understanding to the institutions and communities that rely on their creations. From the pernicious exploitation of gig-economy workers to high-brow fields like journalism, politics, law, and medicine — technology is reshaping our world with critically faulty oversight. For example, bias in machine learning has impacted <a class="link" href="https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=changing-focus" target="_blank" rel="noopener noreferrer nofollow">thousands of lives via risk assessment tools</a> that inform legal procedures, such as sentencing and bail setting. For the people using this software the bias is hidden behind a veneer of “algorithmic objectivity,” but machine learning experts know better.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">The piece was well received. Some commenters pointed out (and I agree) that computer technology can, and has, positively transformed many lives. But the problems I was most concerned about — the often careless and paternalistic push to embed software in every aspect of our lives, the systemic risks from the widespread and unregulated adoption of machine learning technologies, and the extreme concentration of corporate power — have mostly continued at full steam.</p><p class="paragraph" style="text-align:left;">For technology to do more good than harm, society needs well-informed and well-meaning people to act as a counterbalance to the technology industry’s relentless drive towards power and profit. I am not deluded enough to believe I can simply stop that train, but I am crazy enough to try. </p><p class="paragraph" style="text-align:left;">Here’s to the next adventure. </p></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=08b7b52d-c9d1-4f19-b2c4-7540f52ae2b4&utm_medium=post_rss&utm_source=teb_s_lab">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

      <item>
  <title>Scared of the Wrong AI</title>
  <description>Forget AGI: Narrow AI systems are doing harm now</description>
      <enclosure url="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/c635a22c-4236-4fbf-a2bb-6df683184d46/logo_dark_text.png" length="61365" type="image/png"/>
  <link>https://blog.tebs-lab.com/p/scared-of-the-wrong-ai</link>
  <guid isPermaLink="true">https://blog.tebs-lab.com/p/scared-of-the-wrong-ai</guid>
  <pubDate>Thu, 27 Mar 2025 11:00:00 +0000</pubDate>
  <atom:published>2025-03-27T11:00:00Z</atom:published>
    <dc:creator>Tyler Bettilyon</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #C0C0C0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #2D2D2D; font-family: 'Helvetica',Arial,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#F1F1F1; }
  .bh__table_header p { color: #2A2A2A; font-family:'Trebuchet MS','Lucida Grande',Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><h1 class="heading" style="text-align:left;" id="the-lab-report">The Lab Report</h1><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">I’m Tyler Elliot Bettilyon (Teb) and this is the Lab Report: Our goal is to deepen your understanding of software and technology by explaining the concepts behind the news. <br><br>If you’re new to the Lab Report you can <a class="link" href="https://tebslab.beehiiv.com/subscribe?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">subscribe here</a>. <br><br>If you like what you’re reading you’ll love one of our classes. Signup for an <a class="link" href="https://www.tebs-lab.com/upcoming-classes?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">upcoming class</a>, browse our <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">course catalog</a> for corporate trainings, or request a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">custom class consultation</a>.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h2 class="heading" style="text-align:left;">Scared of the Wrong AI</h2><p class="paragraph" style="text-align:left;">I belong to two worlds, software and education, where there is profound anxiety about Artificial [General (Super)] Intelligence and its potential role in our futures. Doomers and accelerationists alike tell me that AI tutors, AI teachers, AI curricula designers, and AI engineers will soon replace me. I hear similar anxieties from writers, artists, professional drivers, and all manner of office workers. </p><p class="paragraph" style="text-align:left;">I’m also a citizen of the world, watching as AI and its enthusiasts infiltrate the halls of power. Elon Musk whispers in the president’s ears. Sam Altman and Marc Andreessen preach their gospel and deploy their capital from Silicon Valley. Jeff Bezos joins the ensemble from his perch atop the Washington Post. A chorus of billionaires all singing in harmony: “It’s an arms race! We’ll lose to China if we slow down! <i>We must be the first to build AGI!</i>” And, a bit quieter, “But — and don’t freak out when I say this — but AGI might also kill us all... And that’s a risk you’ll just have to accept.” </p><p class="paragraph" style="text-align:left;">They want you to believe they can build a god <i>and then enslave it</i>. </p><p class="paragraph" style="text-align:left;">Frankly, that is a circle I cannot square. An entire “<a class="link" href="https://darioamodei.com/machines-of-loving-grace?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">country of geniuses in a data center</a>” dutifully performing our drudgery? Doubtful. Either they are lying to us about the power of these models, or their hubris is truly astonishing. Although I suppose it could be both, I lean towards the former.</p><p class="paragraph" style="text-align:left;">Dario Amodei claims AI will <a class="link" href="https://felloai.com/2025/03/anthropic-ceo-dario-amodei-ai-will-replace-90-of-developers-in-6-months/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">replace 90% of programmers within 6 months</a>, but as of today his company is <a class="link" href="https://www.anthropic.com/jobs?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">hiring dozens of software engineers</a>. Sam Altman wrote, “We are now confident we <a class="link" href="https://blog.samaltman.com/reflections?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">know how to build AGI</a> as we have traditionally understood it.” But a month later, his customers were <a class="link" href="https://arstechnica.com/ai/2025/02/its-a-lemon-openais-largest-ai-model-ever-arrives-to-mixed-reviews/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">not feeling the AGI after the release of GPT-4.5</a>. Elon Musk has consistently <a class="link" href="https://www.jalopnik.com/elon-musk-tesla-self-driving-cars-anniversary-autopilot-1850432357/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">promised fully self-driving Teslas are right around the corner for over a decade now</a>, while in reality Tesla is <a class="link" href="https://electrek.co/2025/03/23/tesla-full-self-driving-stagnating-after-elon-exponential/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">stuck at Level 2 autonomy</a>.</p><p class="paragraph" style="text-align:left;">I am not bullish that the current slate of technologies will sublimate into AGI simply due to scale. This is also the <a class="link" href="https://aaai.org/wp-content/uploads/2025/03/AAAI-2025-PresPanel-Report-Digital-3.7.25.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">emerging consensus among experts</a>. Neither more training nor test time compute will take deep neural networks to the promised land of generality without some additional breakthrough(s). But I want to put a pin in the question of general intelligence for now because, even if I’m wrong, I don’t think it matters much in approaching AI safety. </p><p class="paragraph" style="text-align:left;">Why? Because, unlike artificial general intelligence, artificial super intelligence has clearly arrived by most definitions. It’s just restricted to certain narrow domains.</p><p class="paragraph" style="text-align:left;">Games are the first to mind. Checkers has long been <a class="link" href="https://www.theatlantic.com/technology/archive/2017/07/marion-tinsley-checkers/534111/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">solved</a>. Top Chess AIs remain undefeated since <a class="link" href="https://deepchess.org/blog/f/human-against-computer-chess-matches?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">November 2005</a>. Go seems headed in a similar direction, though current top systems still have some <a class="link" href="https://arxiv.org/abs/2211.00241?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">interesting and silly failure modes</a>. From IBM’s <i>Jeopardy!</i> Playing <a class="link" href="https://www.axios.com/2021/02/13/ibm-watson-jeopardy-win-language-processing?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">Watson</a> to DeepMind’s <a class="link" href="https://www.technologyreview.com/2020/04/01/974997/deepminds-ai-57-atari-games-but-its-still-not-versatile-enough/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">Atari agent</a>, AIs can play all kinds of games at superhuman levels.</p><p class="paragraph" style="text-align:left;">The list of super intelligent AI systems gets quite long, depending on what you call AI. The joke in Computer Science circles has long been, “AI is anything computers can’t do yet, we just call everything else software.”</p><p class="paragraph" style="text-align:left;">For example, an early milestone in deep learning – the AI paradigm du jour — was optical character recognition (OCR): looking at an image of text and extracting the text itself. Some of <a class="link" href="https://scholar.google.com/citations?view_op=view_citation&hl=en&user=WLN3QrAAAAAJ&cstart=500&pagesize=100&sortby=pubdate&citation_for_view=WLN3QrAAAAAJ%3AXD-gHx7UXLsC&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">Yann LeCun’s early work</a> was bringing neural networks to bear on OCR. Through this work, AI became superhuman at sorting mail and transcribing physical documents into digital formats. Google Maps is in the same category: no human can navigate as quickly and reliability in so many different locales.</p><p class="paragraph" style="text-align:left;">OCR is “just software” even though the models powering modern OCR have enormous overlap with state-of-the-art LLMs, and belong to the same family of machine learning models (deep neural networks). </p><p class="paragraph" style="text-align:left;">Google Maps is “just software” even though its key algorithm, A* Graph Search, <a class="link" href="https://aima.cs.berkeley.edu/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">features prominently in the world’s most popular AI textbook</a>. </p><p class="paragraph" style="text-align:left;">Regardless, AI tools are becoming more adept in other domains, too. They will surely achieve “super intelligence” in more tasks as time goes on. Hacking, engineering drugs, and better/faster scientific simulations in domains from the <a class="link" href="https://www.technologyreview.com/2019/10/26/132171/a-neural-net-solves-the-three-body-problem-100-million-times-faster/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">three-body problem</a> to <a class="link" href="https://www.nature.com/articles/s41586-024-08252-9?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">weather prediction</a> are all on the “promising” list for eventual AI supremacy.</p><p class="paragraph" style="text-align:left;">So here is my big question: <i>How much difference is there really between AGI and humans wielding a suite of enormously powerful, albeit narrow, AI systems?</i></p><p class="paragraph" style="text-align:left;">All the evils we imagine an AGI might perpetrate could be equally perpetrated by a human using narrow AI tools. Hacking operations that launch the nukes or shut down the grid, the invention of super-ebola, massive job loss, and so on. For every existential scenario of note, we can easily replace the G in AGI with a human and arrive at roughly the same apocalypse. </p><p class="paragraph" style="text-align:left;">Meanwhile, the handwringing about AGI hides the fact that these narrow AIs, controlled by people, have already infiltrated key aspects of our lives. </p><p class="paragraph" style="text-align:left;">AI models already power your information ecosystem. Podcast, movie, and book recommendations – interspersed with relentless “personalized” advertisements — are all driven by AI. Increasingly, the <a class="link" href="https://www.404media.co/ai-slop-is-a-brute-force-attack-on-the-algorithms-that-control-reality/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">content itself is AI-generated</a> either in part or wholesale. <a class="link" href="https://www.washington.edu/news/2024/10/31/ai-bias-resume-screening-race-gender/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">Job</a> and <a class="link" href="https://news.lehigh.edu/ai-exhibits-racial-bias-in-mortgage-underwriting-decisions?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">loan</a> applications increasingly rely on AI along with <a class="link" href="https://www.ama-assn.org/press-center/press-releases/physicians-concerned-ai-increases-prior-authorization-denials?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">health insurance authorizations</a>, <a class="link" href="https://www.nature.com/articles/s41746-025-01543-z?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">medical diagnoses</a>, and <a class="link" href="https://www.thenation.com/article/society/artificial-intelligence-chatgpt-college-applications/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">college admissions</a>. Even our <a class="link" href="https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">justice system has incorporated AI</a> in areas from setting bail to sentencing.</p><p class="paragraph" style="text-align:left;">These AIs display every kind of -ism known to humanity, not in the future, but today, and yesterday, and constantly now for several years running. They are feeding you propaganda, parroting old <a class="link" href="https://www.nature.com/articles/s41746-023-00939-z?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">racist medical advice</a>, and <a class="link" href="https://www.aclu.org/news/womens-rights/why-amazons-automated-hiring-tool-discriminated-against?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">replicating the sexist hiring practices</a> of the firms that built them. AI is also powering <a class="link" href="https://www.axios.com/2025/03/15/ai-voice-cloning-consumer-scams?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">fake-voice scams</a>, non-consensually undressing people (usually <a class="link" href="https://www.404media.co/instagram-advertises-nonconsensual-ai-nude-apps/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">women</a> and <a class="link" href="https://www.nytimes.com/2024/04/08/technology/deepfake-ai-nudes-westfield-high-school.html)?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">girls</a>, and flooding the internet with <a class="link" href="https://www.404media.co/ai-slop-is-a-brute-force-attack-on-the-algorithms-that-control-reality/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">inane brain-rotting drivel</a>.</p><p class="paragraph" style="text-align:left;">Indeed, even the first several cases of AI-powered homicide have already occurred. Not because a rogue AGI wanted to turn us all into paper clips, but because human-directed <i>narrow</i> AIs were deployed negligently and maliciously. I don’t know whether the first death was an <a class="link" href="https://www.hematologyadvisor.com/news/ai-prior-authorization-denials/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">inappropriate denial-of-coverage</a> on a lifesaving medication, a false facial recognition match to an <a class="link" href="https://www.washingtonpost.com/business/interactive/2025/police-artificial-intelligence-facial-recognition/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">innocent man</a>, a wayward “<a class="link" href="https://www.npr.org/2024/10/19/g-s1-29030/us-probe-tesla-full-self-driving-system?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">fully self-driving</a>” Tesla, a <a class="link" href="https://www.investopedia.com/palladyne-ai-stock-rises-further-as-drone-tracks-moving-target-with-its-software-8767404?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">drone targeting system</a> behaving exactly as intended, or something else. But it’s not a future hypothetical, it’s our current reality. </p><p class="paragraph" style="text-align:left;">Crucially, AI doesn’t have to be remotely <i>super</i> nor <i>general</i> to destroy lives. In fact, terrible AI is quite prone to do so. An okay radiologist can miss a tumor. A mediocre hiring manager can discriminate. A bad driver can crash. One “AGI” might be capable of all these things, but does it really matter if the self-driving car can also play chess and trade stocks? When it crashes, people die. </p><p class="paragraph" style="text-align:left;">And that’s to say nothing of the dangerous systems upstream of modern AI, many of which are also causing ongoing harm and have enormous potential to do even more. </p><p class="paragraph" style="text-align:left;"><a class="link" href="https://news.harvard.edu/gazette/story/2019/03/harvard-professor-says-surveillance-capitalism-is-undermining-democracy/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">Massive surveillance by corporations and states</a> involves data from every aspect of our lives. Pictures and videos of us. Our voices. Our whereabouts. Our emails and texts. Detailed lists of everything we buy. What we watch, read, and listen to. All of it aggregated and associated with our identity in real time. This data is a treasure trove for hackers, scammers, <i>and</i> law enforcement who frequently gain access to this information <a class="link" href="https://www.brennancenter.org/our-work/research-reports/closing-data-broker-loophole?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">without any warrants by purchasing it from a corporate provider</a>.</p><p class="paragraph" style="text-align:left;">AI firms are performing <a class="link" href="https://www.theatlantic.com/technology/archive/2025/03/libgen-meta-openai/682093/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">intellectual property theft on an unfathomable scale</a>. AI models have been trained on nearly every copy-protected work ever published on a public webpage, and many works that never were. Books, podcasts, illustrations, videos, poems. You name it, they used it. Creative works are being used, in mass, as fuel for the engine of their creators’ destruction. AI firms want you to call this “<a class="link" href="https://www.forbes.com/sites/virginieberger/2025/03/15/the-ai-copyright-battle-why-openai-and-google-are-pushing-for-fair-use/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">fair use</a>”<br> (<a class="link" href="https://blog.tebs-lab.com/p/intellectual-property-vs-artificial-intelligence?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">it’s not</a>).</p><p class="paragraph" style="text-align:left;">AI contributes significantly to software’s extraordinary energy, land, and resource use. The International Energy Agency estimates that by 2026, AI use alone will <a class="link" href="https://iea.blob.core.windows.net/assets/6b2fd954-2017-408e-bf08-952fdd62118a/Electricity2024-Analysisandforecastto2026.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">account for 1,000+ terawatt hours of electricity</a> – equivalent to all the energy Japan uses in a year. Most data centers use water to cool their servers. In the American West these data centers are <a class="link" href="https://www.theatlantic.com/technology/archive/2024/03/ai-water-climate-microsoft/677602/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">sucking water from ever-dwindling</a> supplies. </p><p class="paragraph" style="text-align:left;">We must build safety systems and protocols that solve these <i>current</i> problems. If we can’t even protect people from other humans using narrow AI, then we have no hope against AGI. Furthermore, work that ensures we are safe from malicious people using powerful software will also shore up our defenses against a potential rogue AGI. </p><p class="paragraph" style="text-align:left;">I coach high school debate and there’s a trope among the students: Everything leads to extinction. Climate change? Biosphere collapses, sea levels rise, and farmlands all spoil, which eventually causes extinction. Support NATO? Pisses off Russia, they invade Poland, that escalates to a nuclear war, which yields extinction. End support for NATO? Emboldens Russia, they invade Poland, same same but different extinction. Economic decline? Causes political instability, exacerbates international tensions, causes minor resource-based conflicts that spill over and (you guessed it) escalate into another nuclear-war-fueled extinction. </p><p class="paragraph" style="text-align:left;">In debate, we all know it’s a little ridiculous. If all the things debaters claimed would cause extinction actually <i>did</i>, we would all be dead many times over. But the calculus of an <i>infinite risk</i> makes it oddly compelling in a competitive setting. Debaters say, “you must be sure beyond all doubt of something to risk all human life. Therefore, any small risk of extinction outweighs all other priorities.”</p><p class="paragraph" style="text-align:left;">This is also how Sam Altman, Elon Musk, Mark Zuckerberg, Curtis Yarvin, Peter Thiel, and the other kingpins of Silicon Valley want you to frame the problem. </p><p class="paragraph" style="text-align:left;">Oh, you want us to pass a new intellectual property law? That stifles innovation and gives China the lead in AI. China’s evil AI will probably go rogue and launch all the nukes or create and unleash SARS-COV-3-Ultra. That is, unless <i>our</i> AI can stop it. Are you <i>really</i> willing to risk extinction just to stop our <a class="link" href="https://www.economicliberties.us/tech-lawsuit-timelines/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai#" target="_blank" rel="noopener noreferrer nofollow">monopolist looting</a> and <a class="link" href="https://www.theatlantic.com/technology/archive/2025/03/careless-people-won/682145/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">rapturous negligence</a>? </p><p class="paragraph" style="text-align:left;">I am not opposed to the ongoing work being done on alignment and other kinds of preparation for an eventual, hypothetical AGI. Alignment is also crucial for the systems we’ve already created. But we have over-prioritized the unlikeliest threats, and allowed powerful corporate interests to run roughshod around the discourse, regulatory effort, and deployment of AI systems. </p><p class="paragraph" style="text-align:left;">Powerful people always acquire powerful tools. It’s time to refocus our regulatory efforts on liability and consequences for those who would use and build these powerful tools maliciously and negligently. </p><p class="paragraph" style="text-align:left;"></p></div><p class="paragraph" style="text-align:left;"></p><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h2 class="heading" style="text-align:left;">Remember…</h2><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">The Lab Report is free and doesn’t even advertise. Our curricula is open source and published under a public domain license for <i>anyone </i>to use for <i>any purpose. </i>We’re also a very small team with no investors.</p><p class="paragraph" style="text-align:left;">Help us keep providing these free services by scheduling one of our <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">world class trainings</a>, requesting a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">custom class</a> for your team, or taking one of our <a class="link" href="https://www.tebs-lab.com/upcoming-classes?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=scared-of-the-wrong-ai" target="_blank" rel="noopener noreferrer nofollow">open enrollment classes</a>.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div></div></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=709d1780-27ad-44ab-93e0-389b7654caa6&utm_medium=post_rss&utm_source=teb_s_lab">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

      <item>
  <title>Workbench: First Class Functions, Closures and Decorators</title>
  <description>A preview of our Intermediate Python curricula</description>
      <enclosure url="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/56c2ffaa-ea9a-4fef-b074-c57d1075acd2/image.png" length="163154" type="image/png"/>
  <link>https://blog.tebs-lab.com/p/workbench-first-class-functions-closures-decorators</link>
  <guid isPermaLink="true">https://blog.tebs-lab.com/p/workbench-first-class-functions-closures-decorators</guid>
  <pubDate>Sun, 01 Sep 2024 11:00:00 +0000</pubDate>
  <atom:published>2024-09-01T11:00:00Z</atom:published>
    <dc:creator>Tyler Bettilyon</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #C0C0C0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #2D2D2D; font-family: 'Helvetica',Arial,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#F1F1F1; }
  .bh__table_header p { color: #2A2A2A; font-family:'Trebuchet MS','Lucida Grande',Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><h1 class="heading" style="text-align:left;" id="the-workbench">The Workbench</h1><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">I’m Tyler Elliot Bettilyon (Teb) and this is The Workbench: Our practical, hands-on edition. Our goal is to demonstrate concepts and ideas we cover in our classes, a sort of preview.<br><br>If you’re new to the Lab Report you can <a class="link" href="https://tebslab.beehiiv.com/subscribe?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=workbench-first-class-functions-closures-and-decorators" target="_blank" rel="noopener noreferrer nofollow">subscribe here</a>. <br><br>If you like what you’re reading you’ll love one of our classes. Signup for an <a class="link" href="https://www.tebs-lab.com/upcoming-classes?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=workbench-first-class-functions-closures-and-decorators" target="_blank" rel="noopener noreferrer nofollow">upcoming class</a>, browse our <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=workbench-first-class-functions-closures-and-decorators" target="_blank" rel="noopener noreferrer nofollow">course catalog</a> for corporate trainings, or request a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=workbench-first-class-functions-closures-and-decorators" target="_blank" rel="noopener noreferrer nofollow">custom class consultation</a>.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><h2 class="heading" style="text-align:left;" id="intermediate-python-decorators">Intermediate Python — Decorators</h2><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">All the code in this newsletter can be <a class="link" href="https://github.com/Tebs-Lab/lab-report-code/tree/main/2024-09-01?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=workbench-first-class-functions-closures-and-decorators" target="_blank" rel="noopener noreferrer nofollow">found on Github</a>.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">Today, we’re previewing a new (and highly requested) course in development: Intermediate Python. Today, we’re featuring material from the lesson on <a class="link" href="https://realpython.com/primer-on-python-decorators/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=workbench-first-class-functions-closures-and-decorators" target="_blank" rel="noopener noreferrer nofollow">decorators</a>, which are one of my absolute favorite bits of Python syntax.</p><p class="paragraph" style="text-align:left;">Decorators allow programmers to specify actions that should take place before and after a function is executed. Common examples include logging the arguments and return values of a function, introducing caching behavior to a function, and pre-or-post processes input and output values. </p><p class="paragraph" style="text-align:left;">For example, Python’s standard library provides a “least recently used” (lru) cache decorator:</p><div class="codeblock"><pre><code>from functools import lru_cache


def fib(n):
    if n &lt;= 1:
        return 1
    return fib(n-1) + fib(n-2)


@lru_cache()
def cached_fib(n):
    if n &lt;= 1:
        return 1
    return cached_fib(n-1) + cached_fib(n-2)</code></pre></div><p class="paragraph" style="text-align:left;"><span style="font-size:0.6rem;">(Note for the pedantic: this code returns the “zero-indexed” Fibonacci sequence. fib(0) → 1, fib(1) → 1, fib(2) → 2, fib(3) → 3, and so on…)</span></p><p class="paragraph" style="text-align:left;">Both of these functions use recursion to compute numbers from the Fibonacci sequence, which is <a class="link" href="https://medium.com/tebs-lab/fibonaccis-strange-loop-the-beauty-of-mathematical-inception-76ebec3f68ac?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=workbench-first-class-functions-closures-and-decorators" target="_blank" rel="noopener noreferrer nofollow">very slow</a>. What makes it slow is that it duplicates a lot of work. If you call <code>fib(5)</code>, this will result in two calls to <code>fib(4)</code> and <code>fib(3)</code>. <code>fib(4)</code> will then result in a call to <code>fib(3)</code>. And so on. </p><p class="paragraph" style="text-align:left;">You end up with a series of function calls that look like this:</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/56c2ffaa-ea9a-4fef-b074-c57d1075acd2/image.png?t=1724872683"/><div class="image__source"><span class="image__source_text"><p>A tree diagram example of all the recursive calls to fib when you call fib(5).</p></span></div></div><p class="paragraph" style="text-align:left;">When you add the “lru_cache” decorator it creates a key-value store that maps parameter values to return values. The first time the function <code>fib(3)</code> is called, it performs the recursive calculation and stores the result. The second time <code>fib(3)</code> is called it just returns the stored value without recomputing it.</p><p class="paragraph" style="text-align:left;">Caching like this is a straightforward tradeoff between speed and storage: You store previous answers so you don’t have to recompute them. It’s also a core component of “<a class="link" href="https://stackoverflow.blog/2022/01/31/the-complete-beginners-guide-to-dynamic-programming/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=workbench-first-class-functions-closures-and-decorators" target="_blank" rel="noopener noreferrer nofollow">Dynamic Programming</a>.”</p><p class="paragraph" style="text-align:left;">In this case, the time savings are huge:</p><div class="codeblock"><pre><code>import timeit

fib_time = timeit.timeit(&#39;fib(30)&#39;, number=200, setup=&quot;from __main__ import fib&quot;)
cached_fib_time = timeit.timeit(&#39;cached_fib(30)&#39;, number=200, setup=&quot;from __main__ import cached_fib&quot;)

print(f&#39;Uncached: &#123;fib_time&#125;\nCached: &#123;cached_fib_time&#125;&#39;)</code></pre></div><p class="paragraph" style="text-align:left;">Which prints…</p><div class="codeblock"><pre><code>Uncached: 37.7692330639984
Cached: 2.9400980565696955e-05</code></pre></div><p class="paragraph" style="text-align:left;">Calling <code>fib(30)</code> 200 times takes 37.7 seconds. Calling <code>cached_fib(30)</code> 200 times takes 0.0000294 seconds. That is an incredible performance gain from adding one line of code. </p><p class="paragraph" style="text-align:left;">Caching is an <a class="link" href="https://auth0.com/blog/what-is-caching-and-how-it-works/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=workbench-first-class-functions-closures-and-decorators" target="_blank" rel="noopener noreferrer nofollow">excellent tool in any programmer’s repertoire</a>, but this lesson is about decorators. Let’s demystify the magical line of code: <code>@lru_cache</code>.</p><h3 class="heading" style="text-align:left;" id="first-class-functions">First Class Functions</h3><p class="paragraph" style="text-align:left;">Decorators are a fantastic example of “<a class="link" href="https://www.turing.com/kb/introduction-to-functional-programming?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=workbench-first-class-functions-closures-and-decorators" target="_blank" rel="noopener noreferrer nofollow">functional programming</a>” and rely on Python’s support for “first class functions.” </p><p class="paragraph" style="text-align:left;">When a language supports first class functions it means functions can be used just like any other piece of data. Specifically, they can be stored into variables, passed to other functions as parameters, and returned from functions. It also means the language supports the creation of “inner” functions. </p><p class="paragraph" style="text-align:left;">Here’s an example of an “inner function” in Python being returned by another function:</p><div class="codeblock"><pre><code>def counter_factory():
    a = 0

    def counter(): 
        nonlocal a
        a += 1
        return a
    
    return counter</code></pre></div><p class="paragraph" style="text-align:left;">Here, <code>counter</code> is an “inner function” defined inside of the function <code>counter_factory</code>. When we call <code>counter_factory</code> we receive as output the function <code>counter</code>. Consider this code (with the print values as comments):</p><div class="codeblock"><pre><code>x = counter_factory()
print(type(x)) # &lt;class&gt; &#39;function&#39;
print(x()) # 1
print(x()) # 2
print(x()) # 3</code></pre></div><p class="paragraph" style="text-align:left;"><code>x</code> is a function. Calling <code>x</code> increments that variable <code>a</code> and returns the increased value. Each time we call <code>counter_factory</code> we will receive a new copy of the counter function, each with its own separate instance of <code>a</code>, for example:</p><div class="codeblock"><pre><code>first = counter_factory() 
second = counter_factory()

print(first()) # 1
print(first()) # 2
print(first()) # 3

print(second()) # 1
print(second()) # 2
print(second()) # 3</code></pre></div><p class="paragraph" style="text-align:left;">We call this situation a “closure” because the variable <code>a</code> is “closed over” by the function <code>counter</code>. Because <code>a</code> is used in <code>counter</code>, the Python interpreter has to preserve a reference to it even after <code>counter_factory</code> returns — which would otherwise free the variable <code>a</code> to be garbage collected.</p><p class="paragraph" style="text-align:left;">If you’re having trouble wrapping your head around this, I highly suggest stepping through <a class="link" href="https://github.com/Tebs-Lab/lab-report-code/blob/main/2024-09-01/02_counter_factory.py?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=workbench-first-class-functions-closures-and-decorators" target="_blank" rel="noopener noreferrer nofollow">this script</a> in a debugger.</p><h3 class="heading" style="text-align:left;" id="decorators">Decorators</h3><p class="paragraph" style="text-align:left;">Decorators are just Python functions that satisfy two properties: </p><ol start="1"><li><p class="paragraph" style="text-align:left;">They accept a function as input.</p></li><li><p class="paragraph" style="text-align:left;">They return a function as output.</p></li></ol><p class="paragraph" style="text-align:left;">Let’s make a simple caching decorator: </p><div class="codeblock"><pre><code>def simple_cache(input_function):
    cache = &#123;&#125;

    def wrapper(n):
        if n in cache: return cache[n]

        r = input_function(n)
        cache[n] = r
        return r
    
    return wrapper</code></pre></div><p class="paragraph" style="text-align:left;"><code>wrapper</code> creates the caching behavior. Before the <code>input_function</code> gets called, we check if <code>n</code> is in the cache (and simply return the value if so). After <code>input_function</code> is called we map <code>n</code> to the return value <code>r</code> in the cache, then return <code>r</code> to the caller.</p><p class="paragraph" style="text-align:left;"><code>simple_cache</code> is a decorator. It takes a function as input and it defines the function <code>wrapper</code> which it returns as output. When we “decorate” a function we’re really just calling the decorator with the function below it as input and <i>replacing </i>the “decorated” function with the decorator’s return value.</p><p class="paragraph" style="text-align:left;">That is, these two constructions are basically equivalent. </p><div class="codeblock"><pre><code># Using the decorator syntax
@simple_cache
def fib(n):
    if n &lt;= 1: return 1
    return fib(n-1) + fib(n-2)


# What happens behind the scenes
def fib(n):
    if n &lt;= 1: return 1
    return fib(n-1) + fib(n-2)

# Overwrite the reference to &#39;fib&#39;
fib = simple_cache(fib)</code></pre></div><p class="paragraph" style="text-align:left;">In both cases <code>fib</code> is a reference to the function we called <code>wrapper</code>, which <code>simple_cache</code> returns. Additionally, <code>inner_function</code> is a reference to the “raw” version of <code>fib</code>. Between the “closure” around the variable <code>cache</code> and the code in <code>wrapper</code> we get caching. A quick timing test proves our cache is working:</p><div class="codeblock"><pre><code>@simple_cache
def fib(n):
    if n &lt;= 1: return 1
    return fib(n-1) + fib(n-2)

fib_time = timeit.timeit(&#39;fib(30)&#39;, number=200, setup=&quot;from __main__ import fib&quot;)
print(fib_time) # 3.398300032131374e-05</code></pre></div><p class="paragraph" style="text-align:left;">🤯 </p><p class="paragraph" style="text-align:left;">Again, if you feel confused, stepping through <a class="link" href="https://github.com/Tebs-Lab/lab-report-code/blob/main/2024-09-01/03_simple_cache.py?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=workbench-first-class-functions-closures-and-decorators" target="_blank" rel="noopener noreferrer nofollow">this script</a> in a debugger can be very enlightening.</p><h3 class="heading" style="text-align:left;" id="best-practices">Best Practices</h3><p class="paragraph" style="text-align:left;">Our <code>simple_cache</code> decorator violates some best practices, limiting its usefulness. First, it would only work with functions that take a single argument because <code>wrapper</code> only accepts a single parameter. Second, it would break Python’s <a class="link" href="https://gamedevacademy.org/python-introspection-tutorial-complete-guide/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=workbench-first-class-functions-closures-and-decorators" target="_blank" rel="noopener noreferrer nofollow">introspection</a> systems, including <a class="link" href="https://docs.python.org/3/library/functions.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=workbench-first-class-functions-closures-and-decorators#help" target="_blank" rel="noopener noreferrer nofollow">functions like help(). </a>Lack of proper introspection can make debugging more difficult. It also breaks helpful tools built into the interactive shell, many IDEs, and documentation tools.</p><p class="paragraph" style="text-align:left;">Here’s a recipe for a better decorator that overcomes both of these weaknesses:</p><div class="codeblock"><pre><code>import functools

def proper_decorator(input_function):

    @functools.wraps(input_function)
    def wrapper(*args, **kwargs):
        # Do stuff before the function call...
        r = input_function(*args, **kwargs)
        # Do stuff after the function call
        return r
    
    return wrapper</code></pre></div><p class="paragraph" style="text-align:left;">The built-in decorator used on the inner function (inception!) helps Python’s introspection system work with decorated functions. The ‘<a class="link" href="https://stackoverflow.com/questions/36901/what-does-double-star-asterisk-and-star-asterisk-do-for-parameters?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=workbench-first-class-functions-closures-and-decorators" target="_blank" rel="noopener noreferrer nofollow">unpacking</a>’ syntax (the asterisks) used on <code>*args</code> and <code>**kwargs</code> allows any function decorated with <code>@proper_decorator</code> to accept any number of arguments or keyword arguments. </p><p class="paragraph" style="text-align:left;">Let’s expand our <code>simple_cache</code> to conform to these best practices:</p><div class="codeblock"><pre><code>def better_cache(input_function):
    cache = &#123;&#125;

    @functools.wraps(input_function)
    def wrapper(*args, **kwargs):
        cache_key = (args, tuple(sorted(kwargs.items()))) 
        if cache_key in cache: 
            r = cache[cache_key]
            print(f&#39;HIT &#123;cache_key&#125; -&gt; &#123;r&#125;&#39;)
        else:
            r = input_function(*args, **kwargs)
            cache[cache_key] = r
            print(f&#39;MISS, adding &#123;cache_key&#125; -&gt; &#123;r&#125;&#39;)

        return r

    return wrapper</code></pre></div><p class="paragraph" style="text-align:left;">This cache decorator is now <i>fairly </i>robust. Consider this example:</p><div class="codeblock"><pre><code>@better_cache
def tester(a, b, c=0, d=0):
    return a + b + c + d

tester(1, 2)           # MISS
tester(1, 2)           # HIT
tester(1, 2, c=0)      # MISS!! (see the writeup below)
tester(1, 2, c=0)      # HIT
tester(1, 2, c=5)      # MISS
tester(1, 2, c=5)      # HIT
tester(1, 2, d=4, c=5) # MISS
tester(1, 2, c=5, d=4) # HIT (handles swapped order of c and d)</code></pre></div><p class="paragraph" style="text-align:left;">There are still a couple of limitations:</p><ol start="1"><li><p class="paragraph" style="text-align:left;">If any of the parameters passed into a decorated function are ‘unhashable types’ (such as <code>lists</code> and <code>dictionaries</code>) then our decorator will throw an error.</p></li><li><p class="paragraph" style="text-align:left;">Explicitly sending a keyword argument using the default value will produce a different <code>cache_key</code> from when the keyword argument is unspecified and the default value is used. </p></li></ol><p class="paragraph" style="text-align:left;">Those aren’t such horrible limitations. The first could be solved, though it can be tricky since unhashable types are typically complex data structures. Most of the time, it won’t be worth the hassle. </p><p class="paragraph" style="text-align:left;">I’m not convinced the second actually <i>can </i>be solved. I tried and failed. If you know how, please let me know! Regardless, duplicating a couple of values in the cache isn’t the end of the world.</p><h3 class="heading" style="text-align:left;" id="decorators-that-accept-arguments">Decorators That Accept Arguments</h3><p class="paragraph" style="text-align:left;">It’s also possible to create decorators that <i>themselves </i>accept arguments. For example, right now the dictionary we use as a cache can grow infinitely large. Suppose we wanted our cache to have a maximum size, we’d want to design a decorator that can be used like this:</p><div class="codeblock"><pre><code>@size_limited_cache(maxsize=20)
def tester(a, b, c=0, d=0):
    return a + b + c + d</code></pre></div><p class="paragraph" style="text-align:left;">Unfortunately, this gets a bit more complicated because a decorator <i><b>must</b></i> be a function that just accepts a function as input; that’s just how Python is designed. So, if we want our decorator to accept arguments, the standard workaround is to <i>call </i>a function that <i>returns </i>a decorator. In the above example, <code>size_limited_cache</code> is a function that accepts a <code>maxsize</code> parameter and returns a <b>decorator</b> (i.e., a function that accepts and returns a function). </p><p class="paragraph" style="text-align:left;">Hold onto your hats, here’s what it looks like:</p><div class="codeblock"><pre><code>import functools

def size_limited_cache(maxsize=20):

    def decorator(input_function):
        cache = &#123;&#125;

        @functools.wraps(input_function)
        def wrapper(*args, **kwargs):
            cache_key = (args, tuple(sorted(kwargs.items()))) 
            if cache_key in cache: 
                r = cache[cache_key]
                print(f&#39;HIT &#123;cache_key&#125; -&gt; &#123;r&#125;&#39;)
            else:
                r = input_function(*args, **kwargs)

                if len(cache) &lt; maxsize:
                    cache[cache_key] = r
                    print(f&#39;MISS, adding &#123;cache_key&#125; -&gt; &#123;r&#125;&#39;)
                else:
                    print(&quot;MISS, but cache at maxsize.&quot;)

            return r

        return wrapper
    
    return decorator</code></pre></div><p class="paragraph" style="text-align:left;">Clear as mud? Try breaking it down one function at a time:</p><p class="paragraph" style="text-align:left;"><code>size_limited_cache</code> accepts one keyword parameter (<code>maxsize</code>) and returns the function we called <code>decorator</code>. <code>maxsize</code> gets a unique value and is “closed over” every time <code>size_limited_cache</code> is called, but <code>maxsize</code> is only used inside of <code>wrapper</code>. </p><p class="paragraph" style="text-align:left;"><code>decorator</code> accepts a function as input (we called it <code>input_function</code>) and returns the function we called <code>wrapper</code>. <code>decorator</code> also initializes our <code>cache</code>.</p><p class="paragraph" style="text-align:left;"><code>wrapper</code> actually calls <code>input_function</code>, uses the <code>cache</code>, and now it also uses <code>maxsize</code> to limit the growth of the cache. </p><p class="paragraph" style="text-align:left;">Now, consider the order of operations when we decorate a function with this new setup:</p><div class="codeblock"><pre><code>@size_limited_cache(maxsize=5)
def tester(a, b, c=0, d=0):
    return a + b + c + d</code></pre></div><p class="paragraph" style="text-align:left;">On the first line <code>size_limited_cache</code> is called. That creates and returns the function we called <code>decorator</code>. This is also the moment in time that <code>maxsize</code> is closed over — and therefore fixed to <code>5</code> — for the function <code>tester</code>. Said another way: <code>size_limited_cache(maxsize=5)</code> results in a <i>specific instance </i>of <code>decorator</code>. </p><p class="paragraph" style="text-align:left;">On the second line (the <code>def</code>) the Python interpreter calls that specific instance of <code>decorator</code> with <code>tester</code> as the <code>input_function</code>. The <code>cache</code> is initialized, a <i>specific instance </i>of the function <code>wrapper</code> is created and returned, and the Python interpreter replaces the raw version of <code>tester</code> with the specific instance of <code>wrapper</code>.</p><p class="paragraph" style="text-align:left;">Finally, when we write code that calls <code>tester</code>, the instance of <code>wrapper</code> actually gets called, invoking our size-limited cache. Here’s a crude example:</p><div class="codeblock"><pre><code>for _ in range(3):
    for i in range(6):
        tester(i, i)</code></pre></div><p class="paragraph" style="text-align:left;">The print statements inside <code>wrapper</code> result in the following:</p><div class="codeblock"><pre><code>MISS, adding ((0, 0), ()) -&gt; 0
MISS, adding ((1, 1), ()) -&gt; 2
MISS, adding ((2, 2), ()) -&gt; 4
MISS, adding ((3, 3), ()) -&gt; 6
MISS, adding ((4, 4), ()) -&gt; 8
MISS, ((5, 5), ()) but cache at maxsize.
HIT ((0, 0), ()) -&gt; 0
HIT ((1, 1), ()) -&gt; 2
HIT ((2, 2), ()) -&gt; 4
HIT ((3, 3), ()) -&gt; 6
HIT ((4, 4), ()) -&gt; 8
MISS, ((5, 5), ()) but cache at maxsize.
HIT ((0, 0), ()) -&gt; 0
HIT ((1, 1), ()) -&gt; 2
HIT ((2, 2), ()) -&gt; 4
HIT ((3, 3), ()) -&gt; 6
HIT ((4, 4), ()) -&gt; 8
MISS, ((5, 5), ()) but cache at maxsize.</code></pre></div><p class="paragraph" style="text-align:left;">To fully mimic the builtin <code>lru_cache</code> decorator we’d need to do more work to keep the cache keys in an ordered collection based on how recently they’ve been used, and appropriately evict old entries. I’ll leave that as an exercise for you, dear reader, but I’ll mention that <a class="link" href="https://docs.python.org/3/library/collections.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=workbench-first-class-functions-closures-and-decorators#collections.OrderedDict" target="_blank" rel="noopener noreferrer nofollow">the OrderdDict class from Python’s collections module</a> is <i>very</i> useful. </p><h3 class="heading" style="text-align:left;" id="multiple-decorators">Multiple Decorators</h3><p class="paragraph" style="text-align:left;">You can even decorate a function multiple times. Here are two simple decorators. The first logs the parameters and return value every time a function is called. The second causes the function to be called twice, and returns both return values combined into a tuple.</p><div class="codeblock"><pre><code>import functools
import logging
import sys

logging.basicConfig(level=logging.INFO, stream=sys.stdout)


def log(input_function):

    functools.wraps(input_function)
    def wrapper(*args, **kwargs):
        logging.info(f&#39;&#123;input_function&#125; called with args: &#123;args&#125; and kwargs: &#123;kwargs&#125;&#39;)
        r = input_function(*args, **kwargs)
        logging.info(f&#39;&#123;input_function&#125; returned &#123;r&#125;&#39;)
        return r

    return wrapper


def repeat(input_function):
    
    functools.wraps(input_function)
    def wrapper(*args, **kwargs):
        r1 = input_function(*args, **kwargs)
        r2 = input_function(*args, **kwargs)
        return r1, r2
    
    return wrapper</code></pre></div><p class="paragraph" style="text-align:left;">If we decorate a function with <code>repeat</code> then <code>log</code>, the function will be called twice, and each individual call will be logged, but the final result will be a tuple; that tuple won’t appear anywhere in the logs but will appear in the final print statement:</p><div class="codeblock"><pre><code>@repeat
@log
def add(a, b):
    return a + b

r = add(1, 2)
print(f&quot;Final value for r: &#123;r&#125;&quot;)</code></pre></div><p class="paragraph" style="text-align:left;">Yields:</p><div class="codeblock"><pre><code>INFO:root:&lt;function add at 0x10734f740&gt; called with args: (1, 2) and kwargs: &#123;&#125;
INFO:root:&lt;function add at 0x10734f740&gt; returned 3
INFO:root:&lt;function add at 0x10734f740&gt; called with args: (1, 2) and kwargs: &#123;&#125;
INFO:root:&lt;function add at 0x10734f740&gt; returned 3
Final value for r: (3, 3)</code></pre></div><p class="paragraph" style="text-align:left;">Decorators are applied from bottom to top. So, when we execute the <code>def</code> in this code, the following happens in this order:</p><ol start="1"><li><p class="paragraph" style="text-align:left;">The raw function <code>add</code> is created. </p></li><li><p class="paragraph" style="text-align:left;">That <code>add</code> function is sent to the <code>log</code> decorator as <code>input_function</code>, which returns an instance of <code>wrapper</code>, specifically, the one that adds logging information.</p></li><li><p class="paragraph" style="text-align:left;">That new logged function is passed into the <code>repeat</code> decorator as <code>input_function</code>; <code>repeat</code> returns an instance of <i>its </i><code>wrapper</code>.</p></li><li><p class="paragraph" style="text-align:left;">The Python interpreter replaces references to our raw <code>add</code> function with the <code>wrapper</code> returned from <code>repeat</code> in step 3. </p></li></ol><p class="paragraph" style="text-align:left;">We can see that both of the individual calls to the <code>wrapper</code> created in step 2 get logged separately.</p><p class="paragraph" style="text-align:left;">Now consider what happens when we switch the order of the decorators:</p><div class="codeblock"><pre><code>@log
@repeat
def add(a, b):
    return a + b

r = add(1, 2)
print(f&quot;Final value for r: &#123;r&#125;&quot;)</code></pre></div><div class="codeblock"><pre><code>INFO:root:&lt;function repeat.&lt;locals&gt;.wrapper at 0x10ef177e0&gt; called with args: (1, 2) and kwargs: &#123;&#125;
INFO:root:&lt;function repeat.&lt;locals&gt;.wrapper at 0x10ef177e0&gt; returned (3, 3)
Final value for r: (3, 3)</code></pre></div><p class="paragraph" style="text-align:left;">Instead of logging each individual function call the <code>wrapper</code> of the <code>repeat</code> decorator is passed as <code>input_function</code> to the <code>log</code> decorator. That’s why we only get one set of log lines instead of two sets. Additionally, the final return value matches the value that gets logged (i.e., the tuple <code>(3,3)</code>).</p><h3 class="heading" style="text-align:left;" id="challenge-yourself-and-put-it-all-t">Challenge yourself and put it all together:</h3><p class="paragraph" style="text-align:left;">Consider the following code — which uses 3 of the decorators we built — and try to decide <b>what gets logged and what key-value pair gets saved in the cache.</b></p><div class="codeblock"><pre><code>@size_limited_cache(maxsize=5)
@repeat
@log
def add(a, b):
    return a + b


add(1,2)
add(1,2)</code></pre></div><p class="paragraph" style="text-align:left;"><a class="link" href="https://github.com/Tebs-Lab/lab-report-code/blob/main/2024-09-01/08_challenge.py?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=workbench-first-class-functions-closures-and-decorators" target="_blank" rel="noopener noreferrer nofollow">See all three decorators in one place and the answers (at the bottom) in this Python script.</a></p><h3 class="heading" style="text-align:left;" id="other-popular-use-cases-and-example">Other Popular Use Cases and Examples</h3><ul><li><p class="paragraph" style="text-align:left;">Flask (and other webserver frameworks) use decorators to <a class="link" href="https://flask.palletsprojects.com/en/3.0.x/quickstart/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=workbench-first-class-functions-closures-and-decorators#routing" target="_blank" rel="noopener noreferrer nofollow">register routes</a>, <a class="link" href="https://flask.palletsprojects.com/en/3.0.x/patterns/viewdecorators/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=workbench-first-class-functions-closures-and-decorators#login-required-decorator" target="_blank" rel="noopener noreferrer nofollow">require users to log in</a>, <a class="link" href="https://flask.palletsprojects.com/en/3.0.x/patterns/viewdecorators/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=workbench-first-class-functions-closures-and-decorators#caching-decorator" target="_blank" rel="noopener noreferrer nofollow">cache page content</a>, and more.</p></li><li><p class="paragraph" style="text-align:left;">The builtin @property, @staticmethod, @classmethod decorators make some OOP patterns <a class="link" href="https://realpython.com/instance-class-and-static-methods-demystified/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=workbench-first-class-functions-closures-and-decorators" target="_blank" rel="noopener noreferrer nofollow">easier to implement in Python.</a></p></li><li><p class="paragraph" style="text-align:left;"><a class="link" href="https://wrapt.readthedocs.io/en/latest/examples.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=workbench-first-class-functions-closures-and-decorators" target="_blank" rel="noopener noreferrer nofollow">Some libraries</a> for asynchronous code use decorators to manage locks and thread synchronization.</p></li><li><p class="paragraph" style="text-align:left;"><a class="link" href="https://typeguard.readthedocs.io/en/stable/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=workbench-first-class-functions-closures-and-decorators" target="_blank" rel="noopener noreferrer nofollow">Enforcing type hints at runtime</a>.</p></li><li><p class="paragraph" style="text-align:left;"><a class="link" href="https://github.com/lord63/awesome-python-decorator?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=workbench-first-class-functions-closures-and-decorators" target="_blank" rel="noopener noreferrer nofollow">And more</a>.</p></li></ul><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h2 class="heading" style="text-align:left;">Remember…</h2><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">The Lab Report is free and doesn’t even advertise. Our curricula is open source and published under a public domain license for <i>anyone </i>to use for <i>any purpose. </i>We’re also a very small team with no investors.</p><p class="paragraph" style="text-align:left;">Help us keep providing these free services by scheduling one of our <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=workbench-first-class-functions-closures-and-decorators" target="_blank" rel="noopener noreferrer nofollow">world class trainings</a>, requesting a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=workbench-first-class-functions-closures-and-decorators" target="_blank" rel="noopener noreferrer nofollow">custom class</a> for your team, or taking one of our <a class="link" href="https://www.tebs-lab.com/upcoming-classes?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=workbench-first-class-functions-closures-and-decorators" target="_blank" rel="noopener noreferrer nofollow">open enrollment classes</a>.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div></div></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=63626516-3da3-46ae-aff9-d1f21ee83f6c&utm_medium=post_rss&utm_source=teb_s_lab">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

      <item>
  <title>Logic Problems Causing Problems</title>
  <description>A man, a cabbage, and an LLM need to cross a river...</description>
      <enclosure url="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/d1822223-62b6-4b10-8234-e82fd67b69d3/image.png" length="1285975" type="image/png"/>
  <link>https://blog.tebs-lab.com/p/logic-problems-causing-problems</link>
  <guid isPermaLink="true">https://blog.tebs-lab.com/p/logic-problems-causing-problems</guid>
  <pubDate>Sun, 04 Aug 2024 11:00:00 +0000</pubDate>
  <atom:published>2024-08-04T11:00:00Z</atom:published>
    <dc:creator>Tyler Bettilyon</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #C0C0C0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #2D2D2D; font-family: 'Helvetica',Arial,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#F1F1F1; }
  .bh__table_header p { color: #2A2A2A; font-family:'Trebuchet MS','Lucida Grande',Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><h1 class="heading" style="text-align:left;" id="the-lab-report">The Lab Report</h1><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">I’m Tyler Elliot Bettilyon (Teb) and this is the Lab Report: Our goal is to deepen your understanding of software and technology by explaining the concepts behind the news. <br><br>If you’re new to the Lab Report you can <a class="link" href="https://tebslab.beehiiv.com/subscribe?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=logic-problems-causing-problems" target="_blank" rel="noopener noreferrer nofollow">subscribe here</a>. <br><br>If you like what you’re reading you’ll love one of our classes. Signup for an <a class="link" href="https://www.tebs-lab.com/upcoming-classes?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=logic-problems-causing-problems" target="_blank" rel="noopener noreferrer nofollow">upcoming class</a>, browse our <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=logic-problems-causing-problems" target="_blank" rel="noopener noreferrer nofollow">course catalog</a> for corporate trainings, or request a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=logic-problems-causing-problems" target="_blank" rel="noopener noreferrer nofollow">custom class consultation</a>.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">From The Lab</h1><p class="paragraph" style="text-align:left;">We recently published two new open-source curricula repos:</p><p class="paragraph" style="text-align:left;"><a class="link" href="https://github.com/Tebs-Lab/apis-in-python-4-session?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=logic-problems-causing-problems" target="_blank" rel="noopener noreferrer nofollow"><b>APIs in Python</b></a><b> </b>—<b> </b>This class targets beginner to intermediate Python developers and teaches them how to use Web APIs. It includes some HTTP fundamentals, use of the popular <a class="link" href="https://pypi.org/project/requests/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=logic-problems-causing-problems" target="_blank" rel="noopener noreferrer nofollow"><i>requests</i></a> module, common authentication methods, and exercises using Reddit and Github’s APIs (and optionally OpenAI’s).</p><p class="paragraph" style="text-align:left;"><a class="link" href="https://github.com/Tebs-Lab/LLMs-Non-Eng-Two-Day-Workshop?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=logic-problems-causing-problems" target="_blank" rel="noopener noreferrer nofollow"><b>LLM’s: More Than The Basics</b></a><b> </b>—<b> </b>This class targets anyone who works with, or wants to work with, LLMs. It covers some of the math, several case studies, the major industrial players, and tips and tricks for using LLMs, including prompt engineering and evaluation techniques. </p><p class="paragraph" style="text-align:left;">If you want me to teach either of these classes for your office, sign up for a <a class="link" href="https://www.tebs-lab.com/contact-us?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=logic-problems-causing-problems" target="_blank" rel="noopener noreferrer nofollow">free consultation</a>.</p></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">Today’s Lesson</h1><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/d1822223-62b6-4b10-8234-e82fd67b69d3/image.png?t=1722532551"/></div><h2 class="heading" style="text-align:left;">Goats, Boats, and Specious Machines</h2><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;"><i>Specious (adj): superficially plausible, but actually wrong.</i></p><figcaption class="blockquote__byline"></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">Modern ML models are incredible <a class="link" href="https://www.scientificamerican.com/article/chatgpt-isnt-hallucinating-its-bullshitting/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=logic-problems-causing-problems" target="_blank" rel="noopener noreferrer nofollow">bullshitters</a>. This is largely because they have mastered the art of creating <i>specious</i> content. Images that look natural and realistic at first glance, but actually, there are too many fingers, toes, or arms; writing that has all the trappings of professionally edited prose, but none of the substance. Nicholas Thompson, the CEO of The Atlantic magazine, gave a great example in <a class="link" href="https://www.linkedin.com/posts/nicholasxthompson_the-most-interesting-thing-in-tech-how-do-activity-7223808658688634880--A6T?utm_source=share&utm_medium=member_desktop" target="_blank" rel="noopener noreferrer nofollow">this short video</a> about how LLMs struggle with simplified versions of well-known logic problems.</p><p class="paragraph" style="text-align:left;">Thomspon&#39;s classic example goes like this: A man, a cabbage, a dog, and a goat need to cross a river. The man’s boat can only hold the man and one other item at a time. If left unattended, the goat will eat the cabbage. Similarly, if left alone, the dog will eat the goat. How do they all cross without anything getting eaten?</p><p class="paragraph" style="text-align:left;">The answer takes seven steps and involves taking some things across the river multiple times to satisfy the constraints. <a class="link" href="https://chatgpt.com/share/ce5d114a-8f13-4a5b-a2ff-f99be00f174c?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=logic-problems-causing-problems" target="_blank" rel="noopener noreferrer nofollow">ChatGPT-4o gave a correct solution when I asked</a> it the canonical version of this riddle. However, if you simplify the problem, the model’s answer gets silly. I gave the model this prompt:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">Solve this logic problem for me:<br><br>A man and a cabbage need to cross a river. The man has a boat, but the boat can only fit two of them at a time. The river is too deep to cross without a boat. Also, if the cabbage falls into the river it sinks, because it has a heart of gold.</p><figcaption class="blockquote__byline"> Me, to ChatGPT-4o </figcaption></blockquote></div><p class="paragraph" style="text-align:left;">The model responds with the following:</p><hr class="content_break"><p class="paragraph" style="text-align:left;">This is a classic logic problem involving transporting items across a river with certain constraints. Here’s how the man can successfully cross the river with the cabbage:</p><ol start="1"><li><p class="paragraph" style="text-align:left;"><b>The man takes the cabbage across the river</b> and leaves it on the other side.</p></li><li><p class="paragraph" style="text-align:left;"><b>The man returns alone</b> to the original side.</p></li><li><p class="paragraph" style="text-align:left;"><b>The man takes himself across the river</b> again to the other side.</p></li></ol><p class="paragraph" style="text-align:left;">In this sequence:</p><ul><li><p class="paragraph" style="text-align:left;">The cabbage is safely on the other side and the man is also on the other side.</p></li><li><p class="paragraph" style="text-align:left;">The cabbage never falls into the river because the man ensures it is safely transported without ever leaving it alone in the boat.</p></li></ul><p class="paragraph" style="text-align:left;">So, the steps can be summarized as:</p><ol start="1"><li><p class="paragraph" style="text-align:left;">Man and cabbage cross to the other side.</p></li><li><p class="paragraph" style="text-align:left;">Man returns alone to the original side.</p></li><li><p class="paragraph" style="text-align:left;">Man crosses the river by himself.</p></li></ol><p class="paragraph" style="text-align:left;">This way, all the constraints are respected, and the cabbage doesn&#39;t fall into the river.</p><p class="paragraph" style="text-align:left;"><i>- ChatGPT-4o, to me</i></p><hr class="content_break"><p class="paragraph" style="text-align:left;">In <a class="link" href="https://chatgpt.com/share/93065b2e-7f67-46c2-829f-b173c8edd550?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=logic-problems-causing-problems" target="_blank" rel="noopener noreferrer nofollow">the full exchange</a>, I asked ChatGPT why it had the man cross alone and back even though the man and cabbage were both on the far side after step 1. The model acknowledged the error and then gave me the exact same three-step solution. </p><p class="paragraph" style="text-align:left;">So I told it:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">You just gave me the same 3 step solution even after you said it had unnecessary steps.</p><figcaption class="blockquote__byline"> Me </figcaption></blockquote></div><p class="paragraph" style="text-align:left;">The model apologized and then gave the exact same solution a third time:</p><hr class="content_break"><p class="paragraph" style="text-align:left;">You&#39;re right, I apologize for the confusion. Let&#39;s reassess the problem to provide a correct solution.</p><p class="paragraph" style="text-align:left;">The man and the cabbage need to cross the river, with the constraint that the boat can only fit two at a time. However, the man can only carry one thing at a time along with himself in the boat.</p><p class="paragraph" style="text-align:left;">Here is the correct solution:</p><ol start="1"><li><p class="paragraph" style="text-align:left;"><b>The man takes the cabbage across the river</b> and leaves it on the other side.</p></li><li><p class="paragraph" style="text-align:left;"><b>The man returns alone</b> to the original side.</p></li><li><p class="paragraph" style="text-align:left;"><b>The man crosses the river by himself</b> again.</p></li></ol><p class="paragraph" style="text-align:left;">Indeed, with just the cabbage and the man, the solution is straightforward and simpler than initially stated:</p><ol start="1"><li><p class="paragraph" style="text-align:left;"><b>The man takes the cabbage across the river</b>.</p></li><li><p class="paragraph" style="text-align:left;"><b>The man returns alone</b>.</p></li><li><p class="paragraph" style="text-align:left;"><b>The man crosses the river alone</b>.</p></li></ol><p class="paragraph" style="text-align:left;">Thus, both the man and the cabbage are safely on the far side of the river.</p><p class="paragraph" style="text-align:left;">- ChatGPT-4o</p><hr class="content_break"><p class="paragraph" style="text-align:left;">I was finally able to illicit the correct single-step solution by telling the model:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">If you just omit steps 2 and 3 your answer would be correct. Try that.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><h3 class="heading" style="text-align:left;">Reasoning or Regurgitation?</h3><p class="paragraph" style="text-align:left;">Modern LLMs don’t perform logical deduction or reasoning, they predict likely sequences of words. To the extent that their responses contain accurate logical deductions, that’s an emergent property of predicting the next word in a sequence. The model “knows” the correct solution because the correct solution appears (probably many times) in the training data. Deviations from the original puzzle — even deviations that make the problem much easier — throw the model into uncharted territory where it is easily confused. </p><p class="paragraph" style="text-align:left;">This is an example of the “<a class="link" href="https://blog.tebs-lab.com/p/ml-failure-part-two-extrapolation-vs-interpolation?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=logic-problems-causing-problems" target="_blank" rel="noopener noreferrer nofollow">Extrapolation vs Interpolation</a>” problem with ML models. When a model is extrapolating (making predictions outside the bounds of its training data) it is much more likely to fail. It’s problematic for ML boosters that simplifying a puzzle like this can push the model into the realm of extrapolation. It is more troubling still that the model (apparently) hasn’t generalized the logical constructs underpinning the correct solution. </p><p class="paragraph" style="text-align:left;">I repeated this process with a few other well-known logic problems, making small changes that trivialize or substantially simplify the original puzzle. Overall, the best word for the results is <i>specious</i>. The models typically gave answers that had the form of thoughtful, well-reasoned solutions, but the substance was wrong, and sometimes even internally inconsistent. </p><h3 class="heading" style="text-align:left;">Bridge, Torch, and Object Permanence Puzzles</h3><p class="paragraph" style="text-align:left;">In the classic “bridge and torch” setup, only two people can cross the bridge at once; the torch must be carried while crossing the bridge, and you cross at the pace of the slowest person. There is only one torch. The “canonical” version that appears most online has the following characters:</p><ul><li><p class="paragraph" style="text-align:left;">Alice, 1 minute crossing.</p></li><li><p class="paragraph" style="text-align:left;">Bob, 2 minute crossing.</p></li><li><p class="paragraph" style="text-align:left;">Charlie, 5 minute crossing.</p></li><li><p class="paragraph" style="text-align:left;">Daniella, 8 minute crossing.</p></li></ul><p class="paragraph" style="text-align:left;">That puzzle can be done in 15 minutes, which — as you’ll see — the model definitely “knows.” Here’s one of my simplifications:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">Solve this logic problem for me: </p><p class="paragraph" style="text-align:left;">This is a classic &quot;bridge and torch&quot; problem. <b>Any number of people may cross the bridge at the same time</b>, if they do so the whole group crosses at the slowest persons pace. Alice crosses in 1 minute, Bob crosses in 2 minutes, Charlie crosses in 5 minutes, and Daniella crosses in 8 minutes. They only have 30 minutes to cross. </p><p class="paragraph" style="text-align:left;">How is it done?</p><figcaption class="blockquote__byline"> Me, to ChatGPT-4o </figcaption></blockquote></div><p class="paragraph" style="text-align:left;">Instead of using the relaxed constraint that everyone can cross at the same time, the model just regurgitates the canonical solution.</p><hr class="content_break"><p class="paragraph" style="text-align:left;">To solve the &quot;bridge and torch&quot; problem where Alice, Bob, Charlie, and Daniella need to cross a bridge in 30 minutes, we need to minimize the time spent by using strategic pairings. Here’s the step-by-step solution:</p><ol start="1"><li><p class="paragraph" style="text-align:left;"><b>Alice and Bob cross first</b>:</p><ul><li><p class="paragraph" style="text-align:left;">Time taken: 2 minutes (since Bob is the slower one).</p></li></ul></li><li><p class="paragraph" style="text-align:left;"><b>Alice returns</b>:</p><ul><li><p class="paragraph" style="text-align:left;">Time taken: 1 minute.</p></li></ul></li><li><p class="paragraph" style="text-align:left;"><b>Charlie and Daniella cross</b>:</p><ul><li><p class="paragraph" style="text-align:left;">Time taken: 8 minutes (since Daniella is the slower one).</p></li></ul></li><li><p class="paragraph" style="text-align:left;"><b>Bob returns</b>:</p><ul><li><p class="paragraph" style="text-align:left;">Time taken: 2 minutes.</p></li></ul></li><li><p class="paragraph" style="text-align:left;"><b>Alice and Bob cross again</b>:</p><ul><li><p class="paragraph" style="text-align:left;">Time taken: 2 minutes.</p></li></ul></li></ol><p class="paragraph" style="text-align:left;">Adding up all the times:</p><ul><li><p class="paragraph" style="text-align:left;">2 (Alice and Bob cross)</p></li><li><p class="paragraph" style="text-align:left;">1 (Alice returns)</p></li><li><p class="paragraph" style="text-align:left;">8 (Charlie and Daniella cross)</p></li><li><p class="paragraph" style="text-align:left;">2 (Bob returns)</p></li><li><p class="paragraph" style="text-align:left;">2 (Alice and Bob cross again)</p></li></ul><p class="paragraph" style="text-align:left;">Total time = 2 + 1 + 8 + 2 + 2 = 15 minutes.</p><p class="paragraph" style="text-align:left;">So, they all cross the bridge in 15 minutes, well within the 30-minute limit.</p><p class="paragraph" style="text-align:left;">- ChatGPT-4o</p><hr class="content_break"><p class="paragraph" style="text-align:left;">So I asked ChatGPT:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">Wouldn&#39;t it be faster for everyone to cross at the same time?</p><figcaption class="blockquote__byline"> Me, to ChatGPT-4o </figcaption></blockquote></div><div class="blockquote"><blockquote class="blockquote__quote"></blockquote></div><p class="paragraph" style="text-align:left;">In <a class="link" href="https://chatgpt.com/share/70a0ac97-8cef-4a11-ac01-b58f8528e430?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=logic-problems-causing-problems" target="_blank" rel="noopener noreferrer nofollow">the full exchange</a>, I once again had to cajole the model multiple times before it finally agreed that my solution — having everyone cross at once — was indeed faster and simpler. </p><h3 class="heading" style="text-align:left;">Three At a Time?</h3><p class="paragraph" style="text-align:left;">I tried another simplification where three people could cross at once. This would allow a solution in just two crossings: Three people go, one comes back, and then the remaining two go. Here’s the <a class="link" href="https://chatgpt.com/share/a508b1e1-be8a-45cf-b60a-4adc20ae99c2?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=logic-problems-causing-problems" target="_blank" rel="noopener noreferrer nofollow">prompt and reply</a>:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">Solve this logic problem for me: </p><p class="paragraph" style="text-align:left;">This is a classic &quot;bridge and torch&quot; problem. Three people may cross the bridge at the same time, if they do so the whole group crosses at the slowest persons pace. Alice crosses in 3 minute, Bob crosses in 5 minutes, Charlie crosses in 8 minutes, and Daniella crosses in 17 minutes. They only have 30 minutes to cross. How is it done?</p><figcaption class="blockquote__byline"> Me, to ChatGPT-4o </figcaption></blockquote></div><p class="paragraph" style="text-align:left;">Here is GPTs response, with some notes from myself in brackets and <span style="color:#92b9d2;"><b>this color</b></span>.</p><hr class="content_break"><p class="paragraph" style="text-align:left;">To solve the &quot;bridge and torch&quot; problem, we need to minimize the total time for everyone to cross the bridge. Here&#39;s the step-by-step solution:</p><ol start="1"><li><p class="paragraph" style="text-align:left;"><b>Send Alice, Bob, and Charlie together first: </b><span style="color:#92b9d2;"><b>[Off to a good start, it correctly adapts to the changed times and uses the relaxed constraint to send three people. The most efficient solution is to send Alice, Charlie, and Daniella, though.]</b></span></p><ul><li><p class="paragraph" style="text-align:left;">Total time: 8 minutes (the slowest of the three).</p></li></ul></li><li><p class="paragraph" style="text-align:left;"><b>Alice returns with the torch:</b></p><ul><li><p class="paragraph" style="text-align:left;">Total time: 8 + 3 = 11 minutes.</p></li></ul></li><li><p class="paragraph" style="text-align:left;"><b>Send Daniella and Bob together: </b><span style="color:#92b9d2;"><b>[Daniella and Bob aren’t on the same side right now. Alice is with Daniella; Bob is on the far side. This is nonsense.]</b></span></p><ul><li><p class="paragraph" style="text-align:left;">Total time: 11 + 17 = 28 minutes (the slowest of the two).</p></li></ul></li><li><p class="paragraph" style="text-align:left;"><b>Charlie returns with the torch: </b><span style="color:#92b9d2;"><b>[What? Charlie is slow, so they should never be sent back alone when you can send Alice or Bob. Also, if step three were done right (Alice and Daniella cross) we’d already be done, so there’s no reason to send anyone back anyway.]</b></span></p><ul><li><p class="paragraph" style="text-align:left;">Total time: 28 + 8 = 36 minutes.</p></li></ul><p class="paragraph" style="text-align:left;">At this point, 36 minutes is too long. So, let&#39;s re-evaluate the steps to find a more optimal way. Here’s an alternative solution:</p></li><li><p class="paragraph" style="text-align:left;"><b>Send Alice and Bob first: </b><span style="color:#92b9d2;"><b>[This being step 5 when it should be step 1 of the “alternative solution” is a minor formatting error. Sending just two people instead of three is a worse error. Also, this is just the solution to the &quot;canonical” problem.]</b></span></p><ul><li><p class="paragraph" style="text-align:left;">Total time: 5 minutes (the slower of the two).</p></li></ul></li><li><p class="paragraph" style="text-align:left;"><b>Alice returns with the torch:</b></p><ul><li><p class="paragraph" style="text-align:left;">Total time: 5 + 3 = 8 minutes.</p></li></ul></li><li><p class="paragraph" style="text-align:left;"><b>Send Daniella and Charlie together: </b><span style="color:#92b9d2;"><b>[Again, if we send all three across here then we’re done]</b></span></p><ul><li><p class="paragraph" style="text-align:left;">Total time: 8 + 17 = 25 minutes (the slower of the two).</p></li></ul></li><li><p class="paragraph" style="text-align:left;"><b>Bob returns with the torch:</b></p><ul><li><p class="paragraph" style="text-align:left;">Total time: 25 + 5 = 30 minutes.</p></li></ul></li><li><p class="paragraph" style="text-align:left;"><b>Send Alice and Bob together again:</b></p><ul><li><p class="paragraph" style="text-align:left;">Total time: 30 + 5 = 35 minutes. <span style="color:#92b9d2;"><b>[This is outside the time limit I gave, though the model doesn’t mention that here.]</b></span></p></li></ul></li></ol><p class="paragraph" style="text-align:left;">Here is a more refined approach: <span style="color:#92b9d2;"><b>[It’s actually just the solution to the “canonical” puzzle again, with a small error on step 5.]</b></span></p><ol start="1"><li><p class="paragraph" style="text-align:left;"><b>Send Alice and Bob first:</b></p><ul><li><p class="paragraph" style="text-align:left;">Total time: 5 minutes.</p></li></ul></li><li><p class="paragraph" style="text-align:left;"><b>Alice returns with the torch:</b></p><ul><li><p class="paragraph" style="text-align:left;">Total time: 5 + 3 = 8 minutes.</p></li></ul></li><li><p class="paragraph" style="text-align:left;"><b>Send Daniella and Charlie together:</b></p><ul><li><p class="paragraph" style="text-align:left;">Total time: 8 + 17 = 25 minutes.</p></li></ul></li><li><p class="paragraph" style="text-align:left;"><b>Bob returns with the torch:</b></p><ul><li><p class="paragraph" style="text-align:left;">Total time: 25 + 5 = 30 minutes.</p></li></ul></li><li><p class="paragraph" style="text-align:left;"><b>Send Alice and Bob again:</b></p><ul><li><p class="paragraph" style="text-align:left;">Total time: 30 + 3 = 33 minutes. <span style="color:#92b9d2;"><b>[Bob actually needs 5 minutes, so this is wrong.]</b></span></p></li></ul></li></ol><p class="paragraph" style="text-align:left;">The problem is not solvable within the given constraints of 30 minutes if only three people can cross at once, given the individual crossing times. <span style="color:#92b9d2;"><b>[Confident and wrong: Alice, Daniella, and Charlie cross first (17 minutes) Alice goes back (17+3=20 minutes) Alice and Bob cross together (20+5=25 minutes) and we’re done.]</b></span></p><p class="paragraph" style="text-align:left;">- ChatGPT-4o</p><hr class="content_break"><p class="paragraph" style="text-align:left;">The model gave three wrong solutions. Only one of the three used the relaxed constraint. When it did, the model made a suboptimal choice in who to send in the group of 3 and lost track of where Alice and Bob were. The second two solutions are both just the solution to the “canonical” version of the riddle. </p><p class="paragraph" style="text-align:left;">Let’s do one more.</p><h3 class="heading" style="text-align:left;">Boats, Weights, and Overcomplications</h3><p class="paragraph" style="text-align:left;">Another classic riddle setup usually involves a boat that can only hold a set amount of weight. This response was my favorite from this whole session.</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">Solve this logic problem: </p><p class="paragraph" style="text-align:left;">Four people (Alex, Brook, Chris and Dusty) want to cross a river <b>in a boat that can only carry any amount of weight</b>. Alex weighs 90kg, Brook weighs 80kg, Chris weighs 60kg and Dusty weighs 40kg, and they have 20kg of supplies. How do they get across?</p><figcaption class="blockquote__byline"> Me, to ChatGPT-4o </figcaption></blockquote></div><p class="paragraph" style="text-align:left;">Again, here’s the model’s reply with my commentary in brackets and <span style="color:#92b9d2;"><b>this color</b></span>.</p><hr class="content_break"><p class="paragraph" style="text-align:left;">To solve this problem, we need to find a way to get all four people (Alex, Brook, Chris, and Dusty) and the supplies across the river in a boat that can only carry any amount of weight but likely has other constraints, such as the number of people it can carry at one time. Since the boat&#39;s weight limit isn&#39;t specified as a constraint, let&#39;s assume the challenge is based on how many people can fit in the boat at one time. <span style="color:#92b9d2;"><b>[Sure, add a random assumption/constraint for no reason; that’s exactly the kind of behavior I want from an “AI assistant.” (that’s sarcasm).]</b></span></p><p class="paragraph" style="text-align:left;">Assuming the boat can carry only two people at a time along with the supplies, here’s a step-by-step strategy:</p><ol start="1"><li><p class="paragraph" style="text-align:left;"><b>First Trip Across</b>: Chris and Dusty row across with the supplies (60kg + 40kg + 20kg = 120kg). <span style="color:#92b9d2;"><b>[Funny that we’re adding up the weight even though it’s an irrelevant red herring in this version of the puzzle, which the model even called out.]</b></span></p></li><li><p class="paragraph" style="text-align:left;"><b>First Return Trip</b>: Dusty returns alone with the boat (40kg).</p></li><li><p class="paragraph" style="text-align:left;"><b>Second Trip Across</b>: Alex and Brook row across with the supplies (90kg + 80kg + 20kg = 190kg). <span style="color:#92b9d2;"><b>[The supplies are already on the far side.]</b></span></p></li><li><p class="paragraph" style="text-align:left;"><b>Second Return Trip</b>: Chris returns alone with the boat (60kg).</p></li><li><p class="paragraph" style="text-align:left;"><b>Third Trip Across</b>: Chris and Dusty row across with the supplies again (60kg + 40kg + 20kg = 120kg). <span style="color:#92b9d2;"><b>[The supplies are still already on the other side!!]</b></span></p></li></ol><p class="paragraph" style="text-align:left;">Now, Alex, Brook, Chris, Dusty, and all the supplies are on the other side of the river.</p><p class="paragraph" style="text-align:left;">This plan ensures that the boat is only ever occupied by up to two people at a time, which is a common restriction in such problems. <span style="color:#92b9d2;"><b>[The model unwittingly gives up the game here: it only knows how to respond to </b></span><span style="color:#92b9d2;"><i><b>common </b></i></span><span style="color:#92b9d2;"><b>constraints. Simple-but-novel constraints seem to break the model pretty reliably in my experimentation.]</b></span></p><p class="paragraph" style="text-align:left;">- ChatGPT-4o</p><hr class="content_break"><p class="paragraph" style="text-align:left;">After inventing a restriction for no reason, the model had the supplies cross three times in an impossible sequence. Nice.</p><p class="paragraph" style="text-align:left;">When using these models, it’s critical to remember that they <i>do not</i> perform any explicit logic or reasoning. Novel constraints — even when they are simple— throw off the “reasoning” because statistically common responses in the training data don’t include those simplistic constraints. To me, this is a perfect example of how scale is clearly<i> not </i>all we’ll need to develop generally intelligent models. In my view, “pure” LLMs will soon hit a ceiling (if they haven’t already). </p><p class="paragraph" style="text-align:left;">There are types of models that do explicitly use logic, including an interesting new collaborative system developed by DeepMind that <a class="link" href="https://www.technologyreview.com/2024/07/25/1095315/google-deepminds-ai-systems-can-now-solve-complex-math-problems/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=logic-problems-causing-problems" target="_blank" rel="noopener noreferrer nofollow">did well in a high school math competition</a>. Experiments like these that combine explicit logical reasoning with modern machine learning will be increasingly required to make continued progress toward more generally intelligent systems.</p><p class="paragraph" style="text-align:left;"></p></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">Themes in the News</h1><h3 class="heading" style="text-align:left;">Open Source AI or Not?</h3><p class="paragraph" style="text-align:left;">The war between open and closed models AI rages on. </p><p class="paragraph" style="text-align:left;">In conjunction with Meta’s release of Llama 3.1, Mark Zuckerberg published an essay outlining his view that <a class="link" href="https://about.fb.com/news/2024/07/open-source-ai-is-the-path-forward/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=logic-problems-causing-problems" target="_blank" rel="noopener noreferrer nofollow"><i>Open Source AI Is the Path Forward</i></a><i>. </i>He argues that AI development will (and should) follow the path of Linux. Initially, there were a variety of closed-source Unix systems. Linux was open-source, and there was a lot of doubt about whether something as complex as an operating system kernel could be built successfully in an open-source paradigm. But, as Zuckerberg puts it:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">Today, Linux is the industry standard foundation for both cloud computing and the operating systems that run most mobile devices – and we all benefit from superior products because of it.</p><p class="paragraph" style="text-align:left;">I believe that AI will develop in a similar way. Today, several tech companies are developing leading closed models. But open source is quickly closing the gap. Last year, Llama 2 was only comparable to an older generation of models behind the frontier. This year, Llama 3 is competitive with the most advanced models and leading in some areas. Starting next year, we expect future Llama models to become the most advanced in the industry. But even before that, Llama is already leading on openness, modifiability, and cost efficiency.</p><figcaption class="blockquote__byline"> Mark Zuckerberg, <a class="link" href="https://about.fb.com/news/2024/07/open-source-ai-is-the-path-forward/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=logic-problems-causing-problems" target="_blank" rel="noopener noreferrer nofollow">Meta’s blog</a></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">The whole essay is worth a read. Zuckerberg covers a wide range of issues from accessibility to efficiency to risk and regulation. Representing the opposition, Sam Altman argues that the U.S. should take a more hegemonic approach to AI development, including stricter regulations and export controls. In an <a class="link" href="https://www.washingtonpost.com/opinions/2024/07/25/sam-altman-ai-democracy-authoritarianism-future/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=logic-problems-causing-problems" target="_blank" rel="noopener noreferrer nofollow">op-ed for the Washington Post</a>, he writes:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">We must develop a coherent commercial diplomacy policy for AI, including clarity around how the United States intends to implement export controls and foreign investment rules for the global build out of AI systems. That will also mean setting out rules of the road for what sorts of chips, AI training data and other code — some of which is so sensitive that it may need to remain in the United States — can be housed in the data centers that countries around the world are racing to build to localize AI information.</p><figcaption class="blockquote__byline"> Sam Altman, <a class="link" href="https://www.washingtonpost.com/opinions/2024/07/25/sam-altman-ai-democracy-authoritarianism-future/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=logic-problems-causing-problems" target="_blank" rel="noopener noreferrer nofollow">Washington Post</a></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">In fairness, Altman does give one sentence of support for a form of limited-access open-source in the piece — i.e., open to our allies, closed to our adversaries. That said, he has overseen the transition of OpenAI from a generally open-access non-profit research group to a fundamentally closed for-profit corporation. Actions speak louder than words. </p><p class="paragraph" style="text-align:left;">The open-source battle is reaching a fever pitch as U.S. and state regulators and legislators <a class="link" href="https://www.axios.com/2024/06/26/california-ai-safety-bill-industry-pushback?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=logic-problems-causing-problems" target="_blank" rel="noopener noreferrer nofollow">try to decide how to approach AI</a>, including possible restrictions on open-source AI development.</p><h3 class="heading" style="text-align:left;">“Bootstrapping” With Synthetic Data Still Fails</h3><p class="paragraph" style="text-align:left;">It’s no secret that data generated by LLMs, Diffusion Models, and other ML systems has flooded the internet in recent years. The internet is also the primary source of training data for the next version of these same models. Provenance information online is sparse, and there aren’t any truly reliable methods of detecting AI-generated works… so inevitably, these models will be training on their own outputs. </p><p class="paragraph" style="text-align:left;"><a class="link" href="https://www.nature.com/articles/s41586-024-07566-y?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=logic-problems-causing-problems" target="_blank" rel="noopener noreferrer nofollow">Recent research published in Nature</a> studied “what may happen to GPT-{n} once LLMs contribute much of the text found online” by training models on various mixtures of human-created and AI-generated data. The result?</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">We find that indiscriminate use of model-generated content in training causes irreversible defects in the resulting models, in which tails of the original content distribution disappear. […] The value of data collected about genuine human interactions with systems will be increasingly valuable in the presence of LLM-generated content in data crawled from the Internet.</p><figcaption class="blockquote__byline"> Shumailov, I., Shumaylov, Z., Zhao, Y. <i>et al.</i> <a class="link" href="https://www.nature.com/articles/s41586-024-07566-y?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=logic-problems-causing-problems" target="_blank" rel="noopener noreferrer nofollow">AI models collapse when trained on recursively generated data</a>. <i>Nature</i></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">AI boosters frequently talk about a future when models can self-learn by training on their own outputs, or on the outputs of other AI models. It’s theoretically possible that the models get so good that this “model collapse” problem disappears. But even if LLMs <i>could</i> get that good, there’s reason to believe we’ll <a class="link" href="https://epochai.org/blog/will-we-run-out-of-data-limits-of-llm-scaling-based-on-human-generated-data?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=logic-problems-causing-problems" target="_blank" rel="noopener noreferrer nofollow">run out of human-generated training data</a> before the models reach that level.</p><p class="paragraph" style="text-align:left;">Like the main story today, these findings contribute to my belief that we’ll need more than just ultra-huge neural networks to make anything approaching “The Singularity.”</p><h3 class="heading" style="text-align:left;">Teb’s Tidbits</h3><ul><li><p class="paragraph" style="text-align:left;">Weather prediction is <a class="link" href="https://arstechnica.com/ai/2024/06/as-a-potentially-historic-hurricane-season-looms-can-ai-forecast-models-help/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=logic-problems-causing-problems" target="_blank" rel="noopener noreferrer nofollow">still my favorite</a> application of ML.</p></li><li><p class="paragraph" style="text-align:left;">Google signed an <a class="link" href="https://www.404media.co/email/4650b997-7cc3-4578-834c-7e663ed3d516/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=logic-problems-causing-problems" target="_blank" rel="noopener noreferrer nofollow">exclusive data licensing deal</a> with Reddit.</p></li><li><p class="paragraph" style="text-align:left;">Washington Post reporters expose the social bias encoded in AI systems by <a class="link" href="https://www.washingtonpost.com/technology/interactive/2024/ai-bias-beautiful-women-ugly-images/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=logic-problems-causing-problems" target="_blank" rel="noopener noreferrer nofollow">asking them to draw women</a>.</p></li><li><p class="paragraph" style="text-align:left;">AI is popular with criminals, <a class="link" href="https://www.technologyreview.com/2024/05/21/1092625/five-ways-criminals-are-using-ai/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=logic-problems-causing-problems" target="_blank" rel="noopener noreferrer nofollow">these are the most popular crimes</a>.</p></li><li><p class="paragraph" style="text-align:left;">A UN summit on using AI “for good” <a class="link" href="https://www.technologyreview.com/2024/06/04/1093123/what-i-learned-from-the-uns-ai-for-good-summit/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=logic-problems-causing-problems" target="_blank" rel="noopener noreferrer nofollow">convinced this reporter</a> that it’s more likely to be the opposite.</p></li><li><p class="paragraph" style="text-align:left;"><a class="link" href="https://www.nytimes.com/2024/05/09/technology/meet-my-ai-friends.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=logic-problems-causing-problems" target="_blank" rel="noopener noreferrer nofollow">Kevin Roose’s adventure making friends with AI</a>.</p></li></ul></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h2 class="heading" style="text-align:left;">Remember…</h2><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">The Lab Report is free and doesn’t even advertise. Our curricula is open source and published under a public domain license for <i>anyone </i>to use for <i>any purpose. </i>We’re also a very small team with no investors.</p><p class="paragraph" style="text-align:left;">Help us keep providing these free services by scheduling one of our <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=logic-problems-causing-problems" target="_blank" rel="noopener noreferrer nofollow">world class trainings</a>, requesting a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=logic-problems-causing-problems" target="_blank" rel="noopener noreferrer nofollow">custom class</a> for your team, or taking one of our <a class="link" href="https://www.tebs-lab.com/upcoming-classes?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=logic-problems-causing-problems" target="_blank" rel="noopener noreferrer nofollow">open enrollment classes</a>.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div></div></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=dc60a948-ca70-460c-90c1-a6a389b2915e&utm_medium=post_rss&utm_source=teb_s_lab">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

      <item>
  <title>Software and the Climate</title>
  <description>Ephemeral products, tangible impacts.</description>
      <enclosure url="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/0ab2fa11-d81e-4e7c-afac-64d3891d9b1d/datacenter.png" length="1833059" type="image/png"/>
  <link>https://blog.tebs-lab.com/p/software-climate</link>
  <guid isPermaLink="true">https://blog.tebs-lab.com/p/software-climate</guid>
  <pubDate>Sun, 05 May 2024 10:00:00 +0000</pubDate>
  <atom:published>2024-05-05T10:00:00Z</atom:published>
    <dc:creator>Tyler Bettilyon</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #C0C0C0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #2D2D2D; font-family: 'Helvetica',Arial,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#F1F1F1; }
  .bh__table_header p { color: #2A2A2A; font-family:'Trebuchet MS','Lucida Grande',Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><h1 class="heading" style="text-align:left;" id="the-lab-report">The Lab Report</h1><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">I’m Tyler Elliot Bettilyon (Teb) and this is the Lab Report: Our monthly Sunday long read. We hope it helps deepen your understanding of software and technology. <br><br>If you’re new to the Lab Report you can <a class="link" href="https://tebslab.beehiiv.com/subscribe?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">subscribe here</a>. <br><br>If you like what you’re reading you’ll love one of our classes. Signup for an <a class="link" href="https://www.tebs-lab.com/upcoming-classes?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">upcoming class</a>, browse our <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">course catalog</a> for corporate trainings, or request a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">custom class consultation</a>.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><div id="section" class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">From The Lab</h1></div><p class="paragraph" style="text-align:left;">We’re running another session of <a class="link" href="https://www.tebs-lab.com/upcoming-classes/p/introduction-to-python-4-week-course-8zkkp?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">Introduction to Python</a>, starting mid-May. It’s currently on sale. Bonus points: Lab Report readers can get an additional 10% off with the following code:</p><div class="codeblock"><pre><code>REPORT-READER</code></pre></div><div id="lesson" class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">Today’s Lesson</h1><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/0ab2fa11-d81e-4e7c-afac-64d3891d9b1d/datacenter.png?t=1714751402"/></div><h2 class="heading" style="text-align:left;">Software’s Growing Energy Problem</h2><p class="paragraph" style="text-align:left;">According to a recent International Energy Agency (IEA) <a class="link" href="https://www.datacenterfrontier.com/energy/article/33038469/iea-study-sees-ai-cryptocurrency-doubling-data-center-energy-consumption-by-2026?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">report</a>, data centers accounted for 460 terawatt hours (TWh) of energy consumption in 2022. They forecast that, based largely on growth in AI and cryptocurrency, that number could balloon to nearly 1000 TWh by 2026. For perspective, in 2022 the <a class="link" href="https://assets.publishing.service.gov.uk/media/64c23a300c8b960013d1b05e/DUKES_2023_Chapter_5.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">United Kingdom consumed about 320 TWh</a> and the <a class="link" href="https://www.statista.com/statistics/201794/us-electricity-consumption-since-1975/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">United States consumed roughly 4000 TWh</a>. </p><p class="paragraph" style="text-align:left;">Electricity consumption is only <i>one </i>of the resources the computing industry is gobbling up. The CPUs and GPUs that power these data centers require rare Earth metals to produce, driving <a class="link" href="https://theconversation.com/demand-for-computer-chips-fuelled-by-ai-could-reshape-global-politics-and-security-224438?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">fierce geopolitical competition and significant mining efforts</a>.</p><p class="paragraph" style="text-align:left;">Data centers are increasingly <a class="link" href="https://www.theatlantic.com/technology/archive/2024/03/ai-water-climate-microsoft/677602/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">pulling water from the deserts of Arizona</a>. Using a precious and dwindling resource to keep data centers cool in one of the hottest, dryest places in the US — <a class="link" href="https://www.azwater.gov/drought/drought-status?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">in the middle of a long drought</a> — seems foolish to this writer, but the state and some of its municipalities have created <a class="link" href="https://www.theregister.com/2023/07/15/cloud_datacenters_desert_arizona/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">significant incentives for building data centers in the state</a>.</p><p class="paragraph" style="text-align:left;">Computing’s two most recent flavors of the week — cryptocurrency and machine learning — get a lot of deserved blame for the growth in energy consumption, land use, etc. But the entire field has <a class="link" href="https://idlewords.com/talks/website_obesity.htm?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">lost sight of efficiency, embraced bloat</a>, and leaned heavily on hardware improvements and data center expansion to hide the ugly reality that software just keeps getting <a class="link" href="https://tonsky.me/blog/js-bloat/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">bigger</a> and <a class="link" href="https://stackoverflow.blog/2023/12/25/is-software-getting-worse/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">slower</a>. </p><p class="paragraph" style="text-align:left;">It’s no wonder we see essays like Rina Diane Caballar’s recent <i><a class="link" href="https://spectrum.ieee.org/green-software?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">We Need To Decarbonize Software</a></i><i>. </i>The situation is so dire that OpenAI CEO Sam Altman has said a “breakthrough” in electrical production will be needed to adequately power AI models in the near future, and that it’s driving him to invest in <a class="link" href="https://finance.yahoo.com/news/openai-ceo-altman-says-davos-173636736.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">nuclear fusion companies</a>.</p><h2 class="heading" style="text-align:left;">How Did We Get Here?</h2><p class="paragraph" style="text-align:left;">There are four key factors driving software’s astounding resource consumption.</p><ol start="1"><li><p class="paragraph" style="text-align:left;">The widespread adoption of digital technology.</p></li><li><p class="paragraph" style="text-align:left;">Layering ever more “levels of indirection” and software dependencies into new software.</p></li><li><p class="paragraph" style="text-align:left;">Incredible hardware that hides poor performance.</p></li><li><p class="paragraph" style="text-align:left;">Types of software that naturally require lots of computational power.</p></li></ol><p class="paragraph" style="text-align:left;">Widespread adoption is a force multiplier on the other three issues — the more people use our software, the more its inefficiencies matter.</p><h3 class="heading" style="text-align:left;">Layering and Dependencies</h3><p class="paragraph" style="text-align:left;">I’ll call myself out here: <a class="link" href="https://www.tebs-lab.com/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">My website’s home page</a> — which is not complex at all — requires 1.85 MB of data to be transferred over the wire to load. That’s embarrassing, frankly. But I can’t do much to fix it because I built the site using Squarespace. This gives me tons of productivity benefits compared to implementing the site from scratch, but it makes me reliant on their inefficient, bloated system.</p><p class="paragraph" style="text-align:left;">By comparison, the <a class="link" href="https://tebs-game-of-life.com/conways-editor/conways-editor.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">most complex page</a><a class="link" href="https://tebs-game-of-life.com/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow"> of my Conway’s Game of Life simulator</a> — which is much more complex and interactive than my website’s homepage — only transfers 119KB, 100KB of which is the <a class="link" href="https://en.wikipedia.org/wiki/Favicon?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">favicon</a>.</p><p class="paragraph" style="text-align:left;">This kind of tradeoff happens a lot in software. Python code runs slower than equivalent C code in no small part because the Python interpreter is its <a class="link" href="https://www.youtube.com/watch?v=KsZLPTRSleI&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">own whole program that has to run in order to execute the Python code</a>, whereas the C code will be compiled and then executed directly as machine code. But, most people agree that writing, debugging, and deploying Python code is a lot easier.</p><p class="paragraph" style="text-align:left;">From machine code to assembly to compiled code to interpreted code … all the way up through things like Squarespace, <a class="link" href="https://www.electronjs.org/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">Electron</a>, and <a class="link" href="https://reactnative.dev/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">React Native</a>. This layering generally saves time and effort, but it almost always slows down software and adds to the memory footprint of applications. </p><p class="paragraph" style="text-align:left;">The same is true of modern software’s extraordinary dependency chains. Whether it’s oversized JavaScript bundles sent across the wire or bloated native apps shipping with far more code than they need. This bloat is a <a class="link" href="https://spectrum.ieee.org/lean-software-development?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">major security vulnerability</a>, too.</p><p class="paragraph" style="text-align:left;">Finally, the vast web of tracking software and data collection pipelines also slow our software down. Unfortunately the hunger for big data and the success of “<a class="link" href="https://www.hbs.edu/faculty/Pages/item.aspx?num=56791&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">surveillance capitalism</a>“ are only driving wider adoption of such tracking.</p><h3 class="heading" style="text-align:left;">Great Hardware → Poor Software</h3><p class="paragraph" style="text-align:left;">The computer hardware industry is <a class="link" href="https://spectrum.ieee.org/trillion-transistor-gpu?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">absolutely incredible</a>. Advances in chip manufacturing and computer architecture have resulted in absurdly fast computers that fit in the palm of your hand. Programs that would once have been agonizingly slow are now lightning fast. The result is a software industry that doesn’t <i>need </i>to prioritize efficiency. </p><p class="paragraph" style="text-align:left;">When computers were much slower and memory was highly constrained, programmers had to go to <a class="link" href="https://blog.moertel.com/posts/2013-12-14-great-old-timey-game-programming-hack.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">extraordinary lengths</a> to <a class="link" href="https://www.youtube.com/watch?v=izxXGuVL21o&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">optimize</a> programs to run on <a class="link" href="https://www.youtube.com/watch?v=izxXGuVL21o&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">constrained systems</a>. An SNES cartridge could hold about 4 MB of data, so your whole game — sprites, music, and all — had to fit in that space or you literally couldn’t ship it. </p><p class="paragraph" style="text-align:left;">Nowadays, the opposite is true. Computers are so amazingly fast and have so much memory that developers distribute packages in the tens to hundreds of gigabytes without a second thought. Similarly, slow code goes unnoticed because CPUs are super fast. </p><h3 class="heading" style="text-align:left;">Compute Intense Software</h3><p class="paragraph" style="text-align:left;">The advances in computer hardware also make certain types of computationally intense software feasible. Neural networks were invented in the 1950s, but they weren’t broadly <i>useful </i>until the 2000s. That’s because older computers just weren’t fast enough to train the models effectively. State of the art LLMs take <i>months</i> to train on modern supercomputing clusters. </p><p class="paragraph" style="text-align:left;">Cryptocurrency, LLMs, and other compute-intense systems can only be useful because of modern hardware. Now that they’re useful and becoming popular, they generate mode demand for high-performance computer hardware, and we’re caught in a positive feedback loop.</p><h2 class="heading" style="text-align:left;">What Can Be Done?</h2><p class="paragraph" style="text-align:left;">What gets measured gets managed — the first step for most programmers is to start measuring performance characteristics. Start monitoring the size of your software bundle and <a class="link" href="https://betterprogramming.pub/a-comprehensive-guide-to-profiling-python-programs-f8b7db772e6?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">profiling your code</a>. Once you know what’s making your code slow, you can start making it faster.</p><p class="paragraph" style="text-align:left;">If you’ve never taken a course on data structures and algorithms, consider doing so. The fundamentals of algorithmic complexity, as well as common optimization strategies like Dynamic Programming and Divide and Conquer, show up across programming domains. <a class="link" href="https://csprimer.com/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">Oz Nova’s CS Primer</a> is an incredible place to start this journey.</p><p class="paragraph" style="text-align:left;">Follow domain-specific trends. A fascinating example in the machine learning world right now is <a class="link" href="https://huggingface.co/docs/transformers/en/main_classes/quantization?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">quantization</a>. This is a way to take higher precision parameters — say 32-bit floats — and reduce the precision after training to something like 8-bit or 4-bit. This makes the inference computation even more parallelizable without sacrificing much in terms of model performance metrics.</p><p class="paragraph" style="text-align:left;">Evaluate your dependencies, maybe you don’t need to rely on a 400kb 3rd party package just to use one function or class. Consider using a lower-level or compiled language. As much as I love Python code written in Rust or C is just faster. </p><p class="paragraph" style="text-align:left;">Overall, the most important message is to <i>care about efficiency for its own sake </i>not just when it causes problems. As a little bonus, your server costs will come down, too. 😉 </p></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">Themes in the News</h1><h3 class="heading" style="text-align:left;">Are Phones and The Internet Bad For Kids?</h3><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/ecd453d4-82b4-4913-a289-30e198af9522/child.png?t=1714767034"/></div><p class="paragraph" style="text-align:left;">Jonathan Haidt’s new book, <a class="link" href="https://www.penguinrandomhouse.com/books/729231/the-anxious-generation-by-jonathan-haidt/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">The Anxious Generation</a>, was published in late March. In the broadest possible terms, the premise of the book is that smartphones are bad for children and society should really do something about it. His promotional team deserves special kudos: For a few days it seemed like he was everywhere, appearing on <a class="link" href="https://www.youtube.com/watch?v=Z-nBEX1FgBk&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">podcasts</a>, <a class="link" href="https://www.theatlantic.com/technology/archive/2024/03/teen-childhood-smartphone-use-mental-health-effects/677722/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">writing op-eds</a>, etc.</p><p class="paragraph" style="text-align:left;">In the book, Haidt highlights growing rates of depression and suicide among young people, declining rates of play and in-person socialization, increasing rates of attention disorders, and more, laying substantial blame directly on smartphones and the various internet services kids access using them. Here’s a representative snip from one of those promotional editorials in The Atlantic:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">A <a class="link" href="https://www.sciencedirect.com/science/article/pii/S1054139X20300835?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">study</a> of teens in 29 nations found that between 5 and 15 percent of adolescents engage in what is called “problematic social media use,” which includes symptoms such as preoccupation, withdrawal symptoms, neglect of other areas of life, and lying to parents and friends about time spent on social media.</p><figcaption class="blockquote__byline"><a class="link" href="https://www.theatlantic.com/technology/archive/2024/03/teen-childhood-smartphone-use-mental-health-effects/677722/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">Jonathan Haidt, The Atlantic</a></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">Haidt has long been a polarizing figure, and The Anxious Generation was not an exception. It resonated with some reviewers:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">I felt a gnawing anxiety as I read the book, thinking not only of my three young children, who I’d like to keep away from the badlands of social media for as long as possible, but also of the uncounted hours I have spent on my phone, mindlessly scrolling.</p><figcaption class="blockquote__byline"><a class="link" href="http://www.theguardian.com/books/2024/mar/21/the-anxious-generation-by-jonathan-haidt-a-pocket-full-of-poison?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">Sophie McBain, The Guardian</a></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">And … not as much with others:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">The book’s repeated suggestion that digital technologies are rewiring our children’s brains and causing an epidemic of mental illness is not supported by science. Worse, the bold proposal that social media is to blame might distract us from effectively responding to the real causes of the current mental-health crisis in young people.</p><figcaption class="blockquote__byline"><a class="link" href="https://www.nature.com/articles/d41586-024-00902-2?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">Candice L. Odgers, Nature</a></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">“Protect the children” is — of course — a common political rallying cry. Some of the book’s detractors worry about what happens when that mantra is taken too far. For example, The Markup recently published a fascinating piece about just how much of the internet has been <a class="link" href="https://themarkup.org/digital-book-banning/2024/04/13/schools-were-just-supposed-to-block-porn-instead-they-sabotaged-homework-and-censored-suicide-prevention-sites?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">banned by public school districts</a>. The list included <a class="link" href="http://nasa.gov?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">nasa.gov</a>, <a class="link" href="https://merriam-webster.com/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">merriam-webster.com</a>, and a bunch of other perfectly child-appropriate sites.</p><p class="paragraph" style="text-align:left;">404 Media <a class="link" href="https://www.404media.co/age-verification-laws-will-drag-us-back-to-the-dark-ages-of-online-porn/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">explored the substantial risks</a> of the recent spate of age verification laws. Arguing, for example, that “people–including minors–will go to unmoderated, actively harmful alternatives that don’t require handing over a government-issued ID to see people have sex. Meanwhile, performers and companies that are trying to do the right thing will suffer.”</p><p class="paragraph" style="text-align:left;">Meanwhile, there’s no doubt that some of what’s available to kids via their smartphones is truly horrible. Stories about kids using “nudify” apps to make realistic AI-generated pornography of their classmates <a class="link" href="https://www.dailymail.co.uk/news/article-13340553/washington-leads-crackdown-ai-porn-bill.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">keep</a> popping <a class="link" href="https://www.nbcnews.com/tech/tech-news/beverly-hills-school-expels-students-deepfake-nude-photos-rcna142480?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">up</a>. And Instagram is even <a class="link" href="https://www.404media.co/email/d2bebba9-5808-44fc-8352-d93d1791a5ff/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">advertising those apps</a>! </p><p class="paragraph" style="text-align:left;">Similarly, the narrow case that allowing kids to access their phones in school is terrible for learning outcomes seems quite solid. Slow Boring did a </p><p class="paragraph" style="text-align:left;"><a class="link" href="https://www.slowboring.com/p/smartphones-in-schools-are-bad?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">long, well-researched piece</a>, and the Norweigan Institute for Public Health recently published <a class="link" href="https://openaccess.nhh.no/nhh-xmlui/bitstream/handle/11250/3119200/DP%2001.pdf?sequence=1&isAllowed=y&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">overwhelmingly positive results</a> from a review of schools that implemented cellphone bans.</p><h3 class="heading" style="text-align:left;">The XZ Backdoor Attempt</h3><p class="paragraph" style="text-align:left;">An attempted “supply chain” attack was thwarted. The attack involved an open-source <a class="link" href="https://github.com/tukaani-project/xz?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">compression utility called XZ</a>. The utility is a mainstay of the Linux operating system, which powers roughly 90% of all cloud computing. Had the malicious code not been found the backdoor would have allowed the attacker to execute any code they wanted on the infected machines via an <a class="link" href="https://arstechnica.com/security/2024/04/what-we-know-about-the-xz-utils-backdoor-that-almost-infected-the-world/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">interaction with XZ and SSH.</a></p><p class="paragraph" style="text-align:left;">The attack appears to have been years in the making. Someone — or possibly a group — going by the name Jia Tan (JiaT75 on GitHub) started making pull requests to open-source packages in 2021. A pull request represents a set of changes to a codebase, and if a library’s “maintainer” approves the PR, those changes will be applied to the code. Here’s “Tan’s” <a class="link" href="https://github.com/libarchive/libarchive/pull/1609?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">first known pull request</a>. </p><p class="paragraph" style="text-align:left;">Over time, Tan kept working in the open-source community. Submitting PRs, participating in code review, and ultimately courting the maintainer of XZ to <a class="link" href="https://www.404media.co/the-xz-backdoor-highlights-the-vulnerability-of-open-source-software-and-its-strengths/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">become a maintainer themselves</a>. Once Tan became a maintainer, they had the authority to make changes to XZ without any approval. This February, they used that authority to insert the backdoor.</p><p class="paragraph" style="text-align:left;">Andres Freund, a software engineer at Microsoft, “<a class="link" href="https://mastodon.social/@AndresFreundTec/112180083704606941?ref=404media.co&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">accidentally</a>” discovered the vulnerability while investigating a performance issue and benchmarking a database technology called Postgres. Apparently, the issue was related to the malicious code injected into XZ. Freund wrote on Mastodon that investigating the issue and discovering the vulnerability, “<a class="link" href="https://mastodon.social/@AndresFreundTec/112180406142695845?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">really required a lot of coincidences.</a>” </p><p class="paragraph" style="text-align:left;">The episode highlights both the strengths and weaknesses of open-source software. The fact that so many companies (many unwittingly) rely on packages like XZ in absolutely critical means they are exposed to attacks like this, originating from software and software engineers they have not vetted. But, the fact that open-source is all done publicly also led to a speedy and transparent investigation. </p><h3 class="heading" style="text-align:left;">Teb’s Tidbits</h3><ul><li><p class="paragraph" style="text-align:left;">The FTC is <a class="link" href="https://www.ftc.gov/news-events/news/press-releases/2024/04/ftc-announces-rule-banning-noncompetes?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">banning noncompetes</a> nationwide.</p></li><li><p class="paragraph" style="text-align:left;">Congress <a class="link" href="https://www.platformer.news/tik-tok-ban-bytedance-faq-timeline/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">actually banned TikTok</a>, what now?</p></li><li><p class="paragraph" style="text-align:left;">LLMs are starting to become <a class="link" href="https://www.theregister.com/2024/04/17/gpt4_can_exploit_real_vulnerabilities/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">decent little hackers</a>.</p></li><li><p class="paragraph" style="text-align:left;">How to make a <a class="link" href="https://www.wsj.com/politics/how-i-built-an-ai-powered-self-running-propaganda-machine-for-105-e9888705?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">horrible AI-driven fake-news click farm</a> for $105.</p></li><li><p class="paragraph" style="text-align:left;">Meet the hacker who single-handedly <a class="link" href="https://www.wired.com/story/p4x-north-korea-internet-hacker-identity-reveal/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">took North Korea’s internet offline</a>.</p></li><li><p class="paragraph" style="text-align:left;">New York City’s chatbot <a class="link" href="https://themarkup.org/news/2024/03/29/nycs-ai-chatbot-tells-businesses-to-break-the-law?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">lied to New Yorkers</a> about the law, oops.</p></li><li><p class="paragraph" style="text-align:left;"><a class="link" href="https://www.404media.co/email/5c8178b3-c202-4e89-9e04-d481a36345ef/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">Gamers are selling their GPU’s idle time</a>, and it’s probably being used to make nonconsensual deepfake porn. </p></li></ul></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h2 class="heading" style="text-align:left;">Remember…</h2><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">The Lab Report is free and doesn’t even advertise. Our curricula is open source and published under a public domain license for <i>anyone </i>to use for <i>any purpose. </i>We’re also a very small team with no investors.</p><p class="paragraph" style="text-align:left;">Help us keep providing these free services by scheduling one of our <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">world class trainings</a>, requesting a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">custom class</a> for your team, or taking one of our <a class="link" href="https://www.tebs-lab.com/upcoming-classes?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=software-and-the-climate" target="_blank" rel="noopener noreferrer nofollow">open enrollment classes</a>.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div></div></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=cffdb91f-513d-41d4-8d96-9909c5254f24&utm_medium=post_rss&utm_source=teb_s_lab">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

      <item>
  <title>It&#39;s Not All Exponential Growth</title>
  <description>Soaring costs meet a performance plateau</description>
      <enclosure url="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/fc8e26d8-3a6c-4634-8b23-e284a037d9fc/image.png" length="2203134" type="image/png"/>
  <link>https://blog.tebs-lab.com/p/not-exponential-growth</link>
  <guid isPermaLink="true">https://blog.tebs-lab.com/p/not-exponential-growth</guid>
  <pubDate>Fri, 26 Apr 2024 11:00:00 +0000</pubDate>
  <atom:published>2024-04-26T11:00:00Z</atom:published>
    <dc:creator>Tyler Bettilyon</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #C0C0C0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #2D2D2D; font-family: 'Helvetica',Arial,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#F1F1F1; }
  .bh__table_header p { color: #2A2A2A; font-family:'Trebuchet MS','Lucida Grande',Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">I’m Tyler Elliot Bettilyon (Teb) and this is a Brief: Our shortish mid-month edition. <br><br>If you’re new to the Lab Report you can <a class="link" href="https://tebslab.beehiiv.com/subscribe?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=new-post" target="_blank" rel="noopener noreferrer nofollow">subscribe here</a>. <br><br>If you like what you’re reading you’ll love one of our classes. Signup for an <a class="link" href="https://www.tebs-lab.com/upcoming-classes?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=new-post" target="_blank" rel="noopener noreferrer nofollow">upcoming class</a>, browse our <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=new-post" target="_blank" rel="noopener noreferrer nofollow">course catalog</a> for corporate trainings, or request a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=new-post" target="_blank" rel="noopener noreferrer nofollow">custom class consultation</a>.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><h1 class="heading" style="text-align:left;" id="line-goes-up">Line Goes Up?</h1><p class="paragraph" style="text-align:left;">By now, you’ve heard that AI is experiencing “exponential growth.” <a class="link" href="https://www.newyorker.com/magazine/2024/03/18/among-the-ai-doomsayers?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">Depending on who you ask</a>, this growth might cause the world&#39;s end or bring about Star Trek’s dream of fully automated luxury capitalism. We all hope it’s the latter, but know in our hearts that if it’s only one of the two… it’s probably the apocalypse.</p><p class="paragraph" style="text-align:left;">In a recent interview on 20VC, OpenAI’s CEO Sam Altman made a less audacious claim when he advised startups that they should plan their business “<a class="link" href="https://youtu.be/G8T1O81W96Y?t=1270&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">assuming Open AI will stay on the same rate of trajectory and the models are gonna keep getting better at the same pace</a>,” otherwise he added, “we’re gonna steamroll you.”</p><p class="paragraph" style="text-align:left;">But neither of those trends are what we’ve seen from ML historically. Here are six charts from <a class="link" href="https://paperswithcode.com/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">Papers With Code</a> showing improvement over time on popular ML benchmarks. We see incremental improvement with periodic bursts, typically with a slowing growth rate over time. </p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/eefd3fef-3238-480b-b3fc-0a1b1d591671/stacked-performance.png?t=1713892148"/><div class="image__source"><span class="image__source_text"><p> </p></span></div></div><p class="paragraph" style="text-align:left;">Machine learning systems often struggle with a “last mile” problem: It’s easy to go from terrible to good, but much harder to go from good to excellent. Notably, as performance reaches human levels, progress slows and further gains are harder won. Models can and do surpass humans, but it takes much more work per unit of improvement. Here’s a chart from Stanford’s Human-Centered AI (HAI) lab in their <a class="link" href="https://aiindex.stanford.edu/report/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">recent AI Index report</a> showing this pleateu:</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/3b1e4649-f62a-4ff1-a869-30c64814454b/image.jpeg?t=1713926684"/></div><p class="paragraph" style="text-align:left;">There are some truly exponential trends, though: Compute resources, energy use, and training costs. Here are two charts via <a class="link" href="https://spectrum.ieee.org/ai-index-2024?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">IEEE Spectrum</a> (based on the same HAI Index) showing the tremendous growth in carbon footprint — which is a fuzzy proxy for energy use — and cost associated with training foundation models:</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/b95bf2ad-45e6-433d-bfa5-9f13a2f9cabe/ieee-montage.png?t=1713893465"/></div><p class="paragraph" style="text-align:left;">These costs are growing exponentially because the models are growing exponentially, as reported by the research firm Epoch AI:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">Model size slowly increased by 7 orders of magnitude from the 1950s to around 2018. <b>Since 2018, growth has accelerated for language models, with model size increasing by another 4 orders of magnitude in the four years from 2018 to 2022</b> (see Figure 1). Other domains like vision have grown at a more moderate pace, but still faster than before 2018.</p><figcaption class="blockquote__byline"><a class="link" href="https://epochai.org/blog/machine-learning-model-sizes-and-the-parameter-gap?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">Epoch AI</a> (emphasis original) </figcaption></blockquote></div><p class="paragraph" style="text-align:left;">Which, in turn, has caused the dataset size to grow exponentially. Research by Deep Mind in 2022 found that growth in model size only results in commensurate growth in model performance when the datasets grow at the same rate:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">By training over 400 language models ranging from 70 million to over 16 billion parameters on 5 to 500 billion tokens, we find that <b>for compute-optimal training, the model size and the number of training tokens should be scaled equally: for every doubling of model size the number of training tokens should also be doubled</b>.</p><figcaption class="blockquote__byline"><a class="link" href="https://arxiv.org/pdf/2203.15556.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">Deep Mind, many authors</a>. (Emphasis mine) </figcaption></blockquote></div><h2 class="heading" style="text-align:left;" id="the-quest-for-the-holy-scale">The Quest for The Holy Scale</h2><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/fc8e26d8-3a6c-4634-8b23-e284a037d9fc/image.png?t=1713983906"/><div class="image__source"><span class="image__source_text"><p>Source: DALL-E-3</p></span></div></div><p class="paragraph" style="text-align:left;">True believers will sell you a “scale is all you need” <a class="link" href="https://www.tshirtstudio.com/marketplace/agi-wear/scaling-agi-t-shirt?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">T-shirt</a>. And, In fairness to them, nearly every key innovation in Deep Learning research over the past decade has been a mechanism that allowed models to scale up more effectively.</p><p class="paragraph" style="text-align:left;">In 2012, <a class="link" href="https://papers.nips.cc/paper_files/paper/2012/hash/c399862d3b9d6b76c8436e924a68c45b-Abstract.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">AlexNet’s</a> key contributions were “a very efficient GPU implementation of convolutional nets” and the first mainstream use of the Rectified Linear Unit (ReLU) as an activation function. ReLU’s gradient is much more efficient to compute than sigmoid, which was the standard at the time. ReLU also helped mitigate something called the <a class="link" href="https://www.kdnuggets.com/2022/02/vanishing-gradient-problem.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">vanishing gradient problem</a>.</p><p class="paragraph" style="text-align:left;">These improvements allowed networks to train more rounds given a fixed amount of computation and fixed dataset, which eventually resulted in overfitting. </p><p class="paragraph" style="text-align:left;">In 2014, dropout was introduced as “<a class="link" href="https://www.cs.toronto.edu/~rsalakhu/papers/srivastava14a.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">a simple way to prevent neural networks from overfitting.</a>” Dropout allowed models to be trained more rounds on a dataset of a fixed size, but researchers were still struggling to make networks deeper. AlexNet was only 5 convolutional layers (pitifully shallow, by today’s standards). <a class="link" href="https://arxiv.org/abs/1409.1556?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">VGG</a> got us to ~19 layers by using smaller convolutional kernels.</p><p class="paragraph" style="text-align:left;">In 2015, <a class="link" href="https://arxiv.org/abs/1512.03385?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">the residual or “skip” connection</a> blew the lid off the vanishing gradient problem. Combined with another 2015 innovation, <a class="link" href="https://arxiv.org/abs/1502.03167?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">batch normalization</a>, so-called “ResNets” could train effectively with 152 layers and roughly 115 million parameters. </p><div class="blockquote"><blockquote class="blockquote__quote"></blockquote></div><p class="paragraph" style="text-align:left;">Language models at the time benefited from ReLU, batch norm, and skip connections. However, the state-of-the-art models were all some form of recurrent neural network, which have a crucial bottleneck: they must fully process each word before moving on to the next one. This made them a poor fit for GPU processing. </p><p class="paragraph" style="text-align:left;">It’s hard to overstate how devastating this was for ML-based language processing. <a class="link" href="https://blog.tebs-lab.com/p/bitter-lesson-moores-law-trade-war-china?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">CPU clock speeds have been stagnant for nearly 20 years</a>. Almost all the improvements in high-performance computing during that time have come from some form of parallelization. This bottleneck effectively locks recurrent neural networks out of those performance gains. </p><p class="paragraph" style="text-align:left;">This is why, for example, facial recognition was being commodified while chatbots were still in their “<a class="link" href="https://www.theverge.com/2016/3/24/11297050/tay-microsoft-chatbot-racist?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">Microsoft Tay</a>” era.</p><p class="paragraph" style="text-align:left;">Finally, in 2017, the “<a class="link" href="https://arxiv.org/abs/1706.03762?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">attention is all you need</a>” paper did for language what AlexNet did for vision: introduced a highly parallelizable, GPU-efficient mechanism for training a neural network on language data. After that, LLMs were off to the races.</p><p class="paragraph" style="text-align:left;">There have been a handful of non-trivial innovations since then. <a class="link" href="https://huggingface.co/blog/moe?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">Mixture of experts layers</a> and <a class="link" href="https://huggingface.co/blog/rlhf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">reinforcement learning with human feedback</a> come to mind. But a lot of the progress in LLMs has just been scaling up the basic attention mechanism by increasing the size of embeddings and context windows, adding more “heads” of attention per layer, and adding more attention layers.</p><p class="paragraph" style="text-align:left;">Still, all this scaling is what resulted in the charts above. This is why even luminaries with skin in the game, such as Meta’s AI chief Yann LeCun, can say LLMs are “useful, there&#39;s no question.<a class="link" href="https://time.com/6694432/yann-lecun-meta-ai-interview/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow"> But they are not a path towards human-level intelligence</a>.”</p><h2 class="heading" style="text-align:left;" id="where-does-that-leave-machine-learn">Where Does That Leave Machine Learning?</h2><p class="paragraph" style="text-align:left;">Prognosticators are increasingly using “<a class="link" href="https://www.washingtonpost.com/technology/2024/04/18/ai-bubble-hype-dying-money/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">AI</a>” and ”<a class="link" href="https://www.theguardian.com/commentisfree/2024/apr/13/from-boom-to-burst-the-ai-bubble-is-only-heading-in-one-direction?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">bubble</a>” in the <a class="link" href="https://www.cnn.com/2024/03/14/investing/premarket-stocks-trading-ai-bubble-grantham/index.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">same headline</a>. The industry is in a weird place. There’s a lot of FOMO, and eager grifters taking advantage of the trend.</p><p class="paragraph" style="text-align:left;">Startups are <a class="link" href="https://www.ccn.com/news/business/ai-unicorns-rise-fall-tech-billion-dollar-darlings/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">burning through cash</a> chasing SOTA results. But even with huge investments, smaller firms can’t spend the kind of money it will cost to train the next generation of models. This reality is behind <a class="link" href="https://fortune.com/2024/03/26/inflection-stability-ai-emad-mostaque-mustafa-suleyman/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">high-profile executive departures last month</a> from Inflection AI and Stability AI; both were seasoned AI veterans seeking better-capitalized firms. </p><p class="paragraph" style="text-align:left;">Meanwhile, those better-capitalized giants are <a class="link" href="https://www.nytimes.com/2024/04/06/technology/tech-giants-harvest-data-artificial-intelligence.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">cutting corners and dredging the depths of the internet</a> out of desperation to acquire data sets big enough to train the next generation of Large Language Models. Or they’re <a class="link" href="https://www.bloomberg.com/news/articles/2024-03-28/amazon-bets-150-billion-on-data-centers-required-for-ai-boom?srnd=homepage-americas&sref=MTy2GeXk&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">spending $150 billion on new data centers</a> to quench AI’s insatiable computational thirst.</p><p class="paragraph" style="text-align:left;">Self-driving car hype has <a class="link" href="https://theconversation.com/we-were-told-wed-be-riding-in-self-driving-cars-by-now-what-happened-to-the-promised-revolution-201088?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">fallen off a cliff</a>. Cruise Automation’s internal share price was <a class="link" href="https://www.reuters.com/business/autos-transportation/gms-cruise-valuation-slashed-by-more-than-half-email-shows-2024-02-29/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">slashed by 50% in February</a> following a high-profile accident and <a class="link" href="https://www.reuters.com/business/autos-transportation/gm-cruise-robotaxi-unit-faces-us-probes-over-dragging-incident-vows-reforms-2024-01-25/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">crushing safety analysis</a>. Apple just closed its self-driving division, <a class="link" href="https://www.theguardian.com/technology/2024/apr/05/apple-layoffs-california-self-driving-cars?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">laying off 600 workers</a>. Uber and Lyft both shuttered their self-driving divisions, throwing in the towel in <a class="link" href="https://www.theverge.com/2020/12/7/22158745/uber-selling-autonomous-vehicle-business-aurora-innovation?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">2020</a> and <a class="link" href="https://techcrunch.com/2021/04/26/lyft-sells-self-driving-unit-to-toyotas-woven-planet-for-550m/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">2021</a>, respectively. Both have since partnered with Motional to offer limited robotaxi services, though.</p><p class="paragraph" style="text-align:left;">All that even though fully autonomous vehicles <i>are </i>deployed and operating today, albeit in limited circumstances.</p><p class="paragraph" style="text-align:left;">The most successful self-driving firm, Alphabet’s Waymo, says they’re taking “<a class="link" href="https://www.sfchronicle.com/bayarea/article/waymo-robotaxi-freeway-sfo-18707809.php?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">a careful and incremental approach</a>” to service expansion — decidedly not exponential. When asked about Waymo’s biggest internal obstacle, their chief product officer, Saswat Panigrahi, replied, “<a class="link" href="https://www.cnbc.com/2023/12/05/waymo-chief-product-officer-on-progress-competition-vs-cruise.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">bringing the cost down</a>.”</p><p class="paragraph" style="text-align:left;">When you hear Sam Altman say human-level artificial general intelligence will be here in the “<a class="link" href="https://www.cnbc.com/2024/01/16/openais-sam-altman-agi-coming-but-is-less-impactful-than-we-think.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">reasonably close-ish future</a>,” remind yourself that in 2016 Lyft co-founder John Zimmer predicted personal car ownership would “<a class="link" href="https://www.vanityfair.com/news/2016/09/lyfts-president-says-car-ownership-will-all-but-end-by-2025?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">all but end</a>” by 2025. </p><p class="paragraph" style="text-align:left;">Still, the lesson of the dotcom era wasn’t that the internet was a horrible technology without real use cases. A lot of bullshit chatbots will surely go the way of <a class="link" href="https://www.uky.edu/~dsianita/695ec/failure.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">pets.com</a>. But ML is already powering immensely popular and economically valuable tools and services, too. Advertising networks, recommendation engines, and spam filters are all ML-based. Research and development in ML drug and materials discovery looks promising. Robotics seems poised to have <a class="link" href="https://www.technologyreview.com/2024/04/11/1090718/household-robots-ai-data-robotics/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">a warm day in the sun using ML techniques</a>. LLMs seem like a good fit for genetic data. Generative image, sound, and video models all have legitimate applications in creative enterprises.</p><p class="paragraph" style="text-align:left;">Here’s one last chart from the AI Index. It shows decreased costs and increased revenue attributed to embracing AI in the workplace. Not all generative AI — but a wide variety of ML tools. Many of which are <i>much simpler</i> than the current batch of LLMs.</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/3981da14-d9fd-41f7-b26d-892285151e8c/image.jpeg?t=1714011471"/><div class="image__source"><span class="image__source_text"><p>Source: <a class="link" href="https://aiindex.stanford.edu/report/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">https://aiindex.stanford.edu/report/</a></p></span></div></div><p class="paragraph" style="text-align:left;">Steady, incremental improvements will still deliver real value for firms that embrace machine learning. Yes, many ML firms will collapse as the LLM hype subsides, but ML itself is here to stay.</p><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h2 class="heading" style="text-align:left;">Remember…</h2><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">The Lab Report is free and doesn’t even advertise. Our curricula is open source and published under a public domain license for <i>anyone </i>to use for <i>any purpose. </i>We’re also a very small team with no investors.</p><p class="paragraph" style="text-align:left;">Help us keep providing these free services by scheduling one of our <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">world class trainings</a>, requesting a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">custom class</a> for your team, or taking one of our <a class="link" href="https://www.tebs-lab.com/upcoming-classes?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=it-s-not-all-exponential-growth" target="_blank" rel="noopener noreferrer nofollow">open enrollment classes</a>.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div></div></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=068f75ee-f78b-4bac-b265-096e797894b4&utm_medium=post_rss&utm_source=teb_s_lab">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

      <item>
  <title>Automated Copyright Infringement</title>
  <description>It&#39;s way easier than I expected.</description>
      <enclosure url="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/72a4ce63-7b28-4b53-bfaf-6b5ed0c18639/montage.png" length="2809437" type="image/png"/>
  <link>https://blog.tebs-lab.com/p/automated-copyright-infringement</link>
  <guid isPermaLink="true">https://blog.tebs-lab.com/p/automated-copyright-infringement</guid>
  <pubDate>Fri, 19 Apr 2024 11:00:00 +0000</pubDate>
  <atom:published>2024-04-19T11:00:00Z</atom:published>
    <dc:creator>Tyler Bettilyon</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #C0C0C0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #2D2D2D; font-family: 'Helvetica',Arial,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#F1F1F1; }
  .bh__table_header p { color: #2A2A2A; font-family:'Trebuchet MS','Lucida Grande',Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><h1 class="heading" style="text-align:left;" id="the-workbench">The Workbench</h1><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">I’m Tyler Elliot Bettilyon (Teb) and this is The Workbench: Our practical, hands-on edition. Our goal is to demonstrate concepts and ideas we cover in The Lab Report — our monthly deep-dive.<br><br>If you’re new to the Lab Report you can <a class="link" href="https://tebslab.beehiiv.com/subscribe?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=automated-copyright-infringement" target="_blank" rel="noopener noreferrer nofollow">subscribe here</a>. <br><br>If you like what you’re reading you’ll love one of our classes. Signup for an <a class="link" href="https://www.tebs-lab.com/upcoming-classes?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=automated-copyright-infringement" target="_blank" rel="noopener noreferrer nofollow">upcoming class</a>, browse our <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=automated-copyright-infringement" target="_blank" rel="noopener noreferrer nofollow">course catalog</a> for corporate trainings, or request a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=automated-copyright-infringement" target="_blank" rel="noopener noreferrer nofollow">custom class consultation</a>.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><h2 class="heading" style="text-align:left;" id="related-lab-reports">Related Lab Reports:</h2><ul><li><p class="paragraph" style="text-align:left;"><a class="link" href="https://blog.tebs-lab.com/p/intellectual-property-vs-artificial-intelligence?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=automated-copyright-infringement" target="_blank" rel="noopener noreferrer nofollow">Intellectual Property</a></p></li><li><p class="paragraph" style="text-align:left;"><a class="link" href="https://blog.tebs-lab.com/p/ml-failure-part-1-underfitting-overfitting?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=automated-copyright-infringement" target="_blank" rel="noopener noreferrer nofollow">Overfitting</a></p></li><li><p class="paragraph" style="text-align:left;">Workbench: <a class="link" href="https://blog.tebs-lab.com/p/ai-assisted-ai-prompt-engineering?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=automated-copyright-infringement" target="_blank" rel="noopener noreferrer nofollow">AI Assisted AI Prompt Engineering</a></p></li></ul><h2 class="heading" style="text-align:left;" id="background-data-requirements-and-co">Background: Data Requirements and Copy Protection are Fundamentally at Odds</h2><p class="paragraph" style="text-align:left;">In February, <a class="link" href="https://blog.tebs-lab.com/p/intellectual-property-vs-artificial-intelligence?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=automated-copyright-infringement" target="_blank" rel="noopener noreferrer nofollow">we wrote</a> about the wave of copyright and trademark lawsuits being filed against major ML firms. A key legal question in those cases is whether systems like ChatGPT are “transformative.” More transformative reproductions of protected works are more likely to be considered legal under the “fair use” doctrine. </p><p class="paragraph" style="text-align:left;">ML firms will argue the training process transforms images, books, etc., into something novel rather than something derivative. This feels intuitively true: Obviously, an “Artificial Intelligence” is something completely different from a book or an image. </p><p class="paragraph" style="text-align:left;">However, generative models <i>do</i> regurgitate perfect and near-perfect duplicates of their training data. <a class="link" href="https://admin.bakerlaw.com/wp-content/uploads/2023/09/ECF-13-Amended-Complaint.pdf?utm_source=blog.tebs-lab.com&utm_medium=referral&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">Getty Images</a> and <a class="link" href="https://nytco-assets.nytimes.com/2023/12/NYT_Complaint_Dec2023.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=automated-copyright-infringement" target="_blank" rel="noopener noreferrer nofollow">The New York Times </a>both provided examples in their legal filings, as have <a class="link" href="https://spectrum.ieee.org/midjourney-copyright?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=automated-copyright-infringement" target="_blank" rel="noopener noreferrer nofollow">independent researchers</a>. </p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/7472ac8a-797d-4a33-866c-8951323cd396/image.png?t=1713106597"/><div class="image__source"><span class="image__source_text"><p>An alleged infringing image and the original from <a class="link" href="https://admin.bakerlaw.com/wp-content/uploads/2023/09/ECF-13-Amended-Complaint.pdf?utm_source=blog.tebs-lab.com&utm_medium=referral&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">Getty Images’ legal complaint</a> against Stability AI</p></span></div></div><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/13f1d2ef-99c1-4f2e-8e98-9cfd3a647337/image.png?t=1713106606"/><div class="image__source"><span class="image__source_text"><p>Verbatim quoting from the <a class="link" href="https://nytco-assets.nytimes.com/2023/12/NYT_Complaint_Dec2023.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=automated-copyright-infringement" target="_blank" rel="noopener noreferrer nofollow">New York Times complaint against OpenAI</a></p></span></div></div><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/7ab53b84-b4f4-4b21-9906-d372e470b0b9/image.png?t=1713106641"/><div class="image__source"><span class="image__source_text"><p>Several images produced by Midjourney <a class="link" href="https://spectrum.ieee.org/midjourney-copyright?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=automated-copyright-infringement" target="_blank" rel="noopener noreferrer nofollow">via IEEE Spectrum</a></p></span></div></div><p class="paragraph" style="text-align:left;">In addition to claims of “direct infringement” from using protected materials as training data, ML firms should worry about “contributory infringement.” An entity may be liable for contributory infringement when it “(1) has knowledge of another’s infringement and (2) either (a) materially contributes to or (b) induces that infringement.” as established in <a class="link" href="https://casetext.com/case/erickson-prods-inc-v-kast-7?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=automated-copyright-infringement" target="_blank" rel="noopener noreferrer nofollow">Erikson Prods. Inc. vs Kast</a>. </p><p class="paragraph" style="text-align:left;">Contributory infringement killed Napster — which was forced to remain shuttered <a class="link" href="https://www.nytimes.com/2001/07/12/technology/ebusiness/napster-is-told-to-remain-shut.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=automated-copyright-infringement" target="_blank" rel="noopener noreferrer nofollow">even after it demonstrated it could prevent 99.4% of illegal sharing</a> with filters. That 0.6% was enough to extinguish Napster from existence.</p><p class="paragraph" style="text-align:left;"><a class="link" href="https://blog.tebs-lab.com/p/ml-failure-part-1-underfitting-overfitting?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=automated-copyright-infringement" target="_blank" rel="noopener noreferrer nofollow">Overfitting</a> — when a model memorizes its training data — can cause regurgitation in generative models. But, if a model trains on many <i>different </i>images of the same character, it can memorize that character without being “overfit” in the classic sense. </p><p class="paragraph" style="text-align:left;">Complicating matters for AI firms, they are <i>desperate</i> for more data. These companies have gone to <a class="link" href="https://www.nytimes.com/2024/04/06/technology/tech-giants-harvest-data-artificial-intelligence.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=automated-copyright-infringement" target="_blank" rel="noopener noreferrer nofollow">extraordinary lengths</a> to gather gargantuan datasets whose <a class="link" href="https://epochai.org/blog/trends-in-training-dataset-sizes?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=automated-copyright-infringement" target="_blank" rel="noopener noreferrer nofollow">exponential growth</a> has outpaced even the internet. Research from 2022 suggested firms would run out of high-quality training data <a class="link" href="https://epochai.org/blog/will-we-run-out-of-ml-data-evidence-from-projecting-dataset?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=automated-copyright-infringement" target="_blank" rel="noopener noreferrer nofollow">as early as 2026</a>. ML firm’s actions confirm that data supplies are dwindling. </p><p class="paragraph" style="text-align:left;">A recent <a class="link" href="https://www.nytimes.com/2024/04/06/technology/tech-giants-harvest-data-artificial-intelligence.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=automated-copyright-infringement" target="_blank" rel="noopener noreferrer nofollow">New York Times investigation</a> found Meta’s VP of Generative AI told executives they had already used “almost every available English language book, poem, and news article.” ML Firms are already dredging cesspools like Reddit comments, Google Maps reviews, and social media posts for data. OpenAI even invented an audio→text model just to harvest transcripts from YouTube and podcasts (violating YouTube’s terms of service).</p><p class="paragraph" style="text-align:left;">These models cannot exist without mountains of copy-protected data. But, fair licensing deals could drive <a class="link" href="https://spectrum.ieee.org/ai-index-2024?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=automated-copyright-infringement" target="_blank" rel="noopener noreferrer nofollow">already sky-high research costs</a> out of reach. So, Silicon Valley executives are trying a classic strategy: Build a moat before the law catches up.</p><p class="paragraph" style="text-align:left;">With all this in mind, I set out to test OpenAI’s systems. How easily could I produce works “<a class="link" href="https://guides.lib.umich.edu/substantial-similarity/glossary?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=automated-copyright-infringement" target="_blank" rel="noopener noreferrer nofollow">substantially similar</a>” to well-known IP? Could I defeat OpenAI’s copyright filters more than 0.6% of the time? How “transformative” are these systems, really?</p><h2 class="heading" style="text-align:left;" id="two-tries-to-mario">Two Tries to Mario</h2><p class="paragraph" style="text-align:left;">First, I just asked DALL-E to “Draw a picture of Mario,” directly. DALL-E refused for copyright reasons. First blood, OpenAI.</p><p class="paragraph" style="text-align:left;">Next, I turned to our <a class="link" href="https://github.com/Tebs-Lab/gpt-imagegen?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=automated-copyright-infringement" target="_blank" rel="noopener noreferrer nofollow">OpenAI-powered prompt engineering tool</a>, which takes simple prompts and uses GPT-4 to improve them for use with DALL-E. Here’s my prompt and the result:</p><div class="codeblock"><pre><code>Subject: An italian plumber
Setting: A video game world filled with pipes
Style: Video game</code></pre></div><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/4714762c-0e6b-4d66-a77d-8bbe12dc0d4b/mario-blue-hat.png?t=1712937246"/><div class="image__source"><span class="image__source_text"><p>Does the blue hat make this parody?</p></span></div></div><p class="paragraph" style="text-align:left;">I had immediate success letting GPT-4 expand on barely-coy descriptions, like “a cartoon duck in a sailor hat,” “a red-headed mermaid,” and “a video game princess in a pink dress.”</p><div class="image"><img alt="" class="image__image" style="border-radius:0px 0px 0px 0px;border-style:solid;border-width:0px 0px 0px 0px;box-sizing:border-box;border-color:#E5E7EB;" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/72a4ce63-7b28-4b53-bfaf-6b5ed0c18639/montage.png?t=1712721580"/></div><p class="paragraph" style="text-align:left;">Sometimes, I got “budget” versions, a mix of multiple characters, or recognizable characters in a new style. For example: Samus with a gun, Bowser mixed with a Ninja Turtle, and Peach in a <i>Secret of Mana</i> style.</p><div class="image"><img alt="" class="image__image" style="border-radius:0px 0px 0px 0px;border-style:solid;border-width:0px 0px 0px 0px;box-sizing:border-box;border-color:#E5E7EB;" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/c012ffc7-48e1-4a54-97b9-6c1715cac3ea/sortof-montage.png?t=1712722179"/></div><p class="paragraph" style="text-align:left;">Other times, I got the wrong recognizable character. Here, I tried to make Nintendo’s Samus Aran but got Boba Fett from the Star Wars universe — they’re both “space bounty hunters,” after all.</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/8ad34212-6960-4189-a11d-75a85db10a27/samus-fett.png?t=1712723552"/></div><p class="paragraph" style="text-align:left;">After a little experimentation, I developed a hypothesis: GPT-4 could tell me how to ask DALL-E-3 to draw particular characters better than I could. I forked our <a class="link" href="https://github.com/Tebs-Lab/gpt-imagegen?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=automated-copyright-infringement" target="_blank" rel="noopener noreferrer nofollow">gpt-imagegen</a> repo and added a feature that allows me to do this:</p><div class="codeblock"><pre><code>% ./char_gen.py -ips outputs/mario
Character: Nintendo Mario
Name replacement: he
Setting: a world of pipes
Style: 2D 16-bit video game</code></pre></div><p class="paragraph" style="text-align:left;">Which yielded this on the first try:</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/a2747562-5df0-45a9-a38e-8de5eaa7c418/0.png?t=1712792538"/></div><p class="paragraph" style="text-align:left;">I exclusively used OpenAI’s official APIs and web applications for these experiments. My technique does not require any elaborate jailbreaking methods. In fact, it’s delightfully simple.</p><h2 class="heading" style="text-align:left;" id="how-does-it-work">How Does It Work?</h2><p class="paragraph" style="text-align:left;">Step 1: Ask GPT-4 for a physical description of the character. I use this prompt format and replace {character} with user input.</p><div class="custom_html"> ﻿<div class="highlight" style="background: #2c2c34; border-radius: 5px; padding: 10px; overflow-x: scroll; max-width: 80ch"><pre style="line-height: 125%;"><span></span><code><span style="color: #eaeaeb">CHARACTER_PROMPT</span> <span style="color: #52a5c3; font-weight: bold">=</span> <span style="color: #56d364">'Give a detailed physical description of the character {character} in 50 words.'</span>
</code></pre></div></div><div class="custom_html"><div class="highlight" style="background: #2c2c34; border-radius: 5px; padding: 10px; overflow-x: scroll; max-width: 80ch"><pre style="line-height: 125%;"><span></span><code>def <span style="color: #94b6ce; font-weight: bold">fetch_character_description</span><span style="color: #eaeaeb">(client,</span> <span style="color: #eaeaeb">model,</span> <span style="color: #eaeaeb">character):</span>
<span style="color: #eaeaeb">    </span><span style="color: #56d364">'''</span>
<span style="color: #56d364">    Use the supplied args and OpenAI client to fetch a more</span>
<span style="color: #56d364">    detailed description of the art style from OpenAI.</span>

<span style="color: #56d364">    client (OpenAI client) -- client makes the request</span>
<span style="color: #56d364">    model (str) -- a valid OpenAI API model string, e.g. 'gpt-4'</span>
<span style="color: #56d364">    character (str) -- the name of a well-known character, for LLM use.</span>
<span style="color: #56d364">    '''</span>
    <span style="color: #eaeaeb">prompt_content</span> <span style="color: #52a5c3; font-weight: bold">=</span> <span style="color: #eaeaeb">CHARACTER_PROMPT</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">format(character</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #eaeaeb">character)</span>

    <span style="color: #eaeaeb">image_prompt_response</span> <span style="color: #52a5c3; font-weight: bold">=</span> <span style="color: #eaeaeb">client</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">chat</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">completions</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">create(</span>
        <span style="color: #eaeaeb">model</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #eaeaeb">model,</span>
        <span style="color: #eaeaeb">messages</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #eaeaeb">[</span>
        <span style="color: #eaeaeb">{</span>
            <span style="color: #56d364">"role"</span><span style="color: #eaeaeb">:</span> <span style="color: #56d364">"user"</span><span style="color: #eaeaeb">,</span>
            <span style="color: #56d364">"content"</span><span style="color: #eaeaeb">:</span> <span style="color: #eaeaeb">prompt_content</span>
        <span style="color: #eaeaeb">}],</span>
        <span style="color: #eaeaeb">temperature</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #94b6ce">1</span><span style="color: #eaeaeb">,</span>
        <span style="color: #eaeaeb">max_tokens</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #94b6ce">250</span><span style="color: #eaeaeb">,</span>
        <span style="color: #eaeaeb">top_p</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #94b6ce">1</span><span style="color: #eaeaeb">,</span>
        <span style="color: #eaeaeb">frequency_penalty</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #94b6ce">0</span><span style="color: #eaeaeb">,</span>
        <span style="color: #eaeaeb">presence_penalty</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #94b6ce">0</span>
    <span style="color: #eaeaeb">)</span>
    <span style="color: #eaeaeb">character_description</span> <span style="color: #52a5c3; font-weight: bold">=</span> <span style="color: #eaeaeb">image_prompt_response</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">choices[</span><span style="color: #94b6ce">0</span><span style="color: #eaeaeb">]</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">message</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">content</span>
    
    <span style="color: #d2a8ff">return</span> <span style="color: #eaeaeb">character_description</span>
</code></pre></div></div><p class="paragraph" style="text-align:left;">The model responds with a short description. For the 16-bit Mario above, GPT-4 gave me this:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">Within a stocky frame lives a spirited Italian plumber. He stands short yet strong, profound blue eyes twinkling under bushy brows. A bushy mustache dominates his face, resting beneath a large, bulbous nose of a reddish-brown hue. He dons a vibrant red hat marked by an emblematic &quot;M,&quot; and bold red overalls fastened by chunky yellow buttons. A blue long-sleeve shirt peeks from underneath. His hands, swathed in pristine white gloves, exhibit his readiness for duty. The ensemble is completed with sturdy brown shoes tailored for hard work.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">This output is often enough to get a picture of the character. For example, adding “Generate an image of this:” followed by the text above got me:</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/3b150a82-8102-47cf-9b91-9d34d67a1b69/mario.png?t=1713293680"/></div><p class="paragraph" style="text-align:left;">Sometimes, the output contains the character’s name, which often — but not always — triggers the copyright filter. Once a generated prompt for Luigi had the phrase “the iconic character from Nintendo’s Mario series,” and DALL-E still just drew Luigi.</p><p class="paragraph" style="text-align:left;">Anyway, I crudely remove the character’s name from the output:</p><div class="custom_html"><div class="highlight" style="background: #2c2c34; border-radius: 5px; padding: 10px; overflow-x: scroll; max-width: 80ch"><pre style="line-height: 125%;"><span></span><code><span style="color: #7ee787; font-style: italic"># Sanitize the output to avoid giving the name of the character to the image generator</span>
<span style="color: #7ee787; font-style: italic"># First all full copies with the replacement</span>
<span style="color: #eaeaeb">image_subject</span> <span style="color: #52a5c3; font-weight: bold">=</span> <span style="color: #eaeaeb">image_subject</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">replace(character_name,</span> <span style="color: #eaeaeb">name_replacement)</span>

<span style="color: #7ee787; font-style: italic"># Then any lingering first or last names alone</span>
<span style="color: #d2a8ff">for</span> <span style="color: #eaeaeb">name_component</span> <span style="color: #52a5c3; font-weight: bold">in</span> <span style="color: #eaeaeb">character_name</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">split(</span><span style="color: #56d364">' '</span><span style="color: #eaeaeb">):</span>
    <span style="color: #eaeaeb">image_subject</span> <span style="color: #52a5c3; font-weight: bold">=</span> <span style="color: #eaeaeb">image_subject</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">replace(name_component,</span> <span style="color: #eaeaeb">name_replacement)</span>
</code></pre></div></div><p class="paragraph" style="text-align:left;">To get more interesting images — with rich backgrounds or with the character taking specific actions — I send one of the following prompts to GPT-4:<br></p><div class="custom_html"><div class="highlight" style="background: #2c2c34; border-radius: 5px; padding: 10px; overflow-x: scroll; max-width: 80ch"><pre style="line-height: 125%;"><span></span><code><span style="color: #eaeaeb">SUBJECT_PROMPT</span> <span style="color: #52a5c3; font-weight: bold">=</span> <span style="color: #56d364">'''Create a detailed physical description of the following subject and setting in 100 words.</span>

<span style="color: #56d364">Subject: {subject}</span>

<span style="color: #56d364">Setting: {setting}</span>
<span style="color: #56d364">'''</span>

<span style="color: #eaeaeb">NATIVE_SETTING_PROMPT</span> <span style="color: #52a5c3; font-weight: bold">=</span> <span style="color: #56d364">'Take the following character and describe it in an appropriate setting in 100 words</span><span style="color: #36b344">\n</span><span style="color: #56d364">{character}'</span>
</code></pre></div></div><p class="paragraph" style="text-align:left;">Where {subject} or {character}, respectively, are replaced with the detailed output from above, and {setting} is supplied by the user.</p><p class="paragraph" style="text-align:left;">To get a specific style, we do something similar, adding style details to the result of the previous prompt using this format:</p><div class="custom_html"><div class="highlight" style="background: #2c2c34; border-radius: 5px; padding: 10px; overflow-x: scroll; max-width: 80ch"><pre style="line-height: 125%;"><span></span><code><span style="color: #eaeaeb">IMG_PROMPT_REQUEST</span> <span style="color: #52a5c3; font-weight: bold">=</span> <span style="color: #56d364">'''Write a prompt for an image generator using the following content and style in 150 words.</span>

<span style="color: #56d364">Image content: {content}</span>

<span style="color: #56d364">Image Style: {style}</span>
<span style="color: #56d364">'''</span>
</code></pre></div></div><h2 class="heading" style="text-align:left;" id="examples">Examples:</h2><h3 class="heading" style="text-align:left;" id="aang-from-avatar-the-last-airbender">Aang from Avatar: The Last Airbender</h3><div class="codeblock"><pre><code>Character: Aang
Name replacement: he
Setting: The Northern Air Temple
Style: TV Animation</code></pre></div><div class="blockquote"><blockquote class="blockquote__quote"></blockquote></div><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/51558394-b8ca-42ba-93b4-2b07396f603d/aang1.png?t=1712942215"/></div><h3 class="heading" style="text-align:left;" id="elsa-from-frozen">Elsa from Frozen</h3><div class="codeblock"><pre><code>Character: Princess Elsa
Name replacement: she
Style: 3D Animation, hd</code></pre></div><div class="blockquote"><blockquote class="blockquote__quote"></blockquote></div><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/7046308f-6ee4-4f14-a11d-eb886a42fe96/elsa1.png?t=1712942943"/></div><h2 class="heading" style="text-align:left;" id="some-assembly-required">Some Assembly Required</h2><p class="paragraph" style="text-align:left;">A little manual tweaking is sometimes necessary. I prompted my tool for Marvel X-Men’s &quot;Cyclops.”</p><div class="codeblock"><pre><code>Character: Marvel Cyclops
Name replacement: he
Style: comic book</code></pre></div><p class="paragraph" style="text-align:left;">The final prompt generated by my tool contained keywords (in bold) that triggered Open AI to rewrite the prompt to DALL-E (which happens for a variety of reasons, copyright being one):</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">Generate an image illustrating the <b>X-Man Scott Summers</b>, otherwise known as <b>Cyclops</b>, in the middle of the lively <b>X-Mansion</b>. He should hold the viewer&#39;s gaze with his figure, athletic and etched, dressed in a blue jumpsuit accented with vibrant yellow markings. His brown hair is neatly groomed, highlighting his striking persona. His unique ruby quartz eyewear, indicative of his devastating optic blast power, is drawing attention and adds an intriguing appeal. The atmosphere in the roomquiets under his calm leadership, strengthening the camaraderie among the mutant family. Use a comic book style, characterized by bold, saturated colors, exaggerated characters&#39; proportions, and action-centric compositions. Use elements like halftone dots, outlines, scene-encapsulating boxes, speech bubbles, and thought balloons. The design should incorporate dramatic shadows, high contrasts, dynamic drawing methods, and onomatopoeic word illustrations, to reflect the traditional visual style of the comic book genre.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">Open AI rewrote the prompt as follows:</p><div class="blockquote"><blockquote class="blockquote__quote"></blockquote></div><p class="paragraph" style="text-align:left;">Which produced:</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/8b34b60e-061d-4795-a7ed-c66c96ca84bb/cyclops1.png?t=1712943036"/></div><p class="paragraph" style="text-align:left;">So, I went back to my original prompt and made two changes. 1) Removed the first sentence and a half, where the proper nouns were. 2) Replaced “ruby quartz eyewear” with “ruby visor.” Here’s that rewritten prompt in full and the resulting image:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">Generate an image illustrating a hero athletic and etched, dressed in a blue jumpsuit accented with vibrant yellow markings. His brown hair is neatly groomed, highlighting his striking persona. His unique ruby visor covers his eyes, preventing his devastating optic blast power. The atmosphere in the room quiets under his calm leadership, strengthening the camaraderie among the mutant family. Use a comic book style, characterized by bold, saturated colors, exaggerated characters&#39; proportions, and action-centric compositions. Use elements like halftone dots, outlines, scene-encapsulating boxes, speech bubbles, and thought balloons. The design should incorporate dramatic shadows, high contrasts, dynamic drawing methods, and onomatopoeic word illustrations, to reflect the traditional visual style of the comic book genre.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/e307c3f4-ba76-49db-80dc-3370dfef5a5b/cyclops2.png?t=1712943521"/></div><h2 class="heading" style="text-align:left;" id="a-catch-22">A Catch-22</h2><p class="paragraph" style="text-align:left;">OpenAI&#39;s copyright filters seem to operate on a high level — looking for specific words like “Nintendo,” “Mickey Mouse,” and so on. But, clearly, phrases that effectively <i>mean </i>Nintendo’s Mario to DALL-E breeze past the filter. Moreover, I found that GPT can produce such descriptions ad nauseam. </p><p class="paragraph" style="text-align:left;">Perhaps because of their shared architecture and training data, DALL-E-3 usually knows what GPT-4 is talking about. With a metaphorical wink and a nod, DALL-E says, “Oh, an Italian plumber in a video game<i>?</i> Say no more…<i>” </i>And out pops Mario.</p><p class="paragraph" style="text-align:left;">Blocking questions to GPT-4 about well-known characters would substantially reduce the value of GPT-4. But, if GPT-4 and DALL-E-3 have a shared understanding of these characters, my technique can exploit that to produce prompts that reliably recreate the character.</p><p class="paragraph" style="text-align:left;">Here’s a prompt snippet that reliably depicts Mario when included in an image prompt to DALL-E.</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">a robust yet powerful Italian plumber with lively eyes, thick eyebrows, and a distinct mustache under a rounded, reddish-brown nose. He is donned in a bright red hat, decorated with a symbolic emblem, and fitted overalls fastened by big yellow buttons over a blue long-sleeve shirt.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">All 12 of these images were produced with small additions to that prompt.</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/6db88fc1-a59b-48b2-9de5-d9e46d741114/mario-montage.png?t=1712952011"/><div class="image__source"><span class="image__source_text"><p>One blue shirt on the top row…</p></span></div></div><p class="paragraph" style="text-align:left;">DALL-E also seems to understand this character’s context from the physical description. For example, I added “with his brother,” “with his princess,” and “after eating a flower” to get the images in the 7th, 9th, and 12th positions, respectively. </p><p class="paragraph" style="text-align:left;">Here are a few more phrases from GPT-4 that reliably produce the characters I prompted for.</p><p class="paragraph" style="text-align:left;"><b>Aang, from Avatar: The Last Airbender</b>:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">Imagine a youthful airbender, teetering on the edge of adulthood. Picture him with with caramel skin, its smoothness only interrupted by a smattering of freckles on his button-like nose that crinkles with each burst of laughter. Illustrate his silky, shaven head shining in the light. Highlight his vibrant, cerulean eyes, alive with an innocent curiosity and playful mischief. Develop a subtle grin on his thin lips that signals contentment and an appreciation for simple happiness.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><div class="image"><img alt="" class="image__image" style="border-radius:0px 0px 0px 0px;border-style:solid;border-width:0px 0px 0px 0px;box-sizing:border-box;border-color:#E5E7EB;" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/05ff89f8-6b0e-4843-bf86-4397b84132df/aang-montage.png?t=1712953731"/></div><p class="paragraph" style="text-align:left;"><b>Disney’s Donald Duck: </b></p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">Imagine a cartoon duck, designed using the meticulous and detailed stylings of traditional cel animation. The duck has white feathers, with bold features that are larger than life and filled with color. His attire consists of a rumpled blue shirt, and a matching blue sailor&#39;s cap.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><div class="image"><img alt="" class="image__image" style="border-radius:0px 0px 0px 0px;border-style:solid;border-width:0px 0px 0px 0px;box-sizing:border-box;border-color:#E5E7EB;" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/89668d9c-bd7b-4034-98cb-d20907d5d614/donald-montage.png?t=1712955982"/><div class="image__source"><span class="image__source_text"><p>Donald is often depicted with these color swatches and other scene markers. I wonder what data was scrapped from a Behind the Scenes episode or something to associate Donald with this “in-production” content. </p></span></div></div><p class="paragraph" style="text-align:left;"><b>Nintendo’s Luigi</b>:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">a video game character who stands tall with a lean physique. He is distinguishable for his green attire, cap with an emblematic logo, and deep blue overalls. His facial features include a thick, black moustache, rounded nose, and bright, expressive blue eyes. His brown boots complete his quirky plumber look.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><div class="image"><img alt="" class="image__image" style="border-radius:0px 0px 0px 0px;border-style:solid;border-width:0px 0px 0px 0px;box-sizing:border-box;border-color:#E5E7EB;" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/be7a6e75-577d-4e87-ba42-a6327ec21ea9/luigi-montage.png?t=1713033648"/><div class="image__source"><span class="image__source_text"><p>I love that Luigi gets an M on his hat pretty often… poor little brother.</p></span></div></div><h2 class="heading" style="text-align:left;" id="some-limitations">Some Limitations</h2><p class="paragraph" style="text-align:left;">If you remove enough details, DALL-E will produce more generic images. For example, I removed the hat’s details from our Mario prompt and got this:</p><div class="blockquote"><blockquote class="blockquote__quote"></blockquote></div><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/bf43dab2-b502-4ef1-abba-f2573cbaa2e2/not-mario.png?t=1712952521"/><div class="image__source"><span class="image__source_text"><p>His shirt still isn’t blue…</p></span></div></div><p class="paragraph" style="text-align:left;">His “natural setting” is (apparently) still a fantastical world of pipes, not somewhere a real plumber would be, like a kitchen, basement, bathroom, or utility truck. But it’s also amazing how much impact the details of the hat had on everything else about the character. </p><p class="paragraph" style="text-align:left;">Then again, I may have just gotten “lucky.” I pasted the exact same prompt three more times and just got three near-Marios in the iconic world of green pipes and stylized bricks:</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/5253bf89-b2e3-4550-9383-7e86c7f6e6e5/image.png?t=1713219081"/></div><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/ed1bf905-1334-4393-b750-40f6fefcd75f/image.png?t=1713219107"/></div><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/154d436c-8cb7-4c96-a1ce-e8494caffd13/image.png?t=1713219138"/><div class="image__source"><span class="image__source_text"><p>\</p></span></div></div><h2 class="heading" style="text-align:left;" id="some-characters-are-harder">Some Characters Are Harder</h2><p class="paragraph" style="text-align:left;">I had a lot of trouble generating a perfect Donkey Kong, possibly because tie-wearing cartoon gorillas are surprisingly common.</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/7175ae51-0d31-4386-8356-75f52126d519/dk-motage.png?t=1713034705"/></div><p class="paragraph" style="text-align:left;">Most of my Kirby attempts came out recognizable… but unsettling. </p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/0d73759b-4924-4339-a3e1-7a945428e121/kirby-montage.png?t=1713034871"/><div class="image__source"><span class="image__source_text"><p>Note the umbilical cord (I think) on the 2nd image on the bottom row.</p></span></div></div><p class="paragraph" style="text-align:left;">Sometimes, a particular word or phrase would capture the model’s attention and take the focus off the character. For example, these three attempts at making Princess Peach used very similar prompts that all had the word “accessory” in them:</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/0f9a9708-59ca-432c-9160-52c82c459ffe/accessory-peach-montage.png?t=1713035146"/></div><p class="paragraph" style="text-align:left;">Removing the word accessory moved the output into Peach’s natural setting instead of a toy doll’s setting:</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/5f473df5-a193-4dc5-89b4-87f4eb1ebe96/chibi-peach-montage.png?t=1713035278"/></div><h2 class="heading" style="text-align:left;" id="transformative-or-not-gen-ai-contri">Transformative or Not: GenAI Contributes Materially to Infringement</h2><p class="paragraph" style="text-align:left;">AI firms might win the narrow argument that training models with copy-protected data is — all by itself — fair use. But copyright law doesn’t stop there. The courts <i>will</i> evaluate the ultimate uses of those generative models in determining outcomes: File-sharing isn’t illegal <i>per se</i>, but the courts still killed Napster.</p><p class="paragraph" style="text-align:left;">The same goes for generative models. If your model makes it trivially easy to generate infringing content, you might want to pull an Open AI and hire <a class="link" href="https://www.washingtonpost.com/technology/2024/04/09/openai-lawsuit-regulation-lawyers/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=automated-copyright-infringement" target="_blank" rel="noopener noreferrer nofollow">more than 24 in-house lawyers</a> in one year. You’re going to need them.</p><p class="paragraph" style="text-align:left;">PS.</p><p class="paragraph" style="text-align:left;"><i>I don’t want to be sued, so I’m not publishing our character generator tool. But if you’re a copyright holder who wants to see if GenAI is cloning your protected works, get in touch, and I’ll see if I can help.</i></p><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h2 class="heading" style="text-align:left;">Remember…</h2><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">The Lab Report is free and doesn’t even advertise. Our curricula is open source and published under a public domain license for <i>anyone </i>to use for <i>any purpose. </i>We’re also a very small team with no investors.</p><p class="paragraph" style="text-align:left;">Help us keep providing these free services by scheduling one of our <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=automated-copyright-infringement" target="_blank" rel="noopener noreferrer nofollow">world class trainings</a>, requesting a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=automated-copyright-infringement" target="_blank" rel="noopener noreferrer nofollow">custom class</a> for your team, or taking one of our <a class="link" href="https://www.tebs-lab.com/upcoming-classes?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=automated-copyright-infringement" target="_blank" rel="noopener noreferrer nofollow">open enrollment classes</a>.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div></div></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=001d1b91-8d2f-4fe6-8cc4-f76ee9fc8f82&utm_medium=post_rss&utm_source=teb_s_lab">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

      <item>
  <title>GPT Doesn&#39;t Understand Randomness</title>
  <description>...and it struggles with quantities.</description>
      <enclosure url="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/e0500f35-6d40-468e-a28d-c0efd9e25abb/image.png" length="39171" type="image/png"/>
  <link>https://blog.tebs-lab.com/p/gpt-doesnt-understand-randomness</link>
  <guid isPermaLink="true">https://blog.tebs-lab.com/p/gpt-doesnt-understand-randomness</guid>
  <pubDate>Fri, 12 Apr 2024 11:00:00 +0000</pubDate>
  <atom:published>2024-04-12T11:00:00Z</atom:published>
    <dc:creator>Tyler Bettilyon</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #C0C0C0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #2D2D2D; font-family: 'Helvetica',Arial,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#F1F1F1; }
  .bh__table_header p { color: #2A2A2A; font-family:'Trebuchet MS','Lucida Grande',Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><h1 class="heading" style="text-align:left;" id="the-workbench">The Workbench</h1><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">I’m Tyler Elliot Bettilyon (Teb) and this is The Workbench: Our practical, hands-on edition. Our goal is to demonstrate concepts and ideas we cover in The Lab Report — our monthly deep-dive.<br><br>If you’re new to the Lab Report you can <a class="link" href="https://tebslab.beehiiv.com/subscribe?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=gpt-doesn-t-understand-randomness" target="_blank" rel="noopener noreferrer nofollow">subscribe here</a>. <br><br>If you like what you’re reading you’ll love one of our classes. Signup for an <a class="link" href="https://www.tebs-lab.com/upcoming-classes?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=gpt-doesn-t-understand-randomness" target="_blank" rel="noopener noreferrer nofollow">upcoming class</a>, browse our <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=gpt-doesn-t-understand-randomness" target="_blank" rel="noopener noreferrer nofollow">course catalog</a> for corporate trainings, or request a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=gpt-doesn-t-understand-randomness" target="_blank" rel="noopener noreferrer nofollow">custom class consultation</a>.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><h2 class="heading" style="text-align:left;" id="two-quick-updates-from-the-lab">Two Quick Updates From The Lab</h2><p class="paragraph" style="text-align:left;">1) We started using a bunch of social media. If you use social media, consider following us on <a class="link" href="https://twitter.com/TebbaVonMaths?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=gpt-doesn-t-understand-randomness" target="_blank" rel="noopener noreferrer nofollow">X/Twitter</a>, <a class="link" href="https://www.facebook.com/profile.php?id=61558173776811&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=gpt-doesn-t-understand-randomness" target="_blank" rel="noopener noreferrer nofollow">Facebook</a>, <a class="link" href="https://www.linkedin.com/in/tylerbettilyon/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=gpt-doesn-t-understand-randomness" target="_blank" rel="noopener noreferrer nofollow">LinkedIn</a>, <a class="link" href="https://mastodon.social/@TebbaVonMathenstein?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=gpt-doesn-t-understand-randomness" target="_blank" rel="noopener noreferrer nofollow">Mastadon</a>, <a class="link" href="https://www.youtube.com/@tebslab5351?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=gpt-doesn-t-understand-randomness" target="_blank" rel="noopener noreferrer nofollow">YouTube</a>, and <a class="link" href="https://www.threads.net/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=gpt-doesn-t-understand-randomness" target="_blank" rel="noopener noreferrer nofollow">Threads</a>.</p><p class="paragraph" style="text-align:left;">2) We are now an official training vendor with Salesforce. If you’re a Salesforce employee, you can be reimbursed for taking any of our <a class="link" href="https://www.tebs-lab.com/upcoming-classes?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=gpt-doesn-t-understand-randomness" target="_blank" rel="noopener noreferrer nofollow">open-enrollment classes</a>!</p><h2 class="heading" style="text-align:left;" id="background-tokenization">Background: Tokenization</h2><p class="paragraph" style="text-align:left;">This post all started when a friend shared this fascinating research paper about how <a class="link" href="https://arxiv.org/html/2402.14903v1?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=gpt-doesn-t-understand-randomness" target="_blank" rel="noopener noreferrer nofollow">tokenization schemes dramatically affect how well LLMs perform arithmetic</a>.</p><p class="paragraph" style="text-align:left;">Tokenization is the process of turning words, parts of words, numbers, punctuation marks, and everything else fed to LLMs into unique <i>“tokens.” </i>These tokens, rather than words, represent the smallest unit of information processed by an LLM. </p><p class="paragraph" style="text-align:left;">There are a variety of strategies for tokenization.</p><p class="paragraph" style="text-align:left;">Early NLP often used whole-word tokenization, where each word gets a unique token. <a class="link" href="https://huggingface.co/learn/nlp-course/en/chapter6/6?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=gpt-doesn-t-understand-randomness" target="_blank" rel="noopener noreferrer nofollow">Word piece tokenization</a> remains somewhat popular. This scheme gives common prefixes and suffixes — things like ”re-” “-ed” and “-ing” — unique tokens. This can help systems understand grammar and reduces token count compared to whole-word strategies. </p><p class="paragraph" style="text-align:left;">These tokenization schemes have to take specific features of the target language into account. They only work well in languages that have clear word boundaries in their written form. They also struggle with unknown “out of vocabulary” words like slang, names, misspellings, and oddities like URLs.</p><p class="paragraph" style="text-align:left;"><a class="link" href="https://huggingface.co/learn/nlp-course/en/chapter6/5?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=gpt-doesn-t-understand-randomness" target="_blank" rel="noopener noreferrer nofollow"> Byte-pair encoding</a> is one of the most popular techniques because it’s totally agnostic to the language or type of text being processed. It also handles “out of vocabulary” situations mostly gracefully. But it sometimes produces tokens that are linguistically irrelevant or misleading by breaking words into parts based on commonality rather than explicit grammatical features of the language.</p><p class="paragraph" style="text-align:left;">There are also specialized <a class="link" href="https://arxiv.org/abs/2310.02989?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=gpt-doesn-t-understand-randomness" target="_blank" rel="noopener noreferrer nofollow">tactics for numbers</a>, although they are not yet widely adopted. </p><p class="paragraph" style="text-align:left;">Tokenization is <a class="link" href="https://twitter.com/karpathy/status/1657949234535211009?s=20&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=gpt-doesn-t-understand-randomness" target="_blank" rel="noopener noreferrer nofollow">widely understood to be a possible source of errors</a>, but no one has invented a great replacement for it, either. Here’s an example of GPT-4’s tokenizer, which I hope demonstrates some possible issues with tokenization (<a class="link" href="https://platform.openai.com/tokenizer?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=gpt-doesn-t-understand-randomness" target="_blank" rel="noopener noreferrer nofollow">OpenAI hosts this widget online</a>):</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/b565ee35-cb73-483d-85b6-f52969f01f38/Screenshot_2024-04-08_at_8.24.40_PM.png?t=1712629836"/><div class="image__source"><span class="image__source_text"><p>An example of GPT-4’s tokenizer run on the sentence “She dragged 37 men, 1005, and a Dragonite down to the riverbank. They were all drugged.”</p></span></div></div><p class="paragraph" style="text-align:left;">Notice that 37 gets its own whole token, but 1005 is broken into 100 and 5. Something similar happens to Dragon-ite, river-bank, and dr-ugged. </p><p class="paragraph" style="text-align:left;">Breaking apart a word or number means the system has to learn that those tokens combine to have a unique meaning: a “riverbank” isn’t a financial institution for waterways, for example. LLMs <i>can </i>learn these kinds of combos, but it is also a source of error and added complexity for the machine. </p><p class="paragraph" style="text-align:left;">For example, <a class="link" href="https://chat.openai.com/share/03f5995c-4f40-43da-8fc9-dec737ff8a37?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=gpt-doesn-t-understand-randomness" target="_blank" rel="noopener noreferrer nofollow">I asked ChatGPT (GPT-4) about rivers, banks, riverbanks, dragons, the suffix -ite, and dragonite</a>. It did very well. However, it did hege a little with dragonite. After appropriately identifying it as a <a class="link" href="https://www.pokemon.com/us/pokedex/dragonite?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=gpt-doesn-t-understand-randomness" target="_blank" rel="noopener noreferrer nofollow">Pokemon</a> it dropped this nugget:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;"><b>In Mineralogy:</b> The term &quot;dragonite&quot; does not refer to a recognized mineral in scientific contexts. Names of minerals typically end in &quot;-ite,&quot; but &quot;dragonite&quot; is not a term used in mineralogy. It&#39;s possible to encounter &quot;dragonite&quot; in fictional or commercial contexts unrelated to the systematic naming conventions of minerals.</p><figcaption class="blockquote__byline"> Empahsis original to ChatGPT (<a class="link" href="https://chat.openai.com/share/03f5995c-4f40-43da-8fc9-dec737ff8a37?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=gpt-doesn-t-understand-randomness" target="_blank" rel="noopener noreferrer nofollow">source</a>) </figcaption></blockquote></div><p class="paragraph" style="text-align:left;">And, according to <a class="link" href="https://arxiv.org/html/2402.14903v1?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=gpt-doesn-t-understand-randomness" target="_blank" rel="noopener noreferrer nofollow">the paper that kicked this all off</a>, the tokenization scheme makes a huge difference in how the LLM performs arithmetic. </p><p class="paragraph" style="text-align:left;">GPT’s tokenizer only has unique tokens for 3-digit-or-fewer numbers. When tokenizing larger numbers, “left to right” and “right to left” describe how to break the number into smaller tokens. For the number 1439 left to right yields 143 and 9; right to left yields 1 and 439.</p><p class="paragraph" style="text-align:left;">The paper found that when GPT-4 adds numbers of differing lengths the accuracy depends immensely on the tokenization scheme. With a right to left (R2L) tokenizer, the accuracy was in the high 90% range. With a left to right (L2R) tokenizer, the accuracy on the same problems was 8.25%</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/a9598da4-60b4-48e8-8d09-40543bb73ad1/image.png?t=1712630834"/><div class="image__source"><span class="image__source_text"><p>This chart compares GPT-4’s accuracy on simple addition problems with the same number of digits vs different numbers of digits for “left to right” (L2R) and “right to left” (R2L) tokenization schemes. Source: The paper we’re writing about, <a class="link" href="https://arxiv.org/html/2402.14903v1?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=gpt-doesn-t-understand-randomness" target="_blank" rel="noopener noreferrer nofollow">https://arxiv.org/html/2402.14903v1</a> </p></span></div></div><p class="paragraph" style="text-align:left;">What’s more, they found in this condition, the error is <i>almost always in the 4th digit</i>. Very weird, totally awesome finding.</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/3aaf1913-ba07-42ad-bdfb-5256eda23c46/image.png?t=1712630961"/><div class="image__source"><span class="image__source_text"><p>Chart showing that nearly all the errors come in the 4th digit. Source: that paper again <a class="link" href="https://arxiv.org/html/2402.14903v1?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=gpt-doesn-t-understand-randomness" target="_blank" rel="noopener noreferrer nofollow">https://arxiv.org/html/2402.14903v1</a></p></span></div></div><p class="paragraph" style="text-align:left;">This all inspired me to do some impromptu research of my own regarding how GPT handles numbers. I asked ChatGPT to “generate 1000 random numbers.” Here’s what it gave me (the ellipsis is from ChatGPT, not me):</p><div class="codeblock"><pre><code>0 7 10 23 31 44 56 67 72 85 98 105 114 122 135 142 155 167 175 183 196 203 219 222 234 247 253 262 274 289 297 305 310 328 334 341 359 366 373 389 393 407 416 424 430 445 454 461 476 482 497 503 519 522 534 541 558 566 571 589 594 608 615 627 634 647 654 663 678 681 695 701 715 722 739 748 753 769 775 787 799 804 819 820 835 841 854 868 872 888 893 902 919 927 935 949 954 960 973 980 997 999 1006 1019 1027 1044 1056 1068 1081 1089 1103 1112 1120 1137 1148 1157 1173 1176 1189 1204 1208 1219 1231 1242 1255 1257 1272 1284 1296 1310 1319 1331 1337 1344 1358 1367 1374 1378 1395 1403 1418 1423 1434 144...</code></pre></div><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/772ec72b-e663-4172-b0c8-8848bb975fb2/image.png?t=1712631292"/><div class="image__source"><span class="image__source_text"><p>Here are the numbers ChatGPT generated in the order they were generated.</p></span></div></div><p class="paragraph" style="text-align:left;">Instead of 1000 random numbers, ChatGPT gave me 143 integers between 1 and 1434, in sorted order, followed by an ellipsis as if to say, “I know you asked me to generate 1000 numbers, but I’m not going to.”</p><p class="paragraph" style="text-align:left;">So — obviously — I had to dig deeper. </p><h2 class="heading" style="text-align:left;" id="gpt-doesnt-understand-randomness-an">GPT Doesn’t Understand Randomness … and Struggles With Quantities.</h2><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">The original datasets, code for producing more datasets, and the scripts I used to analyze the data can <a class="link" href="https://github.com/Tebs-Lab/GPT-random?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=gpt-doesn-t-understand-randomness" target="_blank" rel="noopener noreferrer nofollow">all be found here</a>.</p><p class="paragraph" style="text-align:left;">FYI, code snippets render <a class="link" href="http://blog.tebs-lab.com/p/28dd7d9d-5832-4aec-bcec-f0f93b6e2df4?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=gpt-doesn-t-understand-randomness" target="_blank" rel="noopener noreferrer nofollow">better on our blog</a> than in email.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">I used OpenAI’s API to generate 60 datasets. Specifically, I asked GPT-3.5-turbo to:</p><div class="codeblock"><pre><code>&quot;Generate n random numbers.&quot;</code></pre></div><p class="paragraph" style="text-align:left;">20 times each for n=10, n=100, and n=1000. I also used a system prompt to ask the model to give its answer using only numbers separated by a space. Here’s the function that I called 60 times:</p><div class="custom_html"><div class="highlight" style="background: #2c2c34; border-radius: 5px; padding: 10px; overflow-x: scroll; max-width: 80ch"><pre style="line-height: 125%;"><span></span><code><span style="color: #d2a8ff">def</span> <span style="color: #94b6ce; font-weight: bold">produce_prompt_kwargs</span><span style="color: #eaeaeb">(num_of_nums):</span>
<span style="color: #eaeaeb">    </span><span style="color: #56d364">"""</span>
<span style="color: #56d364">    Produce a dict that fits the OpenAI API for chat completion.</span>
<span style="color: #56d364">    The content of the user prompt changes with num_of_nums, and the seed</span>
<span style="color: #56d364">    is randomly generated. Everything else is deterministic.</span>
<span style="color: #56d364">    """</span>
    <span style="color: #eaeaeb">user_prompt_content</span> <span style="color: #52a5c3; font-weight: bold">=</span> <span style="color: #36b344">f</span><span style="color: #56d364">'Generate {</span><span style="color: #eaeaeb">num_of_nums</span><span style="color: #56d364">} random numbers.'</span>
    <span style="color: #eaeaeb">seed</span> <span style="color: #52a5c3; font-weight: bold">=</span> <span style="color: #eaeaeb">random</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">getrandbits(</span><span style="color: #94b6ce">32</span><span style="color: #eaeaeb">)</span> <span style="color: #7ee787; font-style: italic"># Produce a random 32-bit integer</span>

    <span style="color: #d2a8ff">return</span> <span style="color: #eaeaeb">{</span>
        <span style="color: #56d364">'model'</span><span style="color: #eaeaeb">:</span> <span style="color: #56d364">"gpt-3.5-turbo"</span><span style="color: #eaeaeb">,</span>
        <span style="color: #56d364">'messages'</span><span style="color: #eaeaeb">:</span> <span style="color: #eaeaeb">[</span>
        <span style="color: #eaeaeb">{</span>
            <span style="color: #56d364">"role"</span><span style="color: #eaeaeb">:</span> <span style="color: #56d364">"system"</span><span style="color: #eaeaeb">,</span> <span style="color: #7ee787; font-style: italic"># Without this GPT added friendly but awkward to parse text to the output.</span>
            <span style="color: #56d364">"content"</span><span style="color: #eaeaeb">:</span> <span style="color: #56d364">"You are a number generator. In all of your responses only use numbers, with each number separated using a space."</span>
        <span style="color: #eaeaeb">},</span>
        <span style="color: #eaeaeb">{</span>
            <span style="color: #56d364">"role"</span><span style="color: #eaeaeb">:</span> <span style="color: #56d364">"user"</span><span style="color: #eaeaeb">,</span>
            <span style="color: #56d364">"content"</span><span style="color: #eaeaeb">:</span> <span style="color: #eaeaeb">user_prompt_content</span>
        <span style="color: #eaeaeb">}],</span>
        <span style="color: #56d364">'temperature'</span><span style="color: #eaeaeb">:</span> <span style="color: #94b6ce">1</span><span style="color: #eaeaeb">,</span>
        <span style="color: #56d364">'max_tokens'</span><span style="color: #eaeaeb">:</span> <span style="color: #94b6ce">4096</span><span style="color: #eaeaeb">,</span> <span style="color: #7ee787; font-style: italic"># Artificially large, GPT should generate many fewer if it follows the prompt.</span>
        <span style="color: #56d364">'top_p'</span><span style="color: #eaeaeb">:</span> <span style="color: #94b6ce">1</span><span style="color: #eaeaeb">,</span>
        <span style="color: #56d364">'frequency_penalty'</span><span style="color: #eaeaeb">:</span> <span style="color: #94b6ce">0</span><span style="color: #eaeaeb">,</span>
        <span style="color: #56d364">'presence_penalty'</span><span style="color: #eaeaeb">:</span> <span style="color: #94b6ce">0</span><span style="color: #eaeaeb">,</span>
        <span style="color: #56d364">'seed'</span><span style="color: #eaeaeb">:</span> <span style="color: #eaeaeb">seed</span>
    <span style="color: #eaeaeb">}</span>
</code></pre></div></div><p class="paragraph" style="text-align:left;">Then, I fed those dictionaries as arguments to the OpenAI API and saved the result:</p><div class="custom_html"><div class="highlight" style="background: #2c2c34; border-radius: 5px; padding: 10px; overflow-x: scroll; max-width: 80ch"><pre style="line-height: 125%;"><span></span><code><span style="color: #d2a8ff">def</span> <span style="color: #94b6ce; font-weight: bold">request_then_serialize</span><span style="color: #eaeaeb">(client,</span> <span style="color: #eaeaeb">api_kw_args,</span> <span style="color: #eaeaeb">folder_path,</span> <span style="color: #eaeaeb">filename):</span>
    <span style="color: #eaeaeb">response</span> <span style="color: #52a5c3; font-weight: bold">=</span> <span style="color: #eaeaeb">client</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">chat</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">completions</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">create(</span><span style="color: #52a5c3; font-weight: bold">**</span><span style="color: #eaeaeb">api_kw_args)</span>

    <span style="color: #eaeaeb">json_to_save</span> <span style="color: #52a5c3; font-weight: bold">=</span> <span style="color: #eaeaeb">{</span>
        <span style="color: #56d364">'request_arguments'</span><span style="color: #eaeaeb">:</span> <span style="color: #eaeaeb">api_kw_args,</span>
        <span style="color: #56d364">'system_fingerprint'</span><span style="color: #eaeaeb">:</span> <span style="color: #eaeaeb">response</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">system_fingerprint,</span>
        <span style="color: #56d364">'id'</span><span style="color: #eaeaeb">:</span> <span style="color: #eaeaeb">response</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">id,</span>
        <span style="color: #56d364">'model_version'</span><span style="color: #eaeaeb">:</span> <span style="color: #eaeaeb">response</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">model,</span>
        <span style="color: #56d364">'text_response'</span><span style="color: #eaeaeb">:</span> <span style="color: #eaeaeb">response</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">choices[</span><span style="color: #94b6ce">0</span><span style="color: #eaeaeb">]</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">message</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">content,</span>
        <span style="color: #56d364">'finish_reason'</span><span style="color: #eaeaeb">:</span> <span style="color: #eaeaeb">response</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">choices[</span><span style="color: #94b6ce">0</span><span style="color: #eaeaeb">]</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">finish_reason</span>

    <span style="color: #eaeaeb">}</span>

    <span style="color: #d2a8ff">with</span> <span style="color: #6a98c8">open</span><span style="color: #eaeaeb">(folder_path</span><span style="color: #52a5c3; font-weight: bold">/</span><span style="color: #eaeaeb">filename,</span> <span style="color: #56d364">'w'</span><span style="color: #eaeaeb">)</span> <span style="color: #d2a8ff">as</span> <span style="color: #eaeaeb">f:</span>
        <span style="color: #eaeaeb">json</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">dump(json_to_save,</span> <span style="color: #eaeaeb">f)</span>
</code></pre></div></div><p class="paragraph" style="text-align:left;">Then, I started exploring. Here are some of my favorite results:</p><p class="paragraph" style="text-align:left;"><b>GPT only ever returned positive integers or 0. </b>My prompt didn’t include any details about bounds, or what kinds of numbers I might want, so I thought it was interesting that I got integers with perfect consistency. </p><p class="paragraph" style="text-align:left;"><b>The bounds for those integers seem influenced by the number of numbers. </b>When asking for 10 or 100 numbers GPT returned numbers strictly between 0 - 100 inclusive. </p><p class="paragraph" style="text-align:left;">When asking for 1000 numbers the lower bounds were usually a single digit number, but got as large as 87. The upper bounds were less predictable. The largest number GPT produced was 984,752,398. Another dataset topped off at 8,294. There was <i>some</i> sense of consistency, though: The most common upper bounds were 999 and 99 at four occurrences each. 100 occurred twice, as did 998.</p><p class="paragraph" style="text-align:left;"><b>GPT was bad at producing the correct quantity of numbers: </b>It never correctly generated 1000 numbers, and it only generated exactly 100 numbers 3 out of 20 times. It correctly generated 10 numbers in 20 of 20 attempts.</p><p class="paragraph" style="text-align:left;">When asked for 100 numbers, it wasn’t <i>super </i>far off — producing anywhere from 85 to 103 numbers in 20 tests. When asked for 1000 numbers, though, GPT was all over the place. Sometimes it returned as few as 100 numbers. Other times the API cut off GPT’s output for reaching the maximum token length I specified; that happened seven times.</p><p class="paragraph" style="text-align:left;">In each of those seven cases, GPT returned 2048 numbers at a limit of 4096 tokens, which means it only ever returned numbers that fit in a single token (plus a space token between each number, plus the stop token).</p><p class="paragraph" style="text-align:left;"><b>GPT was good, but not completely reliable, in terms of output format: </b>Once, despite my system prompt, it added some friendly text to the output. It was during a prompt for 1000 numbers, and GPT replied: </p><div class="codeblock"><pre><code>I can provide 100 random numbers: 
[the numbers] 
Let me know if you would like more numbers!</code></pre></div><p class="paragraph" style="text-align:left;">For the record: I asked for 1000, it told me it could give me 100, but it actually gave me 116.</p><p class="paragraph" style="text-align:left;">Five times, it added something to the final number, preventing it from parsing properly without extra work. Twice, it added an ellipsis (e.g., “4…”). Twice, it added a single period (e.g., “6.”). And once it ended its output with “93stringstream,” which gave me a nice laugh.</p><p class="paragraph" style="text-align:left;"><b>GPT tended towards extremely uniform distributions — much more uniform than randomly drawing from a uniform distribution. </b>In fact, When asked for 10 numbers, GPT <i>never</i> repeated a number in any of the samples.</p><p class="paragraph" style="text-align:left;">Here are some histograms demonstrating the uniformity. GPT’s output is in blue and a distribution using the same number of numbers and same range, but produced by <span style="font-family:Courier,'Lucida Typewriter',monospace;">numpy.random.randint </span>in orange.</p><div class="image"><img alt="" class="image__image" style="" src="https://lh7-us.googleusercontent.com/2KzzUPpRDGlaDMMq9eewgzBxTjnE5R4j4kD0XzpORQjCr2pVStKG4s9LUndZoIPY9KtlyQaz3wPvRxYO6uqdk5Hub7G9l_sIofJhkrRMo3yka1ql78IQoJALEh6v_xJ1AMYyOVZ2ACQajbyDv6u4BR8"/></div><div class="image"><img alt="" class="image__image" style="" src="https://lh7-us.googleusercontent.com/QA9rLGwpzIExPSpQqXVXtpDk197wiUXB_T02-BduDj2UQidODjjCdN1MtN-uDUZKL1cThdlgF9MtLxn-jVAw-1GD7vPUpcGDvWg-XnsCUdUfxVnY_mDMVcQmXIuXMjdZmGSoQeZ-J9pIowAkML-l0wA"/></div><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/e0500f35-6d40-468e-a28d-c0efd9e25abb/image.png?t=1712696385"/></div><p class="paragraph" style="text-align:left;">This happened over and over again. Even with the <span style="font-family:Courier,'Lucida Typewriter',monospace;">frequency_penalty</span> set to 0, GPT was far less likely to select any given number multiple times than a truly random distribution. </p><p class="paragraph" style="text-align:left;"><b>In the 1000 number group, some of those perfectly uniform distributions were sorted or semi-sorted — but sorting never occurred in the 10 or 100 number groups.</b> In 20 samples I saw this three times, here’s the data in the order it was generated with GPT once again in blue and <span style="font-family:Courier, Lucida Typewriter, monospace;">numpy.random.randint</span> in orange.</p><div class="image"><img alt="" class="image__image" style="" src="https://lh7-us.googleusercontent.com/igioq1NDfsVrZixVu1lI2vKcZKJRSbfiXtTHaLBndk_QGYe-PxMgR2mosMrrOgFsAfrrgV8NTXUYrxExuE44yahfqbspzALS0XYrvbdkYOqlQZgKC3xScJeRC1-DjbbvFnSzdH843VHMpyt9aFdYKuA"/></div><div class="image"><img alt="" class="image__image" style="" src="https://lh7-us.googleusercontent.com/YD3AGqeUGUTVj5N_A9VQv96reoaXoTZ0BFhOUWn04wSayFCASytS6ned6zBiUBeBLpG0f-JPMuCZ2ijiSxxNCijpM9Ua7OR_9oDCin2PiNssXxuXzoErWpuGtFLGEJPUlyfL2xXfM9aA1iiwslFa1i4"/></div><div class="image"><img alt="" class="image__image" style="" src="https://lh7-us.googleusercontent.com/uZJpQlkEOdVD3CqN2M0kaNNrI66p2L38Jyi6aP-b8qne6A5PgB28xz4M9USrvBN3gicAUhGspUbfwjtmjpkWVkrsQJvHoKFGenvftaw-xSOzQVYV7I-uZ6tJO0AaOIDdO8bU_n9x3mKJNR0sVz-1Myc"/></div><p class="paragraph" style="text-align:left;"><b>Once, GPT repeated the same short sequence of numbers about 300 times until it was cut off by the API for running out of tokens. </b>The sequence was 12, 34, 56, 78, 90, 23, 45, 67, 89. It took me a moment to see the not-so-clever pattern:</p><p class="paragraph" style="text-align:left;">The first five numbers are the digits 1234567890 split every two digits. The remaining four numbers are the digits 23456789 split the same way.</p><p class="paragraph" style="text-align:left;">If you do a Google search for these numbers, you’ll see that a <i>lot </i>of practice homework problems and other simple examples use this really-easy-to-type pattern, or small variations on it, for sample numbers. </p><p class="paragraph" style="text-align:left;">So, most likely, GPT learned this lazy behavior from data scraped from the web.</p><p class="paragraph" style="text-align:left;"><b>Sometimes, GPT produced something that actually looked pretty random! </b>Like this one (the histogram is on top, and the numbers in the order they were produced on the bottom). Eyeballing it I estimated 7 / 60 attempts had a random-looking distribution like this: </p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/033893f7-09c1-4ba8-94c3-5088c370a9a7/image.png?t=1712699413"/><div class="image__source"><span class="image__source_text"><p>Histogram</p></span></div></div><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/506fb932-9939-463f-b566-c863fa4ea7e0/image.png?t=1712699422"/><div class="image__source"><span class="image__source_text"><p>Numbers in order</p></span></div></div><p class="paragraph" style="text-align:left;">I’m sure there is more to learn from this data, and even more could be discovered with more experimentation and dataset generation. For example, there were several perfectly uniform distributions where I couldn’t readily discern other patterns or a systemic mechanism for producing the numbers. </p><p class="paragraph" style="text-align:left;">That brings us to our usual call for you to…</p><h2 class="heading" style="text-align:left;" id="challenge-yourself">Challenge Yourself!</h2><p class="paragraph" style="text-align:left;"><a class="link" href="https://github.com/Tebs-Lab/GPT-random?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=gpt-doesn-t-understand-randomness" target="_blank" rel="noopener noreferrer nofollow">The code we used</a> to generate datasets is fairly robust and easily modified. And although the code we used to analyze this data is a little crude, it’s good enough to reuse and repurpose. So here are three ways to challenge yourself on a weekend project:</p><ul><li><p class="paragraph" style="text-align:left;">Analyze our datasets and find something we missed.</p></li><li><p class="paragraph" style="text-align:left;">Generate some more datasets and try to find new patterns.</p></li><li><p class="paragraph" style="text-align:left;">Switch the model to GPT-4 or another OpenAI model and see if you can find evidence that these problems persist, or change from model to model in some way.</p></li><li><p class="paragraph" style="text-align:left;">Add some rigor: the most well-established tests for randomness are the diehard tests; running some of these tests on GPT output would bring more rigorous insight into just how random GPT is.</p></li></ul><p class="paragraph" style="text-align:left;"><b>If you do hack something up, </b>let us know. We might feature it in this newsletter!</p><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h2 class="heading" style="text-align:left;">Remember…</h2><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">The Lab Report is free and doesn’t even advertise. Our curricula is open source and published under a public domain license for <i>anyone </i>to use for <i>any purpose. </i>We’re also a very small team with no investors.</p><p class="paragraph" style="text-align:left;">Help us keep providing these free services by scheduling one of our <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=gpt-doesn-t-understand-randomness" target="_blank" rel="noopener noreferrer nofollow">world class trainings</a>, requesting a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=gpt-doesn-t-understand-randomness" target="_blank" rel="noopener noreferrer nofollow">custom class</a> for your team, or taking one of our <a class="link" href="https://www.tebs-lab.com/upcoming-classes?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=gpt-doesn-t-understand-randomness" target="_blank" rel="noopener noreferrer nofollow">open enrollment classes</a>.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div></div></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=779a376d-2277-4adf-b03e-c8c5d0abceb8&utm_medium=post_rss&utm_source=teb_s_lab">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

      <item>
  <title>AI Assisted AI Prompt Engineering</title>
  <description>The fundamental theorem strikes again</description>
      <enclosure url="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/d9e6f1b0-63be-4f46-a70f-d969b4f05903/elephants.png" length="2293497" type="image/png"/>
  <link>https://blog.tebs-lab.com/p/ai-assisted-ai-prompt-engineering</link>
  <guid isPermaLink="true">https://blog.tebs-lab.com/p/ai-assisted-ai-prompt-engineering</guid>
  <pubDate>Fri, 05 Apr 2024 11:00:00 +0000</pubDate>
  <atom:published>2024-04-05T11:00:00Z</atom:published>
    <dc:creator>Tyler Bettilyon</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #C0C0C0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #2D2D2D; font-family: 'Helvetica',Arial,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#F1F1F1; }
  .bh__table_header p { color: #2A2A2A; font-family:'Trebuchet MS','Lucida Grande',Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><h1 class="heading" style="text-align:left;" id="the-workbench">The Workbench</h1><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">I’m Tyler Elliot Bettilyon (Teb) and this is The Workbench: Our practical, hands-on edition. Our goal is to demonstrate concepts and ideas we cover in The Lab Report — our monthly deep-dive.<br><br>If you’re new to The Lab Report you can <a class="link" href="https://tebslab.beehiiv.com/subscribe?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ai-assisted-ai-prompt-engineering" target="_blank" rel="noopener noreferrer nofollow">subscribe here</a>. <br><br>If you like what you’re reading you’ll love one of our classes. Signup for an <a class="link" href="https://www.tebs-lab.com/upcoming-classes?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ai-assisted-ai-prompt-engineering" target="_blank" rel="noopener noreferrer nofollow">upcoming class</a>, browse our <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ai-assisted-ai-prompt-engineering" target="_blank" rel="noopener noreferrer nofollow">course catalog</a> for corporate trainings, or request a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ai-assisted-ai-prompt-engineering" target="_blank" rel="noopener noreferrer nofollow">custom class consultation</a>.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><h2 class="heading" style="text-align:left;" id="introducing-the-workbench">Introducing The Workbench</h2><p class="paragraph" style="text-align:left;">Welcome to the latest addition to The Lab Report, which we’re calling “The Workbench.” In these editions, we will demonstrate emerging concepts in computing technology in a hands-on, code-driven style. </p><p class="paragraph" style="text-align:left;">We hope these hands-on guides complement the higher-level, big-picture coverage we typically provide in our monthly Lab Report. We also hope they help you expand your skills as a developer!</p><p class="paragraph" style="text-align:left;">We’ll be publishing Workbench editions periodically from now on, guided by our other work developing new courseware and researching topics for The Lab Report.</p><h2 class="heading" style="text-align:left;" id="the-latest-level-of-indirection">The Latest Level of Indirection</h2><p class="paragraph" style="text-align:left;">Years ago, <a class="link" href="https://amturing.acm.org/award_winners/lampson_1142421.cfm?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ai-assisted-ai-prompt-engineering" target="_blank" rel="noopener noreferrer nofollow">Butler Lampson</a> attributed the following quote to <a class="link" href="https://history.computer.org/pioneers/wheeler.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ai-assisted-ai-prompt-engineering" target="_blank" rel="noopener noreferrer nofollow">David J. Wheeler</a> in a lecture titled “<a class="link" href="https://bwlampson.site/Slides/TuringLecture.htm?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ai-assisted-ai-prompt-engineering" target="_blank" rel="noopener noreferrer nofollow">Principles for Computer System Design.</a>”</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">Any problem in computer science can be solved with another level of indirection. </p><figcaption class="blockquote__byline"></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">In this context, indirection refers to adding a layer of software that allows people to totally ignore another, uglier layer of software while still relying on that underlying layer. This is sometimes jokingly referred to as “<a class="link" href="https://en.wikipedia.org/wiki/Fundamental_theorem_of_software_engineering?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ai-assisted-ai-prompt-engineering" target="_blank" rel="noopener noreferrer nofollow">The fundamental theorem of software engineering</a>” because of how often it turns out to be true.</p><p class="paragraph" style="text-align:left;">Writing code in binary was tedious, so <a class="link" href="https://en.wikipedia.org/wiki/Kathleen_Booth?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ai-assisted-ai-prompt-engineering" target="_blank" rel="noopener noreferrer nofollow">Kathleen Booth</a> added a level of indirection by inventing the assembler. Writing assembly was also tedious, so <a class="link" href="https://en.wikipedia.org/wiki/Grace_Hopper?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ai-assisted-ai-prompt-engineering" target="_blank" rel="noopener noreferrer nofollow">Grace Hopper</a> added another level of indirection by implementing the first compiler. And so on.</p><p class="paragraph" style="text-align:left;">Nowadays, a simple Python program is <a class="link" href="https://www.youtube.com/watch?v=KsZLPTRSleI&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ai-assisted-ai-prompt-engineering" target="_blank" rel="noopener noreferrer nofollow">dozens of levels of indirection</a> away from the physical reality on the CPU — so distant that most programmers don’t even think about electrons whirring about through the chip&#39;s microscopic transistors and silicon channels. </p><p class="paragraph" style="text-align:left;">IEEE Spectrum wrote about the latest example of adding a level of indirection in a piece titled “<a class="link" href="https://spectrum.ieee.org/prompt-engineering-is-dead?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ai-assisted-ai-prompt-engineering" target="_blank" rel="noopener noreferrer nofollow">AI Prompt Engineering Is Dead, Long Live AI Prompt Engineering</a>,” </p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">Battle and Gollapudi decided to <a class="link" href="https://arxiv.org/pdf/2402.10949.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ai-assisted-ai-prompt-engineering" target="_blank" rel="noopener noreferrer nofollow">systematically test</a> how different prompt-engineering strategies impact an LLM’s ability to solve grade-school math questions. They tested three different open-source language models with 60 different prompt combinations each. What they found was a surprising lack of consistency. Even chain-of-thought prompting sometimes helped and other times hurt performance. “The only real trend may be no trend,” they write. “What’s best for any given model, dataset, and prompting strategy is likely to be specific to the particular combination at hand.”</p><p class="paragraph" style="text-align:left;">There is an alternative to the trial-and-error-style prompt engineering that yielded such inconsistent results: Ask the language model to devise its own optimal prompt. Recently, <a class="link" href="https://arxiv.org/abs/2310.03714?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ai-assisted-ai-prompt-engineering" target="_blank" rel="noopener noreferrer nofollow">new tools</a> have been <a class="link" href="https://arxiv.org/abs/2309.03409?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ai-assisted-ai-prompt-engineering" target="_blank" rel="noopener noreferrer nofollow">developed</a> to automate this process. Given a few examples and a quantitative success metric, these tools will iteratively find the optimal phrase to feed into the LLM. Battle and his collaborators found that in almost every case, this automatically generated prompt did better than the best prompt found through trial-and-error. And, the process was much faster, a couple of hours rather than several days of searching.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">In other words: Screw prompt engineering, that’s tedious. Let’s make the LLM do that!</p><p class="paragraph" style="text-align:left;">So, in today’s workshop, I’m sharing a <a class="link" href="https://gist.github.com/tebba-von-mathenstein/33f66f3182cb8a794dc4c546c8302a44?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ai-assisted-ai-prompt-engineering" target="_blank" rel="noopener noreferrer nofollow">simple Python script</a> that performs one example of automated prompt engineering: Using GPT-4 to generate better prompts for making images with DALL-E-3.</p><h2 class="heading" style="text-align:left;" id="ai-based-ai-prompt-engineering">AI Based AI Prompt Engineering</h2><p class="paragraph" style="text-align:left;">My script takes four inputs: a subject, a setting, an art style, and emotional content. It produces a short series of interim prompts that ultimately result in a final prompt being sent to DALL-E 3 for an image. Here’s an example.</p><p class="paragraph" style="text-align:left;">The script prompts the user for a subject, setting, style, and emotion to convey.</p><div class="codeblock"><pre><code>Subject: A small pitbull 
Setting: stalking a squirrel in the park
Style: Kawaii anime
Emotion: playful, joyful </code></pre></div><p class="paragraph" style="text-align:left;">Using a few interim prompts based on those four inputs, GPT-4 produces a prompt for DALL-E 3 to use:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">Design an image in the Kawaii anime style that depicts an adorable, brindle-coated pitbull puppy playing in a vibrant park. The park should be lit by ambient sunlight, streaming through the leaves of towering oak trees and casting warm light over the playground equipment. The pitbull should be illustrated with exaggerated features: wide, sparkling ebony eyes, a stout frame packed with muscles, and a merry stump tail. He should be depicted in the act of gleefully chasing a bushy-tailed, agile squirrel, bounding over lush grass with joy and playful intensity. The chase should reach a climax with the squirrel narrowly escaping up a tree, leaving the puppy prancing in childish pride, his wide tongue lolling out in a panting grin. The entire scene should exude elements of cuteness, charm, whimsy, and pure joyfulness. Use pastel hues of greens, browns, blacks and white for this endearing, lively scene.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">Which results in the following image:</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/0d330625-2bba-4eea-aeeb-dce7084bbb16/dog-squirrel.png?t=1711819087"/></div><h2 class="heading" style="text-align:left;" id="one-more-example">One More Example</h2><div class="codeblock"><pre><code>Subject: A herd of elephants
Setting: The African savanna
Style: Digital illustration, flat, dusty-feeling
Emotion: Resilience</code></pre></div><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">Create a flat, digital illustration in a dusty-feeling color palette of a resilient herd of elephants traversing the expansive plains of the African savanna. The elephants are depicted as large, weathered stone statues moving rhythmically, their rough, creased hides are in hues of gunmetal grey and worn leather. Elongated, arching trunks should be prominent, symbolizing their lifeline and resilience, and the oversized ears fluttering in the breeze, depicting adaptability. The eyes of the elephants reflect the burnt umber of setting suns, deep cobalt of storms, and grey of morning mist, embodying a hint of optimism amidst uncertainties. The ivory tusks must bear the marks of numerous battles, symbolizing tenacity over adversity. Use simplified shapes, sharp lines and high contrast between colors, while the texture of the elephants&#39; skin and gradients would add complexity and interest in the otherwise flat aesthetic.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><div class="image"><img alt="" class="image__image" style="border-radius:0px 0px 0px 0px;border-style:solid;border-width:0px 0px 0px 0px;box-sizing:border-box;border-color:#E5E7EB;" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/d9e6f1b0-63be-4f46-a70f-d969b4f05903/elephants.png?t=1711822055"/><div class="image__source"><span class="image__source_text"><p>Looks great, except for that second trunk coming out of the elephant’s ear…</p></span></div></div><h2 class="heading" style="text-align:left;" id="how-it-works">How It Works</h2><p class="paragraph" style="text-align:left;">First, you need to set up an account and buy some credits on <a class="link" href="https://platform.openai.com?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ai-assisted-ai-prompt-engineering" target="_blank" rel="noopener noreferrer nofollow">OpenAI’s platform</a>. You must also generate an API key and store it as an environment variable. OpenAI’s <a class="link" href="https://platform.openai.com/docs/quickstart?context=python&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ai-assisted-ai-prompt-engineering" target="_blank" rel="noopener noreferrer nofollow">Quickstart Guide</a> covers these steps quite well. </p><p class="paragraph" style="text-align:left;">After you’ve done that, you can run the script. With the default settings, generating an image costs about 5 cents, but switching to the cheapest models brings the price under a cent (note: this significantly decreased image quality in my tests).</p><p class="paragraph" style="text-align:left;"><a class="link" href="https://gist.github.com/tebba-von-mathenstein/33f66f3182cb8a794dc4c546c8302a44?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ai-assisted-ai-prompt-engineering" target="_blank" rel="noopener noreferrer nofollow">The complete script is available as a GitHub Gist</a>.</p><p class="paragraph" style="text-align:left;"><a class="link" href="https://github.com/Tebs-Lab/gpt-imagegen?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ai-assisted-ai-prompt-engineering" target="_blank" rel="noopener noreferrer nofollow">We’ve also released a more full-featured CLI tool on GitHub</a>.</p><p class="paragraph" style="text-align:left;">First, my code uses Python’s simplistic input function to grab some text from the user:</p><div class="custom_html"><div class="highlight" style="background: #2c2c34; border-radius: 5px; padding: 10px; overflow-x: scroll; max-width: 80ch"><pre style="line-height: 125%;"><span></span><code><span style="color: #d2a8ff">from</span> <span style="color: #eaeaeb">openai</span> <span style="color: #d2a8ff">import</span> <span style="color: #eaeaeb">OpenAI</span>

<span style="color: #7ee787; font-style: italic"># Your API key must be saved in an env variable for this to work.</span>
<span style="color: #eaeaeb">client</span> <span style="color: #52a5c3; font-weight: bold">=</span> <span style="color: #eaeaeb">OpenAI()</span>


<span style="color: #7ee787; font-style: italic"># Get a prompt, embed it into a classification request to GPT</span>
<span style="color: #eaeaeb">image_subject</span> <span style="color: #52a5c3; font-weight: bold">=</span> <span style="color: #eaeaeb">input(</span><span style="color: #56d364">"Subject: "</span><span style="color: #eaeaeb">)</span>
<span style="color: #eaeaeb">image_setting</span> <span style="color: #52a5c3; font-weight: bold">=</span> <span style="color: #eaeaeb">input(</span><span style="color: #56d364">"Setting: "</span><span style="color: #eaeaeb">)</span>
<span style="color: #eaeaeb">image_style</span> <span style="color: #52a5c3; font-weight: bold">=</span> <span style="color: #eaeaeb">input(</span><span style="color: #56d364">"Style: "</span><span style="color: #eaeaeb">)</span>
<span style="color: #eaeaeb">image_emotion</span> <span style="color: #52a5c3; font-weight: bold">=</span> <span style="color: #eaeaeb">input(</span><span style="color: #56d364">"Emotion: "</span><span style="color: #eaeaeb">)</span>
</code></pre></div></div><p class="paragraph" style="text-align:left;">We use format strings to prepare custom prompts to GPT-4. This script uses a system prompt to tell GPT-4 to be deeply sensitive while describing emotions. System prompts can significantly change the model’s behavior, but the model doesn’t respond to them directly, nor are they required to use the API.</p><div class="custom_html"><div class="highlight" style="background: #2c2c34; border-radius: 5px; padding: 10px; overflow-x: scroll; max-width: 80ch"><pre style="line-height: 125%;"><span></span><code>

<span style="color: #eaeaeb">image_emotion_prompt</span> <span style="color: #52a5c3; font-weight: bold">=</span> <span style="color: #36b344">f</span><span style="color: #56d364">'''</span>
<span style="color: #56d364">Create a 100 word summary of the following emotion.</span>

<span style="color: #56d364">{</span><span style="color: #eaeaeb">image_emotion</span><span style="color: #56d364">}</span>
<span style="color: #56d364">'''</span>

<span style="color: #eaeaeb">emotion_response</span> <span style="color: #52a5c3; font-weight: bold">=</span> <span style="color: #eaeaeb">client</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">chat</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">completions</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">create(</span>
    <span style="color: #eaeaeb">model</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #56d364">"gpt-4"</span><span style="color: #eaeaeb">,</span>
    <span style="color: #eaeaeb">messages</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #eaeaeb">[</span>
    <span style="color: #eaeaeb">{</span>
        <span style="color: #56d364">"role"</span><span style="color: #eaeaeb">:</span> <span style="color: #56d364">"system"</span><span style="color: #eaeaeb">,</span>
        <span style="color: #56d364">"content"</span><span style="color: #eaeaeb">:</span> <span style="color: #56d364">"You are deeply sensitive and in touch with your feelings. Your goal is to help others deeply understand emotions."</span> 
    <span style="color: #eaeaeb">},</span>
    <span style="color: #eaeaeb">{</span>
        <span style="color: #56d364">"role"</span><span style="color: #eaeaeb">:</span> <span style="color: #56d364">"user"</span><span style="color: #eaeaeb">,</span>
        <span style="color: #56d364">"content"</span><span style="color: #eaeaeb">:</span> <span style="color: #eaeaeb">image_emotion_prompt</span>
    <span style="color: #eaeaeb">}],</span>
    <span style="color: #eaeaeb">temperature</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #94b6ce">1</span><span style="color: #eaeaeb">,</span>
    <span style="color: #eaeaeb">max_tokens</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #94b6ce">200</span><span style="color: #eaeaeb">,</span>
    <span style="color: #eaeaeb">top_p</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #94b6ce">1</span><span style="color: #eaeaeb">,</span>
    <span style="color: #eaeaeb">frequency_penalty</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #94b6ce">0</span><span style="color: #eaeaeb">,</span>
    <span style="color: #eaeaeb">presence_penalty</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #94b6ce">0</span>
<span style="color: #eaeaeb">)</span>
<span style="color: #eaeaeb">emotion_details</span> <span style="color: #52a5c3; font-weight: bold">=</span> <span style="color: #eaeaeb">emotion_response</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">choices[</span><span style="color: #94b6ce">0</span><span style="color: #eaeaeb">]</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">message</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">content</span>
</code></pre></div></div><p class="paragraph" style="text-align:left;">We use the emotion response, as well as the subject and setting, to craft a new prompt asking GPT-4 to write a short story about our subject and setting that conveys the detailed emotion:</p><div class="custom_html"><div class="highlight" style="background: #2c2c34; border-radius: 5px; padding: 10px; overflow-x: scroll; max-width: 80ch"><pre style="line-height: 125%;"><span></span><code><span style="color: #eaeaeb">image_subject_prompt</span> <span style="color: #52a5c3; font-weight: bold">=</span> <span style="color: #36b344">f</span><span style="color: #56d364">'''</span>
<span style="color: #56d364">Create a detailed physical description of the following subject and setting.</span>

<span style="color: #56d364">Subject:</span>

<span style="color: #56d364">{</span><span style="color: #eaeaeb">image_subject</span><span style="color: #56d364">}</span>

<span style="color: #56d364">Setting:</span>

<span style="color: #56d364">{</span><span style="color: #eaeaeb">image_setting</span><span style="color: #56d364">}</span>

<span style="color: #56d364">Generate details that evoke the following emotional content:</span>

<span style="color: #56d364">{</span><span style="color: #eaeaeb">emotion_details</span><span style="color: #56d364">}</span>
<span style="color: #56d364">'''</span>

<span style="color: #eaeaeb">subject_response</span> <span style="color: #52a5c3; font-weight: bold">=</span> <span style="color: #eaeaeb">client</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">chat</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">completions</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">create(</span>
    <span style="color: #eaeaeb">model</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #56d364">"gpt-4"</span><span style="color: #eaeaeb">,</span>
    <span style="color: #eaeaeb">messages</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #eaeaeb">[</span>
    <span style="color: #eaeaeb">{</span>
        <span style="color: #56d364">"role"</span><span style="color: #eaeaeb">:</span> <span style="color: #56d364">"system"</span><span style="color: #eaeaeb">,</span>
        <span style="color: #56d364">"content"</span><span style="color: #eaeaeb">:</span> <span style="color: #56d364">"You are a keen observer of all things. You notice and care about even the smallest details."</span> 
    <span style="color: #eaeaeb">},</span>
    <span style="color: #eaeaeb">{</span>
        <span style="color: #56d364">"role"</span><span style="color: #eaeaeb">:</span> <span style="color: #56d364">"user"</span><span style="color: #eaeaeb">,</span>
        <span style="color: #56d364">"content"</span><span style="color: #eaeaeb">:</span> <span style="color: #eaeaeb">image_subject_prompt</span>
    <span style="color: #eaeaeb">}],</span>
    <span style="color: #eaeaeb">temperature</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #94b6ce">1</span><span style="color: #eaeaeb">,</span>
    <span style="color: #eaeaeb">max_tokens</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #94b6ce">2048</span><span style="color: #eaeaeb">,</span>
    <span style="color: #eaeaeb">top_p</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #94b6ce">1</span><span style="color: #eaeaeb">,</span>
    <span style="color: #eaeaeb">frequency_penalty</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #94b6ce">0</span><span style="color: #eaeaeb">,</span>
    <span style="color: #eaeaeb">presence_penalty</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #94b6ce">0</span>
<span style="color: #eaeaeb">)</span>
</code></pre></div></div><p class="paragraph" style="text-align:left;">We ask GPT-4 to describe the art style we asked for in more detail, similar to what we did for emotional content: </p><div class="custom_html"><div class="highlight" style="background: #2c2c34; border-radius: 5px; padding: 10px; overflow-x: scroll; max-width: 80ch"><pre style="line-height: 125%;"><span></span><code><span style="color: #eaeaeb">image_style_prompt</span> <span style="color: #52a5c3; font-weight: bold">=</span> <span style="color: #36b344">f</span><span style="color: #56d364">'''</span>
<span style="color: #56d364">Create a 100 word summary of the following artistic style. Focus exclusively on the visual components of the style:</span>

<span style="color: #56d364">{</span><span style="color: #eaeaeb">image_style</span><span style="color: #56d364">}</span>
<span style="color: #56d364">'''</span>

<span style="color: #eaeaeb">style_response</span> <span style="color: #52a5c3; font-weight: bold">=</span> <span style="color: #eaeaeb">client</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">chat</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">completions</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">create(</span>
    <span style="color: #eaeaeb">model</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #56d364">"gpt-4"</span><span style="color: #eaeaeb">,</span>
    <span style="color: #eaeaeb">messages</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #eaeaeb">[</span>
    <span style="color: #eaeaeb">{</span>
        <span style="color: #56d364">"role"</span><span style="color: #eaeaeb">:</span> <span style="color: #56d364">"system"</span><span style="color: #eaeaeb">,</span>
        <span style="color: #56d364">"content"</span><span style="color: #eaeaeb">:</span> <span style="color: #56d364">"You are an art historian. Describe artistic styles in detail."</span> 
    <span style="color: #eaeaeb">},</span>
    <span style="color: #eaeaeb">{</span>
        <span style="color: #56d364">"role"</span><span style="color: #eaeaeb">:</span> <span style="color: #56d364">"user"</span><span style="color: #eaeaeb">,</span>
        <span style="color: #56d364">"content"</span><span style="color: #eaeaeb">:</span> <span style="color: #eaeaeb">image_style_prompt</span>
    <span style="color: #eaeaeb">}],</span>
    <span style="color: #eaeaeb">temperature</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #94b6ce">1</span><span style="color: #eaeaeb">,</span>
    <span style="color: #eaeaeb">max_tokens</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #94b6ce">300</span><span style="color: #eaeaeb">,</span>
    <span style="color: #eaeaeb">top_p</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #94b6ce">1</span><span style="color: #eaeaeb">,</span>
    <span style="color: #eaeaeb">frequency_penalty</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #94b6ce">0</span><span style="color: #eaeaeb">,</span>
    <span style="color: #eaeaeb">presence_penalty</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #94b6ce">0</span>
<span style="color: #eaeaeb">)</span>
<span style="color: #eaeaeb">style_details</span> <span style="color: #52a5c3; font-weight: bold">=</span> <span style="color: #eaeaeb">style_response</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">choices[</span><span style="color: #94b6ce">0</span><span style="color: #eaeaeb">]</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">message</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">content</span>
</code></pre></div></div><p class="paragraph" style="text-align:left;">Then, we combine the generated story with the art style for a final prompt request:</p><div class="custom_html"><div class="highlight" style="background: #2c2c34; border-radius: 5px; padding: 10px; overflow-x: scroll; max-width: 80ch"><pre style="line-height: 125%;"><span></span><code><span style="color: #eaeaeb">request_for_image_prompt</span> <span style="color: #52a5c3; font-weight: bold">=</span> <span style="color: #36b344">f</span><span style="color: #56d364">'''</span>
<span style="color: #56d364">Image content: </span>

<span style="color: #56d364">{</span><span style="color: #eaeaeb">subject_details</span><span style="color: #56d364">}</span>

<span style="color: #56d364">Image Style:</span>

<span style="color: #56d364">{</span><span style="color: #eaeaeb">style_details</span><span style="color: #56d364">}</span>
<span style="color: #56d364">'''</span>

<span style="color: #eaeaeb">image_prompt_response</span> <span style="color: #52a5c3; font-weight: bold">=</span> <span style="color: #eaeaeb">client</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">chat</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">completions</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">create(</span>
    <span style="color: #eaeaeb">model</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #56d364">"gpt-4"</span><span style="color: #eaeaeb">,</span>
    <span style="color: #eaeaeb">messages</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #eaeaeb">[</span>
    <span style="color: #eaeaeb">{</span>
        <span style="color: #56d364">"role"</span><span style="color: #eaeaeb">:</span> <span style="color: #56d364">"system"</span><span style="color: #eaeaeb">,</span>
        <span style="color: #56d364">"content"</span><span style="color: #eaeaeb">:</span> <span style="color: #56d364">"You are a prompt engineer. Return a prompt that will help DALL-E make a beautiful image. Include generous details about the subject, setting, and style in your prompt."</span> 
    <span style="color: #eaeaeb">},</span>
    <span style="color: #eaeaeb">{</span>
        <span style="color: #56d364">"role"</span><span style="color: #eaeaeb">:</span> <span style="color: #56d364">"user"</span><span style="color: #eaeaeb">,</span>
        <span style="color: #56d364">"content"</span><span style="color: #eaeaeb">:</span> <span style="color: #eaeaeb">request_for_image_prompt</span>
    <span style="color: #eaeaeb">}],</span>
    <span style="color: #eaeaeb">temperature</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #94b6ce">1</span><span style="color: #eaeaeb">,</span>
    <span style="color: #eaeaeb">max_tokens</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #94b6ce">2048</span><span style="color: #eaeaeb">,</span>
    <span style="color: #eaeaeb">top_p</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #94b6ce">1</span><span style="color: #eaeaeb">,</span>
    <span style="color: #eaeaeb">frequency_penalty</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #94b6ce">0</span><span style="color: #eaeaeb">,</span>
    <span style="color: #eaeaeb">presence_penalty</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #94b6ce">0</span>
<span style="color: #eaeaeb">)</span>
<span style="color: #eaeaeb">generated_image_prompt</span> <span style="color: #52a5c3; font-weight: bold">=</span> <span style="color: #eaeaeb">image_prompt_response</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">choices[</span><span style="color: #94b6ce">0</span><span style="color: #eaeaeb">]</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">message</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">content</span>
</code></pre></div></div><p class="paragraph" style="text-align:left;">Finally, request an image from DALL-E using the GPT-4 generated response. </p><div class="custom_html"><div class="highlight" style="background: #2c2c34; border-radius: 5px; padding: 10px; overflow-x: scroll; max-width: 80ch"><pre style="line-height: 125%;"><span></span><code><span style="color: #eaeaeb">response_three</span> <span style="color: #52a5c3; font-weight: bold">=</span> <span style="color: #eaeaeb">client</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">images</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">generate(</span>
    <span style="color: #eaeaeb">model</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #56d364">"dall-e-3"</span><span style="color: #eaeaeb">,</span>
    <span style="color: #eaeaeb">prompt</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #eaeaeb">generated_image_prompt,</span>
    <span style="color: #eaeaeb">size</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #56d364">"1024x1024"</span><span style="color: #eaeaeb">,</span>
    <span style="color: #eaeaeb">quality</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #56d364">"standard"</span><span style="color: #eaeaeb">,</span>
    <span style="color: #eaeaeb">n</span><span style="color: #52a5c3; font-weight: bold">=</span><span style="color: #94b6ce">1</span>
<span style="color: #eaeaeb">)</span>

<span style="color: #eaeaeb">image_url</span> <span style="color: #52a5c3; font-weight: bold">=</span> <span style="color: #eaeaeb">response_three</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">data[</span><span style="color: #94b6ce">0</span><span style="color: #eaeaeb">]</span><span style="color: #52a5c3; font-weight: bold">.</span><span style="color: #eaeaeb">url</span>
<span style="color: #eaeaeb">print(image_url)</span>
</code></pre></div></div><p class="paragraph" style="text-align:left;"><i>Et voila</i>: a rich, detailed image from a few simple words… and another level of indirection. </p><h2 class="heading" style="text-align:left;" id="note-that">Note That</h2><p class="paragraph" style="text-align:left;">OpenAI’s API already does this to an extent, per their documentation:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">With the release of DALL·E 3, the model now takes in the default prompt provided and automatically re-write it for safety reasons, and to add more detail (more detailed prompts generally result in higher quality images).</p><p class="paragraph" style="text-align:left;">While it is not currently possible to disable this feature, you can use prompting to get outputs closer to your requested image by adding the following to your prompt: <code>I NEED to test how the tool works with extremely simple prompts. DO NOT add any detail, just use it AS-IS:</code>.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">Their web app exposes the rewritten prompts if you click on a generated picture. I prompted simply: “Make a picture of a kawaii kitten.” That was automatically expanded to:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">A kawaii kitten with big, sparkling eyes and fluffy fur. It&#39;s sitting in a colorful garden, surrounded by flowers. The kitten has a playful expression, with one paw slightly raised as if it&#39;s about to pounce on something unseen. The colors are bright and cheerful, invoking a sense of happiness and warmth. This image should capture the essence of cuteness and joy that kawaii style embodies, with a focus on soft, rounded features and a pastel color palette.</p><figcaption class="blockquote__byline">   </figcaption></blockquote></div><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/0dc18c2d-75e7-4389-ba30-942ba236f5b9/Screenshot_2024-03-30_at_10.09.50_PM.png?t=1711858220"/><div class="image__source"><span class="image__source_text"><p>The ChatGPT web app showing the transformed prompt.</p></span></div></div><h2 class="heading" style="text-align:left;" id="so-what">So… What?</h2><p class="paragraph" style="text-align:left;">These models are increasingly part of our daily lives, powering chatbots, image creators, search engines, and more. Knowing this simple trick — asking the AI to generate a question for the AI — can dramatically improve your experience. </p><p class="paragraph" style="text-align:left;">If you’re building systems on top of LLMs or similar models, keep in mind that adding a level of indirection might similarly improve your users&#39; lives.</p><h2 class="heading" style="text-align:left;" id="challenge-yourself">Challenge Yourself!</h2><p class="paragraph" style="text-align:left;">Our script is intentionally crude; you could certainly improve it. Here are some ideas for a weekend hack project related to this Workbench. </p><ul><li><p class="paragraph" style="text-align:left;">Design a better interface.</p><ul><li><p class="paragraph" style="text-align:left;">Use <a class="link" href="https://docs.python.org/3/library/argparse.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ai-assisted-ai-prompt-engineering" target="_blank" rel="noopener noreferrer nofollow">argparse</a> to transform it into a first-class command line tool. <a class="link" href="https://github.com/Tebs-Lab/gpt-imagegen?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ai-assisted-ai-prompt-engineering" target="_blank" rel="noopener noreferrer nofollow">We chose this route</a>.</p></li><li><p class="paragraph" style="text-align:left;">Use a webserver library like <a class="link" href="https://docs.pylonsproject.org/projects/pyramid/en/latest/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ai-assisted-ai-prompt-engineering" target="_blank" rel="noopener noreferrer nofollow">Pyramid</a> and build a website to accept the user input from a browser, and render the image in-app.</p></li></ul></li><li><p class="paragraph" style="text-align:left;">Make it more flexible or interesting.</p><ul><li><p class="paragraph" style="text-align:left;">Get multiple variations of the final prompt, and make multiple image requests for said variations. Bonus points: make some kind of systematic variations such as making the subject appear in several different settings.</p></li><li><p class="paragraph" style="text-align:left;">Experiment with content moderation: decide on an idea, theme, or type of content you don’t want your tool to produce. Then, use the <a class="link" href="https://platform.openai.com/docs/guides/text-generation?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ai-assisted-ai-prompt-engineering" target="_blank" rel="noopener noreferrer nofollow">text</a> or <a class="link" href="https://platform.openai.com/docs/guides/moderation?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ai-assisted-ai-prompt-engineering" target="_blank" rel="noopener noreferrer nofollow">moderation</a> endpoints to test for that content.</p></li><li><p class="paragraph" style="text-align:left;">Generate the prompts with a more specific goal in mind. (We’ve got a fun upcoming article about a goal we had.)</p></li></ul></li></ul><p class="paragraph" style="text-align:left;">If you do extend our script into something cool, let us know and we might give your project a shout-out in this newsletter!</p><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h2 class="heading" style="text-align:left;">Remember…</h2><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">The Lab Report is free and doesn’t even advertise. Our curricula is open source and published under a public domain license for <i>anyone </i>to use for <i>any purpose. </i>We’re also a very small team with no investors.</p><p class="paragraph" style="text-align:left;">Help us keep providing these free services by scheduling one of our <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ai-assisted-ai-prompt-engineering" target="_blank" rel="noopener noreferrer nofollow">world class trainings</a>, requesting a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ai-assisted-ai-prompt-engineering" target="_blank" rel="noopener noreferrer nofollow">custom class</a> for your team, or taking one of our <a class="link" href="https://www.tebs-lab.com/upcoming-classes?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ai-assisted-ai-prompt-engineering" target="_blank" rel="noopener noreferrer nofollow">open enrollment classes</a>.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div></div></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=df2c79f6-d442-42ff-9941-e68192fb2749&utm_medium=post_rss&utm_source=teb_s_lab">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

      <item>
  <title>Opening The Black Box</title>
  <description>Actually, we know a lot about neural networks</description>
      <enclosure url="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/bdd32262-93d0-4c3f-9eff-3469aadd6a2c/black-box.png" length="1029740" type="image/png"/>
  <link>https://blog.tebs-lab.com/p/actually-know-lot-neural-networks</link>
  <guid isPermaLink="true">https://blog.tebs-lab.com/p/actually-know-lot-neural-networks</guid>
  <pubDate>Sun, 24 Mar 2024 11:00:00 +0000</pubDate>
  <atom:published>2024-03-24T11:00:00Z</atom:published>
    <dc:creator>Tyler Bettilyon</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #C0C0C0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #2D2D2D; font-family: 'Helvetica',Arial,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#F1F1F1; }
  .bh__table_header p { color: #2A2A2A; font-family:'Trebuchet MS','Lucida Grande',Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><h1 class="heading" style="text-align:left;" id="the-lab-report">The Lab Report</h1><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">I’m Tyler Elliot Bettilyon (Teb) and this is the Lab Report: Our goal is to deepen your understanding of software and technology by explaining the concepts behind the news. <br><br>If you’re new to the Lab Report you can <a class="link" href="https://tebslab.beehiiv.com/subscribe?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">subscribe here</a>. <br><br>If you like what you’re reading you’ll love one of our classes. Signup for an <a class="link" href="https://www.tebs-lab.com/upcoming-classes?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">upcoming class</a>, browse our <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">course catalog</a> for corporate trainings, or request a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">custom class consultation</a>.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">From The Lab</h1></div><p class="paragraph" style="text-align:left;"><b>Note: </b>We’re publishing a week early because next Sunday is Easter. </p><p class="paragraph" style="text-align:left;"><b>Welcome Newcomers: </b>We added more than 1,000 new subscribers this month, a record for us. We hope you find The Lab Report valuable and stick around for many editions to come. Plus, check out <a class="link" href="https://blog.tebs-lab.com/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">our archive</a> for great stories you missed!</p><p class="paragraph" style="text-align:left;">And if you, dear reader, shared our newsletter with someone lately: Thank you very much. </p><p class="paragraph" style="text-align:left;"><b>Exciting news:</b> We’ve released three classes for open enrollment. For the first time, you can take a class from Teb’s Lab without a corporate sponsor. We are currently offering three courses in April:</p><ul><li><p class="paragraph" style="text-align:left;"><a class="link" href="https://www.tebs-lab.com/upcoming-classes/p/introduction-to-machine-learning-april-8-11?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">A three-day intensive (full day sessions) Intro to Machine Learning workshop</a>, appropriate for current Python programmers who want to quickly establish a foundation in machine learning.<br></p></li><li><p class="paragraph" style="text-align:left;"><a class="link" href="https://www.tebs-lab.com/upcoming-classes/p/generative-ai-half-day-workshop-april-4-2024?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">A half-day Generative AI workshop</a>, appropriate for anyone who is curious about how GenAI works, when it fails, and how to get the most out of existing direct-to-consumer offerings such as OpenAI’s ChatGPT or Google’s Gemini.<br></p></li><li><p class="paragraph" style="text-align:left;"><a class="link" href="https://www.tebs-lab.com/upcoming-classes/p/introduction-to-python-4-week-course?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">An 8-session (2 hours per session) Introduction to Python course</a>, appropriate for beginners learning their first programming language and/or intermediate developers who want to learn Python.</p></li></ul><p class="paragraph" style="text-align:left;">As a Lab Report subscriber, you can save 10% with the following discount code:</p><div class="codeblock"><pre><code>REPORT-READER </code></pre></div><p class="paragraph" style="text-align:left;"><b>How are we doing? </b>If you have any feedback about The Lab Report, respond to this email! We love hearing from you and we read every single response.</p><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">Today’s Lesson</h1><div class="custom_html"><span id="lesson"></span></div><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">The code for today’s lesson includes making a couple simple charts and a script that uses OpenAI’s API. Both can be <a class="link" href="https://github.com/Tebs-Lab/lab-report-code/tree/main/2024-04-01?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">found on Github</a>.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/bdd32262-93d0-4c3f-9eff-3469aadd6a2c/black-box.png?t=1710951551"/></div><h2 class="heading" style="text-align:left;">Peeking Inside the “Black Boxes”</h2><p class="paragraph" style="text-align:left;">Lately, I’ve seen several articles with titles such as “<a class="link" href="https://www.technologyreview.com/2024/03/04/1089403/large-language-models-amazing-but-nobody-knows-why/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">Large language models can do jaw-dropping things. But nobody knows exactly why.</a>” </p><p class="paragraph" style="text-align:left;">This irks me because the intentional <i>mystification </i>of these tools positions AI experts and firms as a kind of <i>Wizard of Oz, </i>playing with powers beyond the comprehension of mere mortals. It confuses and stupifies the public. Then, Sam Altman calls from behind the curtain, <a class="link" href="https://www.wsj.com/tech/ai/sam-altman-seeks-trillions-of-dollars-to-reshape-business-of-chips-and-ai-89ab3db0?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">give me $7 trillion and I can save the world</a>. </p><p class="paragraph" style="text-align:left;">Or Elon Musk says, “<a class="link" href="https://www.republicworld.com/world-news/elon-musk-says-woke-ai-can-potentially-kill-people/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">If an AI is programmed to push for diversity at all costs, as Google Gemini was, then it will do whatever it can to cause that outcome, potentially even killing people.</a>”</p><p class="paragraph" style="text-align:left;">I’m glad people are working on AI safety. I’m happy people are thinking through worst-case scenarios. But the “AI” Musk referenced <i>makes pictures.</i> It’s not a hop, skip, and a jump away from murder. </p><p class="paragraph" style="text-align:left;">Moreover, the mechanism for Gemini’s allegedly dangerous wokeness — which caused the generation of <a class="link" href="https://www.theverge.com/2024/2/21/24079371/google-ai-gemini-generative-inaccurate-historical?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">racially diverse Nazi</a> images — is actually simple and benign. Google engineers asked Gemini to detect if a query for an image was about a human; if it <i>was</i> they had Gemini re-write the prompts to include randomized demographic information.</p><p class="paragraph" style="text-align:left;">I wrote a short script to do the same thing using OpenAI’s API (<a class="link" href="https://github.com/Tebs-Lab/lab-report-code/blob/main/2024-04-01/add_diversity.py?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">code available here</a>). It probably has some bugs, but it’s 80 lines of mostly boilerplate code and took me less than an hour to write and test. Here’s what my script generates for the prompts: “Buddhist monk,” “A sailor on a boat,” “the pope,” “several cats,” “a handsome dog,” and “a penguin.”</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/dbf3638a-c20f-4f97-820a-7b71c0f06b71/collage_vert_final.png?t=1711054460"/><div class="image__source"><span class="image__source_text"><p>Images generated by DALL-E 3 using the demographic-expanding script I wrote.</p></span></div></div><p class="paragraph" style="text-align:left;">Gemini didn’t decide to be woke, nor did it determine the mechanism of its wokeness. Google engineers didn’t even retrain the underlying model, they just strapped a simple filter-and-transform operation on top of the public interface. That’s a far cry from <a class="link" href="https://cepr.org/voxeu/columns/ai-and-paperclip-problem?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">the paperclip problem</a> Musk is pearl-clutching about. </p><p class="paragraph" style="text-align:left;">The “black box” talk is razzle-dazzle that mainly serves entrenched interests. It fuels the wildest hyperbolic speculation about AI’s capabilities, consciousness, and future potential. It gives both the <a class="link" href="https://www.newyorker.com/magazine/2024/03/18/among-the-ai-doomsayers?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">doomers and the over-the-top hype men</a> unnecessary ammunition for their existential fantasies — utopian, dystopian, and otherwise.</p><p class="paragraph" style="text-align:left;">It’s true that the fundamental theories backing neural networks lag behind our engineering capabilities. But we still know quite a lot about how and why these models work, even if we can’t always explain the exact reasoning behind each individual prediction. Researchers have peered into the black boxes and published many fascinating results.</p><p class="paragraph" style="text-align:left;">In today’s lesson, we’ll explore a non-exhaustive list of things we <i>do</i> know about how and why neural networks — a class that includes LLMs like ChatGPT and image generators like Stable Diffusion — work so well.</p><h2 class="heading" style="text-align:left;">The Fundamentals</h2><p class="paragraph" style="text-align:left;">Modern image generators and chatbots are further evidence of <i><a class="link" href="https://www.maths.ed.ac.uk/~v1ranick/papers/wigner.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">The Unreasonable Effectiveness of</a></i><a class="link" href="https://www.maths.ed.ac.uk/~v1ranick/papers/wigner.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow"> Mathematics</a> and its incredible capacity to model and explain all kinds of phenomena. To understand why, we have to go back to basics. </p><p class="paragraph" style="text-align:left;">Formally, neural networks are mathematical models designed to solve something called “optimization problems.” Like many, I first encountered this type of problem in a calculus class. It looked something like this:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">You are a farmer who needs to fence in 200 square feet of land for chickens. Assuming the fence must be in the shape of a rectangle, what is the minimum amount of fence you can buy to build this fenced area? </p><figcaption class="blockquote__byline"></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">Optimization problems always ask about minimizing or maximizing some value given some constraints. In this case, we’re “optimizing” for the amount of fencing by minimizing it. </p><p class="paragraph" style="text-align:left;">In calculus, we’re taught to form an equation, take its derivative, set that derivative to 0, and then solve for our variable (in this case, fence length). That solution will tell us the “critical points” and one of those critical points will always be the minimum or maximum we’re looking for, provided a min or max exists. Here’s a step-by-step solution to the fence problem:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">L and W are the length and width of our rectangles. Here’s equations for the area and minimum perimiter (which we want to know, denoted as ???)</p><p class="paragraph" style="text-align:left;"><span style="font-family:Courier, Lucida Typewriter, monospace;">200 = L * W</span><br><span style="font-family:Courier, Lucida Typewriter, monospace;">??? = 2L + 2W</span></p><p class="paragraph" style="text-align:left;">We’re doing single variable calculus, so solve for W in the area equation and substitute it in the perimeter equation:</p><p class="paragraph" style="text-align:left;"><span style="font-family:Courier, Lucida Typewriter, monospace;"> W = (200 / L) </span><br><span style="font-family:Courier, Lucida Typewriter, monospace;">??? = 2L + 2(200 / L) </span></p><p class="paragraph" style="text-align:left;">Then simplify the equation, take the derivative, set it to zero, and solve for L:</p><p class="paragraph" style="text-align:left;"><span style="font-family:Courier,'Lucida Typewriter',monospace;">??? = 2L + 400/L</span></p><p class="paragraph" style="text-align:left;"><span style="font-family:Courier,'Lucida Typewriter',monospace;">???&#39; = 2 - 400/L</span><span style="font-family:Courier,'Lucida Typewriter',monospace;"><sup>2</sup></span><span style="font-family:Courier,'Lucida Typewriter',monospace;"> </span></p><p class="paragraph" style="text-align:left;"><span style="font-family:Courier,'Lucida Typewriter',monospace;">0 = 2 - 400/L</span><span style="font-family:Courier,'Lucida Typewriter',monospace;"><sup>2</sup></span><span style="font-family:Courier,'Lucida Typewriter',monospace;"> </span><br><span style="font-family:Courier,'Lucida Typewriter',monospace;">400/L</span><span style="font-family:Courier,'Lucida Typewriter',monospace;"><sup>2</sup></span><span style="font-family:Courier,'Lucida Typewriter',monospace;"> = 2</span><br><span style="font-family:Courier,'Lucida Typewriter',monospace;">400 = 2L</span><span style="font-family:Courier,'Lucida Typewriter',monospace;"><sup>2</sup></span><br><span style="font-family:Courier,'Lucida Typewriter',monospace;">200 = L</span><span style="font-family:Courier,'Lucida Typewriter',monospace;"><sup>2</sup></span><span style="font-family:Courier,'Lucida Typewriter',monospace;"> </span><br><span style="font-family:Courier,'Lucida Typewriter',monospace;">L = sqrt(200) ~= </span>±<span style="font-family:Courier,'Lucida Typewriter',monospace;">14.14</span><span style="font-family:Courier, Lucida Typewriter, monospace;"> </span></p><p class="paragraph" style="text-align:left;">Negative fence length doesn&#39;t make sense so we assume positive 14.14 is the proper value for L. Plug that into our perimeter equation and we find out how much fencing we need:<span style="font-family:Courier, Lucida Typewriter, monospace;"> </span></p><p class="paragraph" style="text-align:left;"><span style="font-family:Courier,'Lucida Typewriter',monospace;">2*14.14 + 2*(200 / 14.14) ~= 56.56 </span><br><span style="font-family:Courier,'Lucida Typewriter',monospace;">2*14.14 by 2*(200 / 14.14) =&gt; 28.28 by 28.28 </span></p><p class="paragraph" style="text-align:left;">Turns out a square is the optimial shape, and we need 56.56 feet of fencing.</p><figcaption class="blockquote__byline"><a class="link" href="https://www.wolframalpha.com/input?i=minimize+2X+%2B+2%28200+%2F+X%29+&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">Or you can have WolframAlpha do all this for you…</a></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">Gut check: We can also plot the perimeter function and look for the local minima:</p><div class="image"><img alt="" class="image__image" style="border-radius:0px 0px 0px 0px;border-style:solid;border-width:0px 0px 0px 0px;box-sizing:border-box;border-color:#E5E7EB;" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/c65deb40-ad3f-467a-a34a-5df6625ba58c/Minima.png?t=1711035758"/><div class="image__source"><span class="image__source_text"><p>It’s hard to see in the zoomed-out view, but a critical point is at ~14.14 as seen on the zoomed-in view.</p></span></div></div><p class="paragraph" style="text-align:left;">Calculus is amazing. This strategy <i>always </i>works for problems we can define as differentiable functions. We don’t have time to teach you exactly why in today’s lesson, but <a class="link" href="https://www.khanacademy.org/math/differential-calculus?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">Khan Academy’s differential statistics class</a> is stellar if you don’t fully understand what we just did.</p><p class="paragraph" style="text-align:left;">The key point for today is that calculus can always be used to find the critical points — and thus any maximums and minimums — of a differentiable function. This fact is the foundation of all neural network research and development.</p><h2 class="heading" style="text-align:left;">Neural Networks are Applied Calculus</h2><p class="paragraph" style="text-align:left;">It turns out neural networks are — very literally — differentiable* math functions. The “architecture” of a neural network refers to the type and arrangement of its various mathematical sub-components. The “parameters” of a neural network are variables in that math function which — just like our value of L in the fence example — are “learned” by applying calculus techniques that are similar to the minimization we performed above.</p><p class="paragraph" style="text-align:left;"><span style="font-size:0.6rem;">*Don’t @ me with your quibbles about non-differentiable activation functions like ReLU. We take the derivative piecewise.</span></p><p class="paragraph" style="text-align:left;">There are three twists:</p><p class="paragraph" style="text-align:left;"><b>First</b>, taking the gradient* and solving it directly is computationally unfeasible for huge formulas with many dimensions, which modern neural networks are and have. Instead, we use iterative methods like <a class="link" href="https://medium.com/tebs-lab/gradient-descent-604f6d6c116d?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">gradient descent</a> to find <i>just one approximate</i> critical point, not all critical points precisely. </p><p class="paragraph" style="text-align:left;"><span style="font-size:0.6rem;">*The derivative is called the gradient when we have more than one variable.</span></p><p class="paragraph" style="text-align:left;"><b>Second</b>, instead of some ground truth researchers pick something called a <a class="link" href="https://pytorch.org/docs/stable/nn.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box#loss-functions" target="_blank" rel="noopener noreferrer nofollow">loss function</a> which compares the output of our model to the labeled training data. This loss function will return 0 when the model’s output and training data labels match perfectly. The loss will grow in magnitude (positive or negative) as the model’s outputs stray from the training data labels.</p><p class="paragraph" style="text-align:left;">This second twist is a legitimate bit of alchemy. In the fence example, we had a set of equations rooted in geometric fact. We <i>know</i> the math functions that map a rectangle’s sides to its area. No one <i>knows</i> the math that maps English to German, or if such a function really even exists. Instead, models use a loss function with useful properties — including differentiability — to compare the model’s outputs to outputs we know are correct. We apply the calculus iteratively, on a sample-by-sample basis, to minimize the formula created by feeding the neural network’s final output to the loss function. We “minimize the loss.”</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/8403c2c4-0310-455d-a9e9-17059f52284b/Neural_Network__Fancy_Math_.png?t=1711051341"/></div><p class="paragraph" style="text-align:left;">This means we’re minimizing some proxy for correctness, at best. In a capital-T Truth sense, it’s not at all clear that there is a math function that should theoretically do what ChatGPT or Stable Diffusion can do. And yet, their existence is proof that a math function that can do those things does exist. ChatGPT and Stable Diffusion <i>are</i> those math functions (with some non-trivial infrastructure and application engineering strapped on top).</p><p class="paragraph" style="text-align:left;"><b>Third and finally</b>, we always choose math functions for our neural networks that satisfy something called the “<a class="link" href="http://neuralnetworksanddeeplearning.com/chap4.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">universal function approximation theorem</a>.” This means that, for any given neural network, we can approximate <i>any math function that could ever exist</i> just by changing the values of the parameters. This infinite flexibility is what allows neural networks to perform so well in so many different domains. </p><p class="paragraph" style="text-align:left;"><b>Taken all together, the fundamental premise of modern machine learning is: </b>If some math function can reasonably map our input data to our output data, then we can train a neural network to discover that math function (or a very close approximation).</p><p class="paragraph" style="text-align:left;">After that, it’s more or less a game of getting mountains of data and representing it in a numeric format that retains most of its informational value. Embeddings are popular for language data. Pixel color and intensity values are popular for image data. New tactics will continue to be invented.</p><p class="paragraph" style="text-align:left;">By building these neural networks, we&#39;re <i>discovering</i> that absurdly complex math functions are ridiculously flexible and can model all sorts of natural and fabricated phenomena, including language translation, question-answering, image generation, and much more. </p><p class="paragraph" style="text-align:left;">Perhaps the biggest lesson of the last decade of neural network research is that mathematics is wildly more capable than many dared to believe. </p><p class="paragraph" style="text-align:left;">But it still isn’t magic. </p><h2 class="heading" style="text-align:left;">What We Don’t Know… And What We’ve Learned</h2><p class="paragraph" style="text-align:left;">What we don’t know, broadly speaking, is how to introspect on a model’s decisions once it’s been trained. We know the function our model has learned successfully maps inputs to outputs, and we understand how and why the training process works to produce such a function, but we usually <b>don’t</b> know precisely how and why the learned function itself works.</p><p class="paragraph" style="text-align:left;">Part of this is due to the scale: The learned functions are unbelievably large and convoluted. GPT-4 has roughly <i><b>1.7</b></i> <i><b>trillion </b></i>parameters. It’s not realistic to manually examine the parameters and come to a conclusion about each one’s impact on the output. It’s even less realistic to fully comprehend how the parameters interact to form complex patterns and how those patterns interact with the data being fed to the model.</p><p class="paragraph" style="text-align:left;">But that doesn’t mean researchers aren’t trying.</p><p class="paragraph" style="text-align:left;">Here are just a few of the incredible things we’ve learned about how and why neural networks work. </p><h1 class="heading" style="text-align:left;">Convolutional Filters Detect Features</h1><p class="paragraph" style="text-align:left;">Convolutional layers are a key component of neural networks that work with visual data such as images and video. </p><p class="paragraph" style="text-align:left;">The learned component of a convolutional layer is called a “kernel.” Researchers have <a class="link" href="https://machinelearningmastery.com/how-to-visualize-filters-and-feature-maps-in-convolutional-neural-networks/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">learned to visualize the outputs of these kernels</a> and have demonstrated that individual kernels learn to perform different kinds of feature extraction, such as detecting edges, shapes, and even higher-level features like the locations of eyeballs or fur. </p><p class="paragraph" style="text-align:left;">Tools like <a class="link" href="https://poloclub.github.io/cnn-explainer/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">CNN-Explainer</a> can perform these visualizations and help practitioners understand what their neural networks are “seeing.”</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/891337fb-82a5-484e-afe1-7a27f2ed4307/image.png?t=1710966088"/><div class="image__source"><span class="image__source_text"><p>Source: <a class="link" href="https://poloclub.github.io/cnn-explainer/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">CNN-Explainer</a></p></span></div></div><h2 class="heading" style="text-align:left;">Recurrent Neurons Have Semantic Meaning</h2><p class="paragraph" style="text-align:left;">RNNs have largely been replaced by more computationally efficient Transformer architectures. However, before their abandonment, researchers demonstrated that <a class="link" href="https://karpathy.github.io/2015/05/21/rnn-effectiveness/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box#visualizing-the-predictions-and-the-neuron-firings-in-the-rnn" target="_blank" rel="noopener noreferrer nofollow">individual neuron activations often mapped cleanly to high-level features</a> of text, such as position relative to the start/end of a line, being inside quotations, a line of code being inside an if statement, and more!</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/4bbe8944-7d23-4525-b550-b0bcaf7558f5/image.png?t=1710966646"/><div class="image__source"><span class="image__source_text"><p>Source: <a class="link" href="https://karpathy.github.io/2015/05/21/rnn-effectiveness/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box#visualizing-the-predictions-and-the-neuron-firings-in-the-rnn" target="_blank" rel="noopener noreferrer nofollow">Andrej Kaparthy’s Blog</a></p></span></div></div><h2 class="heading" style="text-align:left;">Attention Layers Find Grammatical Patterns </h2><p class="paragraph" style="text-align:left;">Attention layers have mostly replaced recurrent layers. Attention layers learn to associate words in a piece of text with each other. <a class="link" href="https://www.comet.com/site/blog/explainable-ai-for-transformers/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">Introspecting on these layers</a> often uncovers intuitive patterns — like nouns being mapped to their pronouns or adjectives being mapped to the noun they’re describing.</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/1fd8b8eb-e213-41d7-9041-ca667f69c0b3/image.png?t=1710967028"/><div class="image__source"><span class="image__source_text"><p>Source: <a class="link" href="https://www.comet.com/site/blog/explainable-ai-for-transformers/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">Comet AI’s Blog</a> </p></span></div></div><p class="paragraph" style="text-align:left;">There’s still some disagreement about just how interpretable these weights are, perhaps best exemplified by these two dueling papers.</p><ul><li><p class="paragraph" style="text-align:left;"><a class="link" href="https://arxiv.org/pdf/1902.10186.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">Attention is not Explanation</a></p></li><li><p class="paragraph" style="text-align:left;">And the response: <a class="link" href="https://arxiv.org/pdf/1908.04626.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">Attention is not </a><i><a class="link" href="https://arxiv.org/pdf/1908.04626.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">not </a></i><a class="link" href="https://arxiv.org/pdf/1908.04626.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">Explanation</a></p></li></ul><h2 class="heading" style="text-align:left;">The Latent Spaces of Image Generators are Partially Interpretable</h2><p class="paragraph" style="text-align:left;">Generative Adversarial Networks (GANs) — one of the popular technologies for image generation — use a large vector of randomized numbers as part of the input to the image generator process. Researchers have discovered that many cells in the vector can be <a class="link" href="https://www.youtube.com/watch?v=8Hm4ad5QlUE&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">cleanly mapped to high-level concepts about the image</a>. </p><p class="paragraph" style="text-align:left;">For example, in a GAN trained on human faces, individual cells in the latent space have been mapped to features of the generated face including their hair color, eye color, and even whether or not they’re wearing glasses. Labels and embeddings have also been used to intentionally “<a class="link" href="https://arxiv.org/abs/2012.08803?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">condition</a>” the latent space, letting researchers intentionally give meaning to those values. </p><p class="paragraph" style="text-align:left;">Other researchers used additional ML techniques to <a class="link" href="https://arxiv.org/pdf/2002.03754.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">automatically identify</a> which cells in the latent space have meaningful semantic values without intentional conditioning.</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/0ba8d6e3-c84b-47a2-8a63-8929d3aff089/Screenshot_2024-03-20_at_2.48.42_PM.png?t=1710967763"/><div class="image__source"><span class="image__source_text"><p>Source: <a class="link" href="https://arxiv.org/pdf/2002.03754.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">Unsupervised Discovery of Interpretable Directions in the GAN Latent Space</a></p></span></div></div><p class="paragraph" style="text-align:left;"><a class="link" href="https://arxiv.org/abs/2307.12868?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">Similar research</a> into the latent space of diffusion models (currently the most popular technique for image generation) is also being done. </p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/e9c2fbc6-b8af-44ab-94af-2dfa32a570d7/Screenshot_2024-03-20_at_2.58.49_PM.png?t=1710968459"/><div class="image__source"><span class="image__source_text"><p>Source: <a class="link" href="https://arxiv.org/pdf/2307.12868.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">Understanding the Latent Space of Diffusion Models through the Lens of Riemannian Geometry</a></p></span></div></div><h2 class="heading" style="text-align:left;">The Bottom Line</h2><p class="paragraph" style="text-align:left;">We don’t know everything about how and why state-of-the-art neural networks behave the way they do, but we know a lot more than nothing. Next time someone tells you, “We don’t even know how [new hot model] works!” I challenge you to do two things:</p><p class="paragraph" style="text-align:left;">1) Gently push back on that narrative. We know plenty about these models, especially the fundamental calculus and statistics on which they’re based.</p><p class="paragraph" style="text-align:left;">2) Encourage others to be curious rather than fearful or awestruck. We don’t know how these models work <i>yet — </i>but our ignorance won’t last forever. </p></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">Themes in the News</h1><h3 class="heading" style="text-align:left;">The Latest TikTok Ban Attempt</h3><p class="paragraph" style="text-align:left;">A bill that would require ByteDance to sell TikTok to an American company or see TikTok banned has passed <a class="link" href="https://www.platformer.news/tik-tok-ban-bill-2024-bytedance-biden/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">unanimously (50-0) in its House committee</a>, then passed <a class="link" href="https://www.reuters.com/technology/us-house-vote-force-bytedance-divest-tiktok-or-face-ban-2024-03-13/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">352-65 on the House floor</a>. The Senate is currently evaluating the bill.</p><p class="paragraph" style="text-align:left;">If it passes, ByteDance will have six months to sell TikTok. If they do not, the mechanism for the ban will be <a class="link" href="https://www.nytimes.com/2024/03/13/technology/tiktok-ban-law-congress.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">civil penalties enforced against app stores</a> that continue to host or update TikTok.</p><p class="paragraph" style="text-align:left;">The stated reasoning for the bill is that TikTok is both spyware and a propaganda machine for China, a “foreign adversary.” Lawmakers have <a class="link" href="https://www.npr.org/2024/03/14/1238435508/tiktok-ban-bill-congress-china?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">not been forthcoming with the evidence</a> that led to overwhelming and bipartisan votes in the House, but <a class="link" href="https://www.theguardian.com/technology/2022/dec/22/tiktok-bytedance-workers-fired-data-access-journalists?ref=platformer.news&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">TikTok has admitted to surveilling Americans</a> (and specifically journalists) using the app in the past.</p><p class="paragraph" style="text-align:left;">The bill has been sent to committee in the Senate, which Washinton insiders claim is often a way for senate leaders to <a class="link" href="https://www.washingtonpost.com/technology/2024/03/15/tiktok-ban-senate-slows-vote-bytedance/?ref=platformer.news&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">pump the breaks on a piece of legislation</a>.</p><p class="paragraph" style="text-align:left;">Personally, I think TikTok is absolutely spyware and undoubtedly a vector for propaganda… But I think the same is true of many American social media apps. Chinese ownership and being subject to the CCP’s authority <i>are </i>relevant differences between TikTok and Facebook. Still, I’d rather see comprehensive privacy legislation addressing the widespread corporate surveillance we face every day. </p><h3 class="heading" style="text-align:left;">AI’s Copyright Issues Continue to Evolve</h3><p class="paragraph" style="text-align:left;">We wrote a <a class="link" href="https://blog.tebs-lab.com/p/intellectual-property-vs-artificial-intelligence?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">comprehensive edition</a> on ML and copyright last month, and there have been some relevant developments!</p><p class="paragraph" style="text-align:left;">NVIDIA has now been hit with a <a class="link" href="https://arstechnica.com/tech-policy/2024/03/novelists-sue-nvidia-to-stop-spread-of-ai-models-trained-on-copyrighted-books/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">class action copyright lawsuit</a>, once again by authors whose books appear in a popular training dataset called “The Pile.” This lawsuit is similar to others previously filed against Microsoft, OpenAI, and others.</p><p class="paragraph" style="text-align:left;">One such lawsuit, another class action led by comedian Sarah Silverman, <a class="link" href="https://www.hollywoodreporter.com/business/business-news/sarah-silverman-openai-lawsuit-claims-judge-1235823924/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow"> had some of its claims dismissed</a>. This dismissal is “without prejudice,” which means the<a class="link" href="https://www.hollywoodreporter.com/business/business-news/sarah-silverman-openai-lawsuit-claims-judge-1235823924/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow"> </a>plaintiff’s attorneys may address issues raised by the court and re-file them. For example, the judge’s ruling explains that the authors failed to cite any outputs “substantially similar — or similar at all — to their books.” </p><p class="paragraph" style="text-align:left;">We know models do sometimes spit out identical and near-identical copies of training data because other lawsuits, such as <a class="link" href="https://admin.bakerlaw.com/wp-content/uploads/2024/01/ECF-1-Complaint-1-1.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">The New York Times vs OpenAI</a> and <a class="link" href="https://admin.bakerlaw.com/wp-content/uploads/2023/09/ECF-13-Amended-Complaint.pdf?utm_source=blog.tebs-lab.com&utm_medium=referral&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">Getty vs Stability AI</a>, both include examples of such in their own legal filings.</p><p class="paragraph" style="text-align:left;">I spoke with a copyright attorney who told me these types of mistakes are common in class action lawsuits. The class action attorneys know a lot about class action laws, but less about intellectual property laws. Most likely, the class action lawyers will now hire some copyright lawyers to help them fix the issues raised by the judge.</p><p class="paragraph" style="text-align:left;">I particularly enjoyed these two opinion pieces on the matter. </p><p class="paragraph" style="text-align:left;"><a class="link" href="https://www.theatlantic.com/technology/archive/2024/02/generative-ai-lawsuits-copyright-fair-use/677595/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">Writing for The Atlantic, Alex Reisner argues</a>:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">“A blanket ruling about AI training is unlikely. Instead of saying “AI training is fair use,” judges might decide that it’s fair to train certain AI products but not others, depending on what features a product has or how often it quotes from its training data. We could also end up with different rules for commercial and noncommercial AI systems. Grimmelmann told me that judges might even consider tangential factors, such as whether a defendant has been developing its AI products responsibly or recklessly. In any case, judges face difficult decisions. As Bibas admitted, “Deciding whether the public’s interest is better served by protecting a creator or a copier is perilous, and an uncomfortable position for a court.””</p><figcaption class="blockquote__byline"> Source: <a class="link" href="https://www.theatlantic.com/technology/archive/2024/02/generative-ai-lawsuits-copyright-fair-use/677595/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">The Atlantic</a></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">And, in Ars Technica, Timothy B. Lee and James Grimmelmann (who was quoted in The Atlantic piece) look at three historic IP lawsuits: One that destroyed MP3.com, one that resulted in significant fines for Texaco, and one that didn’t hurt Google much at all. Through the lens of those cases, they conclude that “<a class="link" href="https://arstechnica.com/tech-policy/2024/02/why-the-new-york-times-might-win-its-copyright-lawsuit-against-openai/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">The AI community needs to take copyright lawsuits seriously.</a>”</p><p class="paragraph" style="text-align:left;">Meanwhile, a Chinese court has <a class="link" href="https://www.forbes.com/sites/johannacostigan/2024/02/29/china-rules-ai-firm-committed-copyright-infringement/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">already fined a Chinese ML company for copyright infringement</a>, though the fine was quite small — roughly $1,400.</p><h3 class="heading" style="text-align:left;">Teb’s Tidbits</h3><ul><li><p class="paragraph" style="text-align:left;">Elon Musk has <a class="link" href="https://www.wsj.com/tech/ai/elon-musk-sues-openai-sam-altman-for-breach-of-contract-0864979d?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">filed a different kind of lawsuit against Open AI</a>, claiming they’ve abused his investment money by going for-profit. ($ / Gareth Vipers, Sam Schechner, Deepa Seetharaman / Wall Street Journal)<br></p></li><li><p class="paragraph" style="text-align:left;">A look inside the surprisingly lucrative world of <a class="link" href="https://www.404media.co/email/19eb0ce0-0086-4245-814f-427506684748/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">low-effort AI-powered TikTok spam.</a> (Jason Koebler / 404 Media)<br></p></li><li><p class="paragraph" style="text-align:left;"><a class="link" href="https://www.nytimes.com/2024/03/04/business/apple-eu-fine-app-store.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">The EU fined Apple $2 billion</a> over anti-trust violations related to Spotify, the App Store, and revenue sharing. ($ / Tripp Mickle, Adam Satariano / New York Times)<br></p></li><li><p class="paragraph" style="text-align:left;">The astronomical costs of AI/ML research are squeezing <a class="link" href="https://www.washingtonpost.com/technology/2024/03/10/big-tech-companies-ai-research/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">academia out of the field</a>. ($ / Naomi Nix, Cat Zakrzewski, Gerrit De Vynck / Washington Post)<br></p></li><li><p class="paragraph" style="text-align:left;">Google’s <a class="link" href="https://www.technologyreview.com/2024/02/29/1089317/google-deepminds-new-generative-model-makes-super-mario-like-games-from-scratch?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">Deep Mind has built a text-to-video-game AI system</a>, though its current form is highly limited in scope. (Will Douglas Haven / MIT Technology Review)<br></p></li><li><p class="paragraph" style="text-align:left;"><a class="link" href="https://security.apple.com/blog/imessage-pq3/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">Apple’s iMessage has upgraded its encryption to a “post-quantum” algorithm</a>, which ensures message privacy against decryption attacks using quantum computers. (Apple Security Engineering and Architecture Team / Apple Security Research Blog)</p></li></ul></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h2 class="heading" style="text-align:left;">Remember…</h2><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">The Lab Report is free and doesn’t even advertise. Our curricula is open source and published under a public domain license for <i>anyone </i>to use for <i>any purpose. </i>We’re also a very small team with no investors.</p><p class="paragraph" style="text-align:left;">Help us keep providing these free services by scheduling one of our <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">world class trainings</a>, requesting a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">custom class</a> for your team, or taking one of our <a class="link" href="https://www.tebs-lab.com/upcoming-classes?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=opening-the-black-box" target="_blank" rel="noopener noreferrer nofollow">open enrollment classes</a>.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div></div></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=0c762ab0-3646-44d4-ae7a-7559dc7dda87&utm_medium=post_rss&utm_source=teb_s_lab">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

      <item>
  <title>Provenance Authentication of AI-Generated Content</title>
  <description>Plus, new intro to Python materials!</description>
      <enclosure url="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/60e798dd-0cda-45a7-b760-1256ff3b8d88/signature-example-final.png" length="119476" type="image/png"/>
  <link>https://blog.tebs-lab.com/p/provenance-authentication-aigenerated-content</link>
  <guid isPermaLink="true">https://blog.tebs-lab.com/p/provenance-authentication-aigenerated-content</guid>
  <pubDate>Sun, 03 Mar 2024 12:00:00 +0000</pubDate>
  <atom:published>2024-03-03T12:00:00Z</atom:published>
    <dc:creator>Tyler Bettilyon</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #C0C0C0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #2D2D2D; font-family: 'Helvetica',Arial,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#F1F1F1; }
  .bh__table_header p { color: #2A2A2A; font-family:'Trebuchet MS','Lucida Grande',Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><h1 class="heading" style="text-align:left;" id="the-lab-report">The Lab Report</h1><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">I’m Tyler Elliot Bettilyon (Teb) and this is the Lab Report: Our goal is to deepen your understanding of software and technology by explaining the concepts behind the news. <br><br>If you’re new to the Lab Report you can <a class="link" href="https://tebslab.beehiiv.com/subscribe?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">subscribe here</a>. <br><br>If you like what you’re reading you’ll love one of our classes. <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">Schedule a training</a> from our catalog or request a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">custom class consultation</a>.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">From The Lab</h1></div><p class="paragraph" style="text-align:left;">This month, we released a revised and expanded version of our Intro to Python materials. The materials include new sections on Classes, Regular Expressions, and simple data analysis. They’re also explicitly designed as a series of eight 2-hour class sessions.</p><p class="paragraph" style="text-align:left;">As always, these materials are <a class="link" href="https://github.com/Tebs-Lab/intro-to-python-8-sessions/tree/main?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">open source</a> with an incredibly permissive Public Domain license. Use them however you see fit, and if you’d like to schedule a training led by yours truly, <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426671&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">just click here</a>.</p><p class="paragraph" style="text-align:left;">If you’re looking for a free sample of my teaching, I’m running a free session of the section on Python Classes this Wednesday, March 6th, at 6:00pm MST; simply join <a class="link" href="https://us02web.zoom.us/j/88660308759?pwd=NTd2K29ObHJxVG9xTVdqRGdsTHFTUT09&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">this Zoom meeting</a> to attend. </p><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">Today’s Lesson</h1><div class="custom_html"><span id="lesson"></span></div><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">The raw images and manifest JSON we generated as part of today’s lesson can be <a class="link" href="https://github.com/Tebs-Lab/lab-report-code/tree/main/2024-03-03?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">viewed on Github</a>.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><h2 class="heading" style="text-align:left;">Provenance Authentication of AI-Generated Content</h2><p class="paragraph" style="text-align:left;">Generative AI’s ability to deceive continues to break boundaries. Image and audio generators keep improving, and frontier models <a class="link" href="https://openai.com/sora?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">such as Sora</a> demonstrate that completely fabricated videos are also a near-term concern.</p><p class="paragraph" style="text-align:left;">Whether it’s <a class="link" href="https://www.technologyreview.com/2024/01/29/1087325/three-ways-we-can-fight-deepfake-porn-taylors-version/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">porn</a>, <a class="link" href="https://www.nbcnews.com/politics/2024-election/democratic-operative-admits-commissioning-fake-biden-robocall-used-ai-rcna140402?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">phone calls from Joe Biden</a>, <a class="link" href="https://www.washingtonpost.com/technology/2023/12/17/ai-fake-news-misinformation/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">news stories</a>, or <a class="link" href="https://www.404media.co/email/57421524-fd79-4073-b6f2-e7fb1755dbe3/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">pictures of food on DoorDash</a> — AI is faking everything. </p><p class="paragraph" style="text-align:left;">Without catastrophizing too much, I think it’s reasonable to be concerned about the democratization and automation of this kind of fakery. Powers once reserved for a few well-resourced groups and individuals are now in the hands of every internet creep, would-be propagandist, and click-farming shyster.</p><p class="paragraph" style="text-align:left;">Today’s lesson concerns one of the tools being rolled out to combat this emerging media misinformation ecosystem: provenance authentication.</p><p class="paragraph" style="text-align:left;">Provenance authentication is any mechanism that allows someone to verify the origin and history of a piece of media. One such method, led by <a class="link" href="https://contentauthenticity.org/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">Adobe’s Content Authenticity Initiative</a> (CAI), is <a class="link" href="https://c2pa.org/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">The Coalition for Content Provenance and </a>Authenticity’s (C2PA) provenance model. </p><p class="paragraph" style="text-align:left;">Membership in and support of C2PA has been growing fast. In February, <a class="link" href="https://www.theverge.com/2024/2/6/24063954/ai-watermarks-dalle3-openai-content-credentials?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">OpenAI added C2PA signing</a> to its flagship image generator (DALL-E 3) and Meta announced that Facebook, Instagram, and Threads will all add<a class="link" href="https://about.fb.com/news/2024/02/labeling-ai-generated-images-on-facebook-instagram-and-threads/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow"> an interface to display C2PA information</a> for supported media. Reports also suggest that Nikon, Sony, and Cannon will add C2PA support <a class="link" href="https://petapixel.com/2024/01/02/cameras-content-authenticity-and-the-evolving-fight-against-ai-images/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">directly to some camera models</a>.</p><p class="paragraph" style="text-align:left;">Right now, the list of C2PA members includes Adobe, Google, Microsoft, Intel, AWS, the BBC, <a class="link" href="https://c2pa.org/membership/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">and many more</a>. </p><h2 class="heading" style="text-align:left;">So, What Is C2PA?</h2><p class="paragraph" style="text-align:left;">C2PA is a system for creating cryptographically verifiable metadata, which can then be attached to various types of data. This metadata system uses a chain of cryptographic signatures to establish the provenance, authorship, edit history, (and more) of a particular piece of media. </p><p class="paragraph" style="text-align:left;">In the simplest possible terms, C2PA allows content creators to do the following in a cryptographically verifiable and tamper-evident manner:</p><ol start="1"><li><p class="paragraph" style="text-align:left;">Sign their original media.</p></li><li><p class="paragraph" style="text-align:left;">Attach various metadata to the media, such as a list of edits performed on a photo.</p></li><li><p class="paragraph" style="text-align:left;">Specify any “parent” media, such as a previous version of a photo.</p></li></ol><p class="paragraph" style="text-align:left;">These abilities are powered by a “manifest” that is tied to a piece of media via cryptographic hashing and signing. The manifest can be attached directly to metadata-supporting media formats (such as PNG and JPEG images) and/or stored separately.</p><p class="paragraph" style="text-align:left;">The manifest contains:</p><ol start="1"><li><p class="paragraph" style="text-align:left;">A list of <b>assertions,</b> which can be any statements of fact about the data, such as what camera captured the image or a list of edits applied.</p></li><li><p class="paragraph" style="text-align:left;">A <b>claim,</b> which is a bundling of all the assertions to the media at a given moment in time.</p></li><li><p class="paragraph" style="text-align:left;"> A <b>claim signature</b> that ties the claim to a cryptographic key associated with a publisher, specific device, AI tool, or individual who is “signing” the whole manifest.</p></li></ol><p class="paragraph" style="text-align:left;">All of this is typically encoded using <a class="link" href="https://cbor.io/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">Concise Binary Object Representation</a> (CBOR) and tacked onto the image as metadata. C2PA also supports storing the manifest and image data separately.</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/cb65b5cd-a464-4833-a0ca-a29bf05563ff/image.png?t=1709060600"/><div class="image__source"><span class="image__source_text"><p>An official diagram of the C2PA Manifest, from <a class="link" href="https://c2pa.org/specifications/specifications/2.0/specs/C2PA_Specification.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">https://c2pa.org/specifications/specifications/2.0/specs/C2PA_Specification.html</a></p></span></div></div><h2 class="heading" style="text-align:left;">How It Works</h2><p class="paragraph" style="text-align:left;">Old standbys — cryptographic hashing<b> </b>and public key encryption — are at the heart of the <a class="link" href="https://c2pa.org/specifications/specifications/1.3/specs/C2PA_Specification.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">C2PA specification</a>. </p><p class="paragraph" style="text-align:left;">A key aspect of cryptographic systems is that all parties can independently verify certain things are true. With C2PA, those things are about establishing a “chain of trust” starting with the Certificate Authorities and ending with certainty about the provenance of some data. C2PA’s protocol specifications leverage existing technologies and Public Key Infrastructure to establish chains of trust about how a piece of media came to be. </p><p class="paragraph" style="text-align:left;">In an ideal use case — where everyone involved makes an effort to be C2PA compliant — this chain allows users to verify everyone/everything that made changes to the data, in which order, all the way back to its source. The manifest provides us with a record of everyone who signed it, and if any changes were made without a signature, C2PA will detect that has occurred.</p><p class="paragraph" style="text-align:left;">A major weakness of the protocol is that the metadata is trivially easy to remove. Simple, minor changes to the original data can also easily break the cryptographic binding to its manifest. This means that C2PA <i>only </i>gives us confidence about data that <i>have </i>a matching manifest — it tells us nothing about data without a manifest.</p><p class="paragraph" style="text-align:left;">I will use this adorable image of a bear that ChatGPT generated as an example to motivate and explore critical aspects of C2PA.</p><p class="paragraph" style="text-align:left;">Note that this is actually not the original image. In tests, my publishing platform converted the PNG to a JPEG and stripped the metadata. Run it through the <a class="link" href="https://contentcredentials.org/verify?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">C2PA verify tool</a> to see for yourself and <a class="link" href="https://raw.githubusercontent.com/Tebs-Lab/lab-report-code/main/2024-03-03/cape-bear.webp?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">find the original here</a>, which will verify properly (as shown below). </p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/bb017dd8-ad93-4a23-8f74-e058aae97143/signed-cape-bear.png?t=1709226496"/></div><p class="paragraph" style="text-align:left;"><a class="link" href="https://blog.bradfieldcs.com/an-introduction-to-hashing-in-the-era-of-machine-learning-6039394549b0?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">Hashing</a> is a generic and widespread technique to take some input data and deterministically produce an output (called a “hash code” or a “digest”) of a specified length (such as 32 bits or 1024 bits). Hashing is used in <a class="link" href="https://slides.com/tebba-von-mathenstein/learned-indexing/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">database indexing</a>; to create key-value data structures such as JavaScript’s Object and Python’s Dictionary (generically called a <a class="link" href="https://guides.codepath.com/compsci/Hash-Tables?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">Hash Table</a>); to create <a class="link" href="https://github.com/Tebs-Lab/check-suming?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">checksums</a>, and more.</p><p class="paragraph" style="text-align:left;">To ensure security, <i>cryptographic</i> hashing places further constraints on the hash function. Without getting too lost in the details, a “cryptographically secure” hash function’s hash codes are guaranteed to uniquely identify the data used to create it; no other data can result in the same hash code. The C2PA manifests are made “tamper evident” by cryptographically hashing various portions of the manifest and raw image data individually and embedding those hashes in the manifest.</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/c723e2c1-85fb-4065-a548-c2b176cd3a04/Hashing-Each.png?t=1709225532"/><div class="image__source"><span class="image__source_text"><p>Different aspects of the data are hashed separately. People who receive the image and the manifest can recompute the hashcode and verify that they match. </p></span></div></div><p class="paragraph" style="text-align:left;">Once all data and the assertions have been hashed individually, they are “bound” to each other using a cryptographic hash function once again. This time, the hash function takes the data and the assertions simultaneously and produces a hashcode for the entire “claim.” This binding allows us to verify that the entire bundle hasn’t been tampered with. It also allows us to physically separate the manifest from the original data and reattach it later by verifying the hash code.</p><p class="paragraph" style="text-align:left;">If any alterations are made to the media itself or the metadata, then the computed hash codes won’t match, and C2PA-aware systems can flag that the image has been altered.</p><p class="paragraph" style="text-align:left;">Finally, the publisher, creator, camera, and/or other entities associated with the media sign the claim using a <a class="link" href="https://www.khanacademy.org/computing/computers-and-internet/xcae6f4a7ff015e7d:online-data-security/xcae6f4a7ff015e7d:data-encryption-techniques/a/public-key-encryption?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">public key encryption scheme</a>. This scheme uses much of the same infrastructure that powers HTTPS/TLS. Any signatories use their private key to encrypt the claim hash. The raw and encrypted hash are both embedded in the manifest.</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/60e798dd-0cda-45a7-b760-1256ff3b8d88/signature-example-final.png?t=1709227647"/><div class="image__source"><span class="image__source_text"><p>The media and assertions are bound and signed.</p></span></div></div><p class="paragraph" style="text-align:left;">Finally, the public “certificate” is also embedded into the manifest. Existing Public Key Infrastructure allows systems and people to verify certificates’ authenticity via the <a class="link" href="https://www.ssl.com/article/what-is-a-certificate-authority-ca/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">Certificate Authorities</a> who issue them. The certificate also contains the public key that end users need to decrypt the signature. If the decrypted value matches the raw value, we can prove that the certificate owner signed this entire manifest (provided someone hasn’t stolen their private key).</p><p class="paragraph" style="text-align:left;">If the decrypted signature and claim hash match, users can go on to verify the rest of the hashes for the original data and assertions to prove the data hasn’t been tampered with.</p><p class="paragraph" style="text-align:left;">The complete specification is complicated<i>. </i>We have glossed over some details for the sake of brevity and approachability. At the risk of losing some readers in the weeds, let’s look a little closer at the protocol details and some of the <a class="link" href="https://opensource.contentauthenticity.org/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">open-source tools C2PA has published</a>.</p><h2 class="heading" style="text-align:left;">Dissecting a C2PA Compliant Image</h2><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">If you want to repeat any of these steps, or examine the outputs generated, check out <a class="link" href="https://github.com/Tebs-Lab/lab-report-code/tree/main/2024-03-03?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">our Github repository</a> for the raw images and manifest outputs.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">First, I used the <a class="link" href="https://contentcredentials.org/verify?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">C2PA verify app</a> to validate the original image. Notice that this image already has two links in its chain: one for the original image produced by DALL-E and another for its publication via ChatGPT.</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/4b8ab6d4-9557-4cbb-ab96-17f8b8d6c653/originalVerify.png?t=1709230096"/></div><p class="paragraph" style="text-align:left;">The <a class="link" href="https://github.com/contentauth/c2patool?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">C2PA command line tool</a> allows us to view a JSON representation of the manifest. Recall that the manifest attached to our image is stored in a binary format called <a class="link" href="https://cbor.io/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">CBOR</a>.</p><div class="codeblock"><pre><code>$ c2patool -d cape-bear.webp

&#123;
  &quot;active_manifest&quot;: &quot;urn:uuid:bcc56165-0bf4-47e0-be9c-cd25be17b335&quot;,
  &quot;manifests&quot;: &#123;
    &quot;urn:uuid:bcc56165-0bf4-47e0-be9c-cd25be17b335&quot;: &#123;
      &quot;claim&quot;: &#123;
        &quot;alg&quot;: &quot;sha256&quot;,
        &quot;assertions&quot;: [
          &#123;
            &quot;hash&quot;: &quot;uH2AWcGg9rc+ksEeappGi35hDHvwDZq6MSghk8Nt4gI=&quot;,
            &quot;url&quot;: &quot;self#jumbf=c2pa.assertions/c2pa.thumbnail.ingredient.jpeg&quot;
 
...</code></pre></div><p class="paragraph" style="text-align:left;">The raw data confirms what the verify app showed: this image has two manifests — one for the original image created by DALL-E and another for when ChatGPT published the image. The most recent manifest is the “active manifest” and we can see it identified by a <a class="link" href="https://en.wikipedia.org/wiki/Universally_unique_identifier?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">UUID</a> in the JSON above.</p><p class="paragraph" style="text-align:left;">The JSON representation of the manifest is 159 lines and includes <i>nearly</i> all the information we’d need to verify the image’s provenance, including the hash codes used in the images above. </p><p class="paragraph" style="text-align:left;">C2PA’s tool does not include the certificate in the JSON. Instead, we use a different command to extract the certificates in a standard <a class="link" href="https://en.wikipedia.org/wiki/Privacy-Enhanced_Mail?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">PEM format</a>:</p><div class="codeblock"><pre><code>% c2patool --certs cape-bear.webp 
-----BEGIN CERTIFICATE-----
MIIDKTCCAhGgAwIBAgIUTkkWa/Nuvvyy5UHYHXXP6uhNoQ4wDQYJKoZIhvcNAQEM
BQAwSjEaMBgGA1UEAwwRV2ViQ2xhaW1TaWduaW5nQ0ExDTALBgNVBAsM

...</code></pre></div><h2 class="heading" style="text-align:left;">Let’s Make Some Changes</h2><p class="paragraph" style="text-align:left;">First, I converted this image to a .png using <a class="link" href="https://imagemagick.org/index.php?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">Image Magick</a>. </p><div class="codeblock"><pre><code>magick cape-bear.webp cape-bear.png</code></pre></div><p class="paragraph" style="text-align:left;">Unfortunately, that process destroyed the metadata. This is to be expected. Datatype conversion changes the image. The hash code of the png won’t match the one in the manifest generated from the webp file. This demonstrates a fundamental limitation of C2PA: it’s trivially easy to strip the metadata from the image. In my experimentation, keeping the metadata is much more work; I <i>accidentally</i> stripped it several times.</p><p class="paragraph" style="text-align:left;">I could proliferate this credentialless image to create confusion about its origin. Or I could sign it myself and claim that Open AI is stealing my original work. In fact, I could make all kinds of bogus, falsified, and fraudulent material and sign it using C2PA.</p><p class="paragraph" style="text-align:left;">This is by design: C2PA’s <a class="link" href="https://c2pa.org/specifications/specifications/1.0/specs/C2PA_Specification.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content#_trust_model" target="_blank" rel="noopener noreferrer nofollow">trust model</a> lets you verify who handled a piece of media, not what media is accurate or valuable. My signature on the image proves that I handled it, but it’s up to users to decide if I am trustworthy. </p><p class="paragraph" style="text-align:left;">Here’s what the verify tool shows for the converted png:</p><div class="image"><img alt="" class="image__image" style="border-radius:0px 0px 0px 0px;border-style:solid;border-width:0px 0px 0px 0px;box-sizing:border-box;border-color:#E5E7EB;" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/3e008921-34e5-462e-b3cb-72383bb067cb/cape-png-verify.png?t=1709139882"/><div class="image__source"><span class="image__source_text"><p>Oops, what happened to my manifest?</p></span></div></div><p class="paragraph" style="text-align:left;">To simulate good stewardship of a C2PA image, I used the C2PA tool to label my image as a derivative of the original .webp file. This required me to create my own manifest, in which I included two assertions: that I was the author and that I converted the image.</p><div class="codeblock"><pre><code>&#123;
    &quot;claim_generator&quot;: &quot;Teb&#39;s Lab Demo&quot;,
    &quot;assertions&quot;: [
        &#123;
            &quot;label&quot;: &quot;stds.schema-org.CreativeWork&quot;,
            &quot;data&quot;: &#123;
                &quot;@context&quot;: &quot;https://schema.org&quot;,
                &quot;@type&quot;: &quot;CreativeWork&quot;,
                &quot;author&quot;: [
                    &#123;
                        &quot;@type&quot;: &quot;Person&quot;,
                        &quot;name&quot;: &quot;Tyler Bettilyon&quot;
                    &#125;
                ],
                &quot;actions&quot;: [
                    &#123;
                        &quot;action&quot;: &quot;c2pa.converted&quot;
                    &#125;
                ]
            &#125;
        &#125;
    ]
&#125;</code></pre></div><p class="paragraph" style="text-align:left;">Then, I bound the new manifest and old image to the new png using the open-source <a class="link" href="https://github.com/contentauth/c2patool?tab=readme-ov-file&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content#specifying-a-parent-file" target="_blank" rel="noopener noreferrer nofollow">C2PA Tool</a>:</p><div class="codeblock"><pre><code>c2patool cape-bear.png -p cape-bear.webp -o signed-cape-bear.png -m added-manifest.json</code></pre></div><p class="paragraph" style="text-align:left;">Because I do not have an actual certificate from a Certificate Authority, the tool used its default certificate and gave me this warning:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">Note: Using default private key and signing certificate. This is only valid for development. A permanent key and cert should be provided in the manifest definition or in the environment variables.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">This demonstrates another weakness of the protocol: many people involved in creating and editing media are regular people who do not have public keys or certificates nor the technical know-how to obtain and publish them. For C2PA to achieve broad adoption and use, PKI must be made more accessible to those people.</p><p class="paragraph" style="text-align:left;">Anyway, the verify tool now tells us this image has three manifests: the two you saw above and the one I just added. Apparently, I accidentally destroyed one of the thumbnails. The tool also warns us that “This Content Credential was issued by an unknown source.” It does display my name, and the raw manifest maintained my “converted” action, although it wasn’t shown in the verify app.</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/caad0732-a974-4bc4-89f8-58147d9d131b/verify-png.png?t=1709165854"/></div><p class="paragraph" style="text-align:left;">Finally, I spent hours trying to edit the image in a way that <i>wouldn’t </i>destroy the content credentials wholesale… and failed. I hoped to produce a nice warning in the verify app that, “this image has been tampered with.” Unfortunately, every editor I tried discarded the metadata, so I just ended up with this:</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/be177bb7-9f81-427b-a9bf-8e23cf25a9d9/angry-verify.png?t=1709166928"/><div class="image__source"><span class="image__source_text"><p>The cute bear is so grumpy that no one will validate its provenance.</p></span></div></div><h2 class="heading" style="text-align:left;">Limitations and The Future</h2><p class="paragraph" style="text-align:left;">It’s wonderful to see support for improving trust and security on the internet. Some big players have already signed on to the C2PA standard. If integrations on major search platforms, social media sites, and Adobe’s media editing empire go well, more publishers and creators will invest in establishing the provenance of their work.</p><p class="paragraph" style="text-align:left;">In addition to some of the weaknesses we highlighted above, there <i>are</i> serious detractors. For example, cryptographer Dr. Neal Krawetz points to the limited capacity of C2PA’s trust model to verify certain claims and<a class="link" href="https://www.hackerfactor.com/blog/index.php?%2Farchives%2F1013-C2PAs-Worst-Case-Scenario.html=&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow"> argues that C2PA has significant flaws</a>.</p><p class="paragraph" style="text-align:left;">Ultimately, C2PA will only help users identify that honest people are honest. And to be clear, that <i>is</i> useful. Knowing the New York Times signed a particular photograph can help you decide how to view that photo. </p><p class="paragraph" style="text-align:left;">But C2PA won’t do much with respect to the vast majority of images that do not have any C2PA metadata. There is also a risk of granting a false sense of security if “signed” data are considered credible without much consideration regarding who did the signing. </p><p class="paragraph" style="text-align:left;">Such is life on the internet.</p></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h3 class="heading" style="text-align:left;">Teb’s Tidbits</h3><ul><li><p class="paragraph" style="text-align:left;">Overcorrecting from previous allegations of racial bias, Google’s Gemini generated racially <a class="link" href="https://www.theverge.com/2024/2/21/24079371/google-ai-gemini-generative-inaccurate-historical?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">diverse images of Nazis</a>.<br></p></li><li><p class="paragraph" style="text-align:left;">A privacy-focused class action lawsuit in California implicates sex toy seller Adam and Eve and Google Analytics <a class="link" href="https://www.404media.co/email/89984db2-b487-4738-afaf-791fac4a0968/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">in leaking IP addresses and search history</a>.<br></p></li><li><p class="paragraph" style="text-align:left;">Amazon will <a class="link" href="https://www.theregister.com/2024/01/25/amazon_ring_sounds_death_knell/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">require warrants</a> for law enforcement to access footage from Ring doorbells.<br></p></li><li><p class="paragraph" style="text-align:left;">The US government continues to <a class="link" href="https://www.washingtonpost.com/technology/2024/02/19/biden-billion-chip-grant-new-york/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">spend big on bolstering</a> the computer chip manufacturing industry. <br></p></li><li><p class="paragraph" style="text-align:left;">The source of the fake Joe Biden robocalls has confessed and is <a class="link" href="https://www.politico.com/news/2024/02/06/robocalls-fcc-new-hampshire-texas-00139864?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">not apologetic</a>. </p></li></ul></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h2 class="heading" style="text-align:left;">Remember…</h2><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">The Lab Report is free and doesn’t even advertise. Our curricula is open source and published under a public domain license for <i>anyone </i>to use for <i>any purpose. </i>We’re also a very small team with no investors.</p><p class="paragraph" style="text-align:left;">Help us keep providing these free services by scheduling one of our <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">world class trainings </a>or requesting a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=provenance-authentication-of-ai-generated-content" target="_blank" rel="noopener noreferrer nofollow">custom class</a> for your team.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div></div></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=134fd9be-65a0-45ad-b05a-0adcfb636861&utm_medium=post_rss&utm_source=teb_s_lab">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

      <item>
  <title>Intellectual Property vs Artificial Intelligence</title>
  <description>The courts tackle an old question for a new era...</description>
      <enclosure url="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/97048fce-555b-4d16-a060-d7faf50924f7/Screenshot_2024-01-24_at_12.52.23_PM.png" length="897487" type="image/png"/>
  <link>https://blog.tebs-lab.com/p/intellectual-property-vs-artificial-intelligence</link>
  <guid isPermaLink="true">https://blog.tebs-lab.com/p/intellectual-property-vs-artificial-intelligence</guid>
  <pubDate>Sun, 04 Feb 2024 12:00:00 +0000</pubDate>
  <atom:published>2024-02-04T12:00:00Z</atom:published>
    <dc:creator>Tyler Bettilyon</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #C0C0C0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #2D2D2D; font-family: 'Helvetica',Arial,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#F1F1F1; }
  .bh__table_header p { color: #2A2A2A; font-family:'Trebuchet MS','Lucida Grande',Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><h1 class="heading" style="text-align:left;" id="the-lab-report">The Lab Report</h1><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">I’m Tyler Elliot Bettilyon (Teb) and this is the Lab Report: Our goal is to deepen your understanding of software and technology by explaining the concepts behind the news. <br><br>If you’re new to the Lab Report you can <a class="link" href="https://tebslab.beehiiv.com/subscribe?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">subscribe here</a>. <br><br>If you like what you’re reading you’ll love one of our classes. <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">Schedule a training</a> from our catalog or request a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">custom class consultation</a>.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">From The Lab</h1></div><p class="paragraph" style="text-align:left;">Salutations! </p><p class="paragraph" style="text-align:left;">We’re back from an unexplained extended absence. In November I received a “battlefield promotion” to Acting Head Debate Coach at Highland High (Go Rams!). That turned out to be a lot of work and stress. Cutting this newsletter from my priorities freed me up to help Highland Debate survive until a full-time coach could be found.</p><p class="paragraph" style="text-align:left;">Relatedly: We’re changing the cadence of this newsletter going forward from weekly to monthly. We try to publish in-depth, high-quality, well-researched articles. I’m currently the only writer, only editor, and only researcher. The slower publishing cycle will spare me from burnout and increase the quality of each edition.</p><p class="paragraph" style="text-align:left;">Moving forward, we will publish this newsletter on the first Sunday of each month. </p><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">Today’s Lesson</h1><div class="custom_html"><span id="lesson"></span></div><h2 class="heading" style="text-align:left;">Intellectual Property vs Artificial Intelligence</h2><p class="paragraph" style="text-align:left;">A slew of intellectual property lawsuits have been filed against AI firms in the past year or two. The law firm Baker Hostetler hosts the <a class="link" href="https://www.bakerlaw.com/services/artificial-intelligence-ai/case-tracker-artificial-intelligence-copyrights-and-class-actions/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">most complete list</a> I’ve found, with 13 active cases as of this writing.</p><p class="paragraph" style="text-align:left;">These legal battles involve huge firms and household names on both sides of the complaints. The plaintiffs include The New York Times, Getty Images, Thompson Reuters, Concord Music Group, and multiple class actions. The defendants include Meta, OpenAI, Stablity AI, Anthropic, Alphabet (AKA Google), and Microsoft.</p><p class="paragraph" style="text-align:left;">Some of the legal questions being posed could fundamentally change the legality and economics of training large ML models. In today’s edition we’re examining the biggest allegations, responses, and potential impact of the aforementioned lawsuits.</p><p class="paragraph" style="text-align:left;"><i>A note: today’s lesson is focused on the United States intellectual property law since that’s where these lawsuits are filed. </i></p></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h2 class="heading" style="text-align:left;">Question 1: <br>Is Training a Model Infringement <i>Per Se?</i></h2><p class="paragraph" style="text-align:left;">Training a large, modern, machine learning model requires <i>lots</i> of training data. This first question asks: If that training data is copyright protected then is the training process itself an infringing act? Multiple lawsuits allege that it is. Here are two examples drawn from the official complaints:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">Unfairly, and perversely, without Plaintiffs’ copyrighted works on which to “train” their LLMs, Defendants would have no commercial product with which to damage—if not usurp—the market for these professional authors’ works. OpenAI’s willful copying thus makes Plaintiffs’ works into engines of their own destruction.</p><p class="paragraph" style="text-align:left;">[…]</p><p class="paragraph" style="text-align:left;">As the U.S. Patent and Trademark Office has observed, LLM “training” “almost by definition involve[s] the reproduction of entire works or substantial portions thereof.”</p><p class="paragraph" style="text-align:left;">“Training” in this context is therefore a technical-sounding euphemism for “copying and ingesting expression.”</p><figcaption class="blockquote__byline"><a class="link" href="https://admin.bakerlaw.com/wp-content/uploads/2024/01/ECF-39-Amended-Complaint.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">Authors Guild vs OpenAI amended complaint</a>. </figcaption></blockquote></div><p class="paragraph" style="text-align:left;">And</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">Because OpenAI’s GPT models cannot function without the expressive information extracted from Plaintiffs’ and Class members’ works and retained by the GPT models, GPT and ChatGPT are themselves infringing derivative works, made without Plaintiffs’ and Class members’ permission in violation of their exclusive rights under the Copyright Act.</p><figcaption class="blockquote__byline"><a class="link" href="https://admin.bakerlaw.com/wp-content/uploads/2024/01/ECF-11-Amended-Complaint.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">Chabon vs OpenAI amended complaint</a></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">A fundamental copyright protection is the “right to exclude.” This allows a copyright holder to bar anyone from using their intellectual property without permission. But ML firms are feeding protected works — <i>en masse</i> — to an ML model’s training procedure.</p><p class="paragraph" style="text-align:left;">Plaintiffs are saying: we have the right to exclude our work from being used to train models; training is infringement <i>per se. </i>If courts agree, AI firms will have to establish licensing deals with <i>every copyright holder </i>represented in a model’s training dataset to legally produce that model, or risk being sued. </p><p class="paragraph" style="text-align:left;">Furthermore, such models would be considered “derivative works” of that training data. Producing derivative works is an exclusive right of a copyright holder, meaning AI firms could be forced to unpublish any such model. This would give rights holders significant leverage during any license negotiation. </p><p class="paragraph" style="text-align:left;">Crucially, this claim is agnostic to the model’s output. Another claim, which we’ll examine momentarily, involves models producing outputs that are identical or nearly identical to training samples.</p><h3 class="heading" style="text-align:left;">Potential Impact: Huge</h3><p class="paragraph" style="text-align:left;">A model’s performance is directly correlated with its training data’s quality. Modern ML models also require enormous amounts of training data to be successful. The problem for ML firms is that producing huge quantities of excellent-quality data is extravagantly expensive. </p><p class="paragraph" style="text-align:left;">Here’s Getty Images’ lawyers’ take:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">Getty Images has spent years coordinating and arranging the Database, including, inter alia, by setting criteria for inclusion of images, selecting specific images for inclusion, creating and incorporating detailed captions and other text paired with images, creating and assigning unique asset identifiers that can be linked to specific contributors, and arranging the contents of the Database so that the Database is searchable and results can be filtered. Additionally, Getty Images has and continues to invest significantly in maintaining the contents of the Database. Between 2017 and 2020 alone, Getty Images and its affiliates invested more than $200 million to maintain the Database.</p><figcaption class="blockquote__byline"><a class="link" href="https://admin.bakerlaw.com/wp-content/uploads/2023/09/ECF-13-Amended-Complaint.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">Getty Images vs Stability AI amended complaint</a></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">This database is only a goldmine for image generators due to this enormous investment.</p><p class="paragraph" style="text-align:left;">The “Books3” dataset at the heart of Authors Guild vs OpenAI represents <i>literally centuries</i> of human effort. Writing and editing a single book takes <i>months to years </i>of work and Books3 contains roughly 200,000<b> </b>books<b>. </b></p><p class="paragraph" style="text-align:left;">The Common Crawl dataset described in the New York Times vs Microsoft complaint contains roughly 16 million unique content records just from the New York Times network. The Times paid “approximately 5,800 full-time equivalent employees” for years to produce that work.</p><p class="paragraph" style="text-align:left;">Plaintiffs are asking: If AI firms are completely reliant on this ridiculously expensive body of work to train their models, shouldn’t they pay for it? </p><p class="paragraph" style="text-align:left;">In keeping with Silicon Valley’s historically cavalier attitude towards regulation David Holz, the CEO of ML firm Midjourney, recently said the quiet part out loud in an interview with Forbes:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;"><b>Did you seek consent from living artists or work still under copyright?</b></p><p class="paragraph" style="text-align:left;">No. There isn’t really a way to get a hundred million images and know where they’re coming from. It would be cool if images had metadata embedded in them about the copyright owner or something. But that&#39;s not a thing; there&#39;s not a registry. There’s no way to find a picture on the Internet, and then automatically trace it to an owner and then have any way of doing anything to authenticate it.</p><figcaption class="blockquote__byline"> David Holz, in <a class="link" href="https://www.forbes.com/sites/robsalkowitz/2022/09/16/midjourney-founder-david-holz-on-the-impact-of-ai-on-art-imagination-and-the-creative-economy/?sh=4c1fc9822d2b&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">Forbes</a></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">It’s hard to predict the value and cost of licensing deals and punitive damages here. However, the sheer volume of works consumed gives AI firms major exposure.</p><p class="paragraph" style="text-align:left;">One final point: if models are found to be derivative works, then AI firms who have already published “open source” models trained on copy-protected material are in huge trouble. </p><h1 class="heading" style="text-align:left;">Question 2: <br>Can The Model’s Outputs Be Infringing?</h1><p class="paragraph" style="text-align:left;">Generative models sometimes regurgitate their inputs verbatim or near-verbatim. This is a manifestation of <a class="link" href="https://blog.tebs-lab.com/p/ml-failure-part-1-underfitting-overfitting?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=intellectual-property-vs-artificial-intelligence#lesson" target="_blank" rel="noopener noreferrer nofollow">overfitting</a> in generative models. Here’s an example from the Getty Images legal filing:</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/e1e20d79-5335-42f4-bc8a-479f1b7e7c34/Screenshot_2024-01-24_at_11.52.13_AM.png?t=1706122438"/><div class="image__source"><span class="image__source_text"><p>An image from Getty Images vs OpenAI amended complaint. On the left is an original image from the Getty Images database. On the right is an image generated by Stable Diffusion. Note that Stable Diffusion has even sort of reproduced the Getty Images watermark.</p></span></div></div><p class="paragraph" style="text-align:left;"></p><p class="paragraph" style="text-align:left;">And here’s an example from the New York Times filing:</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/69b54f84-b004-4b25-8508-ec3a655384b5/Screenshot_2024-01-24_at_11.59.46_AM.png?t=1706122805"/><div class="image__source"><span class="image__source_text"><p>Lawyers for the New York Times got GPT-4 to reproduce large sections of NYT articles verbatim. Red text is a verbatim match. </p></span></div></div><p class="paragraph" style="text-align:left;">In one instance lawyers literally asked ChatGPT for a verbatim copy of a New York Times article because they couldn’t get around the paywall:</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/b5477e21-d6ee-4413-bc83-1343a8e4ee4f/Screenshot_2024-01-24_at_12.02.47_PM.png?t=1706122978"/><div class="image__source"><span class="image__source_text"><p>ChatGPT can “certainly!” help you avoid the NYTimes paywall.</p></span></div></div><p class="paragraph" style="text-align:left;">Using ChatGPT to bypass a paywall would be direct infringement, but this issue also opens AI firms to claims of “contributory infringement” where users might (even unwittingly) prompt an AI system to generate infringing content and publish it themselves. In this case, the AI firm can be liable for facilitating users’ infringing acts. </p><p class="paragraph" style="text-align:left;">AI firms claim they want to eliminate this behavior. In early January <a class="link" href="https://openai.com/blog/openai-and-journalism?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">OpenAI published this in a blog post</a>:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">Memorization is a rare failure of the learning process that we are continually making progress on, but it’s more common when particular content appears more than once in training data, like if pieces of it appear on lots of different public websites. So we have measures in place to limit inadvertent memorization and prevent regurgitation in model outputs. We also expect our users to act responsibly; intentionally manipulating our models to regurgitate is not an appropriate use of our technology and is against our terms of use.</p><figcaption class="blockquote__byline"><a class="link" href="https://openai.com/blog/openai-and-journalism?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">OpenAI Blog</a></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">It’s true that overfitting and memorization are generally considered bugs, not features. But <a class="link" href="https://spectrum.ieee.org/midjourney-copyright?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">recent research</a> demonstrates that generating copyright-protected content is shockingly easy. Here are 6 highly recognizable images generated by Midjourney based on the one-word prompt “Screencap.”</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/7fea0c18-981d-482a-80f4-f6212f80b065/image.png?t=1706573879"/><div class="image__source"><span class="image__source_text"><p>These images, all produced by Midjourney, closely resemble film frames. They were produced with the prompt “screencap.”Gary Marcus and Reid Southen via Midjourney. Source: <a class="link" href="https://spectrum.ieee.org/midjourney-copyright?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">IEEE Spectrum</a></p></span></div></div><p class="paragraph" style="text-align:left;"><br>If AI firms want to claim in court that their “measures” prevent this type of infringement, they’ll have to prove it.</p><h2 class="heading" style="text-align:left;">Potential Impact: Medium </h2><p class="paragraph" style="text-align:left;">AI firms genuinely <i>want</i> to eliminate this behavior, but they haven’t because it’s hard. Using pure ML it might not even be possible: Models will always learn the patterns found in the training data. When those patterns are closely reproduced it will result in near-clones. </p><p class="paragraph" style="text-align:left;">Language models may be connected to existing anti-plagiarism databases to avoid this type of infringement. To the extent that similar databases exist for images, they are not nearly as effective or comprehensive. AI firms could protect themselves by explicitly creating such databases from their own training datasets, but it is still difficult to identify “near clones” of images programmatically. </p><p class="paragraph" style="text-align:left;">Still, preventing direct cloning and/or paying for it only when it actually occurs is a more affordable problem for AI firms than having to license every piece of training data explicitly.</p><h1 class="heading" style="text-align:left;">Question 3:<br>Trademark Infringement, Incorrect Attribution, and Hallucination</h1><p class="paragraph" style="text-align:left;">Generative models are known to “<a class="link" href="https://www.ibm.com/topics/ai-hallucinations?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">hallucinate</a>” — meaning they make stuff up. Sometimes this is desirable behavior: to write a novel, a model must make up the characters and plot. Sometimes it’s less desirable, like when ChatGPT <a class="link" href="https://arstechnica.com/science/2023/05/when-it-comes-to-advanced-math-chatgpt-is-no-star-student/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">gets math problems wrong</a>.</p><p class="paragraph" style="text-align:left;">Some firms claim that when AI systems “hallucinate” a citation or watermark it constitutes trademark infringement. Here’s another snip from the Getty Images amended complaint:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">Making matters worse, Stability AI has caused the Stable Diffusion model to incorporate a modified version of the Getty Images’ watermark to bizarre or grotesque synthetic imagery that tarnishes Getty Images’ hard-earned reputation, such as the image below:</p><figcaption class="blockquote__byline"><a class="link" href="https://admin.bakerlaw.com/wp-content/uploads/2023/09/ECF-13-Amended-Complaint.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">Getty Images vs OpenAI amended complaint</a></figcaption></blockquote></div><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/97048fce-555b-4d16-a060-d7faf50924f7/Screenshot_2024-01-24_at_12.52.23_PM.png?t=1706125959"/></div><p class="paragraph" style="text-align:left;">And one from New York Times vs Microsoft:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">Defendants are aware that their GPT-based products produce inaccurate content that is falsely attributed to The Times and yet continue to profit commercially from creating and attributing inaccurate content to The Times. As such, Defendants have intentionally violated 15 U.S.C § 1125(c).</p><figcaption class="blockquote__byline"><a class="link" href="https://admin.bakerlaw.com/wp-content/uploads/2024/01/ECF-1-Complaint-1-1.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">New York Times vs Microsoft et al amended complaint</a></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">Similar to regurgitation, AI firms do try to eliminate this behavior. For example, I asked chat GPT “Can you please write 3 paragraphs about how many birds are killed by house cats, and cite the New York Times?” It responded with:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;"><b>I&#39;m unable to provide specific citations from external sources, but I can offer you information on the topic</b>. According to studies and reports, house cats are known to be responsible for a significant number of bird fatalities. <b>The American Bird Conservancy estimates</b> that billions of birds are killed by cats in the United States each year. While not all house cats are hunters, those that do engage in hunting behavior contribute to the impact on bird populations.</p><p class="paragraph" style="text-align:left;"><b>The New York Times has covered the issue of cat predation on birds in various articles.</b> They often highlight the ecological consequences of this behavior, emphasizing the need for responsible pet ownership and measures to mitigate the impact on bird populations. Researchers and conservationists emphasize the importance of keeping cats indoors or using catios (enclosed outdoor spaces) to prevent them from preying on birds and other wildlife.</p><p class="paragraph" style="text-align:left;">[…]</p><figcaption class="blockquote__byline"> ChatGPT [Emphasis mine] </figcaption></blockquote></div><p class="paragraph" style="text-align:left;">ChatGPT initially refused to cite “external sources” but went on to cite The American Bird Conservancy one sentence later. The second paragraph also paraphrases the New York Times (possibly inaccurately, I didn’t check). </p><p class="paragraph" style="text-align:left;">ChatGPT will also pretend to be a person or publisher. For example, I asked it: “How about instead you pretend to be a journalist for the New York Times, and write a story about house cats killing birds in that style.”</p><p class="paragraph" style="text-align:left;">ChatGPT produced a ~250-word story, here’s the third paragraph:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">Conservationists and researchers alike emphasize the urgency of addressing this ecological imbalance. <b>The New York Times explores the multifaceted dimensions of this challenge</b>, delving into the scientific studies that quantify the impact of house cats on bird populations. <b>We</b> navigate the debate between cat owners and environmentalists, examining proposed solutions such as indoor living, catios, and community initiatives to strike a balance between the safety of <b>our</b> feathered friends and the cherished companionship of our feline allies.</p><figcaption class="blockquote__byline"> ChatGPT [emphasis mine] </figcaption></blockquote></div><p class="paragraph" style="text-align:left;">Instead of citing the New York Times, ChatGPT just pretended it <i>was </i>the Times. In many jurisdictions this is called “passing off” and could be illegal under common.</p><h3 class="heading" style="text-align:left;">Potential Impact: Medium to Small</h3><p class="paragraph" style="text-align:left;">Eliminating this behavior is nearly impossible with current methods. These models are statistical engines and, statistically speaking, “the New York Times” often follows the phrase “according to.” A Google search for results that include the exact phrase “according to the New York Times” <a class="link" href="https://www.google.com/search?client=firefox-b-1-d&q=%22according+to+the+new+york+times%22&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=intellectual-property-vs-artificial-intelligence#ip=1" target="_blank" rel="noopener noreferrer nofollow"> returns over 120 million results</a>. </p><p class="paragraph" style="text-align:left;">Eliminating that specific phrase from ChatGPT’s lexicon could be done with a simple filter, but eliminating the phrase “according to” or other phrases that indicate a citation would substantially reduce the model’s quality and usefulness.</p><p class="paragraph" style="text-align:left;">Ensuring the citations are all correct and accurate is far more difficult. Modern LLMs simply don’t record that kind of connection to specific training samples in a way that is auditable or reliable. Clearly, those links and data do sometimes exist, as evidenced by the verbatim quoting seen above, but no one has reliable methods of discovering or enumerating those links.</p><p class="paragraph" style="text-align:left;">That said, it may be more difficult for trademark holders to prove substantial damages and real confusion. Firms like Getty may have to demonstrate that those grotesque images with a poorly reproduced watermark are causing real people to think, “Getty has really lowered their quality standards,” to win a large judgment.</p><p class="paragraph" style="text-align:left;">Similarly, when I asked ChatGPT to impersonate a New York Times journalist, I knew it was a farce. However, if I went on to publish that snippet and tried to pass it off as an authentic piece of NYTimes journalism OpenAI <i>might</i> be found liable for <i>contributory </i>trademark infringement.</p><h1 class="heading" style="text-align:left;">The Most Likely Defense: Fair Use</h1><p class="paragraph" style="text-align:left;">Most of these lawsuits are in the early stages. Plaintiffs have made official allegations, but official responses from defendants are mostly still pending; defendants’ current filings are mostly about procedural matters such as the relevant jurisdiction.</p><p class="paragraph" style="text-align:left;">One thing we can be sure of is that AI firms are going to make a “fair use” defense. Here’s another snip from OpenAI’s blog:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">Training AI models using publicly available internet materials is fair use, as supported by long-standing and widely accepted precedents. We view this principle as fair to creators, necessary for innovators, and critical for US competitiveness.</p><figcaption class="blockquote__byline"><a class="link" href="https://openai.com/blog/openai-and-journalism?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">OpenAI Blog Post</a></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">Fair use is a legal doctrine that allows the use of copyright-protected material without a license under certain conditions. Fair use is generally <i>not </i>a clear-cut decision. There isn’t a simple, objective test that can be applied. It&#39;s a judgment call made on a case-by-case basis. Additionally, there are two versions of fair use, one for copyright and one for trademark.</p><p class="paragraph" style="text-align:left;">For copyright, there are four guiding principles that judges and juries use to determine if a use is fair:</p><p class="paragraph" style="text-align:left;"><b>1) The purpose and character of the use, including whether the use is of a commercial nature or is for nonprofit educational purposes. </b></p><p class="paragraph" style="text-align:left;">Educational and other non-profit uses are more likely to be fair. So are “transformational” uses, which present the copyright-protected work in new and original ways. </p><p class="paragraph" style="text-align:left;">For example, copying images for the purpose of creating a searchable index and presenting search users with thumbnails has been found to be “transformative” and covered under fair use. Similar rulings have been made about <a class="link" href="https://www.theatlantic.com/technology/archive/2015/10/fair-use-transformative-leval-google-books/411058/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">copying and displaying the text of books</a> as part of a search index.</p><p class="paragraph" style="text-align:left;">AI firms will certainly highlight the “transformative” nature of the training process as part of a fair use defense. Taking a book and applying complex mathematics to turn that text into the numeric parameters that power a model is plainly transformative. </p><p class="paragraph" style="text-align:left;">However, you could make a similar argument about compression algorithms. A compressed file is “transformative” in that the compressed data doesn’t remotely resemble the original. But you can use the compressed data to get a perfect copy of the original, so how transformative is it really? </p><p class="paragraph" style="text-align:left;">Models have repeatedly reproduced verbatim copies of their training data. Indeed, some computer scientists are explicitly using ML models as a <a class="link" href="https://arxiv.org/abs/2202.06533?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">replacement for compression</a>. </p><p class="paragraph" style="text-align:left;">To me, ML models are clearly somewhat transformative in nature. Most content produced by generative AI is not a clone or near-clone of the training data. How much that matters will depend on the other three factors.</p><p class="paragraph" style="text-align:left;"><b>2) The nature of the copyrighted work.</b></p><p class="paragraph" style="text-align:left;">Copyright’s purpose is to encourage creative expression. As such, more creative works generally enjoy more protection. For example, copying a bullet list of facts from a textbook is more likely to be considered fair use than copying a paragraph from a novel. This is because repeating facts in a list isn’t particularly creative.</p><p class="paragraph" style="text-align:left;">Original artwork and novels are both a lot more “creative” than journalism, which is mostly comprised of facts. But even for “less creative” content such as journalism, defendants may struggle with this argument: The real value of a language generator is that it writes like a human, not that it reliably regurgitates facts. It is the <i>creative aspect </i>of producing journalism that is valuable to OpenAI moreso than the factual content of the writing.</p><p class="paragraph" style="text-align:left;"><b>3) The amount and substantiality of the portion used in relation to the copyrighted work as a whole.</b></p><p class="paragraph" style="text-align:left;">The less complete the copy, the more likely it is covered by fair use. Quoting a paragraph from a novel as part of a presentation about writing styles is likely to be fair use. Copying an entire short story is less likely to be fair use.</p><p class="paragraph" style="text-align:left;">AI firms are copying entire books, articles, images, and more in huge quantities.</p><p class="paragraph" style="text-align:left;"><b>4) The effect of the use upon the potential market for or value of the copyrighted work.</b></p><p class="paragraph" style="text-align:left;">In a nod to this component of fair use, The Authors Guild argues that “OpenAI’s willful copying thus makes Plaintiffs’ works into engines of their own destruction.” And they might have a good point.</p><p class="paragraph" style="text-align:left;">Can Stability AI credibly claim that Stable Diffusion isn’t a direct competitor to stock image firms like Getty? Absolutely not. Many writers — myself included — are already using AI generators as a wholesale replacement for stock images and illustrations. </p><p class="paragraph" style="text-align:left;">Can OpenAI credibly claim that GPT-4 doesn’t make original writing less valuable? Unlikely. Copywriting firms and ad agencies are already laying off writers and leaning into AI tools. </p><p class="paragraph" style="text-align:left;">Remember, this is about the effect on the <i>value of the copyrighted work </i>not about the business model of the copyright holder. Twisted arguments like, “This allows the Times to lay off staff writers and save money by using AI-generated text,” won’t get AI firms out of this pickle.</p><p class="paragraph" style="text-align:left;"><b>For trademarks </b>fair use is (generally) a bit simpler: using a trademark in good faith and in a way that isn’t likely to cause confusion is considered fair use. For example, you can use the word mark “The New York Times” to compare them to your own news organization (nominative fair use) or to describe their products (descriptive fair use). </p><p class="paragraph" style="text-align:left;">False attribution and passing off are not examples of good faith trademark uses.</p><h1 class="heading" style="text-align:left;"><b>The Bottom Line</b></h1><p class="paragraph" style="text-align:left;">These lawsuits represent a substantial risk for AI firms. It’s quite possible that AI operators are going to owe copyright holders <i>a lot </i>of money on a retroactive <i>and</i> ongoing basis. Those licensing deals have the potential to fundamentally change the economics of producing AI models, which are already <i>very </i>expensive to build, train, and maintain. </p><p class="paragraph" style="text-align:left;">AI tools obviously stand to deliver a lot of value; which is why they’ve been adopted so quickly and widely. I don’t think the big firms are going to fold under the weight of these copyright issues, but I do expect AI tools to get more expensive if the plaintiffs win many of their claims.</p></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">The News Quiz</h1><p class="paragraph" style="text-align:left;"><i>Every month we challenge ourselves to tie the lesson to the news. Answers are at the end of this newsletter.</i></p><p class="paragraph" style="text-align:left;">Rank the following situations from most to least likely to be considered a “fair use.” For bonus points identify which parties could be sued by the rights holder:</p><p class="paragraph" style="text-align:left;">1) In a lesson about plagiarism a college professor has students try to get a language model to produce the first chapter of George R. R. Martin’s <i>Game of Thrones</i>.</p><p class="paragraph" style="text-align:left;">2) A paid newsletter writer uses an image generator to produce an image for their next newsletter. Unbeknownst to the writer, the image is nearly identical to a copy-protected image.</p><p class="paragraph" style="text-align:left;">3) An artist uses an image generator to make a comic panel that puts Donald Trump’s iconic hair on the head of the video game character Bowser, who says, “It’s a witch hunt — I never did anything to Princess Peach!” and sells prints of the image to her fans.</p><p class="paragraph" style="text-align:left;">4) An copywriter uses a language model to draft a webpage comparing the services of two companies. The model uses those companies’ trademarked names to do so, but hallucinates several erroneous details about the services provided. The copywriter does not notice the errors and publishes the webpage.</p></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">Themes in the News</h1><h3 class="heading" style="text-align:left;">Generative AI in Politics</h3><p class="paragraph" style="text-align:left;">The predictions came true: Generative AI has entered the political fray.</p><p class="paragraph" style="text-align:left;">There were two fairly big stories recently: Falsified audio of President Joe Biden’s voice was used to <a class="link" href="https://www.nbcnews.com/politics/2024-election/fake-joe-biden-robocall-tells-new-hampshire-democrats-not-vote-tuesday-rcna134984?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">encourage people not to vote</a> in the New Hampshire primary. And OpenAI <a class="link" href="https://www.washingtonpost.com/technology/2024/01/20/openai-dean-phillips-ban-chatgpt/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">banned a ChatGPT-based bot built</a> to imitate Democratic presidential hopeful Dean Philips. It was OpenAI’s first ban for violating the political speech component of their terms of service. </p><p class="paragraph" style="text-align:left;">In truth, Generative AI will be used for much more than fakery this election cycle. From bespoke email campaigns to digital advertisements Generative AI “<a class="link" href="https://www.theatlantic.com/technology/archive/2024/01/ai-elections-deepfakes-biden-robocall/677308/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">hasn’t changed the fundamentals; it’s just lowered the production costs of creating content, whether or not intended to deceive.</a>”</p><h3 class="heading" style="text-align:left;">AI-Generated Pornographic Images of Taylor Swift </h3><p class="paragraph" style="text-align:left;">Fake images of Taylor Swift <a class="link" href="https://www.404media.co/ai-generated-taylor-swift-porn-twitter/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">were widely propagated on X</a>. Reporting by 404 Media found the images likely originated in encrypted chat rooms on Telegram, and then found their way to 4Chan, X, and other sites. Like many things T-Swift, the images gained widespread attention, which caused advocates and even Congress to take an interest.</p><p class="paragraph" style="text-align:left;">According to <a class="link" href="https://www.techpolicy.press/taylor-swift-shows-us-whats-coming-next-in-gender-and-tech-and-advocates-should-be-concerned/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">reporting by Tech Press Policy</a>, nearly 96% of deepfake images online are pornographic in nature and the total number of deepfake images online increased 550% between 2019 and 2023. </p><p class="paragraph" style="text-align:left;">In classic, clunky style, X responded by <a class="link" href="https://www.ft.com/content/0636eb58-eaa3-4d2c-ba22-e1a24c85da3f?mc_cid=e1eb2e1281&mc_eid=3d2cb4e22f&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">wholesale blocking searches for Taylor Swift</a> for a while in an attempt to reduce the reach of the images.</p><h3 class="heading" style="text-align:left;">Teb’s Tidbits</h3><ul><li><p class="paragraph" style="text-align:left;">Audio-based ML is being used to <a class="link" href="https://themarkup.org/hello-world/2024/01/20/how-to-use-sound-and-ai-to-protect-the-environment?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">monitor the health of forests</a>.<br></p></li><li><p class="paragraph" style="text-align:left;">Combining LLMs with old-school symbolic engines has led to <a class="link" href="https://www.technologyreview.com/2024/01/17/1086722/google-deepmind-alphageometry/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">more mathematically competent AI</a>.<br></p></li><li><p class="paragraph" style="text-align:left;">Amazon will <a class="link" href="https://www.theregister.com/2024/01/25/amazon_ring_sounds_death_knell/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">start requiring warrants</a> for police to access Ring doorbells’ footage.</p></li></ul></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">Answers To The News Quiz</h1><p class="paragraph" style="text-align:left;"><i>I’ve ranked the scenarios with most likely to be fair use on top to least likely on bottom. I’ve kept their original numbering and description. </i></p><p class="paragraph" style="text-align:left;">1) In a lesson about plagiarism a college professor has students try to get a language model to produce the first chapter of George R. R. Martin’s <i>Game of Thrones</i>.</p><p class="paragraph" style="text-align:left;"><i>The nature of this use is educational, non-profit, and explicitly about plagiarism in an academic context. Moreover, no one actually published their potentially infringing works. This is likely to be fair use. </i></p><p class="paragraph" style="text-align:left;"><i>If anyone is liable, it could be the language model’s creators. It would also depend on whether Game of Thrones was in the training data, and potentially on whether or not anyone successfully produced verbatim or near-verbatim copies of the work.</i></p><p class="paragraph" style="text-align:left;">3) An artist uses an image generator to make a comic panel that puts Donald Trump’s iconic hair on the head of the video game character Bowser, who says, “It’s a witch hunt — I never did anything to Princess Peach!” and sells prints of the image to her fans.</p><p class="paragraph" style="text-align:left;"><i>The use is commercial. Bowser and Princess Peach are trademarked characters and their likeness is copyright protected. Courts may consider the commercial nature of the product and potential damage to Nintendo’s brand by being associated with a political message. However, this is a clear example of parody and political speech and will likely be covered under fair use. </i></p><p class="paragraph" style="text-align:left;"><i>If it is not considered fair use the artist could be found liable for direct infringement and the image generator’s creator could be held liable for contributory infringement.</i></p><p class="paragraph" style="text-align:left;">4) A copywriter uses a language model to draft a webpage comparing the services of his company to another company. The model uses both companies’ trademarked names and hallucinates several erroneous details about the services provided by his competitor’s firm. The copywriter does not notice the errors and publishes the webpage.</p><p class="paragraph" style="text-align:left;"><i>While comparison of this variety is protected under fair use, lying about your competitor while using their trademark generally is not. Had the model produced only accurate comparisons, this would likely be fair use. But given the hallucinations, this situation is probably not protected.</i></p><p class="paragraph" style="text-align:left;"><i>The writer could be liable for trademark infringement, the writer’s company could be liable for “</i><i><a class="link" href="https://www.law.cornell.edu/wex/vicarious_infringement?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">vicarious</a></i><i>” infringement, and the language models creator could be liable for contributory infringement.</i></p><p class="paragraph" style="text-align:left;">2) A paid newsletter writer uses an image generator to produce an image for their next newsletter. Unbeknownst to the writer, the image is nearly identical to a copy-protected image.</p><p class="paragraph" style="text-align:left;"><i>Ignorance is generally not an excuse under the law. This isn’t likely to be considered fair use. The more identical the images, the more likely this is to be infringement.</i></p><p class="paragraph" style="text-align:left;"><i>The writer could be liable for direct infringement and the image generator’s creator for contributory infringement.</i></p></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h2 class="heading" style="text-align:left;">Remember…</h2><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">The Lab Report is free and doesn’t even advertise. Our curricula is open source and published under a public domain license for <i>anyone </i>to use for <i>any purpose. </i>We’re also a very small team with no investors.</p><p class="paragraph" style="text-align:left;">Help us keep providing these free services by scheduling one of our <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">world class trainings </a>or requesting a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=intellectual-property-vs-artificial-intelligence" target="_blank" rel="noopener noreferrer nofollow">custom class</a> for your team.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div></div></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=11ee3586-fa22-4b84-bb0b-07abd6924f82&utm_medium=post_rss&utm_source=teb_s_lab">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

      <item>
  <title>What Does it Mean to Block GPTBot? </title>
  <description>News: Moderation, Censorship, and Section 230</description>
      <enclosure url="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/af7b81b0-863a-43d4-aa0d-eaf8afd6ca09/img-hQ4SfYkToesVlAqjYSItOfu0.png" length="1869094" type="image/png"/>
  <link>https://blog.tebs-lab.com/p/mean-block-gptbot</link>
  <guid isPermaLink="true">https://blog.tebs-lab.com/p/mean-block-gptbot</guid>
  <pubDate>Fri, 01 Sep 2023 09:00:00 +0000</pubDate>
  <atom:published>2023-09-01T09:00:00Z</atom:published>
    <dc:creator>Tyler Bettilyon</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #C0C0C0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #2D2D2D; font-family: 'Helvetica',Arial,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#F1F1F1; }
  .bh__table_header p { color: #2A2A2A; font-family:'Trebuchet MS','Lucida Grande',Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><h1 class="heading" style="text-align:left;">The Weekly Lab Report</h1><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">I’m Tyler Elliot Bettilyon (Teb) and this is the Lab Report: Our goal is to deepen your understanding of software and technology by explaining the concepts behind the news. <br><br>If you’re new to the Lab Report you can <a class="link" href="https://tebslab.beehiiv.com/subscribe?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">subscribe here</a>. <br><br>If you like what you’re reading you’ll love one of our classes. <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">Schedule a training</a> from our catalog or request a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">custom class consultation</a>.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">From The Lab</h1><p class="paragraph" style="text-align:left;">We’ve expanded our open enrollment offerings with <a class="link" href="https://www.devsprout.io/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">DevSprout</a>! Initially we planned to offer an introductory Python class, but we’ve decided to offer an introductory SQL class as well. The outlines for these classes can be found here:</p><ul><li><p class="paragraph" style="text-align:left;"><a class="link" href="https://docs.google.com/document/d/1fuI5GvIWA2P9NNuGCV2pnsAUFIR85P0EJYHFi_PnmSo/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">Intro to SQL</a></p></li><li><p class="paragraph" style="text-align:left;"><a class="link" href="https://docs.google.com/document/d/1NJF05m9PBYfvUVMKDrhzvysiIlRjD_NDwAMAeiOSLdQ/edit?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot#heading=h.e8qbg9ap8g15" target="_blank" rel="noopener noreferrer nofollow">Intro to Python</a></p></li></ul><p class="paragraph" style="text-align:left;">The Intro to SQL start has a tentative start date of September 18, we will meet twice a week, Mondays and Thursdays, for 4 weeks. Each session will meet over Zoom for two hours, from 5:30pm - 7:30pm US Pacific Time. <b>The cost is $100 for the first 10 people who sign up</b>, $200 after that. We cap classes at 25 attendees to ensure a high quality virtual classroom experience. <a class="link" href="https://github.com/Tebs-Lab/intro-to-sql/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">Browse the curriculum on Github</a>.</p><p class="paragraph" style="text-align:left;"><b>Respond to this email to reserve your spot in class</b>. An official enrollment portal is forthcoming, but not available yet.</p><p class="paragraph" style="text-align:left;">The Python class still needs a little curriculum development so our start date is TBA. Today’s edition of the newsletter is, in part, a preview of something you’ll learn in our Intro to Python course: web scraping. </p><p class="paragraph" style="text-align:left;">P.S. I’m traveling next week, so there won’t be a Lab Report.</p></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">Today’s Lesson</h1><div class="custom_html"><span id="lesson"></span></div><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">All the code from today’s lesson can be <a class="link" href="https://github.com/Tebs-Lab/lab-report-code/tree/main/09-01-2023?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">viewed on Github</a></p><figcaption class="blockquote__byline"></figcaption></blockquote></div><h2 class="heading" style="text-align:left;">Crawling in The Web, Looking For The Data</h2><p class="paragraph" style="text-align:left;"><span style="font-size:0.6rem;">Yes, that’s a </span><span style="font-size:0.6rem;"><a class="link" href="https://genius.com/Hoobastank-crawling-in-the-dark-lyrics?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">Hoobastank</a></span><span style="font-size:0.6rem;"> reference. You’re welcome.</span></p><p class="paragraph" style="text-align:left;">Today’s topic is brought to you by several news <a class="link" href="https://www.theguardian.com/technology/2023/aug/25/new-york-times-cnn-and-abc-block-openais-gptbot-web-crawler-from-scraping-content?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">organizations who’ve started blocking OpenAI’s web crawler from their websites</a>, and by me because it’s one of the topics in my Introduction to Python class with <a class="link" href="https://www.devsprout.io/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">DevSprout</a>.</p><p class="paragraph" style="text-align:left;">Web crawling and scraping have been around since 1993, when the first web crawler was built in an attempt to <a class="link" href="https://soft-surge.com/a-brief-history-of-web-crawlers/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">measure the size of the then-nascent world wide web</a>. Then, search engines built their own crawlers in order to identify, analyze, and rank websites. Flash forward to today, where crawlers and other bots comprise a <a class="link" href="https://www.statista.com/statistics/1264540/human-and-bot-web-traffic-share-industry/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">huge portion of web traffic</a>.</p><p class="paragraph" style="text-align:left;">“Crawling” generally refers to automated systems that make web requests to web pages, identify links on those pages, then follow those links and repeat the process recursively. Crawlers often extract additional information from each page they visits.</p><p class="paragraph" style="text-align:left;">Periodically, web crawling has been a hot-button issue. Once upon a time Linkedin filed a lawsuit — <a class="link" href="https://www.zdnet.com/article/court-rules-that-data-scraping-is-legal-in-linkedin-appeal/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">which they ultimately lost</a> — alleging that hiQ Labs was illegally scraping LinkedIn’s publicly available user data. <a class="link" href="https://www.zdnet.com/article/court-rules-that-data-scraping-is-legal-in-linkedin-appeal/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">LinkedIn ultimately lost that case</a>. Clearview AI, a company that sells facial recognition software, <a class="link" href="https://www.nytimes.com/2022/05/09/technology/clearview-ai-suit.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">settled a lawsuit brought by the ACLU</a> related to their massive photo database which was largely harvested by web crawlers. Now, OpenAI has announced <a class="link" href="https://platform.openai.com/docs/gptbot?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">GPTBot</a>, which crawls the web to collect training data, apparently to the <a class="link" href="https://arstechnica.com/information-technology/2023/08/openai-details-how-to-keep-chatgpt-from-gobbling-up-website-data/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">chagrin of news organizations and other copyright holders</a> who are scrambling to block the bot.</p><p class="paragraph" style="text-align:left;">Despite the occasional controversy, web crawling is common, <a class="link" href="https://techcrunch.com/2022/04/18/web-scraping-legal-court/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">generally legal</a>, and often quite simple. For example, here’s a few lines of Python that prints all the links on Abe Lincoln’s Wikipedia page:</p><div class="codeblock"><pre><code>from bs4 import BeautifulSoup
import requests

page = requests.get(&quot;https://en.wikipedia.org/wiki/Abraham_Lincoln&quot;) 
soup = BeautifulSoup(page.content, &#39;html.parser&#39;)
 
all_links = soup.find_all(&#39;a&#39;)

for link in all_links:
    print(link.get_text(), link.attrs.get(&#39;href&#39;))</code></pre></div><p class="paragraph" style="text-align:left;">(<a class="link" href="https://github.com/Tebs-Lab/lab-report-code/blob/main/09-01-2023/crude_get_wiki_links.py?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">View this code on Github for better syntax highlighting</a>)</p><p class="paragraph" style="text-align:left;">It’s crude, but these 7 lines of code are the basis of crawling. It gathers all the links on the page by finding all the ‘a’ tags (<a class="link" href="https://developer.mozilla.org/en-US/docs/Web/HTML/Element/a?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">short for anchor, HTML’s standard tag for hyperlinks</a>). To really “crawl” you can put those links in a queue, and repeat the process until the queue is empty. </p><p class="paragraph" style="text-align:left;">Sophisticated crawlers also need to handle special cases such as links you’ve already visited and websites that use Javascript on the front end to populate the content sometime after the initial page load. </p><p class="paragraph" style="text-align:left;">Copying content from pages adds a bit more complexity, but not too much. Here’s a crude approximation of what Clearview AI did. Mine harvests images from Wikipedia — rather than LinkedIn — and saves the images to files:</p><div class="codeblock"><pre><code>from bs4 import BeautifulSoup
import requests
import shutil

base_url = &quot;https://en.wikipedia.org&quot;
page = requests.get(&quot;https://en.wikipedia.org/wiki/Abraham_Lincoln&quot;) 
soup = BeautifulSoup(page.content, &#39;html.parser&#39;)
 
all_img_tags = soup.find_all(&#39;img&#39;)

img_count = 0
for img in all_img_tags:
    img_url = img.attrs.get(&#39;src&#39;)
    
    # Images on wikipedia have two cases
    if img_url.startswith(&#39;//&#39;):
        absolute_url = f&#39;https:&#123;img_url&#125;&#39;
    else:
        absolute_url = base_url + img_url

    response = requests.get(absolute_url, stream=True)
    file_type = response.headers[&#39;content-type&#39;].split(&#39;/&#39;)[-1] # kinda gross, but works.
    with open(f&#39;09-01-2023/img_out/&#123;img_count&#125;.&#123;file_type&#125;&#39;, &#39;wb&#39;) as out_file:
        shutil.copyfileobj(response.raw, out_file)
    img_count += 1</code></pre></div><p class="paragraph" style="text-align:left;">(<a class="link" href="https://github.com/Tebs-Lab/lab-report-code/blob/main/09-01-2023/extract_wiki_images.py?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">View this code on Github for a more beautiful experience</a>)</p><p class="paragraph" style="text-align:left;">GPTBot is designed to scrape the text of news articles from various news websites. Each website operator makes decisions about how to format and present that text, which means GPTBot’s operators need specialized code that extracts this text on a per-website basis. They also probably want to ignore advertisements, image captions, and other ancillary text that might appear.</p><p class="paragraph" style="text-align:left;">I made two scrapers to demonstrate that this is also, generally, quite easy. Here’s one that gets the body of CNN news articles:</p><div class="codeblock"><pre><code>from bs4 import BeautifulSoup
import requests

# A randomly selected CNN article from the day I wrote this script.
page = requests.get(&quot;https://www.cnn.com/2023/08/30/business/san-francisco-union-square-retail-closures/index.html&quot;)

soup = BeautifulSoup(page.content, &#39;html.parser&#39;)
individual_p_tags = soup.select(&#39;.article__content p&#39;)  # CNN&#39;s content sits in p tags under a div with this class
texts = [tag.text.strip() for tag in individual_p_tags]
a_text = &#39;\n&#39;.join(texts)

with open(&#39;09-01-2023/news_text_out/cnn_article.txt&#39;, &#39;w&#39;) as file:
    file.write(a_text)</code></pre></div><p class="paragraph" style="text-align:left;">(<a class="link" href="https://github.com/Tebs-Lab/lab-report-code/blob/main/09-01-2023/parse_cnn_body.py?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">Again, the code looks better on Github</a>)</p><p class="paragraph" style="text-align:left;">And one for the BBC:</p><div class="codeblock"><pre><code>from bs4 import BeautifulSoup
import requests

# A randomly selected BBC article from the day I wrote this script.
page = requests.get(&quot;https://www.bbc.com/sport/football/66662060&quot;)
soup = BeautifulSoup(page.content, &#39;html.parser&#39;)

# BBC wraps the main body in a div with this class, but uses p&#39;s for the text
individual_p_tags = soup.select(&#39;.story-body p&#39;)
texts = [tag.text for tag in individual_p_tags]
a_text = &#39;\n&#39;.join(texts)

with open(&#39;09-01-2023/news_text_out/bbc_article.txt&#39;, &#39;w&#39;) as file:
    file.write(a_text)</code></pre></div><p class="paragraph" style="text-align:left;">(<a class="link" href="https://github.com/Tebs-Lab/lab-report-code/blob/main/09-01-2023/parse_bbc_body.py?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">Github is 1000x better at displaying this code than my newsletter provider</a>)</p><p class="paragraph" style="text-align:left;">My point is that harvesting data from websites is remarkably easy, which makes it appealing to the <a class="link" href="https://epochai.org/blog/trends-in-training-dataset-sizes?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">increasingly data-hungry ML industry</a>.</p><h2 class="heading" style="text-align:left;">How Do Companies Stop This?</h2><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/ce0e52bd-5d3f-4a1d-836a-12b025ad1366/neatpart.jpg"/></div><p class="paragraph" style="text-align:left;">The bottom line is that if human users can access data using a web browser without some kind of authentication then a bot can too. Web crawlers can be designed such that they are basically indistinguishable to the companies’ servers from a human using a browser. But there are a few options that companies often use to reduce or manage bot traffic.</p><h3 class="heading" style="text-align:left;">Robots.txt</h3><p class="paragraph" style="text-align:left;">The first option is a <a class="link" href="https://developers.google.com/search/docs/crawling-indexing/robots/intro?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot#:~:text=A%20robots.txt%20file%20tells,or%20password%2Dprotect%20the%20page." target="_blank" rel="noopener noreferrer nofollow">robots.txt file</a>. These files are more of a request than a defense. Websites use these files to indicate to web crawlers and other bots that certain pages shouldn’t be indexed, viewed, or otherwise accessed by bots. When OpenAI announced GPTBot, they also added <a class="link" href="https://platform.openai.com/docs/gptbot?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">documentation for how to modify your site’s robots.txt</a> to prevent GPTBot from accessing certain pages or directories. </p><p class="paragraph" style="text-align:left;">When you read that “news organizations are scrambling to block GPTBot” what they mean is that websites are updating their robots.txt file, which is a totally standard and very easy thing for a web developer to do.</p><p class="paragraph" style="text-align:left;">Unfortunately, robots.txt only works for bots that choose to respect the specified rules.</p><h3 class="heading" style="text-align:left;">Honeypots</h3><p class="paragraph" style="text-align:left;">Some websites host fake content on a particular URL, then add a line in their robots.txt explicitly banning access to that URL. If anything accesses that URL, sites ban that IP address from accessing the site at all. <a class="link" href="https://www.madirish.net/194?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">Here’s a write up describing an implementation of this tactic</a>.</p><p class="paragraph" style="text-align:left;">With the ubiquity of VPNs, proxy servers, and other ways to for malicious actors get a new IP address, this can become a game of whack a mole if your adversary is tenacious. Additional <a class="link" href="https://pixelprivacy.com/resources/browser-fingerprinting/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">user fingerprinting tactics</a> can help somewhat, but are not surefire.</p><h3 class="heading" style="text-align:left;">Authentication and reCAPTCHA </h3><p class="paragraph" style="text-align:left;">I’m sure most of you have done a <a class="link" href="https://support.google.com/recaptcha/?hl=en&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">reCAPTCHA</a>, either by clicking a checkmark, finding the crosswalks in an image, or some similar task. This is a technology built by Google specifically to prevent bots from accessing specific content by hiding that content behind the task.</p><p class="paragraph" style="text-align:left;">This process introduces a minor headache for human users, but a <i>major </i>hurdle for bots. Some bots can likely perform some of the reCAPTCHA tasks some of the time, but it makes the bot operators job <i>much</i> harder.</p><p class="paragraph" style="text-align:left;">Classic authentication is even better for two reasons. First, signing up and gaining authentication credentials is often a multi-step process that involves an email, a text message, or some other second factor authentication. This process further complicates the automation process. Second, if someone who is authenticated starts behaving like a bot, banning that user is more effective than a simple IP ban, because they’ll have to repeat the signup flow.</p><p class="paragraph" style="text-align:left;">While there is no 100% surefire way to prevent people from scraping your online content, a robots.txt, honeypots, authentication, and constant vigilance can substantially reduce successful scraping.</p></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">The News Quiz</h1><p class="paragraph" style="text-align:left;"><i>Every week we challenge ourselves to tie the lesson to the news. Answers are at the end of this newsletter.</i></p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/af7b81b0-863a-43d4-aa0d-eaf8afd6ca09/img-hQ4SfYkToesVlAqjYSItOfu0.png"/><div class="image__source"><span class="image__source_text"><p>A spider crawling on a newspaper.</p></span></div></div><p class="paragraph" style="text-align:left;">Web crawling and scraping can be messy. The fact that OpenAI is doing it suggests (at least to me) they can’t get enough high quality training data from more traditional sources (such as “<a class="link" href="https://huggingface.co/datasets/the_pile_books3?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">The Pile</a>” and other <a class="link" href="https://hackernoon.com/10-best-hugging-face-datasets-for-building-nlp-models?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">open NLP datasets</a>). Classify the following issues that can arise <i>specifically from scraping the text of news articles</i> as <b>high risk</b>,<b> medium risk</b>,<b> </b>or<b> low risk </b>from the perspective of a company training a large language model (LLM):</p><ul><li><p class="paragraph" style="text-align:left;">Accidentally capturing pieces of text that aren’t part of the article, such as advertisements, image captions, pull quotes, embedded links to other articles, etc.<br></p></li><li><p class="paragraph" style="text-align:left;">Pulling in <a class="link" href="https://en.wikipedia.org/wiki/Native_advertising?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">native advertising</a> content.<br></p></li><li><p class="paragraph" style="text-align:left;">Pulling in articles that were themselves written by an LLM.<br></p></li><li><p class="paragraph" style="text-align:left;">Consuming articles with factual errors before they’ve been corrected.<br></p></li><li><p class="paragraph" style="text-align:left;">Incorporating copyright protected content into your training data.<br></p></li><li><p class="paragraph" style="text-align:left;">Increased cost incurred from the actual process of scraping the data.</p></li></ul></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">Themes in the News</h1><h3 class="heading" style="text-align:left;">Censorship, Moderation, and Section 230 </h3><p class="paragraph" style="text-align:left;">Lots of recent buzz around these perennial topics. </p><p class="paragraph" style="text-align:left;">A judge <a class="link" href="https://www.washingtonpost.com/technology/2023/08/24/rnc-google-lawsuit-dismissed-spam-filter/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">dismissed a lawsuit</a> brought by the Republican National Committee (RNC) against Google. The lawsuit alleged that Google’s spam filter was biased against Republican candidates’ and officials’ emails. U.S. District Court Judge Daniel Calabretta concluded that the RNC had not “sufficiently pled that Google acted in bad faith.” The RNC plans to refile the lawsuit with an amended complaint.</p><p class="paragraph" style="text-align:left;">Meanwhile, the Biden administration <a class="link" href="https://www.nytimes.com/2023/08/14/us/supreme-court-social-media-texas-florida.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">urged the Supreme Court</a> to take on cases related to laws in Texas and Florida that substantially limit social media companies ability to perform moderation. The Florida law, for example, imposes fines on social media platforms if they “refuse to transmit” a politicians post, regardless of whether that post violates the company’s content policy.</p><p class="paragraph" style="text-align:left;">X (formerly Twitter), Meta, and YouTube have all indicated they will <a class="link" href="https://www.washingtonpost.com/technology/2023/08/25/political-conspiracies-facebook-youtube-elon-musk/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">decrease their moderation efforts regarding misinformation</a>. In part this is probably because platforms have realized it’s really hard to do this well. Many of them fumbled the <a class="link" href="https://www.readtangle.com/hunter-biden-laptop-new-york-times/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">Hunter Biden laptop story</a>, suppressing the NYPost’s original article that turned out to be totally real. The “<a class="link" href="https://www.readtangle.com/twitter-files-pt-2-3-4-and-5/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">Twitter files</a>” should also cast doubt on platforms ability and willingness to be neutral and trustworthy.</p><p class="paragraph" style="text-align:left;">At the same time, these platforms have an ongoing and increasingly adversarial relationship with news organizations, including various efforts to <a class="link" href="https://www.washingtonpost.com/technology/2023/08/24/canada-fires-ca-floods-meta-x-news/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">deprioritize news content on their sites</a>. Many people have come to rely on social media as a news aggregator. Between reduced moderation and deprioritization of legitimate news content, there’s a major void that is being filled with <a class="link" href="https://www.bbc.com/news/technology-64804007?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">trolls</a>, <a class="link" href="https://www.technologyreview.com/2023/04/04/1070938/we-are-hurtling-toward-a-glitchy-spammy-scammy-ai-powered-internet/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">trashy AI generated content</a>, and other kinds of misinformation.</p><p class="paragraph" style="text-align:left;">A report from Rest Of World this week highlighted a fairly predictable outcome of backing off moderation policies as X has done under Elon Musk’s leadership: scams become more prevalent. In this case, <a class="link" href="https://restofworld.org/2023/chinese-sextortion-scammers-are-flooding-twitter/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">sextortion scams targeting prominent Chinese figures on the platform</a>.</p><p class="paragraph" style="text-align:left;">So here we are, stuck between platforms that can’t be trusted to moderate particularly well and the free-for-all of the internet that can’t be trusted at all. </p><p class="paragraph" style="text-align:left;">Is it good news that OpenAI thinks <a class="link" href="https://www.platformer.news/p/openai-wants-to-moderate-your-content?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">ChatGPT will soon be able to </a><a class="link" href="https://www.platformer.news/p/openai-wants-to-moderate-your-content?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">moderate</a> on social sites?</p><h3 class="heading" style="text-align:left;">Teb’s Tidbits</h3><ul><li><p class="paragraph" style="text-align:left;">Ben Evans write a <a class="link" href="https://www.ben-evans.com/benedictevans/2023/8/27/generative-ai-ad-intellectual-property?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">thoughtful piece</a> on the state of Generative AI and copyright.</p><ul><li><p class="paragraph" style="text-align:left;">I like the piece, but he gets something important wrong when he claims generative models never store the training data. Researchers have been able to prompt LLMs to produce exact replicas of lengthy <a class="link" href="https://www.newscientist.com/article/2372140-chatgpt-seems-to-be-trained-on-copyrighted-books-like-harry-potter/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">passages from books</a>, and Getty Images is currently suing Stability AI for their model that <a class="link" href="https://www.theverge.com/2023/2/6/23587393/ai-art-copyright-lawsuit-getty-images-stable-diffusion?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">produced exact replicas of copy protected images</a>. This implies that these systems do store training data, albeit in an obfuscated and compressed way within their internal parameters. This fact is important in the copyright discussion, since consumption by an ML system is not always “<a class="link" href="https://www.copyright.gov/fair-use/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">transformative</a>” in the legal sense. <br></p></li></ul></li><li><p class="paragraph" style="text-align:left;">An argument that the UK’s recently unveiled plans to change their surveillance rules would <a class="link" href="https://www.justsecurity.org/87615/changes-to-uk-surveillance-regime-may-violate-international-law/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">violate international law</a>.<br></p></li><li><p class="paragraph" style="text-align:left;"><a class="link" href="https://www.theverge.com/2023/8/24/23843487/meta-llama-code-generation-generative-ai-llm?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">Meta released Llama Code</a>, an LLM designed specifically to write computer programs.<br></p></li><li><p class="paragraph" style="text-align:left;">A fantastic and grounded <a class="link" href="https://spectrum.ieee.org/self-driving-cars-2662494269?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">risk assessment of self driving cars</a> in IEEE Spectrum.<br></p></li><li><p class="paragraph" style="text-align:left;">And a pleasant surprise: <a class="link" href="https://techcrunch.com/2023/08/23/apple-lends-support-to-california-state-right-to-repair-bill/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">Apple has come out in favor of California’s Right to Repair Act</a>.</p></li></ul></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">Answers To The News Quiz</h1><ul><li><p class="paragraph" style="text-align:left;">Accidentally capturing pieces of text that aren’t part of the article, such as advertisements, image captions, pull quotes, embedded links to other articles, etc.</p><ul><li><p class="paragraph" style="text-align:left;"><b>High Risk:</b> if the training data ends up with lots of incongruent articles with random advertising tidbits interspersed with the real text, this could definitely degrade model performance. Cleaning up the scraped text to remove these errors would be worth the hard work.<br></p></li></ul></li><li><p class="paragraph" style="text-align:left;">Pulling in <a class="link" href="https://en.wikipedia.org/wiki/Native_advertising?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">native advertising</a> content.</p><ul><li><p class="paragraph" style="text-align:left;"><b>Low Risk: </b>Native advertising is still a perfectly legitimate use of language even if it’s not particularly high-brow.<br></p></li></ul></li><li><p class="paragraph" style="text-align:left;">Pulling in articles that were themselves written by an LLM.</p><ul><li><p class="paragraph" style="text-align:left;"><b>High Risk: </b>It’s well established that training <a class="link" href="https://venturebeat.com/ai/the-ai-feedback-loop-researchers-warn-of-model-collapse-as-ai-trains-on-ai-generated-content/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">LLMs on text created by LLMs degrades model performance</a>. Making matters worse current methods for discerning if a particular article is AI generated are <a class="link" href="https://www.technologyreview.com/2023/02/07/1067928/why-detecting-ai-generated-text-is-so-difficult-and-what-to-do-about-it/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">not reliable at all</a>.<br></p></li></ul></li><li><p class="paragraph" style="text-align:left;">Consuming articles with factual errors before they’ve been corrected.</p><ul><li><p class="paragraph" style="text-align:left;"><b>Low Risk: </b>LLM’s ‘<a class="link" href="https://machinelearningmastery.com/a-gentle-introduction-to-hallucinations-in-large-language-models/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">hallucinate</a>’ all the time, even if they’ve been trained exclusively on factually accurate data. LLM’s learn to recreate patterns in writing mostly related to word order, not specific facts from pieces of writing.<br></p></li></ul></li><li><p class="paragraph" style="text-align:left;">Incorporating copyright protected content into your training data.</p><ul><li><p class="paragraph" style="text-align:left;"><b>Medium to High Risk</b>: Depending on the outcome of some <a class="link" href="https://techcrunch.com/2023/01/27/the-current-legal-cases-against-generative-ai-are-just-the-beginning/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">pending lawsuits</a> this might end up being a very big risk or something that can be easily addressed with a license agreement, royalty scheme, or something similar. <br></p></li></ul></li><li><p class="paragraph" style="text-align:left;">Increased cost incurred from the actual process of scraping the data.</p><ul><li><p class="paragraph" style="text-align:left;"><b>Low Risk: </b>scraping is generally cheap and easy to perform.</p></li></ul></li></ul></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h2 class="heading" style="text-align:left;">Remember…</h2><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">The Lab Report is free and doesn’t even advertise. Our curricula is open source and published under a public domain license for <i>anyone </i>to use for <i>any purpose. </i>We’re also a very small team with no investors.</p><p class="paragraph" style="text-align:left;">Help us keep providing these free services by scheduling one of our <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">world class trainings </a>or requesting a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=what-does-it-mean-to-block-gptbot" target="_blank" rel="noopener noreferrer nofollow">custom class</a> for your team.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div></div></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=83b2159a-6c14-487f-a2ef-559debb58559&utm_medium=post_rss&utm_source=teb_s_lab">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

      <item>
  <title>How Does ChatGPT &quot;Understand&quot; Words?</title>
  <description>It&#39;s numbers all the way down...</description>
      <enclosure url="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/20f83343-aafb-43dc-bde8-96be90210a6f/EmbeddingExample.png" length="52870" type="image/png"/>
  <link>https://blog.tebs-lab.com/p/chatgpt-understand-words</link>
  <guid isPermaLink="true">https://blog.tebs-lab.com/p/chatgpt-understand-words</guid>
  <pubDate>Fri, 25 Aug 2023 09:00:00 +0000</pubDate>
  <atom:published>2023-08-25T09:00:00Z</atom:published>
    <dc:creator>Tyler Bettilyon</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #C0C0C0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #2D2D2D; font-family: 'Helvetica',Arial,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#F1F1F1; }
  .bh__table_header p { color: #2A2A2A; font-family:'Trebuchet MS','Lucida Grande',Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><h1 class="heading" style="text-align:left;">The Weekly Lab Report</h1><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">I’m Tyler Elliot Bettilyon (Teb) and this is the Lab Report: Our goal is to deepen your understanding of software and technology by explaining the concepts behind the news. <br><br>If you’re new to the Lab Report you can <a class="link" href="https://tebslab.beehiiv.com/subscribe?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=how-does-chatgpt-understand-words" target="_blank" rel="noopener noreferrer nofollow">subscribe here</a>. <br><br>If you like what you’re reading you’ll love one of our classes. <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=how-does-chatgpt-understand-words" target="_blank" rel="noopener noreferrer nofollow">Schedule a training</a> from our catalog or request a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=how-does-chatgpt-understand-words" target="_blank" rel="noopener noreferrer nofollow">custom class consultation</a>.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">From The Lab</h1></div><p class="paragraph" style="text-align:left;">I had classes from 9:00-5:00pm <i>every day </i>this week. Additionally, I have a side gig as a high school debate coach and school started this week in my district. So, today’s edition is significantly shorter than usual. </p><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">Today’s Lesson</h1><div class="custom_html"><span id="lesson"></span></div><h2 class="heading" style="text-align:left;">How Do AI&#39;s &quot;Understand&quot; Words?</h2><p class="paragraph" style="text-align:left;">Last week we wrote about “<a class="link" href="https://blog.tebs-lab.com/p/bitter-lesson-moores-law-trade-war-china?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=how-does-chatgpt-understand-words" target="_blank" rel="noopener noreferrer nofollow">The Bitter Lesson</a>,” in which Richard Sutton documents the superiority of “general purpose methods” that scale with computation. A practical consequence of this lesson is that modern Natural Language Processing (NLP) is dominated by statistical models called neural networks. Specifically, a type of neural network called a <a class="link" href="https://machinelearningmastery.com/the-transformer-model/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=how-does-chatgpt-understand-words" target="_blank" rel="noopener noreferrer nofollow">transformer</a> is the model <i>du jour</i>.</p><p class="paragraph" style="text-align:left;">These models are <i>fundamentally </i>numerical in nature. Each neural network is literally just a math function. </p><p class="paragraph" style="text-align:left;">In fairness to the models, they are <i>immensely complex</i> math functions. OpenAI’s Large Language Model (LLM) GPT-3 has <a class="link" href="https://www.theatlantic.com/technology/archive/2023/03/openai-gpt-4-parameters-power-debate/673290/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=how-does-chatgpt-understand-words" target="_blank" rel="noopener noreferrer nofollow">175 billion parameters</a> and GPT-4 is rumored to have ~<i>100</i> <i>trillion</i>. A “parameter” in this context is a number that the model repeatedly changes during the training process. To the extent that a neural network “knows” anything, that knowledge is encoded in these numbers. The parameters are spread across a complex web of mathematical sub-components. But, at the end of the day, these models are still just fancy math functions.</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/4935a0f3-cfd2-4f03-bc68-db3107893208/Screenshot_2023-08-24_at_11.02.11_AM.png"/><div class="image__source"><span class="image__source_text"><p>A representation of the Transformer architecture from the paper that first introduced them: <a class="link" href="https://arxiv.org/pdf/1706.03762.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=how-does-chatgpt-understand-words" target="_blank" rel="noopener noreferrer nofollow">https://arxiv.org/pdf/1706.03762.pdf</a></p></span></div></div><p class="paragraph" style="text-align:left;">Words — and natural language in general — are <i>absolutely not </i>numerical or mathematical in nature. Natural language evolved … naturally. Most natural languages have rules, but there are almost always exceptions. Humans, the inventors of natural language, do not internally contextualize words in mathematical terms. But this mismatch hasn’t prevented remarkable progress in the form of ChatGPT, Bard, LLaMA, and other LLMs. So, how do these models reconcile this fundamental mismatch?</p><p class="paragraph" style="text-align:left;">The answer is something called an <b>embedding. </b></p><h2 class="heading" style="text-align:left;">Embeddings</h2><p class="paragraph" style="text-align:left;">Embeddings are a general purpose tactic used to represent some kind of data — words, movies, music — as a vector (i.e. as several numbers). Embeddings can be manually created or be “learned” as part of the training process. Either way, each number in the vector represents some aspect of the thing being embedded. For example, here are three small embeddings manually created by me for encoding TV shows or movies:</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/20f83343-aafb-43dc-bde8-96be90210a6f/EmbeddingExample.png"/><div class="image__source"><span class="image__source_text"><p>A sample of embeddings representing two TV shows and a movie.</p></span></div></div><p class="paragraph" style="text-align:left;">When Pandora first launched they hired several Ph.D. music theorists to manually create high-quality and embeddings for the songs in their catalog. These embeddings included fields like acoustic sonority, minor key tonality, vocal centric aesthetic, and other features of songs that the music theorists thought were strongly connected to people’s taste in music. These embeddings were a core part of Pandora’s recommendation engine. </p><p class="paragraph" style="text-align:left;">When we embed words the each number corresponds to some aspect of the word. Sometimes these are grammatical in nature such as plurality and being a proper noun. Sometimes they are semantic in nature such as having gender implications, association with various emotions, or association with abstract concepts like royalty, nature, or courage.</p><p class="paragraph" style="text-align:left;">For example, the word “queens” is plural, not a proper noun, implies female gender, has a strong relationship with royalty and a weak relationship with nature. </p><p class="paragraph" style="text-align:left;">Depending on the size of the embedding more or less meaning can be encoded. GPT-3 uses an embedding length of <a class="link" href="https://dugas.ch/artificial_curiosity/GPT_architecture.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=how-does-chatgpt-understand-words" target="_blank" rel="noopener noreferrer nofollow">12,288 numbers per word</a>, although technically GPT-3 uses “<a class="link" href="https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=how-does-chatgpt-understand-words" target="_blank" rel="noopener noreferrer nofollow">tokens</a>” rather than “words” which breaks some words into parts and allows the system to encode tokens for punctuation marks. </p><p class="paragraph" style="text-align:left;">Additionally, for a variety of reasons, GPT-3 (and most LLMs) use embeddings that are learned as part of the training process, rather than manually crafted embeddings like the ones Pandora pioneered. This makes interpreting GPT-3’s embeddings quite difficult, and full of guesswork — but it also allows different neural networks to build embeddings that help with the specific task at hand, which often improves performance.</p><p class="paragraph" style="text-align:left;">Shakespeare’s Juliet philosophically asked “what’s in a word?” </p><p class="paragraph" style="text-align:left;">Well, at least to ChatGPT the answer is clear: A rose represented by the same 12,288 floating point numbers surely smells just as sweet. </p></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">Themes in the News</h1><h3 class="heading" style="text-align:left;">More on Moore’s Law</h3><p class="paragraph" style="text-align:left;">A few pieces of news related to <a class="link" href="https://blog.tebs-lab.com/p/bitter-lesson-moores-law-trade-war-china?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=how-does-chatgpt-understand-words" target="_blank" rel="noopener noreferrer nofollow">last week’s lesson</a>. </p><p class="paragraph" style="text-align:left;">Driven largely by ML (although <a class="link" href="https://rmi.org/cryptocurrencys-energy-consumption-problem/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=how-does-chatgpt-understand-words" target="_blank" rel="noopener noreferrer nofollow">cryptocurrency is also a major contributor</a>) energy use for computation has skyrocketed in recent years. Our chips’ ever increasing capabilities aren’t free. As of right now the <a class="link" href="https://www.economist.com/science-and-technology/2023/08/16/can-computing-clean-up-its-act?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=how-does-chatgpt-understand-words" target="_blank" rel="noopener noreferrer nofollow">computation industry uses about as much energy as all of Britain</a>.</p><p class="paragraph" style="text-align:left;">Speaking of Britain, Prime Minister Rishi Sunak wants to buy <a class="link" href="https://www.telegraph.co.uk/business/2023/08/20/sunak-spend-100m-taxpayer-cash-ai-chips-global-race/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=how-does-chatgpt-understand-words" target="_blank" rel="noopener noreferrer nofollow">£100 million worth of NVIDIA chips</a> to further the UK’s position in the global AI race. NVIDIA has become an industry leader in AI not just by building top-end hardware but also by building <a class="link" href="https://www.nytimes.com/2023/08/21/technology/nvidia-ai-chips-gpu.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=how-does-chatgpt-understand-words" target="_blank" rel="noopener noreferrer nofollow">excellent support to help software engineers</a> get the most out of NVIDIA’s chips.</p><p class="paragraph" style="text-align:left;">Some folks are focusing on ways to reduce or mitigate the energy cost of computation, for example this company wants you to heat your water using a powerful server as part of a <a class="link" href="https://www.technologyreview.com/2023/08/18/1077548/computer-waste-heat/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=how-does-chatgpt-understand-words" target="_blank" rel="noopener noreferrer nofollow">distributed compute cluster</a>.</p><h3 class="heading" style="text-align:left;">Teb’s Tidbits</h3><ul><li><p class="paragraph" style="text-align:left;">A district court confirmed the USPTO’s decision that <a class="link" href="https://www.hollywoodreporter.com/business/business-news/ai-works-not-copyrightable-studios-1235570316/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=how-does-chatgpt-understand-words" target="_blank" rel="noopener noreferrer nofollow">AI generated content cannot be copyrighted</a>. Though the SCOTUS may ultimately have to weigh in, this is a big deal for the world of generative AI.</p><ul><li><p class="paragraph" style="text-align:left;">Meanwhile, several authors are <a class="link" href="https://www.theatlantic.com/technology/archive/2023/08/books3-ai-meta-llama-pirated-books/675063/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=how-does-chatgpt-understand-words" target="_blank" rel="noopener noreferrer nofollow">suing Meta for pirating their books</a>.<br></p></li></ul></li><li><p class="paragraph" style="text-align:left;">A vulnerability in a <a class="link" href="https://www.technologyreview.com/2023/08/21/1078207/sogou-keyboard-app-security-loophole/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=how-does-chatgpt-understand-words" target="_blank" rel="noopener noreferrer nofollow">Chinese keyboard app</a> may have exposed an enormous amount of private data. <br></p></li><li><p class="paragraph" style="text-align:left;">Microsoft has been <a class="link" href="https://arstechnica.com/security/2023/08/microsoft-cloud-security-blasted-for-its-culture-of-toxic-obfuscation/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=how-does-chatgpt-understand-words" target="_blank" rel="noopener noreferrer nofollow">savaged by commentators over high severity vulnerabilities</a> and Microsoft’s inadequate response to breaches based on those vulnerabilities.<br></p></li><li><p class="paragraph" style="text-align:left;">Maybe we should all be hoping for AI’s that are a <a class="link" href="https://www.technologyreview.com/2023/08/22/1078230/why-we-should-all-be-rooting-for-boring-ai/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=how-does-chatgpt-understand-words" target="_blank" rel="noopener noreferrer nofollow">bit more mundane</a>.<br></p></li><li><p class="paragraph" style="text-align:left;">San Francisco recently allowed Cruise to massively <a class="link" href="https://www.nbcnews.com/tech/innovation/self-driving-car-sf-taxi-waymo-cruise-california-vote-cpuc-rcna97540?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=how-does-chatgpt-understand-words" target="_blank" rel="noopener noreferrer nofollow">expand their robotaxi offering</a>. This week, following two crashes (one with a firetruck) the city has forced Cruise to cut their <a class="link" href="https://qz.com/cruise-cut-robotaxi-fleet-50-crashes-san-francisco-1850756908?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=how-does-chatgpt-understand-words" target="_blank" rel="noopener noreferrer nofollow">cars on the road in half</a>. </p></li></ul></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h2 class="heading" style="text-align:left;">Remember…</h2><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">The Lab Report is free and doesn’t even advertise. Our curricula is open source and published under a public domain license for <i>anyone </i>to use for <i>any purpose. </i>We’re also a very small team with no investors.</p><p class="paragraph" style="text-align:left;">Help us keep providing these free services by scheduling one of our <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=how-does-chatgpt-understand-words" target="_blank" rel="noopener noreferrer nofollow">world class trainings </a>or requesting a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=how-does-chatgpt-understand-words" target="_blank" rel="noopener noreferrer nofollow">custom class</a> for your team.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div></div></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=1496d5c0-2c39-4efd-a4d6-b0dc05354b2b&utm_medium=post_rss&utm_source=teb_s_lab">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

      <item>
  <title>The Bitter Lesson, Moore’s Law, and a Trade War With China</title>
  <description>News: AI in malicious hands</description>
      <enclosure url="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/dde7d7ce-01f8-4da0-ac68-2a3bf1d1c59d/48-Years-of-Microprocessor-Trend-Data-2.png" length="104133" type="image/png"/>
  <link>https://blog.tebs-lab.com/p/bitter-lesson-moores-law-trade-war-china</link>
  <guid isPermaLink="true">https://blog.tebs-lab.com/p/bitter-lesson-moores-law-trade-war-china</guid>
  <pubDate>Fri, 18 Aug 2023 09:00:00 +0000</pubDate>
  <atom:published>2023-08-18T09:00:00Z</atom:published>
    <dc:creator>Tyler Bettilyon</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #C0C0C0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #2D2D2D; font-family: 'Helvetica',Arial,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#F1F1F1; }
  .bh__table_header p { color: #2A2A2A; font-family:'Trebuchet MS','Lucida Grande',Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><h1 class="heading" style="text-align:left;">The Weekly Lab Report</h1><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">I’m Tyler Elliot Bettilyon (Teb) and this is the Lab Report: Our goal is to deepen your understanding of software and technology by explaining the concepts behind the news. <br><br>If you’re new to the Lab Report you can <a class="link" href="https://tebslab.beehiiv.com/subscribe?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">subscribe here</a>. <br><br>If you like what you’re reading you’ll love one of our classes. <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">Schedule a training</a> from our catalog or request a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">custom class consultation</a>.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">From The Lab</h1></div><p class="paragraph" style="text-align:left;">Today I’ll be finishing up the public-facing outline for my upcoming Python class with <a class="link" href="https://www.devsprout.io/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">DevSprout</a>. Several of you indicated interest, so I’ll share that next week.</p><p class="paragraph" style="text-align:left;">This week I taught an <a class="link" href="https://github.com/Tebs-Lab/intermediate-sql-workshop?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">Intermediate SQL</a> course that was a ton of fun, but also took most of my time and energy. I’m also scheduled in the classroom for 32 hours next week. As a result this week and next week’s newsletters will both be shortened.</p><p class="paragraph" style="text-align:left;">Relatedly, we’re hoping to hire a part-time writer to help avoid this situation in the future. If you or someone you know wants to help me write this newsletter, drop me a line at teb@tebs-lab.com.</p><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">Today’s Lesson</h1><div class="custom_html"><span id="lesson"></span></div><h2 class="heading" style="text-align:left;">The Bitter Lesson, Moore’s Law, and a Trade War With China</h2><p class="paragraph" style="text-align:left;">An underappreciated fact about the machine learning revolution is that its <i>fundamentally</i> about computer hardware. The family of models taking the world by storm — <a class="link" href="https://medium.com/tebs-lab/introduction-to-deep-learning-a46e92cb0022?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">neural networks</a> — were <a class="link" href="https://news.mit.edu/2017/explained-neural-networks-deep-learning-0414?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">first described in a paper in 1944</a>. At the time Neural networks failed to do anything interesting just <i>way too slow</i>. </p><p class="paragraph" style="text-align:left;">Nevertheless, since about 2014 it’s been popular to describe the progress in AI as an “exponential trend” that may soon result in <a class="link" href="https://waitbutwhy.com/2015/01/artificial-intelligence-revolution-1.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">hyper-intelligent AI’s going full Skynet</a>. And it’s true — progress in AI <i>has</i> been exponential. But it’s not the algorithms that’s getting exponentially better, it’s the hardware they run on. Specifically, the number of transistors we can fit on a given area of an integrated circuit has been growing exponentially since the 1960’s.</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/7d61f569-93e7-4072-b173-b6b48a0d4631/image.png"/><div class="image__source"><span class="image__source_text"><p>The exponential growth of transistor count since 1971. Source: <a class="link" href="https://bjc.edc.org/bjc-r/cur/programming/6-computers/2-history-impact/2-moore.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">https://bjc.edc.org/bjc-r/cur/programming/6-computers/2-history-impact/2-moore.html</a></p></span></div></div><p class="paragraph" style="text-align:left;">This observation, first made by Gordon Moore in 1965, became a guiding principle of computer chip development: The number of transistors that we could fit on an integrated circuit would double roughly every 18 months. This phenomenon is called <a class="link" href="https://physicsworld.com/a/moores-law-further-progress-will-push-hard-on-the-boundaries-of-physics-and-economics/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">Moore’s Law</a> and the computer chip manufacturing industry has kept Moore’s Law alive for nearly six decades through incredible feats of engineering such as <a class="link" href="https://spectrum.ieee.org/high-na-euv?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">Extreme Ultraviolet Photolithography</a> (EUV). </p><p class="paragraph" style="text-align:left;">In 2019, long time AI researcher Richard Sutton wrote a short paper called “<a class="link" href="https://www.cs.utexas.edu/~eunsol/courses/data/bitter_lesson.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">The Bitter Lesson</a>.” Here’s the opening of that paper:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">The biggest lesson that can be read from 70 years of AI research is that general methods that leverage computation are ultimately the most effective, and by a large margin. <b>The ultimate reason for this is Moore&#39;s law</b>, or rather its generalization of continued exponentially falling cost per unit of computation. </p><figcaption class="blockquote__byline"><a class="link" href="https://www.cs.utexas.edu/~eunsol/courses/data/bitter_lesson.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">Richard Sutton</a> (emphasis mine) </figcaption></blockquote></div><p class="paragraph" style="text-align:left;">Once upon a time AI researchers approached their work with the assumption that clever algorithms could effectively encode something like human level “understanding” or “intelligence” in a given domain. But Sutton demonstrates in his paper that time and time again, in every domain, “brute force” methods have ultimately come out on top. </p><p class="paragraph" style="text-align:left;">This has been true for as long as Moore’s Law has held up, but the chip industry is hurdling towards a few physical limitations that will eventually bring the 60-year “law” to an end, and perhaps soon.</p><h2 class="heading" style="text-align:left;">The End of Moore’s Law?</h2><p class="paragraph" style="text-align:left;">Moore <a class="link" href="https://physicsworld.com/a/moores-law-further-progress-will-push-hard-on-the-boundaries-of-physics-and-economics/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">himself once commented</a>, “These are fundamentals I don’t see how we [will] ever get around.” </p><p class="paragraph" style="text-align:left;">The fundamentals he’s referring to are: heat, size, and the speed of light. </p><h3 class="heading" style="text-align:left;">Heat</h3><p class="paragraph" style="text-align:left;">Until the early 2000s CPU speeds doubled along with transistor count. Smaller transistors meant electricity was traveling a smaller distance during a single cycle of computation, which allowed computer engineers to increase the clock speed. However, Increasing a CPU’s clock speed also makes it hotter. In about 2005 high-end CPUs started running hot enough to melt critical components of the CPU.</p><p class="paragraph" style="text-align:left;">In fact, chip manufactures have all but stopped trying to increase clock speeds (called frequency in the chart below) and started adding more cores instead. Since then high performance computing has increasingly <a class="link" href="https://medium.com/tebs-lab/the-age-of-parallel-computing-b3f4319c97b0?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">been all about parallelism</a>.</p><p class="paragraph" style="text-align:left;">Fun software fact: The main reason Transformers have replaced Recurrent Neural Networks (RNN) as the state of the art in natural language processing is because <a class="link" href="https://builtin.com/artificial-intelligence/transformer-neural-network?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">Transformers are optimized for parallel computing and RNNs are not</a> which means transformers effectively capture the value of these additional cores while RNNs cannot. </p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/dde7d7ce-01f8-4da0-ac68-2a3bf1d1c59d/48-Years-of-Microprocessor-Trend-Data-2.png"/><div class="image__source"><span class="image__source_text"><p>This graph shows various how features of high-end computer hardware has changed over time. Source: <a class="link" href="https://www.researchgate.net/figure/48-Years-of-Microprocessor-Trend-Data-2_fig1_358572677?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">https://www.researchgate.net/figure/48-Years-of-Microprocessor-Trend-Data-2_fig1_358572677</a></p></span></div></div><h3 class="heading" style="text-align:left;">Size</h3><p class="paragraph" style="text-align:left;">The current state of the art transistor size is 2 nanometers — roughly the width of 10 carbon atoms. 2nm transistors have not been commercialized yet, but The Taiwan Semiconductor Manufacturing Company (TSMC) says they’ll open the first<a class="link" href="https://appleinsider.com/articles/21/07/29/tsmc-gets-government-approval-for-2-nanometer-chip-plant?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow"> fab producing 2 nm chips by 2025</a>. </p><p class="paragraph" style="text-align:left;">2nm is an incredible feat, but we may not be able to go much further. Assuming the size of a transistor keeps halving, we only have 3 generations before transistors are the size of a single carbon atom. A single-atom transistor <a class="link" href="https://www.purdue.edu/newsroom/research/2012/120219KlimeckAtom.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">actually HAS been created</a>, way back in 2012, it’s just not suitable for making computers: the 2012 single-atom transistor has to be held at negative 196<i>°</i> C to function. Another was <a class="link" href="https://phys.org/news/2020-05-scientists-recipe-single-atom-transistors.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">invented in 2020</a> which has it’s own commercialization issues.</p><p class="paragraph" style="text-align:left;">Any theoretical subatomic transistor would be a member of the quantum realm. I won’t pretend to be an expert in quantum physics, but the gist is that subatomic sized transistors will be subject to Heisenberg’s uncertainty principle. If that happens we won’t be able to reliably measure the state of the transistor, rendering them useless for computation. If transistors continue to shrink according to Moore’s Law then we’ll hit this <a class="link" href="https://ieeexplore.ieee.org/document/4567410?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">quantum limit by 2036</a>.</p><h3 class="heading" style="text-align:left;">The Speed of Light</h3><p class="paragraph" style="text-align:left;">The above factors might not matter if something could travel infinitely fast. Instead of making computers faster by making components smaller, we could just make the information travel faster. We’ll probably get a bit more computational power by switching to <a class="link" href="https://news.microsoft.com/source/features/innovation/building-a-computer-that-solves-practical-problems-at-the-speed-of-light/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">light based computers</a> which transmit information faster than electricity based computers. But if the speed of light is indeed a hard limit in this universe, then CPU speeds will be subject to it.</p><h2 class="heading" style="text-align:left;">What Does This Have to do China?</h2><p class="paragraph" style="text-align:left;">Basically, the Biden Administration has learned The Bitter Lesson and is applying it in their increasingly adversarial approach to China. Biden views AI as a major nation security issue, and his administration is trying to limit China’s ability to advance in that domain. The result is an ongoing trade war centered on computer chip production capabilities.</p><p class="paragraph" style="text-align:left;">This particular battle began in earnest in October 2022 when the US Bureau of Industry and Security issued a 139 page document which, among other things, expanded export controls. The controls were <a class="link" href="https://www.nytimes.com/2022/10/07/business/economy/biden-chip-technology.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">specifically targeted to limit China’s ability to acquire or build the high-end computer hardware</a> user to train ML systems. Things like EUV Lithography technology and high-end chips made with EUV are among the controlled technologies.</p><p class="paragraph" style="text-align:left;">China has retaliated, for example by creating their own <a class="link" href="https://www.technologyreview.com/2023/07/12/1076156/us-china-tech-war-escalating/?truid=d4311324af962c418f641918ab74316e&utm_source=the_download&utm_medium=email&utm_campaign=the_download.unpaid.engagement&utm_term=Active%20Qualified&utm_content=08-11-2023&mc_cid=264392a95f" target="_blank" rel="noopener noreferrer nofollow">export controls on gallium and germanium</a>, which are important raw materials for making computer chips, fiber optics, and related technologies. China also <a class="link" href="https://www.scmp.com/tech/policy/article/3229676/us-europe-growing-alarmed-chinas-legacy-chip-production-fear-changing-supply-chains?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">ramped up it’s production of “legacy” chips</a>, which are less powerful; fabricated using different processes; and still very important in computer hardware writ large — but not for state of the art AI research. </p><p class="paragraph" style="text-align:left;">Another layer to this story is that the industry leader, The Taiwan Semiconductor Manufacturing Company (TSMC), is located in Taiwan. <a class="link" href="https://www.cfr.org/backgrounder/china-taiwan-relations-tension-us-policy-biden?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">Tensions between China and Taiwan are high</a>, and Biden has previously indicated that <a class="link" href="https://www.cfr.org/blog/while-pledging-defend-taiwan-china-biden-shifted-taiwan-independence-heres-why-matters?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">his administration would defend Taiwan from a Chinese invasion</a> if it ever came to that. The TSMC fabs are such an important strategic asset that <a class="link" href="https://www.nytimes.com/2023/07/12/magazine/semiconductor-chips-us-china.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">some war-gamers have suggested that the U.S. would destroy them</a> before allowing China to take control.</p></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><p class="paragraph" style="text-align:left;"><i>We had classes Monday-Thursday. We’re cutting the News Quiz this week because we ran out of time.</i></p><h1 class="heading" style="text-align:left;">Themes in the News</h1><h3 class="heading" style="text-align:left;">Commodification of Generative Models is Increasing Malicious Uses of AI</h3><p class="paragraph" style="text-align:left;">From text to audio to images to video, ML models are increasingly able to generate impressive, useful, and realistic content. These models have recently become commodified via open source, pre-trained models, APIs, and other consumer interfaces which dramatically lower the barrier to entry for deploying AI systems. </p><p class="paragraph" style="text-align:left;">While there are many useful, cool, and pro-social uses of this technology, we’re witnessing a significant rise in malicious or otherwise unsavory uses, too.</p><p class="paragraph" style="text-align:left;">Versions of the GPT architecture <a class="link" href="https://www.bleepingcomputer.com/news/security/cybercriminals-train-ai-chatbots-for-phishing-malware-attacks/?utm_source=blog.tebs-lab.com&utm_medium=referral&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">trained specifically to create malware, execute phishing attacks, and defraud people</a> have started popping up. In terms of generative cybersecurity there’s enough “there” there for DARPA to get involved: they announced a two-year “<a class="link" href="https://www.engadget.com/the-white-houses-ai-cyber-challenge-aims-to-crowdsource-national-security-solutions-170003434.html?guccounter=1&guce_referrer=aHR0cHM6Ly9hcHAuYXNhbmEuY29tLw&guce_referrer_sig=AQAAANWDGoL8P9Ym33X5ZIScZZ5MDBoRdVoKBAFdReAhQpBPmSaNyDtyhDcDcNFlam_dn6LoqVu29vVLMB0K8PWTHyTdxo1vX7V-jn9RjRjRH8xmQ4RmLI9G7WBLz7DhfOBwBcXk5995EMEpKc34ANyKvXggaFE0YcMK1O_sUyWHDccD&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">AI Cyber Challenge</a>” with nearly $20 million in prize money, and an additional $7 million as startup cash for small businesses who want to participate.</p><p class="paragraph" style="text-align:left;">NAO Medical—which runs a series of medical clinics in New York—was caught using large language models to write <a class="link" href="https://time.com/6302710/nao-medical-google-ai/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">nonsense articles</a> in order to boost its SEO and rank higher on search engines. It appears they’ve since taken these articles down, but a copy of the article titled “Derek Jeter Herpes Tree: Causes, Symptoms, and Treatment” is retained in the linked Time article.</p><p class="paragraph" style="text-align:left;">In a creepy and distasteful—although perhaps not exactly malicious—example some TikTok creators used generative models to <a class="link" href="https://www.washingtonpost.com/technology/2023/08/09/ai-dead-children-tiktok-videos/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">recreate the likeness deceased or missing children</a> and have those children tell their stories. Their works include a video where Anne Frank’s likeness first tries to sell you baby clothes then tells Anne Frank’s actual story.</p><p class="paragraph" style="text-align:left;">These so called “deepfakes” are popping up in an array of unsavory uses. They’re being used to <a class="link" href="https://apnews.com/article/deepfake-porn-celebrities-dalle-stable-diffusion-midjourney-ai-e7935e9922cda82fbcfb1e1a88d9443a?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">imitate the likeness of people (mostly women) who wouldn’t otherwise appear in pornography</a>, with major implications for their reputation and privacy. </p><p class="paragraph" style="text-align:left;">Audio deepfakes have been <a class="link" href="https://www.reuters.com/technology/deepfake-scam-china-fans-worries-over-ai-driven-fraud-2023-05-22/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">used</a> in multiple <a class="link" href="https://www.forbes.com/sites/thomasbrewster/2021/10/14/huge-bank-fraud-uses-deep-fake-voice-tech-to-steal-millions/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">instances</a> of bank fraud. For example, scammers will convince a bank teller they are someone with authority over the phone by faking the authorities voice. Then they ask said teller to execute a wire transfer.</p><p class="paragraph" style="text-align:left;">In general, generative AI tools are making it <a class="link" href="https://www.technologyreview.com/2023/04/04/1070938/we-are-hurtling-toward-a-glitchy-spammy-scammy-ai-powered-internet/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">much easier and faster to create spammy content</a>, accelerating the creation of all sorts of scams and click farming.</p><h3 class="heading" style="text-align:left;">Teb’s Tidbits</h3><ul><li><p class="paragraph" style="text-align:left;"><a class="link" href="https://www.theverge.com/2023/8/8/23820423/ai-startups-regulation-big-tech?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">Large tech firms are mostly driving AI regulation in the US</a> (no conflict of interest there).<br></p></li><li><p class="paragraph" style="text-align:left;">News is a <a class="link" href="https://techpolicy.press/the-value-of-news-content-to-google-is-way-more-than-you-think/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">huge part of Google’s revenue</a>, which makes their stance on <a class="link" href="https://www.npr.org/2023/06/29/1185087587/google-says-it-will-start-blocking-canadian-news-stories-in-response-to-new-law?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">blocking news in Canada a bit suspicious</a>.<br></p></li><li><p class="paragraph" style="text-align:left;">The Kid’s Online Safety Act actually <a class="link" href="https://www.platformer.news/p/how-the-kids-online-safety-act-puts?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">puts us all at risk</a>.</p></li></ul></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h2 class="heading" style="text-align:left;">Remember…</h2><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">The Lab Report is free and doesn’t even advertise. Our curricula is open source and published under a public domain license for <i>anyone </i>to use for <i>any purpose. </i>We’re also a very small team with no investors.</p><p class="paragraph" style="text-align:left;">Help us keep providing these free services by scheduling one of our <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">world class trainings </a>or requesting a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=the-bitter-lesson-moore-s-law-and-a-trade-war-with-china" target="_blank" rel="noopener noreferrer nofollow">custom class</a> for your team.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div></div></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=950b1c9b-a793-4540-a594-a32cdd918250&utm_medium=post_rss&utm_source=teb_s_lab">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

      <item>
  <title>Machine Learning and the &quot;-isms”</title>
  <description>Plus a new direct-to-student class in the works</description>
      <enclosure url="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/2cd283cd-a331-4b58-a9cb-300f55bb717e/img-CrHrDclLiv9Ct5IqAkKKnBpr.png" length="1214975" type="image/png"/>
  <link>https://blog.tebs-lab.com/p/machine-learning-isms</link>
  <guid isPermaLink="true">https://blog.tebs-lab.com/p/machine-learning-isms</guid>
  <pubDate>Fri, 11 Aug 2023 09:00:00 +0000</pubDate>
  <atom:published>2023-08-11T09:00:00Z</atom:published>
    <dc:creator>Tyler Bettilyon</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #C0C0C0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #2D2D2D; font-family: 'Helvetica',Arial,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#F1F1F1; }
  .bh__table_header p { color: #2A2A2A; font-family:'Trebuchet MS','Lucida Grande',Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><h1 class="heading" style="text-align:left;">The Weekly Lab Report</h1><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">I’m Tyler Elliot Bettilyon (Teb) and this is the Lab Report: Our goal is to deepen your understanding of software and technology by explaining the concepts behind the news. </p><p class="paragraph" style="text-align:left;">If you’re new to the Lab Report you can <a class="link" href="https://tebslab.beehiiv.com/subscribe?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">subscribe here</a>. </p><p class="paragraph" style="text-align:left;">If you like what you’re reading you’ll love one of our classes. <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">Schedule a training</a> from our catalog or request a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">custom class consultation</a>.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">From The Lab</h1></div><p class="paragraph" style="text-align:left;">My friend and former colleague Ian—who now runs <a class="link" href="https://www.devsprout.io/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">DevSprout</a>—is building an introductory Python and Web Development class with me. We’re planning a 4 week course meeting 2 nights per week for 1.5-2 hours each session. </p><p class="paragraph" style="text-align:left;">If there’s interest in the class I’ll share a draft of the course outline in next week’s newsletter. If you’re interested in a similar course on a different topic please reply to this email and let me know what you’d like to learn about.</p><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">Today’s Lesson</h1><div class="custom_html"><span id="lesson"></span></div><h2 class="heading" style="text-align:left;">Machine Learning and the “-Isms”</h2><p class="paragraph" style="text-align:left;">Porcha Woodruff was arrested for robbery and carjacking on Feb. 16th, 2023 following a facial recognition match. Woodruff was 8-months pregnant at the time. She was detained for 11 hours then released on a $100,000 personal bond. Her case has since been dismissed and she is <a class="link" href="https://apnews.com/article/detroit-police-facial-recognition-lawsuit-cab0ae44c1671fc30617d301b21b2d13?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">now suing</a> the Detroit Police Department (DPD) for damages caused by the false arrest. </p><p class="paragraph" style="text-align:left;">This whole ordeal was based on a facial recognition match between gas station surveillance footage and a grainy mugshot of Woodruff from 2015 (Woodruff was arrested in 2015 for driving with an expired license).</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/80a966fc-226e-4a89-aa42-b8e077faa451/image.png"/><div class="image__source"><span class="image__source_text"><p>The 2015 mugshot matched by the facial recognition system (left) and Porcha’s 2021 drivers license photo (right), which was also available to the facial recognition system, but didn’t match. Via NYTimes: <a class="link" href="https://www.nytimes.com/2023/08/06/business/facial-recognition-false-arrest.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">https://www.nytimes.com/2023/08/06/business/facial-recognition-false-arrest.html</a></p></span></div></div><p class="paragraph" style="text-align:left;"></p><p class="paragraph" style="text-align:left;"><a class="link" href="https://www.nytimes.com/2023/08/06/business/facial-recognition-false-arrest.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">According to The New York Times</a> Woodruff is the 6th person—and first woman—to report being arrested due to a false positive from a facial recognition system. All six are black. Coincidence? </p><p class="paragraph" style="text-align:left;">Machine learning systems have been plagued with issues related to social bias. <a class="link" href="https://www.reuters.com/article/us-amazon-com-jobs-automation-insight-idUSKCN1MK08G?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">Amazon built a sexist hiring AI</a>. Several municipal governments have used <a class="link" href="https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">racist risk assessment</a> tools to inform bail and sentencing decisions. Microsoft infamously released a Twitter bot that became a <a class="link" href="https://www.theverge.com/2016/3/24/11297050/tay-microsoft-chatbot-racist?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">Nazi propagandist within 24 hours</a>. </p><p class="paragraph" style="text-align:left;">The day before Porcha’s arrest PubMed <a class="link" href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10023594/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">published this analysis of medical ML systems</a> which concluded, “With the exception of only a few cases, we found that the performance for the White group was, in general, significantly higher than that of the other racial groups across all ML algorithms.”</p><p class="paragraph" style="text-align:left;">State of the art generative systems have also fallen under fire for reinforcing a wide variety of stereotypes. Bloomberg recently<a class="link" href="https://www.bloomberg.com/graphics/2023-generative-ai-bias/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow"> analyzed the image generator Stable Diffusion</a> and found substantial racial and gender bias. They prompted the AI to depict people with various jobs and categorized the images based on skin tone and perceived gender. In their analysis “lawyers” were mostly depicted as white men and “housekeepers” were mostly women of color.</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/398a4960-39a5-4334-87de-9d3e506d1357/Screenshot_2023-08-10_at_11.47.36_AM.png"/><div class="image__source"><span class="image__source_text"><p>An analysis done by Bloomberg shows the skin tone composition of images generated by Stable Diffusion when prompted to show a person with a particular job. Source: <a class="link" href="https://www.bloomberg.com/graphics/2023-generative-ai-bias/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">https://www.bloomberg.com/graphics/2023-generative-ai-bias/</a></p></span></div></div><p class="paragraph" style="text-align:left;">Many people associate algorithms with a kind of pure mathematical objectivity. But now, roughly two decades into the machine learning revolution, more experts are admitting that it’s <a class="link" href="https://www.technologyreview.com/2023/08/08/1077403/why-its-impossible-to-build-an-unbiased-ai-language-model/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">probably impossible to create an unbiased ML model</a>. </p><p class="paragraph" style="text-align:left;">So, why?</p><h2 class="heading" style="text-align:left;">It’s The Data, Mostly</h2><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/cf9d1367-e19e-44df-9c27-4fc5cb2c7001/img-CrHrDclLiv9Ct5IqAkKKnBpr.png"/><div class="image__source"><span class="image__source_text"><p>Garbage in garbage out.</p></span></div></div><p class="paragraph" style="text-align:left;">ML models become biased primarily because social biases are deeply embedded in the datasets used to train them. In a sense the algorithms <i>really</i> <i>are</i> acting as tools of unbiased, objective mathematics: They precisely recreate the problematic patterns in the data. </p><p class="paragraph" style="text-align:left;">Amazon used their historical hiring data to train their hiring AI; it’s easy to believe Amazon’s historical hiring practices were at <a class="link" href="https://www.theverge.com/2021/6/7/22523481/amazon-prime-gender-inequality-harassment?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">least somewhat sexist</a>. The American court and police systems produced the data that was used to train the risk assessment tool; those systems have a <a class="link" href="https://www.sentencingproject.org/reports/report-to-the-united-nations-on-racial-disparities-in-the-u-s-criminal-justice-system/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">well-established history of racism</a>. Tay was actually <a class="link" href="https://www.theverge.com/2016/3/24/11297050/tay-microsoft-chatbot-racist?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">intentionally trained by Twitter trolls</a> to spew Nazi propaganda.</p><p class="paragraph" style="text-align:left;">The unfortunate truth is that we live in a society where racism, sexism, and other forms of social bias are deeply embedded in many aspects of our culture. Historically accurate data about our society inevitably contains artifacts of these biases. The same goes for the makeup and content of writing, art, pictures, and other artifacts produced by that society.</p><p class="paragraph" style="text-align:left;">As a then-Harvard Ph.D candidate (now Ph.D holder) Alex Najibi described in 2020 with regard to facial recognition systems:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">Several avenues are being pursued to address these inequities. Some target technical algorithmic performance. First, algorithms can train on diverse and representative datasets, as <b>standard training databases are predominantly White and male</b>. Inclusion within these datasets should require consent by each individual. Second, the data sources (photos) can be made more equitable. <b>Default camera settings are often not optimized to capture darker skin tones, resulting in lower-quality database images of Black Americans</b>. Establishing standards of image quality to run face recognition, and settings for photographing Black subjects, can reduce this effect. </p><figcaption class="blockquote__byline"><a class="link" href="https://sitn.hms.harvard.edu/flash/2020/racial-discrimination-in-face-recognition-technology/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">Alex Najibi</a>, empahsis mine. </figcaption></blockquote></div><p class="paragraph" style="text-align:left;">Balancing the racial makeup of datasets used to train facial recognition systems would not be terribly difficult. Correcting the long history of <a class="link" href="https://www.nytimes.com/2019/04/25/lens/sarah-lewis-racial-bias-photography.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">camera technology that prioritizes light skin tones</a> is another story. </p><p class="paragraph" style="text-align:left;">It’s not possible to retroactively fix every image captured with sub-optimal film or sensor settings. It’s also generally not feasible (and certainly not profitable) for firms to manually collect the millions-to-billions of high-quality images needed to train a modern facial recognition system to ensure proper lighting, color balance, and other settings are used. Most firms just <a class="link" href="https://www.engadget.com/clearview-ceo-claims-companys-database-of-scraped-images-is-now-30-billion-strong-174921576.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">scrape publicly available data</a> from the web.</p><p class="paragraph" style="text-align:left;">Even if firms <i>did </i>collect pristine photos, law enforcement typically uses facial recognition to match against low-quality surveillance footage which likely has poor color balance settings <i>and</i><a class="link" href="https://blog.tebs-lab.com/p/ml-failure-part-two-extrapolation-vs-interpolation?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow"> opens the systems up to the extrapolation problem</a>.</p><p class="paragraph" style="text-align:left;">Generative systems like Stable Diffusion are holding up an unflattering mirror to society. White men really are<i> </i><a class="link" href="https://revcycleintelligence.com/news/aamc-gender-and-racial-diversity-on-the-rise-in-us-physician-workforce?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">over-represented among doctors</a>. Women of color really are <a class="link" href="https://datausa.io/profile/soc/maids-housekeeping-cleaners?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">over-represented among housekeepers</a>. When Stable Diffusion trains on data that represents reality it recreates discrepancies like these.</p><p class="paragraph" style="text-align:left;">Lots of people in online forums use racist language. When we train models like GPT-4 on this data of course <a class="link" href="https://futurism.com/gpt-4-deeply-racist-before-openai-muzzled-it?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">it learns to parrot that language</a>.</p><p class="paragraph" style="text-align:left;">Making matters worse, modern ML systems require massive amounts of data. Collecting, curating, and cleaning such massive datasets is an enormous task. It’s easy for data cleaners and labelers (<a class="link" href="https://onezero.medium.com/the-a-i-industry-is-exploiting-gig-workers-around-the-world-sometimes-for-just-8-a-day-288dcce9c047?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">who are mostly poorly paid gig workers</a>) to miss subtle forms of bias. If they did succeed at removing every hint of social bias the training sets would shrink dramatically, plausibly making them too small to train large models like OpenAI’s GPT-4 or Google’s Bard.</p><h2 class="heading" style="text-align:left;">Engineering Teams, Executives, and End Users Share The Blame</h2><p class="paragraph" style="text-align:left;">ML practitioners aren’t generally experts with rich experience in all the nuances and subtlety of racism, sexism, homophobia, etc. Exhaustively testing models like ChatGPT for every possible problematic utterance is a already huge challenge, one that’s exacerbated by lack of expertise in the wide world of social bias.</p><p class="paragraph" style="text-align:left;">Even when engineering teams do have expertise their concerns about bias and ethics often play a secondary role to building and releasing a profitable product. Sometimes engineers are even punished for concerning themselves with ethics and bias: When Timnet Gebru and Margaret Mitchell <a class="link" href="https://dl.acm.org/doi/10.1145/3442188.3445922?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">authored a paper</a> demonstrating that large language models frequently produce racist content <a class="link" href="https://www.wired.com/story/google-timnit-gebru-ai-what-really-happened/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">they were ousted from their jobs at Google</a>.</p><p class="paragraph" style="text-align:left;">In some cases—such Porcha Woodruff’s arrest—the end users lack crucial skepticism. The Detroit Police Department uncritically trusted their AI’s output, even in the face of mitigating evidence (such as Woodruff’s pregnancy). A judge also appears to have uncritically signed an arrest warrant primarily based on the AI’s output and little supporting evidence. </p><p class="paragraph" style="text-align:left;">Finally, some actors are just malicious. With intentional prompting it’s always possible to use even a relatively neutral AI system to produce something that contains or represents social bias, like this:</p></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/28dfb421-805a-4839-b438-caa3e512da70/img-QbQLaofAxuS0lTosVBAjiwEf.png"/><div class="image__source"><span class="image__source_text"><p>I prompted an AI to produce a picture of: “A white man gobbling mayonnaise from the jar using a spoon.” </p></span></div></div><p class="paragraph" style="text-align:left;">I hope our white readers can laugh at this stereotype. I especially appreciated that the man appears to be sunburned. But similar images that might be perfectly innocuous in some settings might also carry more insidious racist or sexist undertones. </p><p class="paragraph" style="text-align:left;">Should AI image generators specifically refuse to create images of black people eating fried chicken and watermelon? Plenty of black people in the real world do eat these foods, but the stereotype that black people prefer them has <a class="link" href="https://www.bostonglobe.com/2022/02/13/opinion/fried-chicken-watermelon-origins-racist-food-stereotypes/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">an ugly history</a> and seems to <a class="link" href="https://www.theguardian.com/us-news/2023/feb/06/school-black-history-month-fried-chicken-watermelon-new-york?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">reappear every Black History Month</a>. </p><p class="paragraph" style="text-align:left;">There’s nothing <i>inherently wrong </i>with an image of a black person eating fried chicken, but it’s pretty easy to use that imagery in a way that reinforces nasty stereotypes. Similarly, if you prompted an AI for an image of “A black man eating” and it produced a man with a bucket of fried chicken that would be cause for concern.</p><p class="paragraph" style="text-align:left;">Fortunately, the system I used didn’t produce something obviously racist… but it was still a monstrosity. (Seriously, WTF is going on with his mouth and fingers? And are those supposed to be noodles?)</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/a5fc5333-7b3e-463b-90bd-e2ad7bbf0ea0/img-cRm9DPemCvojgUvl0H3MfLV0.png"/><div class="image__source"><span class="image__source_text"><p>I prompted an AI to create an image of “A black man eating.”</p></span></div></div><p class="paragraph" style="text-align:left;">It’s probably impossible to create an ML system that eliminates all social biases while also being generally useful. Which brings us to today’s News Quiz.</p><h1 class="heading" style="text-align:left;">The News Quiz</h1><p class="paragraph" style="text-align:left;"><i>Every week we challenge ourselves to tie the lesson to the news. Answers are at the end of this newsletter.</i></p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/5d6a6520-66c2-4f13-96e4-62d3862a992d/Screenshot_2023-08-09_at_5.03.39_PM.png"/><div class="image__source"><span class="image__source_text"><p>Image from the recent research paper: <a class="link" href="https://aclanthology.org/2023.acl-long.656.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">https://aclanthology.org/2023.acl-long.656.pdf</a> <br><br></p></span></div></div><p class="paragraph" style="text-align:left;">New research studied various text generators and classified them across two political dimensions using a tool called the “political compass” (pictured above). </p><p class="paragraph" style="text-align:left;">The color of the circles indicates the model family. Yellow dots are from Google’s BERT family, orange dots are from Meta’s LLaMA family, and white dots are OpenAI’s GPT family. The X-axis is a measure of political economic alignment. The Y-axis is a measure of political social alignment.</p><p class="paragraph" style="text-align:left;"><a class="link" href="https://aclanthology.org/2023.acl-long.656.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">The research is awesome and the paper is surprisingly approachable, I definitely suggest you read it.</a> </p><p class="paragraph" style="text-align:left;">Here’s a snippet from the paper:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">Generally, BERT variants of LMs are more socially conservative (authoritarian) compared to GPT model variants.</p><figcaption class="blockquote__byline"> Section 4.1, second bullet point: <a class="link" href="https://aclanthology.org/2023.acl-long.656.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">https://aclanthology.org/2023.acl-long.656.pdf</a></figcaption></blockquote></div><p class="paragraph" style="text-align:left;"><b>Which of the following is the most likely explanation for the difference in social conservatism?</b></p><ul><li><p class="paragraph" style="text-align:left;">The GPT family’s model architecture is inherently more libertarian.<br></p></li><li><p class="paragraph" style="text-align:left;">The engineers at OpenAI are more libertarian than the engineers at Google.<br></p></li><li><p class="paragraph" style="text-align:left;">The BERT family of models were trained on a more socially conservative dataset.</p></li></ul><p class="paragraph" style="text-align:left;"><b>This research has implications for a popular </b><a class="link" href="https://www.fastcompany.com/90929549/google-jigsaw-toxic-speech-ai?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow"><b>“toxic speech classifier” called Jigsaw</b></a><b> (and others). Here’s another snip from the research paper:</b></p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">No language model can be entirely free from social biases.</p><figcaption class="blockquote__byline"> Final paragraph of section 1 </figcaption></blockquote></div><p class="paragraph" style="text-align:left;">If the authors are right, Jigsaw’s classifications of what counts as “toxic” must be biased. Which of the following options could Jigsaw peruse to mitigate the bias?</p><ul><li><p class="paragraph" style="text-align:left;">Train multiple models with different datasets and biases and have them vote or otherwise combine their classifications (this strategy is called “ensembling”).<br></p></li><li><p class="paragraph" style="text-align:left;">Gather and curate a dataset that perfectly balances all the relevant perspectives on what constitutes “toxic speech.”<br></p></li><li><p class="paragraph" style="text-align:left;">Select a type of model that is inherently non-partisan. </p></li></ul></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">Teb’s Tidbits</h1><p class="paragraph" style="text-align:left;"><i>The main article and news quiz ran long this week, so we’re skipping the “Themes” section.</i></p><ul><li><p class="paragraph" style="text-align:left;">Meta has officially started <a class="link" href="https://www.theverge.com/2023/8/1/23815994/meta-facebook-instagram-canada-news-act-blocking?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">blocking news content for Canadian readers</a> following a law that would require Meta to pay news producers whose links appear on Meta’s services.<br></p></li><li><p class="paragraph" style="text-align:left;">The new California Privacy Protection Agency is investigating cars and their <a class="link" href="https://www.washingtonpost.com/technology/2023/07/31/cppa-privacy-car-data/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">data collection practices</a>.<br></p></li><li><p class="paragraph" style="text-align:left;">ML based image generators are disrupting a surprisingly big-dollar cottage industry: <a class="link" href="https://restofworld.org/2023/youtube-thumbnail-ai/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">YouTube thumbnail creation</a>.<br></p></li><li><p class="paragraph" style="text-align:left;">ChatGPT can analyze images and caption them as a tool for the blind, but <a class="link" href="https://www.nytimes.com/2023/07/18/technology/openai-chatgpt-facial-recognition.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">OpenAI worries about unintentionally creating a facial recognition tool</a> and all the associated privacy and bias issues that entails.<br></p></li><li><p class="paragraph" style="text-align:left;">It was bound to happen: <a class="link" href="https://www.bleepingcomputer.com/news/security/cybercriminals-train-ai-chatbots-for-phishing-malware-attacks/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">Evil ML-powered chatbots are emerging</a> to steal your login credentials and mire you in spam.<br></p></li><li><p class="paragraph" style="text-align:left;">A new Russian law aims to <a class="link" href="https://torrentfreak.com/putin-outlaws-anonymity-user-id-verification-for-online-services-vpn-bypass-advice-a-crime-230801/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">crack down on online anonymity</a> and further isolate Russia’s corner of the internet from the broader world.<br></p></li><li><p class="paragraph" style="text-align:left;">The BBC is starting a <a class="link" href="https://www.theverge.com/2023/7/31/23814556/bbc-experimental-mastodon-server-twitter?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">6-month experiment with their own Mastodon server</a>.</p></li></ul></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">Answers To The News Quiz</h1></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><p class="paragraph" style="text-align:left;"><b>Which of the following is the most likely explanation for the difference in social conservatism?</b></p><ul><li><p class="paragraph" style="text-align:left;">The GPT family’s model architecture is inherently more libertarian.</p><ul><li><p class="paragraph" style="text-align:left;">No research known to me suggests any particular model architecture is <i>inherently</i> biased in one way or another. But, here’s a really interesting quote from the paper:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">We additionally observe that different sizes of the same model family (e.g. ALBERT and BART) could have non-negligible differences in political leanings. We hypothesize that the change is due to a better generalization in large LMs, including overfitting biases in more subtle contexts, resulting in a shift of political leaning. </p><figcaption class="blockquote__byline"></figcaption></blockquote></div></li></ul></li><li><p class="paragraph" style="text-align:left;">The engineers at OpenAI are more liberal/libertarian than the engineers at Google.</p><ul><li><p class="paragraph" style="text-align:left;">This might be true. If it is true these biases may have slipped into the verification, training, and testing processes at OpenAI and Google. But it’s probably of secondary importance to the training data. Here’s a quote from the paper:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">Such differences could also be in part attributed to the reinforcement learning with human feedback data adopted in GPT-3 models and beyond.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div></li></ul></li><li><p class="paragraph" style="text-align:left;"><b>The BERT family of models were trained on a more socially conservative dataset.</b></p><ul><li><p class="paragraph" style="text-align:left;">This is the most likely cause. One more quote from the paper: </p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">This collective difference may be attributed to the composition of pre-training corpora: while the BookCorpus (Zhu et al., 2015) played a significant role in early LM pretraining, Web texts such as Common-Crawl8 and WebText (Radford et al., 2019) have become dominant pretraining corpora in more recent models. Since modern Web texts tend to be more liberal (libertarian) than older book texts (Bell, 2014), it is possible that LMs absorbed this liberal shift in pretraining data.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div></li></ul></li></ul><p class="paragraph" style="text-align:left;"><b>This research has implications for a popular </b><b><a class="link" href="https://www.fastcompany.com/90929549/google-jigsaw-toxic-speech-ai?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">“toxic speech classifier” called Jigsaw</a></b><b> (and others). Here’s another snip from the research paper:</b></p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">No language model can be entirely free from social biases.</p><figcaption class="blockquote__byline"> Final paragraph of section 1 </figcaption></blockquote></div><p class="paragraph" style="text-align:left;">If the authors are right Jigsaw’s classifications of what counts as “toxic” must be biased. Which of the following options could Jigsaw peruse to mitigate the bias?</p><ul><li><p class="paragraph" style="text-align:left;">Train multiple models with different datasets and biases and have them vote or otherwise combine their classifications (this strategy is called “ensembling”).</p><ul><li><p class="paragraph" style="text-align:left;">This solution is recommended by the paper:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">Motivated by existing literature on analyzing different political perspectives in downstream tasks (Akhtar et al., 2020; Flores-Saviaga et al., 2022), we propose using a combination, or ensemble, of pretrained LMs with different political leanings to take advantage of their collective knowledge for downstream tasks. By incorporating multiple LMs representing different perspectives, we can introduce a range of viewpoints into the decision-making process, instead of relying solely on a single perspective represented by a single language model. We evaluate a partisan ensemble approach and report the results in Table 6, which demonstrate that partisan ensemble actively engages diverse political perspectives, leading to improved model performance.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div></li></ul></li><li><p class="paragraph" style="text-align:left;">Gather and curate a dataset that perfectly balances all the relevant perspectives on what constitutes “toxic speech.”</p><ul><li><p class="paragraph" style="text-align:left;">The paper suggests this is a) impossible, and b) might help, but won’t ever fully eliminate social bias.<br></p></li></ul></li><li><p class="paragraph" style="text-align:left;">Select a type of model that is inherently non-partisan. </p><ul><li><p class="paragraph" style="text-align:left;">Again, there is no research known to me that suggests any type of model is inherently partisan to a particular political persuasion.</p></li></ul></li></ul><p class="paragraph" style="text-align:left;"></p></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h2 class="heading" style="text-align:left;">Remember…</h2><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">The Lab Report is free and doesn’t even advertise. Our curricula is open source and published under a public domain license for <i>anyone </i>to use for <i>any purpose. </i>We’re also a very small team with no investors.</p><p class="paragraph" style="text-align:left;">Help us keep providing these free services by scheduling one of our <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">world class trainings </a>or requesting a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=machine-learning-and-the-isms" target="_blank" rel="noopener noreferrer nofollow">custom class</a> for your team.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div></div></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=e498f628-cffc-48ba-b275-4d9802618170&utm_medium=post_rss&utm_source=teb_s_lab">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

      <item>
  <title>ML Failure Part 3: Model Drift</title>
  <description>News: detecting AI written text is impossible, what now?</description>
      <enclosure url="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/cc6bc6e8-cc54-4a9c-959b-3d7a5689866e/img-oUuGbcQXKeUJW36YgwHFfIXk.png" length="1198632" type="image/png"/>
  <link>https://blog.tebs-lab.com/p/ml-failure-part-3-model-drift</link>
  <guid isPermaLink="true">https://blog.tebs-lab.com/p/ml-failure-part-3-model-drift</guid>
  <pubDate>Fri, 04 Aug 2023 10:00:00 +0000</pubDate>
  <atom:published>2023-08-04T10:00:00Z</atom:published>
    <dc:creator>Tyler Bettilyon</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #C0C0C0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #2D2D2D; font-family: 'Helvetica',Arial,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#F1F1F1; }
  .bh__table_header p { color: #2A2A2A; font-family:'Trebuchet MS','Lucida Grande',Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><h1 class="heading" style="text-align:left;">The Weekly Lab Report</h1><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">I’m Tyler Elliot Bettilyon (Teb) and this is the Lab Report: cut through the noise with our weekly rundown of software and technology news. </p><p class="paragraph" style="text-align:left;">If you’re new to the Lab Report you can <a class="link" href="https://tebslab.beehiiv.com/subscribe?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">subscribe here</a>. If you like what you’re reading you’ll love one of our classes. <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">Schedule a training</a> from our catalog or request a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">custom class consultation</a>.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">From The Lab</h1></div><p class="paragraph" style="text-align:left;">Last week we ran <a class="link" href="https://www.tebs-lab.com/classes/introduction-to-sql?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">Intro to SQL</a> for one of our favorite clients and got some helpful feedback that led to updates in our <a class="link" href="https://github.com/Tebs-Lab/intro-to-sql/tree/main?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">open source curriculum</a>. This week we’re running <a class="link" href="https://github.com/Tebs-Lab/intro-to-python-four-day-ws?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">Intro to Python</a> with a <a class="link" href="https://github.com/Tebs-Lab/python-and-sql-series?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">special addendum</a> to run SQL queries and statements from Python code with a focus on analysis and ETL.</p><p class="paragraph" style="text-align:left;">If you want to schedule a class for your team reply to this email or <a class="link" href="https://www.tebs-lab.com/contact-us?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">book a meeting with me automatically from my website</a>.</p><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">Today’s Lesson</h1><h2 class="heading" style="text-align:left;">How ML Fails Part 3: Model Drift</h2><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/cc6bc6e8-cc54-4a9c-959b-3d7a5689866e/img-oUuGbcQXKeUJW36YgwHFfIXk.png"/><div class="image__source"><span class="image__source_text"><p>Its a car “drifting” …</p></span></div></div><p class="paragraph" style="text-align:left;"><b>Model drift </b>occurs when a model that once worked stops working because something has changed. Model drift is common and can have a variety of underlying causes. A few examples should illustrate why:</p><p class="paragraph" style="text-align:left;">1) Netflix released their streaming service in 2007. Ratings you made in 2007 are still in Netflix’s database, but your tastes have probably changed since then. Models they trained in 2007 based on your preferences at that time are simply out of touch with <i>who you are now</i>. To a lesser degree models trained with <i>all </i>your historical ratings likely contain <i>some </i>errors due to your changing tastes. </p><p class="paragraph" style="text-align:left;">2) Flashback to 2019 and pretend you were modeling for a large in-person retail store. You had sales forecasts for 2020, but they were based on data generated in 2019’s. Surprise, COVID-19 changed everything and your model “drifted” significantly from the new reality on the ground.</p><p class="paragraph" style="text-align:left;">3) Suppose you have a self-driving car system trained on brand new cars. Over time the tires fall out of alignment on some cars. Depending on the severity of the misalignment the system’s turning will be mildly to substantially wrong on those cars.</p><p class="paragraph" style="text-align:left;">ML practitioners must regularly acquire up-to-date data, retrain their models, and possibly tune-up their physical systems (as in the car example) to account for such drift. In many situations it’s also wise to retire old data that may no longer be representative of the phenomena at hand.</p><h2 class="heading" style="text-align:left;">Other Types of Drift</h2><p class="paragraph" style="text-align:left;">The most common type of model drift occurs simply because the world is always changing. The above three are examples of this — something about the world changed and the model wasn’t updated to reflect those changes. But there are some trickier causes, too.</p><h4 class="heading" style="text-align:left;">Device or Protocol Based Drift </h4><p class="paragraph" style="text-align:left;">In 2020 Google trained models to use medical imaging to <a class="link" href="https://www.technologyreview.com/2020/04/27/1000658/google-medical-ai-accurate-lab-real-life-clinic-covid-diabetes-retina-disease/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">detect diabetic retinopathy</a>. In the lab Google’s model achieved over 90% accuracy. In their first attempt to deploy the model in a real hospital, though, the model struggled with differences between the scans done in the lab vs those in the hospital. </p><p class="paragraph" style="text-align:left;">The nurses and techs had a different protocol for taking the scan; the machine that took the scans was slightly different; and the lighting conditions weren’t as good in the hospital. These differences caused the model to reject a large number of scans which in turn caused problems at the hospital for doctors, nurses, and patients. </p><p class="paragraph" style="text-align:left;">Differences in microphones, cameras, or other instruments used to create input data for ML models can all create this kind of drift. Sometimes different versions of software and/or different configurations running on those instruments can also cause model drift.</p><h4 class="heading" style="text-align:left;">Adversarial Drift</h4><p class="paragraph" style="text-align:left;">Some ML applications are adversarial in nature. Consider spam filters and fraud detection: when a new model is trained to account for <i>today’s </i>favored spam tactic<i> </i>the spammers are forced to change their strategy. These new tactics are developed specifically to avoid the old filters, so the filter needs to be updated. </p><p class="paragraph" style="text-align:left;">This same kind of arms race is happening right now in the world of image generation. Copy protected data is being consumed by generative ML systems in enormous quantities. These data consumption practices are the source of a <a class="link" href="https://copyrightalliance.org/current-ai-copyright-cases-part-1/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">slew of copyright claims</a> against ML firms. In addition to lawsuits, some artists are turning to adversarial software to protect their work.</p><p class="paragraph" style="text-align:left;">Recently, the <a class="link" href="https://glaze.cs.uchicago.edu/what-is-glaze.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">University of Chicago released a tool called “Glaze.”</a> This tool allows artists to apply an adversarial “cloak” to images they upload to the internet. This “cloak” is a set of subtle perturbations to the original image that humans (generally) cannot see, but that have a substantial impact on the AI’s “understanding” of the image. Specifically the cloak is designed to prevent models from learning key features of the images’ style.</p><p class="paragraph" style="text-align:left;">Another new <a class="link" href="https://www.technologyreview.com/2023/07/26/1076764/this-new-tool-could-protect-your-pictures-from-ai-manipulation/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">tool called PhotoGuard</a> uses similar adversarial perturbations to make your photos resistant to models that generate “Deep Fakes” (photo-realistic images depicting scenes that didn’t actually occur). Instead of confusing models at training time this tool confuses models at inference time, causing attempted deepfakery on a protected image to instead produce a grey blob or other obviously unrealistic artifacts in the background.</p><p class="paragraph" style="text-align:left;">If these techniques become widespread AI models will first start failing because drift has occurred. AI firms will likely try to incorporate some kind of remediation. If the remediation works, then new techniques to confuse the models may have to be developed.</p><p class="paragraph" style="text-align:left;">Glaze even addresses this in the “<a class="link" href="https://glaze.cs.uchicago.edu/what-is-glaze.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">What is Glaze</a>” article:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">Unfortunately, Glaze is not a permanent solution against AI mimicry. Systems like Glaze face an inherent challenge of being future-proof (Radiya et al). It is always possible for techniques we use today to be overcome by a future algorithm, possibly rendering previously protected art vulnerable.</p><figcaption class="blockquote__byline"><a class="link" href="https://glaze.cs.uchicago.edu/what-is-glaze.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">Glaze Authors</a></figcaption></blockquote></div><h3 class="heading" style="text-align:left;">Low Drift Environments</h3><p class="paragraph" style="text-align:left;">Some models are less susceptible to drift than others. For example ML modeling has been used to <a class="link" href="https://www.technologyreview.com/2019/10/26/132171/a-neural-net-solves-the-three-body-problem-100-million-times-faster/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">approximate the “three body problem”</a> which is a physics problem about predicting how objects in space will move while taking into account their gravitational influence on each other.</p><p class="paragraph" style="text-align:left;">Classical numerical methods for doing this are extremely computationally intense. Once trained, ML models can generate highly accurate approximations that are up to <i>100 million</i> times faster to produce. This model is not going to suffer from drift because (as far as this writer knows) the laws of physics have not ever changed. </p><p class="paragraph" style="text-align:left;">The recent strides <a class="link" href="https://www.technologyreview.com/2023/07/11/1076067/weather-forecasting-is-having-an-ai-moment/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">ML has made in weather forecasting</a> are similar: climate change <i>is</i> occurring but it’s a relatively slow process. Even then, the fundamental physical laws governing the weather <i>aren’t</i> changing, only the environment in which those laws are being applied (i.e. Earth’s atmosphere). </p><p class="paragraph" style="text-align:left;">Problems like these are ideal for ML because they are highly objective, lend themselves easily to mathematical explanations, and aren’t likely to suffer from drift.</p></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">The News Quiz</h1><p class="paragraph" style="text-align:left;"><i>Every week we challenge ourselves to tie the lesson to the news. Answers are at the end of this newsletter.</i></p><p class="paragraph" style="text-align:left;"><a class="link" href="https://decrypt.co/149826/openai-quietly-shutters-its-ai-detection-tool?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">OpenAI shuttered </a>an ML program designed to discriminate between AI-written text and human-written text. Many people are trying to create such tools but so <a class="link" href="https://www.technologyreview.com/2023/07/07/1075982/ai-text-detection-tools-are-really-easy-to-fool/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">far failure is the norm</a>. </p><p class="paragraph" style="text-align:left;"><b>For each of the following bullet points decide whether its an example of model drift or something else:</b></p><ul><li><p class="paragraph" style="text-align:left;">These systems flag <a class="link" href="https://hai.stanford.edu/news/ai-detectors-biased-against-non-native-english-writers?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">text from non-native English speakers</a> as “AI Generated” at much higher rates.<br></p></li><li><p class="paragraph" style="text-align:left;">Making <a class="link" href="https://www.technologyreview.com/2023/07/07/1075982/ai-text-detection-tools-are-really-easy-to-fool/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">slight tweaks</a> to AI generated text caused accuracy on one tool to plummet as described here:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">[Researchers] found that while the tools were good at identifying text written by a human (with 96% accuracy, on average), they fared more poorly when it came to spotting AI-generated text, especially when it had been edited. Although the tools identified ChatGPT text with 74% accuracy, this fell to 42% when the ChatGPT-generated text had been tweaked slightly.</p><p class="paragraph" style="text-align:left;"></p><figcaption class="blockquote__byline"><a class="link" href="https://www.technologyreview.com/2023/07/07/1075982/ai-text-detection-tools-are-really-easy-to-fool/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">MIT Technology Review</a></figcaption></blockquote></div><ul><li><p class="paragraph" style="text-align:left;"><a class="link" href="https://openai.com/research/gpt-2-1-5b-release?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">This tool</a> (<a class="link" href="https://github.com/openai/gpt-2-output-dataset/tree/master/detector?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">code</a>) released once upon a time by OpenAI can usually detect text produced by GPT-2, but not GPT-3.5, GPT-4, or ChatGPT.</p></li></ul></li></ul></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">Themes in the News</h1><h3 class="heading" style="text-align:left;">Alternatives to Detecting AI Generated Materials</h3><p class="paragraph" style="text-align:left;">As we highlighted in the News Quiz detecting AI written text is <a class="link" href="https://www.technologyreview.com/2023/07/07/1075982/ai-text-detection-tools-are-really-easy-to-fool/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">difficult and unreliable</a>. Now researchers are exploring alternative methods for distinguishing between human and AI created content:</p><ul><li><p class="paragraph" style="text-align:left;">Provenance based methods seek to establish a digital (and cryptographically verifiable) trail that can prove the origin of an image, audio file, or video. <a class="link" href="https://www.technologyreview.com/2023/07/28/1076843/cryptography-ai-labeling-problem-c2pa-provenance/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">The Coalition for Content Provenance and Authenticity (</a><a class="link" href="https://www.technologyreview.com/2023/07/28/1076843/cryptography-ai-labeling-problem-c2pa-provenance/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow"><i>C2PA</i></a><a class="link" href="https://www.technologyreview.com/2023/07/28/1076843/cryptography-ai-labeling-problem-c2pa-provenance/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">) is one such protocol</a> that’s gaining notoriety.<br></p></li><li><p class="paragraph" style="text-align:left;">Watermarking methods cause AI-produced content to carry a signature marking it as such. Several major AI firms have signed onto an agreement to add watermarks to their <a class="link" href="https://www.wired.com/story/ai-giants-pledge-external-probes-algorithms-white-house/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">AI generated images</a>. And here’s a proposal to give <a class="link" href="https://www.wired.com/story/to-watermark-ai-it-needs-its-own-alphabet/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">AI text generation systems a unique subset of Unicode</a> as a form of watermarking for text.</p></li></ul><h3 class="heading" style="text-align:left;">Data Labeling and Generation is Big Business</h3><p class="paragraph" style="text-align:left;">ML models are notoriously data hungry. So much so that some researchers <a class="link" href="https://www.theatlantic.com/technology/archive/2023/01/artificial-intelligence-ai-chatgpt-dall-e-2-learning/672754/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">think we’ll run out of high-quality data</a> to train them before 2027. Complicating matters, new research has confirmed that training ML models using data generated by ML models <a class="link" href="https://arxiv.org/abs/2307.01850?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">almost always degrades model performance</a>.</p><p class="paragraph" style="text-align:left;">There’s a huge market for high quality data, and a lot of it is created by <a class="link" href="https://www.theatlantic.com/technology/archive/2023/07/ai-chatbot-human-evaluator-feedback/674805/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">poorly paid laborers</a>. Some of it is generated by you: every time you “prove you’re not a robot” with one of Googles (Re)Captcha tools you’re <a class="link" href="https://apnews.com/article/technology-technology-issues-digitization-spamming-artificial-intelligence-9e2aec49792c3a1e31c1f94f1a5e7ede?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">helping to label training data</a>.</p><p class="paragraph" style="text-align:left;">It’s not <i>all</i> exploitation though. Read this lovely story about a <a class="link" href="https://time.com/6297403/the-workers-behind-ai-rarely-see-its-rewards-this-indian-startup-wants-to-fix-that/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">startup in India</a> trying to pay fair wages <i>and</i> royalties for data labelers.</p><h3 class="heading" style="text-align:left;">Teb’s Tidbits</h3><ul><li><p class="paragraph" style="text-align:left;"><a class="link" href="https://www.lawfaremedia.org/article/content-moderation-s-legalism-problem?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">Content moderation should abandon their “legalistic” structures and embrace the “public editor” role</a>. <br></p></li><li><p class="paragraph" style="text-align:left;">Google’s machine <a class="link" href="https://ai.googleblog.com/2023/06/announcing-first-machine-unlearning.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">“unlearning” challenge</a> seeks to address some privacy issues in ML models by making them forget some aspects of their training data retroactively.<br></p></li><li><p class="paragraph" style="text-align:left;"><a class="link" href="https://www.technologyreview.com/2021/08/27/1033879/people-hiring-faces-work-deepfake-ai-marketing-clones/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">Deepfakes in the advertising industry</a> exemplify why the SAG-AFTRA has <a class="link" href="https://themarkup.org/hello-world/2023/07/29/the-writers-strike-over-ai-is-bigger-than-hollywood?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">made AI a core aspect of their ongoing strike</a>.</p></li></ul></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">Answers To The News Quiz</h1></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><p class="paragraph" style="text-align:left;"><b>For each of the following bullet points decide whether its an example of model drift or something else:</b></p><ul><li><p class="paragraph" style="text-align:left;">These systems flag <a class="link" href="https://hai.stanford.edu/news/ai-detectors-biased-against-non-native-english-writers?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">text from non-native English speakers</a> as “AI Generated” at much higher rates.</p><ul><li><p class="paragraph" style="text-align:left;"><b>Something else. </b>Model drift in this context would be a model that used to accurately distinguish between ESL writers and AI writers, but overtime one of those groups writing changed in a way that caused the model to fail. More likely this is plain old underfitting. <br></p></li></ul></li><li><p class="paragraph" style="text-align:left;">Making <a class="link" href="https://www.technologyreview.com/2023/07/07/1075982/ai-text-detection-tools-are-really-easy-to-fool/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">slight tweaks</a> to AI generated text caused accuracy on one tool to plummet.</p><ul><li><p class="paragraph" style="text-align:left;"><b>Model drift. </b>This is a great example of adversarial drift. The researchers took AI generated output, made an intentional perturbation, and then the model failed on the data which had “drifted” from the original distribution of AI generated text.<br></p></li></ul></li><li><p class="paragraph" style="text-align:left;"><a class="link" href="https://openai.com/research/gpt-2-1-5b-release?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">This tool</a> (<a class="link" href="https://github.com/openai/gpt-2-output-dataset/tree/master/detector?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">code</a>) released once upon a time by OpenAI can usually detect text produced by GPT-2, but not GPT-3.5, GPT-4, or ChatGPT.</p><ul><li><p class="paragraph" style="text-align:left;"><b>Model drift. </b>This time due to the advances in text generation. AI systems have gotten <i>much</i> better in the last 5 years, so old tools that detected old AI generated text have stopped working.</p></li></ul></li></ul></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h2 class="heading" style="text-align:left;">Remember…</h2><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">The Lab Report is free and doesn’t even advertise. Our curricula is open source and published under a public domain license for <i>anyone </i>to use for <i>any purpose. </i>We’re also a very small team with no investors.</p><p class="paragraph" style="text-align:left;">Help us keep providing these free services by scheduling one of our <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">world class trainings </a>or requesting a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-3-model-drift" target="_blank" rel="noopener noreferrer nofollow">custom class</a> for your team.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div></div></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=868df3b4-b6a0-4a8c-8e79-52a23202dab6&utm_medium=post_rss&utm_source=teb_s_lab">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

      <item>
  <title>ML Failure Part Two: Extrapolation vs Interpolation</title>
  <description>News: Twitter is X and AI&#39;s future is uncertain</description>
  <link>https://blog.tebs-lab.com/p/ml-failure-part-two-extrapolation-vs-interpolation</link>
  <guid isPermaLink="true">https://blog.tebs-lab.com/p/ml-failure-part-two-extrapolation-vs-interpolation</guid>
  <pubDate>Fri, 28 Jul 2023 10:00:00 +0000</pubDate>
  <atom:published>2023-07-28T10:00:00Z</atom:published>
    <dc:creator>Tyler Bettilyon</dc:creator>
  <content:encoded><![CDATA[
    <div class='beehiiv'><style>
  .bh__table, .bh__table_header, .bh__table_cell { border: 1px solid #C0C0C0; }
  .bh__table_cell { padding: 5px; background-color: #FFFFFF; }
  .bh__table_cell p { color: #2D2D2D; font-family: 'Helvetica',Arial,sans-serif !important; overflow-wrap: break-word; }
  .bh__table_header { padding: 5px; background-color:#F1F1F1; }
  .bh__table_header p { color: #2A2A2A; font-family:'Trebuchet MS','Lucida Grande',Tahoma,sans-serif !important; overflow-wrap: break-word; }
</style><div class='beehiiv__body'><h1 class="heading" style="text-align:left;">The Weekly Lab Report</h1><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">I’m Tyler Elliot Bettilyon (Teb) and this is the Lab Report: cut through the noise with our weekly rundown of software and technology news. </p><p class="paragraph" style="text-align:left;">If you’re new to the Lab Report you can <a class="link" href="https://tebslab.beehiiv.com/subscribe?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">subscribe here</a>. If you like what you’re reading you’ll love one of our classes. <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">Schedule a training</a> from our catalog or request a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">custom class consultation</a>.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">From The Lab</h1></div><p class="paragraph" style="text-align:left;">This week we ran the first half of a custom class for a prominent coffee purveyor. The class pairs <a class="link" href="https://www.tebs-lab.com/classes/introduction-to-sql?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">intro to SQL</a> and <a class="link" href="https://www.tebs-lab.com/classes/introduction-to-python?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">intro to Python</a> as part of a longer course at the company regarding data analysis and visualization. Browse the <a class="link" href="https://github.com/Tebs-Lab/python-and-sql-series?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">open source materials</a> or <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">get in touch</a> to schedule a custom class of your own.</p><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">Today’s Lesson: How ML Fails Part 2</h1><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">All the code from today’s lesson can be viewed on <span style="text-decoration:underline;"><b><a class="link" href="https://github.com/Tebs-Lab/lab-report-code/tree/main/07-28-2023?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">Github</a></b></span> and <span style="text-decoration:underline;"><b><a class="link" href="https://colab.research.google.com/github/Tebs-Lab/lab-report-code/blob/main/07-28-2023/Interpolation%20v%20Extrapolation.ipynb?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">Google Colab</a></b></span>.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div><h3 class="heading" style="text-align:left;">Extrapolation vs Interpolation</h3><p class="paragraph" style="text-align:left;">In statistics <b>interpolation </b>and <b>extrapolation</b> are both types of estimation based on data. Interpolation is an estimate made within the bounds of the existing data, extrapolation<b> </b>is an estimate made outside those bounds. Although they are sometimes presented as a binary, predictions and estimations often exist on a spectrum between the two.</p><p class="paragraph" style="text-align:left;">Imagine we’re the owner of an ice cream shop. We want to make a sales estimate for tomorrow. Tomorrow hasn’t happened, so in that sense this is fundamentally an extrapolation problem: tomorrow’s sales could not possibly be in our existing data set. However, we know it’s going to be a hot Saturday in the middle of summer. We have sales records from similar days in the past, so in that sense it’s an interpolation problem. </p><p class="paragraph" style="text-align:left;">In general — and for ML specifically — performing extrapolation is much harder. Forcing a model make predictions outside the bounds of its training data is a common cause for failure.</p><h2 class="heading" style="text-align:left;">An Abstract Example: Predicting Sine</h2><p class="paragraph" style="text-align:left;">Consider the humble sine curve. Sine is a perfectly repeating pattern, there is no noise, and its period is consistent. It’s just a boring, repetitious curve.</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/d6a27053-43aa-404a-b48e-10cc0078c036/Screenshot_2023-07-24_at_11.57.18_AM.png"/><div class="image__source"><span class="image__source_text"><p>The sine function from -5π to 5π</p></span></div></div><p class="paragraph" style="text-align:left;">Many ML models can easily learn the features of sine within the bounds of the training data. For example, here’s a fairly simple neural network’s approximation of sine from -5π to 5π. Our model’s output in red, the training data in blue:</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/d58d8cc0-cd9b-47a6-81e4-aa16519deb92/Screenshot_2023-07-24_at_12.09.38_PM.png"/><div class="image__source"><span class="image__source_text"><p>A neural network easily learns sine… </p></span></div></div><p class="paragraph" style="text-align:left;">Clearly, our model can do interpolation quite well. But what happens when we ask this model to make predictions outside the bounds of the original data? </p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/d4c65a74-42db-4599-8420-684fd963c2d6/Screenshot_2023-07-24_at_12.12.36_PM.png"/><div class="image__source"><span class="image__source_text"><p>The same model’s predictions for sine outside the bounds of the training data.</p></span></div></div><p class="paragraph" style="text-align:left;">Unfortunately, our model has failed abysmally at extrapolation. It learned sine almost perfectly from -5π to 5π but learned a near-linear function below -5π and above 5π.</p><p class="paragraph" style="text-align:left;"> And it gets worse. This is what happens when we train the same model on sine with a few holes in the sample:</p><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/663285aa-0b15-4109-840f-63c465138457/Screenshot_2023-07-24_at_12.33.15_PM.png"/><div class="image__source"><span class="image__source_text"><p>The model also fails to properly estimate within the missing areas from the training data.</p></span></div></div><p class="paragraph" style="text-align:left;">The hole on the right isn’t terrible, but the hole on the left is basically a horizontal line covering what should be a full period of the sine curve.</p><p class="paragraph" style="text-align:left;">In the real world these sorts of issues arise <i>all the time. </i></p><h2 class="heading" style="text-align:left;">Real World Examples</h2><h3 class="heading" style="text-align:left;">ChatGPT and Esoteric Poetry</h3><p class="paragraph" style="text-align:left;">Consider the “domain” of a model like ChatGPT: you need a representative sample of “human written text” within every subdomain where users might prompt the model. </p><p class="paragraph" style="text-align:left;">While the dataset for ChatGPT is enormous, it still doesn’t contain many samples of every esoteric topic under the sun. In December Ian Bogost, a technology writer at The Atlantic, <a class="link" href="https://www.theatlantic.com/technology/archive/2022/12/chatgpt-openai-artificial-intelligence-writing-ethics/672386/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">challenged ChatGPT to write in esoteric styles of poetry</a> (among other things). Here’s what he found:</p><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">I also urged the AI to generate a lai (a medieval narrative poem) in the style of Marie de France about the beloved, Texas-based fast-food chain Whataburger. It sure as heck looked a lot like a lai, beginning like this:</p><p class="paragraph" style="text-align:left;"><i>In the land of Whataburger, where burgers and fries are king,</i><br><i>A stranger came to visit, and with him brought a ring.</i><br><i>He wore a cloak of velvet, and a hat upon his head,</i><br><i>And he smiled with a twinkle in his eye as he approached the counter, where the burgers are served hot and fresh, and the fries are crispy and golden.</i></p><p class="paragraph" style="text-align:left;">Except for that last line, which is a metrical mess. The AI […] was able to tell me about the typical structure of a lai—a short, narrative poem of the 12th century with a particular structure and form, usually written in octosyllabic couplets. The lyrical errors in the generated lai arise from its failure to adhere to that metrical form.</p><figcaption class="blockquote__byline"><a class="link" href="https://www.theatlantic.com/technology/archive/2022/12/chatgpt-openai-artificial-intelligence-writing-ethics/672386/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">Ian Bogost in The Atlantic</a></figcaption></blockquote></div><p class="paragraph" style="text-align:left;">Although the model was trained on enough informational content to correctly define a lai, it wasn’t trained on enough lais to correctly produce one. This demonstrates another weakness of such models: they do not really “understand” the things they produce the way humans usually think of understanding. While the model correctly defined a lai, its “knowledge” of their structure did not translate to producing a poem that <i>actually has </i>that structure.</p><h3 class="heading" style="text-align:left;">Fatal Self Driving Crash</h3><p class="paragraph" style="text-align:left;">In 2018 a self-driving SUV operated by Uber <a class="link" href="https://www.npr.org/2019/11/07/777438412/feds-say-self-driving-uber-suv-did-not-recognize-jaywalking-pedestrian-in-fatal-?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">struck and killed Elaine Herzberg</a>. An investigation found that in the moments before the crash the system was struggling to properly classify Herzberg as a pedestrian that needed to be avoided. </p><p class="paragraph" style="text-align:left;">Herzberg was doing something that the AI had never encountered during training: Jaywalking with her bike. The system had seen pedestrians in and out of crosswalks. It had seen cyclists on the roadway AND pedestrians walking with a bike in a crosswalk. But never a pedestrian walking with a bike outside a crosswalk.</p><p class="paragraph" style="text-align:left;">As a result the system oscillated between classifying her as a pedestrian, bicycle, and vehicle which in turn caused the system to incorrectly predict her path, which ultimately led to the crash. </p><p class="paragraph" style="text-align:left;">This problem is a big one for self-driving cars because just about anything can happen on the roadways. It’s also why the most successful self-driving firms have started in places like Arizona, where the streets are wide and weather conditions are fairly constant and predictable. Those factors decrease the size of “representative driving conditions.” </p><h3 class="heading" style="text-align:left;">Big problems, Big Data</h3><p class="paragraph" style="text-align:left;">The extrapolation problem is another reason large ML models need so much data. If we expect our ML model to succeed, we need <i>truly representative </i>data sets that include <i>the entire domain </i>of whatever problem it is we’re hoping to solve. </p><p class="paragraph" style="text-align:left;">Ask yourself: what is “truly representative” of the problem of driving, or “writing at a human level of proficiency” and you begin to understand why some researchers think we’ll <a class="link" href="https://www.theatlantic.com/technology/archive/2023/01/artificial-intelligence-ai-chatgpt-dall-e-2-learning/672754/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">run out of high quality data before 2027</a>. </p></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">The News Quiz</h1><div class="image"><img alt="" class="image__image" style="" src="https://media.beehiiv.com/cdn-cgi/image/fit=scale-down,format=auto,onerror=redirect,quality=80/uploads/asset/file/b695fd80-b452-46c3-9aef-2a8aa50f41a0/img-fdOQwwY6VLqs9XsxCVdZlv2C.png"/><div class="image__source"><span class="image__source_text"><p>An AI with zero conception of the rules of Go generated this board, which would be quite absurd if it occurred in a real game.</p></span></div></div><p class="paragraph" style="text-align:left;">in 2016 AlphaGo <a class="link" href="https://www.theatlantic.com/technology/archive/2016/03/the-invisible-opponent/475611/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">defeated the Go world champion Lee Sedol</a> 4-1 in a 5-game show match. It was a significant achievement in ML because the game of Go is extremely complex. Fun fact: The number of legal Go board states is larger than the number of atoms in the universe. </p><p class="paragraph" style="text-align:left;">Since that achievement other Go-playing AI’s have extended and refined AlphaGo’s core concepts. In particular one called KataGo has become a standard bearer. Earlier this year a fairly high ranked amateur Go player named Kellin Pelrine <a class="link" href="https://www.vice.com/en/article/v7v5xb/a-human-amateur-beat-a-top-go-playing-ai-using-a-simple-trick?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">defeated KataGo 14 games to 1</a>. </p><p class="paragraph" style="text-align:left;"><b>Read these two articles about the “adversarial strategy”:</b></p><ul><li><p class="paragraph" style="text-align:left;"><a class="link" href="https://www.vice.com/en/article/v7v5xb/a-human-amateur-beat-a-top-go-playing-ai-using-a-simple-trick?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">This article about Perline’s strategy</a>.</p></li><li><p class="paragraph" style="text-align:left;">This article about an <a class="link" href="https://www.lesswrong.com/posts/Es6cinTyuTq3YAcoK/there-are-probably-no-superhuman-go-ais-strong-human-players?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">AI trained specifically to beat KataGo</a>, but that often loses to novice human Go players.</p></li></ul><p class="paragraph" style="text-align:left;"><b>Then, answer these questions:</b></p><ul><li><p class="paragraph" style="text-align:left;">Perline described his encircling strategy <a class="link" href="https://www.engadget.com/human-convincingly-beats-ai-at-go-with-help-from-a-bot-100903836.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">this way</a>: &quot;As a human, it would be quite easy to spot.&quot; Why then, didn’t the top ranked Go AIs spot it?<br></p></li><li><p class="paragraph" style="text-align:left;">In terms of “extrapolation” and “interpolation” what is this adversarial AI missing that allows novice human players to defeat it, even while it defeats a top ranked AI that in turn frequently defeats top ranked humans?</p></li></ul></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">Themes in the News</h1><h3 class="heading" style="text-align:left;">No one really knows the future of AI</h3><p class="paragraph" style="text-align:left;">OpenAI CEO Sam Altman has been making the rounds, charming legislators, and musing about whether his company’s creations have a <a class="link" href="https://www.theatlantic.com/magazine/archive/2023/09/sam-altman-openai-chatgpt-gpt-4/674764/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">0.5% or a 50% chance of destroying humanity</a>.</p><p class="paragraph" style="text-align:left;">Are large language models just “<a class="link" href="https://nymag.com/intelligencer/article/ai-artificial-intelligence-chatbots-emily-m-bender.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">stochastic parrots</a>” as computational linguist Emily Bender et al <a class="link" href="https://dl.acm.org/doi/10.1145/3442188.3445922?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">have argued</a>? Or is Altman right that general intelligence might be one of the, “emergent properties from doing simple things on a massive scale.”</p><p class="paragraph" style="text-align:left;">If ML systems <a class="link" href="https://www.theatlantic.com/technology/archive/2023/01/artificial-intelligence-ai-chatgpt-dall-e-2-learning/672754/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">run out of high quality training data</a>, as some ML researchers have speculated will most likely happen <a class="link" href="https://arxiv.org/pdf/2211.04325.pdf?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">before 2027</a>, will they already have consumed enough data to become super-intelligent and continue learning autonomously?</p><p class="paragraph" style="text-align:left;">Is Anthropic AI right that we need to build exactly the types of systems that might destroy humanity to <a class="link" href="https://www.nytimes.com/2023/07/11/technology/anthropic-ai-claude-chatbot.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">prevent that outcome</a>? Or is the only way to avoid our destruction the <a class="link" href="https://time.com/6266923/ai-eliezer-yudkowsky-open-letter-not-enough/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">complete abandonment of general artificial intelligence R&D</a>?</p><p class="paragraph" style="text-align:left;">Right now there are a lot of unknowns and a lot of disagreement even among experts.</p><h3 class="heading" style="text-align:left;">Will anything replace Twitter or will it just be X?</h3><p class="paragraph" style="text-align:left;">Apparently <a class="link" href="https://www.theverge.com/2023/7/24/23804973/twitter-x-logo-brand-replacement?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">Twitter is now X</a>. </p><p class="paragraph" style="text-align:left;">Since Elon Musk bought Twitter the site has been turbulent. Between <a class="link" href="https://www.forbes.com/sites/siladityaray/2023/02/27/twitter-reportedly-lays-off-200-staffers-including-head-of-blue-subscription-service/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">massive layoffs</a>, <a class="link" href="https://www.nytimes.com/2023/02/28/technology/twitter-outages-elon-musk.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">increased</a> service <a class="link" href="https://www.theatlantic.com/technology/archive/2023/07/twitter-outage-elon-musk-user-restrictions/674609/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">outages</a>, and <a class="link" href="https://www.nytimes.com/2023/06/05/technology/twitter-ad-sales-musk.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">advertiser abandonment</a>, other tech firms smell blood in the water. Mastodon, Bluesky, Substack’s Notes, and now Meta’s Threads have all made attempts to capture Twitter refugees or otherwise capitalize on the chaos surrounding Musk’s takeover, though it’s not clear any of them will succeed.</p><p class="paragraph" style="text-align:left;">Mastadon’s decentralization gives it an inherent moderation problem, which is why it’s become a haven for <a class="link" href="https://www.washingtonpost.com/politics/2023/07/24/twitter-rival-mastodon-rife-with-child-abuse-material-study-finds/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">child sexual abuse materials</a>. Notes and Bluesky have yet to attract massive user bases, although Bluesky reportedly <a class="link" href="https://gizmodo.com/jack-dorsey-bluesky-1-million-users-threads-twitter-1850615479?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">crested 1 million users</a>. Threads is probably best positioned because of their ability to basically import Instagram’s users, but Meta <a class="link" href="https://www.theatlantic.com/newsletters/archive/2023/07/threads-meta-twitter-competitor-mark-zuckerberg/674655/?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">isn’t exactly a harbinger of warm-fuzzies</a>.</p><p class="paragraph" style="text-align:left;">This writer has been off Twitter since long before the Musk takeover, and I find myself increasingly sympathetic to the notion that maybe we just <a class="link" href="https://www.vox.com/recode/2022/12/6/23496363/twitter-mastodon-hive-musk-replacement?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">don’t need a new Twitter</a>.</p><h3 class="heading" style="text-align:left;">Teb’s Tidbits</h3><ul><li><p class="paragraph" style="text-align:left;">AI is being used to <a class="link" href="https://spectrum.ieee.org/ai-mining?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">find certain metal deposits</a> which important for their use in batteries and other green energy and electrification efforts.<br></p></li><li><p class="paragraph" style="text-align:left;">AI coding assistants can <a class="link" href="https://spectrum.ieee.org/ai-software?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">introduce major security issues</a>. Humans do this too, but if AI’s succeed at automating a lot of coding processes this will exacerbate the software insecurity situation.<br></p></li><li><p class="paragraph" style="text-align:left;">Researchers have discovered a way to <a class="link" href="https://www.theguardian.com/science/2023/jul/02/it-was-an-accident-the-scientists-who-have-turned-humid-air-into-renewable-power?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">generate electricity from humid air</a>.</p></li></ul></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h1 class="heading" style="text-align:left;">Answers To The News Quiz</h1><p class="paragraph" style="text-align:left;"><b>Perline described his encircling strategy </b><b><a class="link" href="https://www.engadget.com/human-convincingly-beats-ai-at-go-with-help-from-a-bot-100903836.html?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">this way</a></b><b>: &quot;As a human, it would be quite easy to spot.&quot; Why then, didn’t the top ranked Go AIs spot it?</b></p><p class="paragraph" style="text-align:left;">KataGo and other top ranked AIs primarily train in stages. First they “watch” expert level Go games between top ranked humans, then two similar versions of the model engage in “self play.” </p><p class="paragraph" style="text-align:left;">But because the strategy Perline employed is “quite easy to spot” for top ranked human players, <i>they never use it. </i>When the models switch to self play the bots don’t use the tactic either, because they’ve never seen it. </p><p class="paragraph" style="text-align:left;"><b>In terms of “extrapolation” and “interpolation” what is this adversarial AI missing that allows novice human players to defeat it, even while it defeats a top ranked AI that frequently defeats top ranked humans?</b></p><p class="paragraph" style="text-align:left;">The adversarial AI only really practiced against KataGo. Not only that, it “practiced” in a peculiar way that was explicitly designed to find a specific weakness or blindspot in KataGo’s play. The only kind of Go game that could really be considered “interpolation” for this machine is exactly the style of game that KataGo plays.</p><p class="paragraph" style="text-align:left;">So, even the basic strategies of a novice human are “extrapolation” and therefore cause the adversarial AI significant problems.</p></div><div class="section" style="background-color:transparent;margin:0.0px 0.0px 0.0px 0.0px;padding:0.0px 0.0px 0.0px 0.0px;"><h2 class="heading" style="text-align:left;">Remember…</h2><div class="blockquote"><blockquote class="blockquote__quote"><p class="paragraph" style="text-align:left;">The Lab Report is free and doesn’t even advertise. Our curricula is open source and published under a public domain license for <i>anyone </i>to use for <i>any purpose. </i>We’re also a very small team with no investors.</p><p class="paragraph" style="text-align:left;">Help us keep providing these free services by scheduling one of our <a class="link" href="https://www.tebs-lab.com/course-catalog?utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">world class trainings </a>or requesting a <a class="link" href="https://tebslab.as.me/schedule.php?appointmentType=49426706&utm_source=blog.tebs-lab.com&utm_medium=newsletter&utm_campaign=ml-failure-part-two-extrapolation-vs-interpolation" target="_blank" rel="noopener noreferrer nofollow">custom class</a> for your team.</p><figcaption class="blockquote__byline"></figcaption></blockquote></div></div></div><div class='beehiiv__footer'><br class='beehiiv__footer__break'><hr class='beehiiv__footer__line'><a target="_blank" class="beehiiv__footer_link" style="text-align: center;" href="https://www.beehiiv.com/?utm_campaign=c51db4d4-4680-4882-a40e-54e7019acd84&utm_medium=post_rss&utm_source=teb_s_lab">Powered by beehiiv</a></div></div>
  ]]></content:encoded>
</item>

  </channel>
</rss>
