<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>Voice on jamesm.blog</title>
    <link>https://jamesm.blog/tags/voice/</link>
    <description>Recent content in Voice on jamesm.blog</description>
    <image>
      <title>jamesm.blog</title>
      <url>https://jamesm.blog/papermod-cover.png</url>
      <link>https://jamesm.blog/papermod-cover.png</link>
    </image>
    <generator>Hugo</generator>
    <language>en</language>
    <lastBuildDate>Tue, 21 Apr 2026 21:48:00 +0100</lastBuildDate>
    <atom:link href="https://jamesm.blog/tags/voice/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>How to Phone Your Home AI Agent Running on a Mac Studio</title>
      <link>https://jamesm.blog/ai/phone-your-home-ai-agent/</link>
      <pubDate>Tue, 21 Apr 2026 21:48:00 +0100</pubDate>
      <guid>https://jamesm.blog/ai/phone-your-home-ai-agent/</guid>
      <description>A practical walkthrough of the stack I use to literally phone my Mac Studio, speak to a local AI agent, and get it to run or check work while I&amp;#39;m away from the desk.</description>
    </item>
    <item>
      <title>MacWhisper vs Wispr Flow vs Superwhisper: The 2026 Dictation Stack Compared</title>
      <link>https://jamesm.blog/ai/mac-dictation-tools-comparison/</link>
      <pubDate>Mon, 20 Apr 2026 19:02:00 +0100</pubDate>
      <guid>https://jamesm.blog/ai/mac-dictation-tools-comparison/</guid>
      <description>A practical breakdown of the three dictation apps every Mac user keeps hearing about - who they&amp;#39;re for, what they cost, and where they stop being interchangeable.</description>
    </item>
    <item>
      <title>Grok&#39;s New Voice APIs: Speech Recognition and Synthesis at Enterprise Scale</title>
      <link>https://jamesm.blog/ai/grok-voice-apis/</link>
      <pubDate>Sun, 19 Apr 2026 21:07:00 +0000</pubDate>
      <guid>https://jamesm.blog/ai/grok-voice-apis/</guid>
      <description>xAI launches standalone Speech-to-Text and Text-to-Speech APIs, bringing production-grade voice capabilities to developers at competitive pricing.</description>
    </item>
    <item>
      <title>OpenAI Voice Engine</title>
      <link>https://jamesm.blog/ai/openai-voice-engine/</link>
      <pubDate>Fri, 29 Mar 2024 23:12:25 +0100</pubDate>
      <guid>https://jamesm.blog/ai/openai-voice-engine/</guid>
      <description>&lt;h2 id=&#34;tldr&#34;&gt;TL;DR&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;OpenAI Voice Engine&lt;/strong&gt; is a text-to-speech model that can clone a realistic voice from just a 15-second audio sample&lt;/li&gt;
&lt;li&gt;It produces emotive, natural-sounding speech despite using a small model and minimal training data&lt;/li&gt;
&lt;li&gt;Access has remained in &lt;strong&gt;limited preview&lt;/strong&gt; since its 2024 announcement due to responsible AI concerns around voice cloning and impersonation&lt;/li&gt;
&lt;li&gt;Approved testers must obtain clear consent from voice providers and inform listeners that voices are AI-generated&lt;/li&gt;
&lt;li&gt;As of 2026, the technology is restricted to approved partners and researchers rather than general availability&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;about&#34;&gt;About&lt;/h2&gt;
&lt;p&gt;OpenAI&amp;rsquo;s Voice Engine is a text-to-speech tool which can create realistic voices from just a 15-second audio sample.  It is notable that a small model with a single 15-second sample can create emotive and realistic voices.  To ensure responsible use testers must get clear consent from voice providers, avoid creating user-generated voices, and inform listeners that the voices are AI-generated.&lt;/p&gt;</description>
    </item>
  </channel>
</rss>
