Login

MartinBanner · 03-26-2026, 11:23 PM

I finally have it working with esphome. Unfortunately I'm unable to use it - or any local device - for a voice assistant. My host is too old without AVX to host Whisper. However I still have it working as a smart speaker able to play media and tts just fine (not easy!). However the volume is too low no matter what I do. I can hear it but barely. From the marketing pictures the speaker appears to be 8 ohm at 3W. I have a few Gen1 Amazon Dots and they appear to be only about 1.2W and are much much louder than the AS. Are there any hints to maximize the volume? I'm running the most current version of Esphome builder and Home Assistant as of the date of this post.

I've commented out volume_max as it is said the default is 100%, but I've tried many values with no improvement.

Thanks for any help

Martin

Here's the yaml

Code:
# Whisper addon cannot be started on the HP Microserver

# Voice Assistant will not work

# AVX instruction set required. Not avail on Turion II

# only Instructionsets: MMX, 3DNow!, SSE, SSE2, SSE3, SSE4A,

# AMD64, AMD-V (AMD Virtualization), and EVP (Enhanced Virus Protection).

esphome:

  name: as

  friendly_name: AS

  platformio_options:

    board_build.flash_mode: dio

  on_boot:

    - light.turn_on:

        id: led_ww

        blue: 100%

        brightness: 60%

        effect: fast pulse

esp32:

  board: esp32-s3-devkitc-1

  framework:

    type: esp-idf

    sdkconfig_options:

      CONFIG_ESP32S3_DEFAULT_CPU_FREQ_240: "y"

      CONFIG_ESP32S3_DATA_CACHE_64KB: "y"

      CONFIG_ESP32S3_DATA_CACHE_LINE_64B: "y"

      CONFIG_AUDIO_BOARD_CUSTOM: "y"

psram:

  mode: octal  # quad for N8R2 and octal for N16R8

  speed: 80MHz

# Enable logging

logger:

  hardware_uart: USB_SERIAL_JTAG

# Enable Home Assistant API

api:

  encryption:

    key: "TFpb+pBAvQIS1MVwaA7EoJ2DkpWE+79UvVro7yMyGdU="

ota:

  - platform: esphome

    password: "******************"

wifi:

  ssid: "wifi"

  password: "***************************"

  fast_connect: True

  manual_ip:

    static_ip: 192.168.55.127 #WiFi

    gateway: 192.168.55.1

    subnet: 255.255.255.0

    dns1: 192.168.55.1

  # Enable fallback hotspot (captive portal) in case wifi connection fails

  ap:

    ssid: "Esp32-S3-Wake-Word"

    password: "*************"

captive_portal:

button:

  - platform: restart

    name: "Restart"

    id: but_rest

light:

  - platform: esp32_rmt_led_strip

    id: led_ww

    rgb_order: GRB

    pin: GPIO16

    num_leds: 1

    chipset: ws2812

    name: "on board light"

    effects:

      - pulse:

      - pulse:

          name: "Fast Pulse"

          transition_length: 0.5s

          update_interval: 0.5s

          min_brightness: 0%

          max_brightness: 100%

i2s_audio:

  - id: i2s_output

    i2s_lrclk_pin: GPIO6  #LRC

    i2s_bclk_pin: GPIO7 #BLCK

speaker:

  - platform: i2s_audio

    id: i2s_audio_speaker

    dac_type: external

    sample_rate: 48000

    i2s_dout_pin:

      number: GPIO8

    bits_per_sample: 32bit

    i2s_audio_id: i2s_output

    timeout: never

    buffer_duration: 100ms

    channel: mono

    # sample_rate: 16000

    # bits_per_sample: 32bit

  # Virtual speakers to combine the announcement and media streams together into one output

  - platform: mixer

    id: mixing_speaker

    output_speaker: i2s_audio_speaker

    # num_channels: 2

    num_channels: 1

    source_speakers:

      - id: announcement_mixing_input

        timeout: never

      - id: media_mixing_input

        timeout: never

  # Vritual speakers to resample each pipelines' audio, if necessary, as the mixer speaker requires the same sample rate

  - platform: resampler

    id: announcement_resampling_speaker

    output_speaker: announcement_mixing_input

    sample_rate: 48000

    bits_per_sample: 16

  - platform: resampler

    id: media_resampling_speaker

    output_speaker: media_mixing_input

    sample_rate: 48000

    bits_per_sample: 16

media_player:

  - platform: speaker

    id: external_media_player

    name: Media Player

    internal: False

    volume_increment: 0.05

    volume_min: 0.4

#    volume_max: 0.85 # when amp gain connected to ground. Avoids cutting out.

    icon: mdi:speaker-wireless

    announcement_pipeline:

      speaker: announcement_resampling_speaker

      format: FLAC    # FLAC is the least processor intensive codec

      num_channels: 1  # Stereo audio is unnecessary for announcements

      sample_rate: 48000

    media_pipeline:

      speaker: media_resampling_speaker

      format: FLAC    # FLAC is the least processor intensive codec

      # num_channels: 2

      num_channels: 1

      sample_rate: 48000

    on_announcement:

      - mixer_speaker.apply_ducking:

          id: media_mixing_input

          decibel_reduction: 20

          duration: 0.0s

Login
Username:
Password:	Lost Password?
	Remember me