Server Configuration

About 2586 wordsAbout 9 min

2024-11-7

Basic Configuration

port

Type: number
Default: 8080
Required: No
Description: Server port

Example

const serverConfig = {
  port: 8080, // Set server port
};

api_key

Type: string
Default: -
Required: No
Description: ESP-AI API key used for AI services

Example

const serverConfig = {
  api_key: "xxx", // Get from https://espai.fun -> create a super-agent -> bottom-left api_key
};

devLog

Type: number
Default: 1
Required: No
Description: Log output mode
- 0: no logs (production mode)
- 1: normal logs
- 2: verbose logs

使用案例

const serverConfig = {
  devLog: 1,  // Normal log level
};

iatDu

Type: string | boolean
Default: false
Required: No
Description: Prompt tone audio played before ASR starts. Local mp3 only. Use 16k or 24k audio (16k recommended)

true uses default prompt tone. A file path uses that local audio. false disables prompt tone.

Example

import path from 'path' 

// This example assumes you put ./du.mp3 in project root
const serverConfig = {
  iatDu: path.join(__dirname, './du.mp3'), // Prompt tone source
};

llm_qa_number

Type: number
Default: 5
Required: No
Description: Number of LLM dialogue rounds to keep (Q+A = one round)

使用案例

const serverConfig = {
  llm_qa_number: 5, // Keep 5 rounds of LLM history
};

Client Config Generation

gen_client_config

Type: (params: Record<string, any>) => Promise<ConfigResponse>
Required: Yes
Description: Generate client-side runtime config, including ASR / TTS / LLM settings
Params:
- params.send_error_to_client: function to send error messages to client
- params.ws: WebSocket instance
- params.client_params: client-side parameters
Returns:
- success: config object
- failure: { success: false, message: string }

See built-in platform integrations: Built-in Server Platform Guide

Example

const config = {
  gen_client_config: async (params) => {
    // 生成客户端配置
    return {
      api_key: "xxx", // 用户`ESP-AI`的秘钥，在一些需要进行AI服务时使用。 权重比全局高，这里配置后优先使用这里的
      iat_server: "xun_fei",
      iat_config: {
        appid: "xxx",
        apiSecret: "xxx",
        apiKey: "xxx",
        vad_eos: 1500,
        // iat_server 中可配置的其他配置项...
      },
      llm_server: "xun_fei",
      llm_config: {
        appid: "xxx",
        apiSecret: "xxx",
        apiKey: "xxx",
        llm: "v4.0",
        // llm_server 中可配置的其他配置项...
      },
      tts_server: "xun_fei",
      tts_config: {
        appid: "xxx",
        apiSecret: "xxx",
        apiKey: "xxx",
        is_clone: false, // 是否为克隆的音色
        // tts_server 中可配置的其他配置项...
      },
      connected_reply: "已成功连接服务器",  // 设置为 false 则关闭
      f_reply: "你好", // 设置为 false 则关闭提示音
      sleep_reply: "我先休息了哦，有需要再叫我", // 设置为 false 则关闭
      llm_init_messages: [
        { role: 'system', content: '你是小明同学，是一个无所不能的智能助手。' },
      ],

      // 指令配置
      intention: [
        {
            // 关键词
            key: ["帮我开灯", "开灯", "打开灯"],
            // 向客户端发送的指令
            instruct: "device_open_001",
            message: "开啦！还有什么需要帮助的吗？", 
            // 配置开放平台 api_key 后字符串类型的指令会进行NLP推理。
            api_key: "xxx",
        },
        {
            // 关键词
            key: ["帮我关灯", "关灯", "关闭灯"],
            // 向客户端发送的指令
            instruct: "device_close_001",
            message: "关啦！还有什么需要帮助的吗？",
            // 配置开放平台 api_key 后字符串类型的指令会进行NLP推理。
            api_key: "xxx",
        },
        // 远程开关， 具体教程见： https://espai.fun/example/switch/
        {
            // 关键词
            key: ["帮我关灯", "关灯", "关闭灯"],
            // 向客户端发送的指令
            instruct: "device_close_001",
            message: "关啦！还有什么需要帮助的吗？",
            // 目标设备ID
            target_device_id:"xxx",
            // 超体 api_key
            api_key:"xxx",
        },
        {
            // 关键词
            key: ["退下吧", "退下"],
            // 内置的睡眠指令
            instruct: "__sleep__",
            message: "我先退下了，有需要再叫我。",
            // 配置开放平台 api_key 后字符串类型的指令会进行NLP推理。
            api_key: "xxx",
        }, 
        {
            // 调到指定音量
            key: async (text = "") => {
                const pattern = /音量调到(\d+)\%/;
                // 查找匹配项
                const match = text.match(pattern);
                if (match) { 
                    return true;
                }
            },
            instruct: ({ text }) => {
                const pattern = /音量调到(\d+)\%/;
                const match = text.match(pattern);
                const volumeLevel = match[1];
                console.log("音量设置为:", volumeLevel);
                // 直接传给业务服务，业务服务可再传给硬件，这样硬件就可以进行调整音量了
                // ... 
            },
            message: "好的"
        },
        {
            /**
             * 正则匹配
             * 如：播放音乐最后的倔强  
             * 返回匹配的字符串为匹配成功
            */
            key: async (text = "", llm_historys) => {
                const regex = /^(播放音乐)(.*)$/;
                const match = text.match(regex);
                if (match) {
                    const songName = match[2];
                    console.log("音乐名称:", songName);
                    return songName;
                } else {
                    return false;
                }
            },
            // 向客户端发送的指令
            instruct: "__play_music__",
            message: "好的！",
            /**
             * 用于返回音频地址和播放进度
             * 目前只支持 mp3、wav 格式
             * @param {String} name 是歌曲名称
             * @return {number} seek 进度： （以秒为单位）
             * @return {message} 找不到数据时的TTS
            */ 
            music_server: async (name, { user_config, signal, sendToClient }) => { 
                return {
                    url:"https://xiaomingio.top/music.mp3",
                    seek: 0,
                    message: message
                };
            },
            /**
             * 当音频结束后的回调
             * @param {object} arg.break_second  停止时的进度，单位秒。也就是用户播放了到了多少秒（seek+play_time）
             * @param {object} arg.play_time     实际播放音频的时间，单位秒。
             * @param {object} arg.seek          音频开始播放时间，其实也就是 music_server 函数中返回的 seek 值
             * @param {object} arg.start_time    开始播放音频的 Unix 毫秒数时间戳
             * @param {object} arg.end_time      结束播放音频的 Unix 毫秒数时间戳
             * @param {object} arg.event         结束原因： "user_break" 用户打断 | play_end 播放完毕 | foo 未知事件 
            */
            on_end: (arg) => {
                // 请求业务服务器保存进度信息 ...
                console.log(arg);
            }
        },
        /**
         * 独立为本设备设置语音识别开始前"嘟"的音频流，默认为 false，也就是不开启提示音
         * 只能播放本地 mp3 地址： iatDu: path.join(__dirname, `./du.mp3`) // nodejs 写法
         * 为 false 时关闭提示音，为 true 时使用默认提示音。
         * 目前仅仅能设置为 false 以关闭设备提示音
        */
        iatDu?: string | boolean;

        // 其他...

      ],
    };
  },
};

Authentication

auth

Type: (params: AuthParams) => Promise<AuthResponse>
Required: No
Description: Client authentication function. client_params comes from client-side params in server config. This runs on every dialogue/session, so performance is critical (caching recommended).

Params:

{
  type: "connect" | "start_session"; // auth scenario
  send_error_to_client: (code: number, message: string) => void;
  ws: WebSocket;
  client_params: {
    api_key: string;
    ext1: string; 
    ext2: string;
  }
}

Returns:
- Success: { success: true }
- Failure: { success: false, message: string }

Example

const config = {
  auth: async (params) => {
    if (params.client_params.api_key === "valid_api_key") {
      return { success: true }; // auth passed
    } else {
      params.send_error_to_client(401, "Invalid API key");
      return { success: false, message: "Invalid API key" }; // auth failed
    }
  },
};

Parameter Control

llm_params_set

Type: (params: Record<string, any>) => Record<string, any>
Required: No
Description: Customize LLM parameters (for example temperature)

Example

const config = {
  llm_params_set: (params) => {
    // Modify default LLM params. Must return params.
    return { ...params, temperature: 0.8 };
  },
};

tts_params_set

Type: (params: Record<string, any>) => Record<string, any>
Required: No
Description: Customize TTS parameters (speaker, volume, speed, etc.)

Example

const config = {
  tts_params_set: (params) => {
    // Modify default TTS params. Must return params.
    return { ...params, volume: 5, speed: 1.0 };
  },
};

Event Callbacks

onDeviceConnect

Type: (arg: DeviceConnectArg) => void
Required: No
Description: Called when a new device connects

Params:

{
  device_id: string;      // Device ID
  client_version: string; // Client version
  instance: Instance      // ESP-AI instance
  client_params           // Client params from provisioning page
}

Example

const config = {
  onDeviceConnect: (arg) => {
    console.log(`Device ${arg.device_id} connected, client version: ${arg.client_version}`);
  },
};

onDeviceDisConnect

Type: (arg: DeviceConnectArg) => void
Required: No
Description: Called when a device disconnects

Params:

{
  device_id: string;      // Device ID
  instance: Instance      // ESP-AI instance
  client_params           // Client params from provisioning page
}

Example

const config = {
  onDeviceDisConnect: (arg) => {
    console.log(`Device ${arg.device_id} disconnected`);
  },
};

onSleep

类型: (arg: DeviceConnectArg) => void
必填：否
说明: 设备断开连接的回调

参数:

{
  device_id: string;      // 设备ID  
  instance: Instance      // ESP-AI 实例
  client_params           //  配网页面配置的客户端参数
}

Example

const config = {
  onSleep: (arg) => {
    console.log(`Device ${arg.device_id} entered sleep state`);
  },
};

onIAT

类型: (arg: IATArg) => void
必填：否
说明: 语音识别请求前的回调(注意，这是转换前，所以什么都拿不到，只能获取到麦克风pcm)

使用案例

const config = {
  onIAT: (arg) => {
    console.log(`设备 ${arg.device_id} 发起语音识别请求`);
  },
};

onIATcb

类型: (arg: IATCallbackArg) => void
必填：否
说明: 语音识别过程中的回调

参数:

{
  device_id: string;    // 设备ID
  text: string;         // 识别文本
  instance: Instance      // ESP-AI 实例
  sendToClient: (diy_text?: string) => void; // 发送到客户端的函数, 默认发送 text 到客户端，如果你要自定义就传入 diy_text 即可
}

/**
 * iat 回调: 语音识别过程中的回调
 * @param {string}    device_id     设备id
 * @param {string}    text          语音转的文字 
 * @param {()=>void}  sendToClient  调用这个方法后可以直接将文字发送到客户端，客户端使用 onEvent 接收、 
 * 
 * *****  调用 sendToClient() 后，客户端代码向下面这样写即可接收到音频流 **** 
 * void on_command(const String& command_id, const String& data) {
 *      if (command_id == "on_iat_cb") {
 *          // some code...
 *      }
 * } 
 * void setup() {
 *      ...
 *      esp_ai.onEvent(on_command);
 * }
*/

使用案例

const config = {
  onIATcb: (arg) => {
    console.log(`设备 ${arg.device_id} 识别到文本：${arg.text}`);
    arg.sendToClient();
  },
};

onTTS

类型: (arg: TTSArg) => void
必填：否
说明: TTS 转换前的回调

参数:

{
  device_id: string;     // 设备ID
  tts_task_id: string;   // TTS任务ID
  text: string;          // 待转换文本
  instance: Instance      // ESP-AI 实例
  sendToClient: (diy_text?: string) => void; // 发送到客户端的函数, 默认发送 text 到客户端，如果你要自定义就传入 diy_text 即可
}

onTTScb

类型: (arg: TTSCallbackArg) => void
必填：否
说明: TTS 转换完成的回调

参数:

{
  device_id: string;     // 设备ID
  is_over: boolean;      // 是否完成
  audio: Buffer;         // 音频流, mp3 格式, 使用 base64 格式进行封装。自行解码为二进制即可。
  instance: Instance      // ESP-AI 实例
  sendToClient: () => void; // 发送到客户端的函数
}

使用案例

const config = {
  onTTScb: (arg) => { 
    console.log(arg.audio)

    // 调用这个方法后可以直接将文字发送到客户端，客户端使用 onEvent 接收、 
    // arg.sendToClient();
    //
    // 客户端接收代码如下：
    // void on_command(const String& command_id, const String& data) {
    //      if (command_id == "on_tts_cb") {
    //          // some code...
    //      }
    // } 
    // void setup() {
    //      ...
    //      esp_ai.onEvent(on_command);
    // }
  },
};

onLLM

类型: (arg: LLMArg) => void
必填：否
说明: LLM 服务调用前的回调

onLLMcb

类型: (arg: LLMCallbackArg) => void
必填：否
说明: LLM 推理完成的回调

参数:

{
  device_id: string;     // 设备ID
  text: string;          // 大语言模型推理出来的文本片段
  user_text: string;     // 用户问题 
  llm_text: string;      // 大模型推理出来的完整文本  
  is_over: boolean;      // 是否推理完毕
  llm_historys: Record<string, any>[]; // 对话历史
  instance: Instance      // ESP-AI 实例
  sendToClient: (diy_text?: string) => void; // 发送到客户端的函数, 默认发送 text 到客户端，如果你要自定义就传入 diy_text 即可
}

使用案例

const config = {
  onLLMcb: (arg) => { 
    console.log(arg.text)

    // 调用这个方法后可以直接将文字发送到客户端，客户端使用 onEvent 接收、 
    // arg.sendToClient();
    //
    // 客户端接收代码如下：
    // void on_command(const String& command_id, const String& data) {
    //      if (command_id == "on_llm_cb") {
    //          // some code...
    //      }
    // } 
    // void setup() {
    //      ...
    //      esp_ai.onEvent(on_command);
    // }
  },
};

Plugin System

plugins

Type: Plugin[]
Required: No
Description: Plugin list config

Example

const config = {
  plugins: [
    // Import plugin. Follow plugin docs for required config fields.
    // Example plugin: https://github.com/wangzongming/esp-ai-plugin-tts-aliyun
    require("esp-ai-plugin-tts-aliyun")
  ],
};

Log Configuration

logs

Type: LogConfig
Required: No
Description: Custom log output config

Structure:

{
  info?: () => void;  // normal log
  error?: () => void; // error log
}

Example

const config = {
  logs: {
    info: () => {
      console.log("normal log message");
    },
    error: () => {
      console.error("error log message");
    },
  },
};

Server Stress Test Data

Test environment:

Provider: Tencent Cloud
CPU: 2 cores
Memory: 2 GB
Bandwidth: 4 MB
Disk: SSD 50 GB

Connection Test Results

Connections	Success	Failures	Instant resource usage	Post-connection usage
1000	1000	0	CPU:100%, MEM:1.5GB	CPU:4%, MEM:1.5GB
2000	2000	0	CPU:100%, MEM:1.5GB	CPU:4%, MEM:1.5GB
3000	3000	0	CPU:100%, MEM:1.5GB	CPU:4%, MEM:1.5GB
4000	3806	194(5%)	CPU:100%, MEM:1.5GB	CPU:4%, MEM:1.5GB
5000	4685	315(6.7%)	CPU:100%, MEM:1.5GB	CPU:4%, MEM:1.5GB
6000	4030	1970(32%)	CPU:100%, MEM:1.6GB	CPU:4%, MEM:1.5GB
10000	52	Service crashed	-	-

Audio Stream Test Results

Test conditions:

Audio size: 10 KB
Chunk size: 2048 bytes
Messages per connection: 6

Connections	Connected	Expected Msg	Sent Msg	Instant resource usage	Post-connection usage
100	922	600	600	CPU:100%, MEM:1.5GB	CPU:4%, MEM:1.5GB
500	500	3000	3000	CPU:100%, MEM:1.5GB	CPU:4%, MEM:1.5GB
1000	1000	6000	6000	CPU:100%, MEM:1.5GB	CPU:4%, MEM:1.5GB
2000	2000	12000	12000	CPU:100%, MEM:1.5GB	CPU:4%, MEM:1.5GB
3000	2982	18000	2982	CPU:100%, MEM:1.5GB	CPU:4%, MEM:1.5GB

Server Configuration

🎉 ESP-AI 公告 🎉