xinference.client.handlers.AudioModelHandle.speech#
- AudioModelHandle.speech(input: str, voice: str = '', response_format: str = 'mp3', speed: float = 1.0, stream: bool = False, prompt_speech: bytes | None = None, prompt_latent: bytes | None = None, **kwargs)#
Generates audio from the input text.
- 参数:
input (str) -- The text to generate audio for. The maximum length is 4096 characters.
voice (str) -- The voice to use when generating the audio.
response_format (str) -- The format to audio in.
speed (str) -- The speed of the generated audio.
stream (bool) -- Use stream or not.
prompt_speech (bytes) -- The audio bytes to be provided to the model.
prompt_latent (bytes) -- The latent bytes to be provided to the model.
- 返回:
The generated audio binary.
- 返回类型:
bytes