声音克隆

功能介绍

声音克隆能力，通过对特定人物的音色进行建模，可将文本或语音输入转换成高度逼真的特定音色副本，应用于虚拟助手和智能播报等场景

接口说明

请求方式： POST（HTTP）

请求地址： https://service-mqk0mc83-1257411467.bj.apigw.tencentcs.com

请求头： Content-Type: application/json

请求流程： 接口包括‘创建任务’，‘查询任务’，‘获取模型列表’。创建任务后，用户可以主动查询任务来知晓任务结果，也可以在创建任务时输入回调地址（callback），则任务在完成后会自动回调该地址

创建任务

URL路径： /release/job

参数说明

参数	是否必选	类型	说明
action	是	string	公共参数，此处为 CreateJob
secretId	是	string	公共参数，用户 SecretId
secretKey	是	string	公共参数，用户 secretKey
createJobRequest	是	object
- inputs	是	Array of Input	Input，输入结构体数组
- outputs	是	Array of Output	Output，输出结构体数组
- callback	否	string	回调地址，默认：不开启回调
- customId	否	string	用户自定义任务 ID，小于 64 字符
- timeout	否	int	任务超时时间，单位秒。超过超时时间后任务会被置为 ERROR

Input

参数	是否必选	类型	说明
sourceData	否	string	文本输入，与 url, source 字段三选一填写,长度限制512
url	否	string	语音源文件 url 地址，与 sourceData，source 字段三选一填写
source	否	object	仓库源设置，与 url，sourceData 字段三选一填写
- contentId	是	string	仓库 ID
- path	是	string	源路径

Output

参数	是否必选	类型	说明
contentId	否	string	仓库 ID，默认：空，生成音频任务必填
destination	否	string	输出目录，默认：'/' （即根目录）
inputSelectors	是	Array of int	该输出的输入源
smartContentDescriptor	是	SmartContentDescriptor	智能能力的描述，默认：空
- outputPrefix	否	string	输出文件前缀，小于 20 字符，默认：空
- voiceCloning	是	object	声音克隆
-- type	是	VoiceCloningType enum	任务类型，具体参考下方VoiceCloningType
-- totalEpoch	否	int	训练轮次，当type为2(训练模型)时可填写。小于等于50, 默认30
-- modelName	是	string	模型名称，当type为2(训练模型)时填写生成模型名称，限制：只允许输入数字/大小写字母/下划线，小于64字符，需唯一；当type为1(生成音频)时填写声音克隆使用的模型名称，可填写默认模型"default"，或者训练生成的模型名称，模型列表可由用户侧自行存储或通过文末获取模型列表接口查询

VoiceCloningType

值	含义
1	生成音频
2	训练模型

请求示例：

生成音频

{
  "action": "CreateJob",
  "secretId": "{secretId}",
  "secretKey": "{secretKey}",
  "createJobRequest": {
    "customId": "{customId}",
    "callback": "{callback}",
    "inputs": [
      {
        "url": "{url}"
      }
    ],
    "outputs": [
      {
        "contentId": "{contentId}",
        "destination": "/output",
        "inputSelectors": [0],
        "smartContentDescriptor": {
          "outputPrefix": "{outputPrefix}",
          "voiceCloning": {
            "type": 1,
            "modelName": "default"
          }
        }
      }
    ]
  }
}

模型训练

{
  "action": "CreateJob",
  "secretId": "{secretId}",
  "secretKey": "{secretKey}",
  "createJobRequest": {
    "customId": "{customId}",
    "callback": "{callback}",
    "inputs": [
      {
        "url": "{url}"
      }
    ],
    "outputs": [
      {
        "inputSelectors": [0],
        "smartContentDescriptor": {
          "voiceCloning": {
            "type": 2,
            "modelName": "demo"
          }
        }
      }
    ]
  }
}

返回示例：

生成音频

{
  "requestId": "ac004192-110b-46e3-ade8-4e449df84d60",
  "createJobResponse": {
    "job": {
      "id": "13f342e4-6866-450e-b44e-3151431c578b",
      "state": 1,
      "customId": "{customId}",
      "callback": "{callback}",
      "inputs": [
        {
          "url": "{url}"
        }
      ],
      "outputs": [
        {
          "contentId": "{contentId}",
          "destination": "{destination}",
          "inputSelectors": [0],
          "smartContentDescriptor": {
            "outputPrefix": "{outputPrefix}",
            "voiceCloning": {
              "type": 1,
              "modelName": "default"
            }
          }
        }
      ],
      "timing": {
        "createdAt": "1603432763000",
        "startedAt": "0",
        "completedAt": "0"
      }
    }
  }
}

训练模型

{
  "requestId": "ac004192-110b-46e3-ade8-4e449df84d60",
  "createJobResponse": {
    "job": {
      "id": "13f342e4-6866-450e-b44e-3151431c578b",
      "state": 1,
      "customId": "{customId}",
      "callback": "{callback}",
      "inputs": [
        {
          "url": "{url}"
        }
      ],
      "outputs": [
        {
          "inputSelectors": [0],
          "smartContentDescriptor": {
            "voiceCloning": {
              "type": 2,
              "modelName": "demo"
            }
          }
        }
      ],
      "timing": {
        "createdAt": "1603432763000",
        "startedAt": "0",
        "completedAt": "0"
      }
    }
  }
}

State

值	含义
1	SUBMITTED
2	PROCESSING
3	COMPLETED
4	ERROR
5	CANCELED

获取任务信息

URL路径： /release/job

获取方式：分为主动获取和被动回调。

主动获取按照 id 的类别有两种查询接口，一种是根据用户自定义 id 查询，由于平台无法保证该 id 的唯一性，故返回 Job 数组（见 1）；另一种是通过创建任务后的回包中的 id 查询（见 2）
被动回调需要在创建任务时填写 callback 字段，平台在任务进入完成态（COMPLETED/ERROR）后会将 Job 结构体发送给 callback 所指的地址（见 3），平台方推荐使用被动回调的方式获取任务结果。

在声音克隆能力中，如果查询到的任务成功（state=3），则任务的 Output 中会携带 smartContentResult 结构体，其中的voiceCloning结构体(VoiceCloningResult)存储输出文件名信息。其中生成音频任务的结果文件，用户可根据 Output 中的 cos 及 destination 信息可自行拼接出输出文件的 cos 路径。

VoiceCloningResult

参数	类型	说明
modelName	string	模型名称，当输入type为1(训练模型)时输出
voiceName	string	生成音频结果文件，当输入type为2(生成音频)时输出

1. 主动查询，根据用户在新建任务时传入的自定义 customId

请求示例：

{
  "action": "ListJobs",
  "secretId": "{secretId}",
  "secretKey": "{secretKey}",
  "listJobsRequest": {
    "customId": "{customId}"
  }
}

返回示例：

生成音频

{
  "requestId": "c9845a99-34e3-4b0f-80f5-f0a2a0ee8896",
  "listJobsResponse": {
    "jobs": [
      {
        "id": "a95e9d74-6602-4405-a3fc-6408a76bcc98",
        "state": 3,
        "customId": "{customId}",
        "callback": "{callback}",
        "timing": {
          "createdAt": "1610513575000",
          "startedAt": "1610513575000",
          "completedAt": "1610513618000"
        },
        "inputs": [{ "url": "{url}" }],
        "outputs": [
          {
            "contentId": "{contentId}",
            "destination": "{destination}",
            "inputSelectors": [0],
            "smartContentDescriptor": {
              "outputPrefix": "{outputPrefix}",
              "voiceCloning": {
                "type": 1,
                "modelName": "default"
              }
            },
            "smartContentResult": {
              "voiceCloning": {
                "voiceName": "out.wav"
	            }
            }
          }
        ]
      }
    ],
    "total": 1
  }
}

训练模型

{
  "requestId": "c9845a99-34e3-4b0f-80f5-f0a2a0ee8896",
  "listJobsResponse": {
    "jobs": [
      {
        "id": "a95e9d74-6602-4405-a3fc-6408a76bcc98",
        "state": 3,
        "customId": "{customId}",
        "callback": "{callback}",
        "timing": {
          "createdAt": "1610513575000",
          "startedAt": "1610513575000",
          "completedAt": "1610513618000"
        },
        "inputs": [{ "url": "{url}" }],
        "outputs": [
          {
            "inputSelectors": [0],
            "smartContentDescriptor": {
              "outputPrefix": "{outputPrefix}",
              "voiceCloning": {
                "type": 2,
                "modelName": "demo"
              }
            },
            "smartContentResult": {
              "voiceCloning": {
                "modelName": "demo"
              }
            }
          }
        ]
      }
    ],
    "total": 1
  }
}

2. 主动查询，根据新建任务时回包带的 id

请求示例：

{
  "action": "GetJob",
  "secretId": "{secretId}",
  "secretKey": "{secretKey}",
  "getJobRequest": {
    "id": "{id}"
  }
}

返回示例：

生成音频

{
  "requestId": "c9845a99-34e3-4b0f-80f5-f0a2a0ee8896",
  "getJobResponse": {
    "job": {
      "id": "a95e9d74-6602-4405-a3fc-6408a76bcc98",
      "state": 3,
      "customId": "{customId}",
      "callback": "{callback}",
      "timing": {
        "createdAt": "1610513575000",
        "startedAt": "1610513575000",
        "completedAt": "1610513618000"
      },
      "inputs": [{ "url": "{url}" }],
      "outputs": [
        {
          "contentId": "{contentId}",
          "destination": "{destination}",
          "inputSelectors": [0],
          "smartContentDescriptor": {
            "outputPrefix": "{outputPrefix}",
            "voiceCloning": {
              "type": 1,
              "modelName": "default"
            }
          },
          "smartContentResult": {
            "voiceCloning": {
              "voiceName": "out.wav"
           }
          }
        }
      ]
    }
  }
}

训练模型

{
  "requestId": "c9845a99-34e3-4b0f-80f5-f0a2a0ee8896",
  "getJobResponse": {
    "job": {
      "id": "a95e9d74-6602-4405-a3fc-6408a76bcc98",
      "state": 3,
      "customId": "{customId}",
      "callback": "{callback}",
      "timing": {
        "createdAt": "1610513575000",
        "startedAt": "1610513575000",
        "completedAt": "1610513618000"
      },
      "inputs": [{ "url": "{url}" }],
      "outputs": [
        {
          "inputSelectors": [0],
          "smartContentDescriptor": {
            "outputPrefix": "{outputPrefix}",
            "voiceCloning": {
              "type": 2,
              "modelName": "demo"
            }
          },
          "smartContentResult": {
            "voiceCloning": {
              "modelName": "demo"
            }
          }
        }
      ]
    }
  }
}

3. 被动回调

会将进入完成态（COMPLETED/ERROR）的任务的整个 Job 结构体发送到用户在创建任务时指定的 callback 字段对应的地址，Job 结构体见主动查询的示例（getJobResponse 下）

获取模型列表

URL路径： /release/music_model

参数说明

参数	是否必选	类型	说明
action	是	string	公共参数，此处为 ListModels
secretId	是	string	公共参数，用户 SecretId
secretKey	是	string	公共参数，用户 secretKey
listModelsRequest	是	object
- offset	是	int	偏移量
- limit	是	int	一次拉取的最大数据量，最大100

请求示例：

{
  "action": "ListModels",
  "secretId": "{secretId}",
  "secretKey": "{secretKey}",
  "listModelsRequest": {
    "offset": 0,
    "limit": 10
  }
}

返回示例：

{
  "requestId": "c9845a99-34e3-4b0f-80f5-f0a2a0ee8896",
  "listModelsResponse": {
    "total": 1,
    "models": [
      {
        "name": "demo",
        "createdAt": "1610513575000"
      }
    ]
  }
}