Music Tag

Feature Introduction

Music tag capability automatically generates classification tags and probability values for each dimension through multi-dimensional analysis of music. Currently supported seven dimensions include:

  • Vocal
  • Mood
  • Language
  • Genre
  • Scene
  • Background Music
  • Intensity

API Description

Request Method:

POST (HTTP)

Request URL:

https://api.mediax.tencent.com/job

Request Header:

Content-Type: application/json

Request Flow:

The API includes 'Create Task' and 'Query Task'. After creating a task, users can actively query the task to know the result, or input a callback address when creating the task, and the task will automatically callback to that address after completion

Other Requirements:

File format: Common audio formats are recommended, such as mp3, wav

Create Task

Parameter Description

Parameter Required Type Description
action Yes string Common parameter, here is CreateJob
secretId Yes string Common parameter, user SecretId
secretKey Yes string Common parameter, user SecretKey
createJobRequest Yes object
- inputs Yes Array of Input Input, input structure array
- outputs Yes Array of Output Output, output structure array
- callback No string Callback address, default: callback disabled
- customId No string User-defined task ID, less than 64 characters
- timeout No int Task timeout, in seconds. Task will be set to ERROR after timeout

Input

Parameter Required Type Description
url No string Source URL address, choose one with source field
source No object Repository source settings, choose one with url field
- contentId Yes string Repository ID
- path Yes string Source path

Note: Only one of the source and url fields can be present

Output

Parameter Required Type Description
inputSelectors Yes Array of int Input source for this output
smartContentDescriptor Yes object Smart capability description, default: empty
- musicTagV2 Yes object
-- tagType Yes Array of string 1. Input range: ["vocal", "language", "mood", "genre", "scene", "bgm", "intensity", "all"]2. Input description - all means calling all types of tags - Can combine and call tags of each sub-dimension based on dimension name - all option and sub-options cannot appear simultaneously
- vocal, 2 types of vocal tags
- language, 21 types of language tags
- mood, 11 types of mood tags
- genre, 37 types of genre tags
- scene, 36 types of scene tags
- bgm, 2 types of background music tags
- intensity, 2 types of intensity tags
- all, get all 7 dimensions of tags

Request Example:

{
  "action": "CreateJob",
  "secretId": "{secretId}",
  "secretKey": "{secretKey}",
  "createJobRequest": {
    "customId": "{customId}",
    "callback": "{callback}",
    "inputs": [
      {
        "url": "{url}"
      }
    ],
    "outputs": [
      {
        "inputSelectors": [0],
        "smartContentDescriptor": {
          "musicTagV2": {
            "tagType": ["mood", "vocal", "language", "scene", "genre", "bgm"]
          }
        }
      }
    ]
  }
}

Response Example:

{
  "requestId": "ac004192-110b-46e3-ade8-4e449df84d60",
  "createJobResponse": {
    "job": {
      "id": "13f342e4-6866-450e-b44e-3151431c578b",
      "state": 1, // See state description below
      "customId": "{customId}",
      "callback": "{callback}",
      "inputs": [
        {
          "url": "{url}"
        }
      ],
      "outputs": [
        {
          "inputSelectors": [0],
          "smartContentDescriptor": {
            "musicTagV2": {
              "tagType": ["mood", "vocal", "language", "scene", "genre", "bgm"]
            }
          }
        }
      ],
      "timing": {
        "createdAt": "1603432763000",
        "startedAt": "0",
        "completedAt": "0"
      }
    }
  }
}

State

Value Meaning
1 SUBMITTED
2 PROCESSING
3 COMPLETED
4 ERROR
5 CANCELED

Get Task Information

There are two ways to get information: active query and passive callback.

  • Active query has two types of query interfaces based on id category: one is to query based on user-defined id, since the platform cannot guarantee the uniqueness of this id, it returns a Job array (see 1); the other is to query by the id in the response after creating the task (see 2)
  • Passive callback requires filling in the callback field when creating the task, the platform will send the Job structure to the address specified by callback after the task enters the completion state (COMPLETED/ERROR) (see 3), the platform recommends using passive callback to get task results.

In the music tag capability, if the queried task is successful (state=3), the task's Output will carry a smartContentResult structure, in which the audioTagger field is a result object array, each result object has tag type (tagType), tag name (tag), probability (probility)

Active Query, Based on User-Defined customId Passed When Creating Task

Request Example:

{
  "action": "ListJobs",
  "secretId": "{secretId}",
  "secretKey": "{secretKey}",
  "listJobsRequest": {
    "customId": "{customId}"
  }
}

Note: jobs in the listJobs response is an array

Response Example:

{
  "requestId": "c9845a99-34e3-4b0f-80f5-f0a2a0ee8896",
  "listJobsResponse": {
    "jobs": [
      {
        "id": "a95e9d74-6602-4405-a3fc-6408a76bcc98",
        "state": 3,
        "customId": "{customId}",
        "callback": "{callback}",
        "timing": {
          "createdAt": "1610513575000",
          "startedAt": "1610513575000",
          "completedAt": "1610513618000"
        },
        "inputs": [
          {
            "url": "{url}"
          }
        ],
        "outputs": [
          {
            "inputSelectors": [0],
            "smartContentDescriptor": {
              "musicTagV2": {
                "tagType": ["mood", "vocal", "language", "scene", "genre", "bgm"]
              }
            },
            "smartContentResult": {
              "musigTagV2": {
                "musicTagV2": {
                  "vocal": {
                    "label": "Vocal",
                    "prob": "1.0000"
                  },
                  "language": {
                    "label": "Other",
                    "prob": "0.0817"
                  },
                  "mood": {
                    "label": "Exciting",
                    "prob": "0.2953"
                  },
                  "genre": {
                    "label": "Zhuang",
                    "prob": "0.7710"
                  },
                  "scene": {
                    "label": "Prayer",
                    "prob": "0.8886"
                  },
                  "intensity": {}
                }
              }
            }
          }
        ]
      }
    ],
    "total": 1
  }
}

Active Query, Based on id in Response After Creating Task

Request Example:

{
  "action": "GetJob",
  "secretId": "{secretId}",
  "secretKey": "{secretKey}",
  "getJobRequest": {
    "id": "{id}"
  }
}

Response Example:

{
  "requestId": "c9845a99-34e3-4b0f-80f5-f0a2a0ee8896",
  "getJobResponse": {
    "job": {
      "id": "a95e9d74-6602-4405-a3fc-6408a76bcc98",
      "state": 3,
      "customId": "{customId}",
      "callback": "{callback}",
      "timing": {
        "createdAt": "1610513575000",
        "startedAt": "1610513575000",
        "completedAt": "1610513618000"
      },
      "inputs": [
        {
          "url": "{url}"
        }
      ],
      "outputs": [
        {
          "inputSelectors": [0],
          "smartContentDescriptor": {
            "musicTagV2": {
              "tagType": ["mood", "vocal", "language", "scene", "genre", "bgm"]
            }
          },
          "smartContentResult": {
            "musigTagV2": {
              "musicTagV2": {
                "vocal": {
                  "label": "Vocal",
                  "prob": "1.0000"
                },
                "language": {
                  "label": "Other",
                  "prob": "0.0817"
                },
                "mood": {
                  "label": "Exciting",
                  "prob": "0.2953"
                },
                "genre": {
                  "label": "Zhuang",
                  "prob": "0.7710"
                },
                "scene": {
                  "label": "Prayer",
                  "prob": "0.8886"
                },
                "intensity": {}
              }
            }
          }
        }
      ]
    }
  }
}

Passive Callback

The entire Job structure of the task that enters the completion state (COMPLETED/ERROR) will be sent to the address corresponding to the callback field specified by the user when creating the task. See the active query example (under getJobResponse) for the Job structure

cURL Example

Create Task

curl --location --request POST 'https://api.mediax.tencent.com/test/job' \
--header 'Content-Type: application/json' \
--data-raw '{
    "action": "CreateJob",
    "secretId": "{secretId}",
    "secretKey": "{secretKey}",
    "createJobRequest": {
        "customId": "{customId}",
        "callback": "{callback}",
        "inputs": [{
                "url": "{url}"
            }],
        "outputs": [
            {
                "inputSelectors": [0],
                "smartContentDescriptor": {
                    "audioTagger": {
                        "tags": [1, 2, 3]
                    }
                }
            }]
    }
}'

Query Task

# Based on customId
curl --location --request POST 'https://api.mediax.tencent.com/test/job' \
--header 'Content-Type: application/json' \
--data-raw '{
    "action": "ListJobs",
    "secretId": "{secretId}",
    "secretKey": "{secretKey}",
    "listJobsRequest": {
        "customId": "{customId}"
    }
}'

# Based on id
curl --location --request POST 'https://api.mediax.tencent.com/test/job' \
--header 'Content-Type: application/json' \
--data-raw '{
    "action": "GetJob",
    "secretId": "{secretId}",
    "secretKey": "{secretKey}",
    "getJobRequest": {
        "id": "{id}"
    }
}'

Appendix: Currently Supported Tags

Music Tag Model List  
Tag Model Types Tag Field
vocal 2 Vocal
Instrumental
language 21 Russian
Indonesian
Mandarin
German
Italian
Japanese
French
Thai
Javanese
Swedish
Cantonese
Instrumental
Burmese
English
Portuguese
Spanish
Vietnamese
Arabic
Korean
Malay
Other
mood 11 Sad
Exciting
Quiet
Catharsis
Romantic
Happy
Nostalgic
Horror
Magnificent
Empty
Funny
genre 37 8bit Pixel Style
ACG
Bossa
Chinese Christian Songs
Chinese Zen
Ancient Style
Country
Buddhism
Children
Caribbean Style
Indian
Classical
Costume Original Soundtrack
Alternative
Epic
Hip Hop
Zhuang
Brazilian
Blues
Yi
Latin
Rock
Punk
Folk
Thai
Jazz
Electronic
Hardcore
Dance
R&B
Mongolian
Tibetan
Easy Listening
Metal
Northern Shaanxi Folk Songs
Reggae
K-Pop
scene 25 Eastern Zen
Chinese Style
ACG
Fitness
Pixel Sound Effects
Nursery Rhymes
Music Box
Coffee Shop
Festive
Nightclub
Grand Chorus
Pets
Love Song
Travel
Campus
Opera
Game
Yoga
Before Sleep
Prayer
Pure Nature
Aerial Photography
Figure Skating
Grassland
Running
Piano Pure Tone
bgm 2 With Background Music
No Background Music
intensity 2 Exciting
Soothing
Tencent Media Lab
We would like to use performance and analytics cookies (“Cookies”) to help us recognize whether you are a returning visitor and to track the number of website views and visits. For more information about the Cookies we use and your options (including how to change your preferences) see our Cookies Policy.