Guides

Summarizing Generic Transcripts in Wordcab

In this guide, you'll learn how to upload and summarize a generic transcript using the Wordcab API.

Share on:

Generic Transcript Format

The format for a "generic" transcript is very simple: Each line should contain a speaker label, followed by a colon, and then what was said. For example, if John and Lauren are talking, the transcript would look like the example below. You can optionally add timestamps in brackets, in the following format - [00:01:23 --> 00:01:25]. You can place these timestamps anywhere on the line with the speaker label and utterance.

[00:01:23 --> 00:01:25] John: Hey Lauren, how are you?
[00:01:26 --> 00:01:29] Lauren: I'm great, just testing out this new tool called Wordcab!
[00:01:30 --> 00:01:32] John: Awesome, what does it do?
[00:01:33 --> 00:01:36] Lauren: Pretty much the coolest thing in the universe.

Your transcript should be saved as a .txt file. We can now move on to the summarization.

Using the API to Summarize Your Transcript

Uploading your generic .txt transcript using Python is really simple. Just make sure to include the source query parameter and assign it "generic" as a value. For more details on the parameters, login into your Wordcab account and check out your personalized Getting Started guide.

import requests

endpoint = "https://wordcab.com/api/upload/" # <= Slash required
apikey = "YOUR_API_KEY"

file_name = "your_transcript.txt"
file_path = f"/path/to/your/file/{file_name}"

display_name = "Transcript Example"
source = "generic"
only_api = "false" # <= "false" is the default only_api value

files = {
    "transcript": open(file_path, "rb")
}
params = {
    "apikey": apikey,
    "display_name": display_name,
    "source": source,
    "only_api": only_api
}

r = requests.post(endpoint, params=params, files=files)
job_name = r.json()["job_name"]

print(job_name)

You'll receive a job_name that you can use to poll your summarization job. Once the job is complete, the same request will output a summary, transcript, and other useful data!

import requests

endpoint = "https://wordcab.com/api/upload" # <= Slash optional
# OR endpoint = "https://wordcab.com/api/meetings"

params = {
    "apikey": apikey,
    "job_name": job_name,
    "summary_len": 3 # <= Defaults to 3, accepts value of 1-5
}

r = requests.get(endpoint, params=params)
print(r.json())
. . .
avatar

Aleks Smechov

Founder at Wordcab
View Articles

Aleks is the founder at Wordcab.