My Ideal Speech-to-Text Transcription Workflow
This is about transcription speech to text workflow. Ideally, I would be able to transcribe a clip, not a sequence, and the transcription would live on the clip, not on a sequence, because I'm using transcription to have a script with which I can cut down a video, not using transcription to create captions for export.
So after I create a transcription for a clip, and apply it as subtitles to the clip, either I would like to be able to export the subtitles to a csv file like I can do with markers, which would include both the subtitle and the timecodes for each subtitle. This, I would import into a spreadsheet that I would then use to put together a paper cut for the interview. This is essentially how I work now, I export markers I create while watching down a clip.
Or, whether or not it's tied to a sequence or the clip, I would want to be able to take the subtitles that it creates and convert them to markers on the clip, or on the sequence. Then I would export the markers just like I can do now with markers that I generate manually, which would have each marker tied to a timecode.
Really great work on this so far, it's so close to being exactly what I need to really supercharge my workflow. Thanks!!
Daniel DeStefano commented
One more thing, I see that the transcription breaks up the interview into paragraphs, but I'm not seeing the rhyme or reason for paragraph breaks. In my sequence, I have long periods of monologue, then long periods of silence, then the speaker speaks again. If the transcription used those long periods of silence to put in paragraph breaks, that would definitely be more ideal. I'm not sure what it is as of now that puts in paragraph breaks, but based on where the big silences are, would be perfect.