When making edits to the generated transcript, same as the captions editing annoyance is the inability to tab through sections of text. Also frustrating is that when double-clicking into a text section, the entire section is selected and not just the word that was clicked on. Making edits takes much longer when having to find the offending spelling/word (since things have now shifted) and then needing to reposition the cursor to be able to start to fix the error.
Why does the transcript need to show the timings on the side and break up the text by time? When exported the transcript doesn't show those things. Editing while having to deal with it broken up is a bit of a hassle.
Geoffrey Knecht commented
I totally agree. Text needs to be easier to work with.