My day job: TVNZ captioning producer
While this site is largely a hub for my journalistic endeavours, I thought I’d take some time to write a little about what I currently do for a living. For eight hours a day, five (sometimes six) days a week, I’m a captioning producer in TVNZ’s Production Services department.
In a nutshell, I (along with a few others in our team of around 20) transcribe the audio of a number of television programmes broadcast on TV ONE, TV2, TV3, TVNZ7 and TVNZ Heartland.
The captions can be activated via Teletext or with the press of a button on the remote of some Freeview-enabled TV sets or set-top boxes. Funded entirely by New Zealand on Air, captions are provided mostly for hearing-impaired viewers.
That said, many others also find them rather useful, including those who speak English as a second language, those learning to read, those who wish to watch TV without disturbing others, or even simply those who watch TV in a noisy environment.
There’s a little more to it than simply transcribing the audio verbatim, though. Because humans speak (much) faster than they can read, we often have to edit the dialogue down so that the text doesn’t simply flash on and off the screen before the viewer has a chance to read it. Our goal is to retain as much meaning as possible from any given dialogue or other important audio while ensuring it remains on screen long enough for viewers to absorb that meaning. There’s a little bit of a science to it, but essentially each and every caption must adhere to both a particular words-per-minute ratio and a minimum on-screen duration. This is why (if you’ve ever watched programming with captions before) often the on-screen dialogue doesn’t match what is spoken word for word.
On top of this, we colour code our captions and position them on screen for each respective speaker for any content that we produce from scratch. The captioning team also goes to great lengths to ensure accuracy in terms of referencing, spelling and grammar; we’ll cross reference all names, terms or otherwise that we encounter as we produce the captions (using a library of resources), and the file is peer edited before transmission.
You might occasionally see captions for some shows that don’t completely adhere to the aforementioned standards (often you’ll notice that the text is entirely in uppercase for such shows). Generally, these are caption files that have already been produced by overseas captioning teams. While we do review these files, we can’t spend too much time on them due to the sheer volume of content that we process. So aside from some basic checking to ensure that they reach a certain quality standard (such as localised spelling), we don’t alter these files drastically.
Finally, and perhaps one of the more interesting and exciting aspects of the job, is the live captioning component. We produce captions for some live-to-air programming (such as the Midday, 6pm and Tonight news bulletins on TV ONE, and also Close Up), including the odd special event such as breaking news, election coverage, budget announcements and more.
This is definitely the most fast-paced and hectic part of the job. Working in teams of two, those producing captions for any given live programming have access to the news story scripts as the journalists are putting them together via a special program. We start working on the captions for them once the stories have reached a certain completion level, and they’re eventually checked against the accompanying video footage as soon as the clips have been cut.
Bear in mind that this all takes place within (at most) 2.5 hours before the news bulletin goes to air, so some of this can get fairly tight! In fact, we’ll generally still be working on some captions while the news bulletin is being broadcast. Once the news bulletin begins, one of the captioners will manually send each individual caption live while their teammate keeps an eye on the line-up and ties up any loose ends.
But there’s only so much we can prepare for in any given live broadcast, and there’s one aspect that we virtually can’t prepare for whatsoever – live crosses and interviews! Because these are generally ad libbed, we have to type this dialogue live, as it is spoken. Let’s just say that the ability to type very quickly is mandatory!
For extended live interviews that often feature multiple speakers (such as those on Close Up, which can often be around the four-minute mark and include up to four participants), both captioners will actually work together as a team to caption the dialogue live. This is done with a dual-QWERTY keyboard set-up; one captioner will begin typing the dialogue, and the other will pick up at a logical point while the first is still typing. Meanwhile, each captioner keeps an eye on what their partner is typing on screen (all the while typing their own text) in a bid to avoid double-ups. It sounds complicated, and it kinda is, but the pair does tend to slip into a kind of rhythm that just seems to work. Again, live captions are not verbatim transcripts; the producers will essentially make judgement calls on the spot to paraphrase what has just been said. During these live interviews, the occasional typo may slip through, too, but I think that’s forgiveable given the circumstances!
So that’s a very basic overlook of what I do for a day job (sometimes a night job on account of the various shifts we work). It’s a fantastic job with a great team, and it’s also very flexible, enabling me to fit in my freelance writing work around it. It’s actually fairly difficult to succinctly communicate exactly what’s involved, and so I hope this post aids your understanding somewhat!











Good article but the captioning techniques are archaic! Nothing is better, faster, more high-tech than a professional stenocaptioner! One person versus all your “typists.” We all know the average speaker speaks at a rate of 180 wpm….captions for HOH population should be verbatim, not edited! Ridiculous!
Hi there, Jandriene. Thanks for your comment.
It’s true that the average person speaks at a rate that’s around 180wpm. However, one of the reasons TVNZ Access Services opted for the “edited” approach is that people generally can’t read that fast. This is further complicated when there are multiple on-screen speakers (especially, say, when there’s heated dialogue in a soap opera!) If we were to caption that dialogue verbatim, most of the text would flash on- and off-screen before most viewers are able to read it.
The guiding vision when composing these captions is to retain as much meaning as possible while keeping below a certain wpm rate. It’s an approach that’s very popular with viewers, based on survey feedback.
I should probably point out now also that this post is close to two years old, and not only do I no longer work at TVNZ Access Services, but I live in another country. I know that, shortly before I left that job, the Access Services team implemented voice-recognition re-speaking technology when captioning “live” segments on news bulletins.