The necessary steps needed to keep all departments in sync.
Although it is common for productions to cut corners, the post-production pipeline should never be ignored. If they are, the consequences can easily be overlooked towards the finalization process, and finally realized once it's been mastered and viewed by audiences. Repercussions can ripple throughout the crew. That is why it is important to consider the following standards.
Save the best for last.
Picture lock is a stage in editing a film or editing a television production. It is the stage prior to online editing when all changes to the film or television program cut have been done and approved. It is then sent to subsequent stages in the process, such as online editing and audio mixing. Any last-minute changes can force portions of subsequent work to be redone. No project should be passed to the audio department without a confirmed picture lock. Results can not only lead to missing deadlines, but also wasted hours of work. Below are some noteworthy tips to consider when locking to picture.
In many cases picture locks contain black screen inserts with a basic white font text that reads: 'VFX insert'. As the VFX team goes through their pipeline of developing the scene, it is common for the audio department to receive multiple updates of the project media throughout the process from concept art, animatics, rigging, baked animations, composited scenes, and finally to the completed render.
Coloring & Compositing
One thing that is never done before picture lock is coloring and compositing -especially when there is a strict deadline. Even if a scene requires a special visual that the sound must highlight, it is up to the director to make sure these notes are given to the sound department prior to the commencement of audio post-production.
Whether it's 2D or 3D, the first video delivery to sound department always has the most basic font and text. As with VFX, video updates are sent throughout the progress of the post-production.
Post-prep edit timeline
Professional industry standards.
It is common for video editors to either overwork their sound section by adding expendable clips and deleting essential audio files for the sound department, or completely neglecting the entire audio section and leaving multiple redundant regions with out-of-sync clips. The best is to have a balance of knowing what to add, what to delete, and how to organize it all for the audio post-production pipeline as to meet the required deadlines.
Cues, edits, and volume changes should be minimal. Effects should never be placed, unless for reference (I.e. Radio, Exterior distance, etc.). It is important to note that the video editor should make very minor adjustments to music as it is merely a reference track for the sound department.
As with music, cues, edits, and volume changes the importing of sound effects should be minimal. Editors can use .mp3’s or low quality clips from other projects, but must understand that those formats are to be used strictly as a reference for the audio department. If audio files meet the quality requirements (typically 48kHz 24bit as per the original file source), then as practice editors must not spend anytime perfecting the clips.
Each character/speaker should have their own track. It is important to note that if several mics are being used and only one sounds good, that the other clips are not deleted. Non-usable dialog clips should be placed on a separate muted tracks. It is possible that the sound department will need them. Any auditory distractions (clicks, pops, humming, etc.) should not render the clip useless as the audio department can most likely fix it. Overall, the video editor should focus on the visual aspects more than the audio, yet be aware of the sound department's requirements.
Each track should be labeled with specific title; Dialog (dia or characters name), Sound Effects (sfx), and Music (msx). All other tracks can have unused dialog, room tones, and optional references. The better the organization of clips on their specified tracks, the faster the process is for the audio department. Unorganized editing timelines can cost production several hours, if not days of unnecessary work. A professional video editor practices organization for not only the video, but also audio post-production pipeline.
Window Burn - Also known as BITC (Burnt-In TimeCode), is a readable on-screen version of the timecode information for a piece of material superimposed on the video image from 01:00:00:00, all the way until at least 10 seconds past the last from of picture.
Title Screen - Between the start of the reel and the bars and tone, post-production usually displays the project title, version number (refer to picture lock), reel number (if feature film), framerate, bit depth, and hertz. This is for all departments to be transparent with project version numbers and technical specifications.
Bars and Tone - Also known as ‘SMPTE color bars’ (Society of Motion Picture-Television Engineers) or ‘Universal Leader’ are a copyrighted television test pattern used where the NTSC video standard is utilized, including countries in North America. The Society of Motion Picture and Television Engineers refers to the pattern as Engineering Guideline 1-1990. Its components are a known standard. The colors are used to ensure that the television receiver is properly demodulating the 3.58 MHz color subcarrier portion of the signal. The vectors for the -I and +Q blocks should fall exactly on the 'I' and 'Q' axes on the vectorscope if the chrominance signal is demodulated properly. There is also a continuous 1000 Hz audio tone before sending program material, so that receiving stations and intermediary telecommunications providers may adjust their equipment and also assert ownership of the transmission line or medium. Likewise, producers of television programs typically record ‘bars and tone’ at the beginning of a videotape or other recording medium so that the playback equipment can be calibrated. Often, the name or call sign of the TV station, other information such as a real-time clock, or another signal source is graphically superimposed over the bars.
2-pop - A 2-pop is placed at timecode 00:59:58:00 or exactly 2 seconds before FFOA which is at 01:00:00:00. Alternately, in film post-production the leader starts at 01:00:00:00 (or 0+00 feet if using feet and frames as is common in the United States), the 2-pop starts at 01:00:06:00 (or 9+00), and FFOA starts at 01:00:08:00 (or 12+00). By lining up the 2 pop with the video, it ensures a fast reliable way to have the entire project in sync. Typically, a 2-pop is placed at the end of a visual countdown leading into the video. As the counter counts down, and reaches the 2-second mark, a 1 kHz tone plays for a single frame. Only the first frame of the 2-second lead in is displayed, followed by black video leading into FFOA (First Frame Of Action). This is especially helpful for playback purposes. In addition to lasting only one frame, the standard (though not a requirement) is a -20 db volume level for the 1 kHz tone. It’s a simple method to ensure sync between Picture & Sound in Post-Production for Film, Commercial, or Television Programs. A 2-Pop is used anytime sound & picture are handled separately. It is used when sending the audio to the sound department and then back to the video department where the audio needs to be synchronized again with the picture. It gets its name from the popping sound such a short burst of tone sounds like, and its placement at the 2-second mark before the start of a program.
Tail 2-Pop (with matching flash frames) Tail pops are at the end of a project to detect if there has been any sync drift. The tail pop is a 4-second-long video that has a pop at 2 seconds. And is highly recommended for all projects. Similar to the 2-pop, it should be placed after the last LFOA (Last Frame Of Action). The common phrase for both 2-pop and Tail 2-pop is, Head & Tail Sync Pop.
AAF/OMF export settings
When picture lock is confirmed, the process of exporting is performed carefully. Any missed information can cause complications for the audio department. Although AAF and OMF are similar, the need for either one depends on the software or workflow. Nevertheless, it is common practice to export both formats for project optimization.
This is the best choice for moving files from Adobe Premiere to Avid ProTools or Media Composer. An AAF contains links to audio and video files as well as editing decisions that are to be applied to the audio and video data. AAF files can be very large, but are the best when it comes to exporting for AVID Pro Tools systems. See video export settings for the most optimum video formats used in the sound department below.
This is an audio-only format. Unlike the old format of EDL, which is limited and simply points to your media, OMF includes all audio files in the OMF. This guarantees that your audio, along with your edits, will successfully transfer.
Final Cut Pro XML
Based on FCP 7. This is the best choice for moving projects to or from Final Cut Pro 7 or X; though FCP X requires conversion using a utility. Like EDL, this only points to the media. This is also the best choice for many 3rd-party media management systems.
Video export settings
The wrong video export settings can result in added or even subtracted frames. As a consequence, it can lead to out of sync audio on final delivery, leading to complicated and strenuous tasks.
Avid DNxHD, which stands for "Digital Nonlinear Extensible High Definition", is a lossy high-definition video post-production codec engineered for multi-generation compositing with reduced storage and bandwidth requirements. It is an implementation of SMPTE VC-3 standard. DNxHD is widely used by video editors but it is also used by other manufacturers and is supported within the Quicktime wrapper. Some camera manufacturers can create DNxHD proxy files allowing edit systems quick and easy access to footage (the Arri Alexa, for example, offers this functionality, as well as QT ProRes).
Avid DNxHR (Digital Nonlinear Extensible High Resolution) is a lossy UHDTV post-production codec engineered for multi-generation compositing with reduced storage and bandwidth requirements. The codec was specifically developed for resolutions considered above FHD/1080p, including 2K, 4K and 8K resolution. There are several quality options; LB (Low Bandwidth), SQ (Standard Quality), HQ (High Quality), HQX (High Quality 10-bit).
The Material Exchange Format is a container format for professional digital video and audio media defined by a set of SMPTE standards. A typical example of its use is for delivering advertisements to TV stations and tapeless archiving of broadcast TV programs. It is also used as part of the Digital Cinema Package for delivering movies to commercial theaters.
Any other codec such as H.264 contain long GOP codecs (Group Of Pictures) and have been known to cause timeline miscalculations leading to a slightly out of sync video/audio. H.264 is CPU intensive and has had issues where it competes with the audio processing resources. DNxHD/HR and Apple ProRes are preferred over any Long GOP codec, especially with editing workflows. This is simply due to the nature of the intraframe codecs.
Audio post-production process
Audio can make or break a project, a basic understanding of audio post-production is a must for everyone involved, especially the visual team and lead director. There are many stages in the audio post-production process, and it’s easy to get confused about the best order in which to complete various tasks. The standard process for audio work has evolved over the many years of movie and video production, and has helped keep audio quality at its maximum throughout. When a larger team is involved, it’s especially important to follow the order below. Even if a project can be managed by one sound professional, sticking to the order of operations below keeps the workflow efficient.
Audio is received in the form of an OMF or AAF export from the video editing system. The organization of the tracks and audio handle length are very important parts of the export. Sometimes there are quirks that are encountered, depending on what programs are being using and what kind of export is given, so discussing the specs before delivery is important. It's never a bad idea to do a test export before sending the full project. It's very important that any of the bugs be worked out before the spotting session.
This is where the whole team comes together and talks through the entire project. During the spotting session, editors take numerous notes. Discussions are held about production audio and the specifics of sound design. Having the dialog editor and the picture editor in the same room is ideal. For instance, there may be a question about B-roll or an alternative microphone channel that seems to be missing. The editor should know where all the bones are buried and can say if it doesn't exist or is unusable. It is also a good time to dig into the details about where the focus of attention should be. For instance, the request to remove a dog barking in the background that was hated being heard from the moment the film was shot. The sound department might love that dog bark and not realize that there are to be no dog sounds in the film at all, even though it's all shot in a park.
Once the spotting session is complete, the sound department takes over. There is little need for any personnel outside the sound department to be involved with the basic development stages of audio. Assistance is needed when the dialog edit contains a foreign language or when creating subjective sounds. In some cases, a person fluent in that language is employed to make sure that nothing gets lost or altered during the dialog edit. Unwanted noise is removed, and the recordings are trimmed down to the necessary length. Raw recordings are organized and synced to the timeline. Often there will be sound design sessions with the director when there are areas that the sound design is unrealistic or complex.
ADR (Additional Dialogue Replacement)
In most cases, some of the original audio recorded on set will be corrupt, noisy, or simply just missing. Other times, the quality is not up to par and the tone of the voices is poor. ADR is the process of recording new dialogue in a studio environment to sync with the video. The actor will lip sync to their original performance as closely as possible.
This is where the creation of audio effects for the picture starts. The sound designer adds wild tracks and new field recordings to create background ambience. Any special sound effects are created at this point, too. Various techniques are used to create sounds, included field recording, heavy processing, and electronic synthesis.
Similar to sound design in the sense that it is a process of creating sounds to enhance the realism of the picture. The difference is that Foley refers to human-based sound effects. Foley artists will usually re-perform the scene live, replicating footsteps, rustling clothes and prop movements. These sounds are then edited to match the scene.
In this step diegetic music (sound occurring inside a scene) and non-diegetic music (sound not part of a scene, like soundtracks) is composed and organized. Where applicable, licensed music is also curated and organized.
Once the editorial work is completed, a premix is made. Premixing is basically doing a lot of the boring work that winds up making the audio start to come to life. The production sound is usually the prime focus of the premix, matching the sound of the audio across edits. Some tasks may include tasks include audio restoration and noise reduction. Different mics may be needed in the process. Overall, the loudness of audio is adjusted. It is important to have the final music available for this stage of the audio process. Although it's not always possible, having the music on hand gives the sound department a better chance to know how to rough in the other elements such as sound effects and ambiances.
The final mix is where the project begins to he heard as it should be. No more distortion, smooth transitions, and those dogs are gone. For many directors, this is their favorite part of the filmmaking process. They can rely on someone else to do the heavy lifting, and they can experiment. The first pass is like painting the side of a house. The mixer(s) start off at the left side of the timeline and work down to the right as time progresses. Going back and play the work again after the changes have been made is a common practice be professional sound mixers. If it sounds good, they move on, but always a little bit earlier so that there is a perspective of how the scene was left the previous moment. Once at the end of the first pass, there should be a full playback. This will be the first time that the director will experience the project from top to bottom without stopping. During the playback, the whole team (including the director) take notes. After the playback, it's always good to hear how the project sounded (as a whole) to the director and their creative team. Then the digging starts and the sound team listens to everyone’s notes. Once those notes are complete, it's not unheard of to do one more playback before calling it a day, just in case a few more notes pop up.
After the mix, the video department receives the audio masters and stems. Masters are the full mix (surround sound and stereo). The stems are the different categories of sound (voice over, dialog, sound effects, and music) split out discretely. These audio assets come in handy when making cut-downs or trailers.
No two people make a veggie burger the same way, and the same holds true for those working in post-production audio. While there may be similar craft involved in what is done, the inventiveness and approach to creating the best sound for a project varies from person to person.
Hopefully, knowing what is involved when collaborating with a sound house or independent sound professional will allow directors to choose their audio soul mate and know enough so that they can truly collaborate on their audio's character and design.
Focused on creating the most immersive sounds.
VR/VGD audio process
It's not uncommon for video game developers to tackle all the sound in their projects. In many cases, developers find a piece of audio online, find a what they like, and then stick it in the project and continue on investing time in all the mechanics and graphics. Though the game might be the best gameplay experience ever with spectacular graphics, the audio can sound either thin or many other video game sounds before.
No two people make a veggie burger the same way, and the same holds true for those of us working in audio. While there may be a similar craft involved in what we do, the inventiveness and approach to creating the best sound for 2D, 3D or virtual reality gaming experience varies from person to person. Knowing what the vision is, and the various ways to approach it is important, but knowing what is to be involved when collaborating will allow you to choose your audio soul mate and know enough so that you can truly collaborate on your games environment, character(s) and overall design.
Production requires many individual parts. Just as game designers, character designers, animators, and composers focus on certain specifics, sound design opens up several levels of concentrated tasks unto itself. From building unique sounds in digital audio workstations with various synths and controllers, importing samples into the game engine, adjusting volume, modulations, convolution reverb, ambisoncs, audio synesthesia, blueprint scripting, sound attenuation… and on. Depending on the depth of the game, tasks can be plenty. Not to mention, an understanding of coding is necessary for the process. Communicative collaboration is a given, but also there is much time spent in working independently.
As mentioned above, it is not uncommon to separate tasks within the sound department. In the event of Blake Sound exporting or importing sounds into a game, these are the list of technical requirements.
Unity can support .aif, .wav, .mp3, and .ogg. Unity can also accepts tracker modules in the .xm, .mod, .it, and .s3m formats. The tracker module assets behave the same way as any other audio assets in Unity, although no waveform preview is available in the asset import inspector.
UE4 currently supports importing 16-bit and 24-bit PCM formatted .wav files at any sample rate, and for up to 8 channels. Although uncompressed little endian 16 bit wave at 44100 Hz or 22050 Hz are recommended. The audio files that are imported are automatically encoded to compression formats based on the platform and features used by the sound. Importing a sound file into the editor generates a Sound Wave Asset that can be dropped directly into a Level, or that can be used to create a Sound Cue which then can be edited inside the Sound Cue Editor.