I often get asked how I "start" a project. This is a big question. Multiple things happen all at the same time, but one of the "filters" that I generally use to collect all of these items into a centralized place is the Audio Design Document (also called an audio style guide, sonic bible, etc...). This becomes a road-map for myself, the audio team, and the project team as a whole on how I guide the audio component of any project that I'm on. The following are *super* high-level points and questions that I generally try to hit any time I’m starting a new project. Keep in mind I’m writing this from an audio director’s point of view, so there are definitely some audio-nerd bits that I bring up. Also note that many items *may* not be able to be answered right at the beginning of any given project, but it’s good to write it down so you know what to prioritize and hit at a later date.Keep all facets of audio in mind:Remember that audio is like a painting. A host of multiple elements (or “colors”) form and combine to create a larger picture. There’s rarely one “silver bullet” that makes an audio experience amazing. It’s a team effort with an attention to detail that puts it over the top. When you’re establishing your audio vision, make sure that you are taking the elements of Music, Sound-Design, Voice-Over, and Technology into account. Each element has their subcomponents and everything needs to work together to make for the most cohesive experience. If one element is neglected or “tacked-on”, it can throw your entire audio direction into disarray.Here are a few things to keep in mind for each element. There is obviously lots of cross-pollination and the project as a whole may not have all of this figured out just yet, but it’s good to have this sort of stuff on your radar when you’re thinking about your audio vision:Music
Style – Is it a big, bombastic orchestral score or is it a minimalist score that creates more of an ambience? This plays a huge role in how your game will sound. If it’s a massive rock piece, you’re going to want to have your sound design and voice over components a bit more sparse. Otherwise, you’re going to exhaust your listener.
Instrumentation – Are you using an 80 piece orchestra with a drum kit and distorted guitars or a solo piano with a string quartet? This starts to color your aural “texture”. It starts to define what the personality of your project will have. This also plays a huge role.
Recording/Engineering Philosophy – How is the music recorded? Is it recorded as several dozen individual tracks? Is it one performance of a big orchestra? Are there solo instruments? If you want a big dynamic score, you’ll want more fidelity in how you record something, so you have the freedom to craft the score how you want in a real-time scenario.
Implementation Philosophy - Will it require multiple audio streams that dynamic cross-fade based off of game parameter? Is it continuously playing within your audio soundscape? Does it simply underscore key events? Etc…
Dynamic Range – Is it always “dialed to 11”? Do you have moments of silence?
Sound-Design
Ambience – How complex and layered is the “world sound” that is around the user? Are you on a sci-fi alien planet with moaning winds and twisting thunder crackling around in the skies above? Maybe it’s in an interrogation room where you only hear some ventilation.
Weapons/Items/Player Feedback – What sort of items does the player have and use? Are they the focal point of any given scene (like guns in an FPS) or are they simply a means to an end with a one-use sound?
Creatures/Monsters/Enemies/Obstacles – What sort of “encounters” will the player come across? Are they enemies? Are they big, drooling, roaring alien creatures that come bursting out of the stone floors and swarm up to cover the player in terrible, terrible insect spawn? Or is it something much less dramatic, expensive, and/or resource hungry?
Visual Effects – Need some fancy UI effects where buttons glow and then spark out into existence? What about in an FPS when you are wandering through a sewer system (every FPS has one of these) where there are steam blasts, dripping goo, rushing water, and the gas main explosion that catches everyone on fire? I hate when that happens.
Player(s) – Are you doing a traditional video game where you have a main character running, jumping, gallivanting around the world? If so, you’re going to need sounds for movement, footsteps, exertion, pain, death, etc…If you don’t have a traditional “player” then this might not be applicable in a direct sense.
Dynamic Range – Is your sound design going to be your star of the show? Are you going to have lots of peaks and valleys in the “action” that’s taking place on screen? Is it going to be pretty sparse and more of a music driven game?
Voice-Over
Story vs. Gameplay Voice-Over
Cast Size
Voice-Types required – This could range from age type, gender type, etc…Each voice has its own collection of frequencies and timbre qualities, so it’s good to know what sort of pool you’ll have.
Special Effect Voices (creatures, computers, etc…) – this is pretty self-explanatory, but some voices might need additional processing and sound design. This can affect your overall soundscape.
Technology
Understanding of platform needs and limitations
What does the project tell you? Is it a big war game with tons of characters on screen? If so, you’re going to need tons of audio channels to work with. If it’s more about music playback, you might just need a handful of channels at any given time, but more streaming capabilities.
Implementation paradigm – How are you going to get awesome audio content into your project and have it sound as you envision it?
Engage yourself with your team:Audio is a pivotal element to any creative entertainment product and it cannot be successful if it is acting within a vacuum. Communication is key. Having some artwork that the team has already established or some prototype builds of the project that you’re working on are among the best methods of inspiring the audio direction of any given title. It will allow you to ask more detailed questions about some of the overall direction of the title which should be a driving factor in how an audio vision is established. Talk to your art directors, art leads, concept artists, 2D/3D artists, animators, creative directors, level designers, tech leads, etc…Become a sponge.Establish reference material:Just like visuals, having aural reference points to listen to goes a long way to establish goals for the audio direction. This can be clips of sound-design, music, or voice-over. Anything that helps get the creative juices flowing. An example would be that when I tend to work on linear-style games; I start gathering any artwork that might help with establishing the through-arc for the game (like storyboards) and then start to gather audio assets that would go with those storyboards. It doesn't have to be super complex or anything like that, but it’s the starting point to start figuring out what your sonic palette will feel like. Paint with an incredibly broad brush. What sort of audio flavors fit with the overall look and feel of the product as a whole? After that, it’s about iteration…maybe you start creating original content to replace anything that you’ve put in there to begin with. You start gathering lists of resources that would be beneficial to record. You start going out into the field to gather assets and start constructing an audio “library” that you can start to pull from. You start creating your project’s “voice” this way. You also start getting an idea of what your budget is going to be through this practice.Establish technology requirements and implementation paradigms:Mentioned above; you’ll need to start getting a handle on what sort of technology requirements your project will need. In addition to that, how will you get your audio into your project?Here are some high level things to keep in mind:Technology-
Number of platforms that need to be supported
Multiplayer requirements (Over network? Split-screen? How many players?)
Memory and streaming limitations
Channel count limitations
DSP options
Environmental Effects (DSP)
Occlusion/Occluding options
Dynamic Mix options
Implementation-
Middleware? Proprietary tech?
What is the level of complexity that audio playback will have in your project? Are you going to need several sounds layering together that trigger off several other sounds after they have completed playback? Or is it more of a simple quick user feedback paradigm where you play super quick and one-shot sounds to simply underscore an event as it takes place?
Define what the audio vision “is” and what it “isn’t”:After you start getting a pretty good idea of what project you’re actually making, you can start drawing lines in the sand about what it *should* sound like and what it *shouldn’t* sound like. You can start really getting more meticulous about what qualities you’re looking for in any given sound and how it interacts with another and how it fits within “the big picture” of the project as a whole.Always, ALWAYS serve the goals of the project and your customer first:I keep writing about “the project as a whole” but this cannot be overstated enough. The project and your customer is king. There may be some really cool technology or sound that no one has heard or seen before, but if it doesn't serve your customer or fit with what the rest of the project is trying to do; then it should be given a HARD look to see if it’s worth the time and investment to put in. Experimentation and investigation is great, but you want to make sure it’s not serving a personal goal or agenda. It needs to serve your customer and make the project a cohesive experience.