Where is Immersive audio and the whole pipeline going?

Where is Immersive audio and the whole pipeline going?

recording with Zylia

Date

Reading Time

Share Article

Open letter follow-up

Written by Claudio Vittori

 

A few days ago, I wrote an open letter as an immersive artist and audio pro entrepreneur about where immersive audio and AI are headed. This is the follow-up I promised.

 

For the record: I’m not affiliated with any of the tools or platforms I mention below this reflection. I’m simply sharing what I’m noticing in the field and what I’m feeling as someone who works with immersive audio every week on productions for Flower of Sound and my own artist work, from original composition to simple binaural pieces to more complex systems designed for XR devices.

 

In Flower of Sound, I’ve taken on the role of creative director. Which means I’m not only producing and composing, but also shaping direction, workflows, and the way we collaborate. And this is exactly where the acceleration of Immersive audio and AI becomes very real.

 

Playing our catalogue in TrueHD
Testing our catalogue in Dolby Atmos TrueHD in a consumer environment

Immersive audio is not one thing

When people talk about immersive audio, they often imagine a single format and a single workflow. But the reality is already more complex.

 

There are multiple standards, ecosystems, targets, and distribution models. Some come from cinema, some from streaming, some from broadcast, some from XR and interactive media. For those who are not familiar with Immersive Audio, a simple example: you need a different format (for now) for headphones and another for a specific speaker setup, which also affects the whole production process.

 

And the more formats exist, the more the ecosystem grows. And as attention and demands grow, so does software development, from plugins to AI. Which sounds good. But it also means speed.

Young ecosystem

Just a few years ago, immersive audio didn’t even have proper immersive compressors and limiters. When the first ones arrived, from a few software houses, both big and small, it genuinely felt like a celebration. Because immersive music could suddenly compete with stereo not only in space but in impact.

 

And now we’re seeing a new wave of tools that try to reduce the whole chain to something like: upload, prompt, export. That jump, from “we finally have immersive compressors” to “AI wants to do everything”, happened incredibly fast. Just to give you an idea, I now use more than 8 plugins.

 

 

Immersive Audio workflow overview
Immersive Audio workflow overview

Real Workflow, From an artistic perspective (spatial sound design)

On a more artistic, sound-design–driven front, I use different tools depending on what I need to build.

 

Sometimes I work in standalone environments, such as Sound Particles, to design complex spatial scenes or dense multichannel structures. Inside Ableton, when I’m working in multichannel and don’t need ADM delivery, I often use Envelop, especially for site-specific installations or live-oriented projects where spatial behavior matters more than standardized formats.

 

Audio Brewers sits a bit differently in my workflow. I use it as a support tool for ambisonics in general: for microphone workflows, for upscaling, and for managing different ambisonic resolutions. It’s not tied to a single context or DAW, but it often helps bridge gaps between capture, processing, and delivery.

 

Over the years, I’ve also worked with various ambisonic and spatial toolsets such as SPARTA, Blue Ripple, and Flux SPAT, depending on the project and the phase. For dynamics, my path went from HoRNet compressors to tools like Elixir, and, more recently, to Gravitas within the Fiedler ecosystem. None of these excludes the others; they simply reflect how the toolbox evolves over time.

 

For larger-scale or venue-based projects, I’ve had direct experience with comprehensive spatial systems, including L-Acoustics workflows and d&b Soundscape. Different ecosystems, different constraints, same landscape.

 

All of this naturally extends into XR and real-time spatial audio. We’re exploring immersive pipelines for extended reality using tools like Atmoky and others, where sound has to respond to interaction, movement, and space in real time.

 

I’m not listing these tools to be exhaustive. I’m mentioning them to make one thing clear: until very recently, immersive audio meant navigating a wide and fragmented ecosystem, made of many specialized tools, each solving a specific problem. And this ecosystem is now evolving at an incredible speed.

The AI shift

Many people talk about AI as if the main story is: “AI will mix for you.” I think the bigger shift is more structural. AI is starting to unlock workflows in which spatial automation, format conversion, and even delivery can be consolidated into a few actions. And immersive audio stops being tied to time, skill, monitoring, routing, and long sessions. It becomes tied to scalability. Not perfect yet, but the direction is clear. In the last few weeks, multiple AI’s for immersive audio entered the market.

 

 

Part of Immersive Audio workflow
Part of Immersive Audio workflow with Ircam

Where is the Craft

The risk is not that immersive audio disappears. It will probably grow and become more accessible. Which is a good thing, as Immersive Audio is the natural way to listen. The risk is that it becomes a pipeline controlled by platforms where you upload audio, or generate it with prompts, the system reconstructs it, spatializes it, exports it, and the platform becomes the gatekeeper.

 

When that happens. The whole process becomes invisible. And the human craft and knowledge disappear. As people lose the ability to hear why things do not sound correct, or to correct them. It’s like we don’t know that people in the past built these enormous cathedrals, and now, in the future, people won’t even know how to compose a simple song if the AI goes offline.

 

And when a craft starts to disappear, the whole system and everybody in it become economically fragile.

 

 

Home theater set-up
Home theater set-up

A metaphor

There’s a game we used to play in Italy when I was a kid:

  • Ten people walk around nine chairs.
  • Music plays.
  • Then someone says: stop.
  • You have to sit down to stay in the game
  • And one person is out.
  • Nine people walk around eight chairs
  • The music plays again
  • Until there are two people and one chair left
  • For the final round…

 

In immersive audio, it feels like the game is changing.

It’s not nine chairs.

It suddenly became just one.

And there are still ten people running around.

And the music is speeding up.

 

Screenshot 2026 02 09 alle 10.53.04
2014 — invited to compose for a Wave Field Synthesis (WFS) system, I encountered immersive audio for the first time.
A decisive threshold in my practice. Since then, I never looked back.

 

The next generation

For the past few years, we’ve been collaborating with schools and educational institutions in Italy and in the Netherlands, and we’ve had young collaborators and interns working with us. And it’s hard to hide the consequences of this acceleration from them. They see it too. Some of them are extremely skilled, especially those coming from engineering, game audio, and XR.

 

Not long ago, if you had a talented engineer, you could give them tasks and be sure you were filling their days, their weeks, sometimes even months. Now it happens that between one coffee and another, you write a prompt, generate a big part of the code, send it, and objectively, the task collapses. This is not a hypothetical future. It’s happening to me. It’s happening right now.

 

But there is another side to this, and it’s something I don’t hear discussed enough. As a creative director, I’m also responsible for guiding these young people. And sometimes the acceleration is so fast that I’m not ready. Because when tasks collapse into minutes, the time you have to teach, to shape, to give direction, to build a path, also collapses. The space between “do this” and “now let’s go deeper” is shrinking. And it forces you to constantly invent new steps, new challenges, new directions, at a speed that is not always easy to sustain. I’m not sure if this is good. Because when you miss or do not understand the in-between steps, do you fully understand the end result?

 

 

Convenience will win

There is something else I want to say, because it matters. If a tool appears tomorrow that lets me take my music into a DAW, press a button, write a prompt, and export a convincing, immersive mix in multiple formats without manually building the whole spatial chain…

 

I have to be honest. There will be a point when I will use it as well, especially to replace the technical process (not the composing and playing of the instruments). Not because I don’t respect the craft. Not because I don’t care. But because convenience is the strongest force of our era. And that’s exactly the point. The real disruption is not only about artists. It’s about what happens to the entire chain around immersive audio. The engineers, the workflows, the software houses that built the missing pieces, the knowledge that took years to accumulate.

 

When the chain collapses into a single automated step, many roles don’t disappear overnight. As AI is especially at the beginning not perfect so you probably need even more specialised knowledge. But the whole system loses its ground. And it becomes fragile.

So what do we do?

This is the question I keep asking myself. Maybe my view is too pessimistic. I hope so. But usually, when you feel something moving this fast, it’s safer to be a little more prepared than a little more comfortable. I’ve even caught myself telling younger people: Be careful what you choose to study right now.
Not because immersive audio is not a future. It is. But because the ground is moving under our feet.

 

 

immersive audio
Immersive audio plugin: Sphere

We’re all in the same boat

I’m European. I live and work here. I can’t speak for other continents, but I strongly suspect we’re all in the same boat. And I’m grateful I was born here, between Lake Como and Milan. When things accelerate like this, I understand why so many of us start dreaming about a quieter place in the mountains.

 

But then the question comes back, and it’s very practical. If this acceleration continues, who do we ask for help? Who is responsible for making sure that the minds who studied, who trained for years, who built real expertise, are not simply left behind?

 

Is there a plan?

 

Or is it going to be a “save yourself if you can” situation? I’m not writing this to attack anyone. I’m writing it because I can feel this wave approaching, and I don’t think we are that far from it. And the speed is what makes it hard to ignore.

 

PS, in my research for this article, I found this official website of the European Union. So maybe this is a start. Go to Culture Compass.

RELATED ARTICLES

Scroll to Top

Subscribe and receive 20% off your first purchase!

Get early access to immersive audio well-being releases. Join the community.