Interview e - Stephan Bernsee

Stephan M. Bernsee,
German music/audio DSP applications developer and founder of Prosoniq.
Stephan M. Bernsee (born Stephan Sprenger) is a German music/audio DSP applications developer and founder of Prosoniq. Within 10 years 500.000 copies of their most successful software – sonicWORX Basic – bundled with Sony and Creative Labs hardware, were downloaded and sold. Today their technologies are sublicensed to companies like TwelveTone Systems, Digidesign/Avid, Autodesk/Discreet, Sony, Steinberg, Merging and many others. At Musik Messe 2009 Prosoniq announced a successor of sonicWORX with ist ability to extract, process or suppress individual notes and instruments in a song. Aside from that Stephan developed for Prosoniq the synthesis technology and sound synthesis engine for the Hartmann Neuron.
Blogasys: Stephan could you please be so kind and tell us a little bit about Prosoniq and the development of sonicWORX within the last two decades?
Stephan M. Bernsee:
Well, please allow me to make a quick remark on your intro: We never actually sold that many copies of sonicWORX. It is true that our free “sonicWORX Artist Basic” version was bundled with Sony CD ROM drives and Creative Labs sound cards for some time so while this means that there were actually that many users who physically had a CD with the software on it they were not necessarily active users. We had about 5,000 active users of sonicWORX back in the days if I recall correctly. For an independent developer and a software that was entirely Mac only this is still a pretty good number I think.
As for your actual question - I think I need to go back a few years to when we originally released the first version of sonicWORX. This was in 1993/94 when the Mac was still in its infancies as a multi-media platform and the PC was little more than a typewriter.
So back then we had this set of new signal processing technologies that we had developed on a Silicon Graphics Indy workstation which was *the* platform for scientific computing at that time. To give you a concrete example - I was working as a mastering engineer at a studio at the time and we had to provide optimum sound quality for vinyl cutting in very little time. We were mainly into producing all kinds of dance, house, techno and drum’n bass tracks. Back then there was a loudness war going on between different labels from different countries and we had to “trick” our own vinyl cutter, an elderly lady who had previously spent most of her life cutting classics and jazz tracks, into cutting our stuff at insanely loud levels as to provide the same level of quality as the underground labels in the UK and the Netherlands. I think she never even heard of the kind of music that we dealt with before we first came into her office, and I have my doubts that she was really thinking about it as music, so she was totally out of her league. So in order to do this we developed an adaptive saturation algorithm that was able to achieve a much higher loudness than our competitors. We did this by exploiting psychoacoustic properties - a concept that was still pretty new at that time. Our algorithm even took the cutting amplitude and speed limitations of that particular lathe machine that she used into account.
The problem we were facing was that there was no audio editing application that allowed us to use these algorithms as “plug ins”, as that concept was still new. On the Silicon Graphics there were no real audio applications because SGI never realized that they could have been the next Apple if only they were clever enough. So we had to develop our own application and that’s how sonicWORX came to be.
We did find out soon that there was a market for it and sold it quite successfully until the end of the 1990s, when audio and MIDI applications began to merge. At that time people started investing their money into all-in-one multitrack programs and plug ins and didn’t really want to use a dedicated mastering application anymore. They either did their mastering on dedicated hardware or with software plug ins, which was bad for sonicWORX.
Blogasys: …so you discontinued sonicWORX at that time, right?
Yes, that’s right - mainly for two reasons. First, like I said, with the emergence of countless plug in hosts and numerous home studios people either did their mastering on multiband compression hardware or with plug ins, so continuing sonicWORX didn’t make much sense to us - at least not with the feature set that it had back then.
The second reason was that it would have been exceedingly difficult and costly to bring sonicWORX to OS X. For instance, on MacOS 9 there was no multi-tasking. If you grabbed an analyzer window by its title bar it simply stopped drawing. We didn’t like that in a professional application so my former colleague Frederic Schelling, a brilliant Mac developer, devised his own way of drawing at interrupt time directly into the screen buffer using a predefined color as a “blue screen” chroma key. Something that was really frowned upon by Apple - for good reason - but worked beautifully for us and our users. Also, thanks to some tricky programming sonicWORX was the only application that could load an audio file and process it while you played it back at the same time. This wasn’t possible to do with any other audio editors back then - for some this is still true today. We couldn’t simply port these features to MacOS X because they were so deeply rooted within MacOS 9 that porting them meant essentially doing a complete rewrite.
So with these two main factors, the change in the market and the prohibitively expensive cost of porting we felt it necessary to rethink the concept of sonicWORX and redesign it from the ground up before coming out with a new version in 2010. It just doesn’t make much sense today to develop yet another audio editor as there are plenty of them out there already and some are actually pretty good at what they do.
We wanted to go past dealing with audio in the usual way and beyond simply cutting, pasting and applying effects to it. We wanted to make the things that you hear visible and at the same time accessible to editing. So we came up with a new editing concept that presents sound in a way that is easy to understand and intuitive to manipulate.

sonicWORX Isolate (find link at the end of the interview)
Blogasys: Can you please explain the new features and what was the reason for you to develop them?
Stephan M. Bernsee:
If you look at the popular media formats that you find today there is a huge need to re-master classic recordings that were done in stereo - or even mono - as a surround mix. There are some incredibly talented sound engineers who spend a lot of time meticulously editing classic mono recordings to convert them to stereo and surround sound format. Sometimes you even want to do a remix, or use a new score in combination with vocals from a classic recording. Not to mention all the new creative ideas that you can explore once you’re no longer limited to the original harmonic and rhytmic context.
Of course, extracting information from music is not really something completely new, there have been countless scientific and a few commercial applications that have attempted to do this, with varying success. If you look for keywords like “blind source separation” or “auditory scene analysis” you’ll actually find a lot of papers and some applications, even free ones such as ‘nmfdemix’ by Remy Muller. I’m not sure if you recall a product that we introduced at Musikmesse 1998 which was called “Pandora”. Pandora was a software for automatic de-mixing that could automatically extract voice and lead instruments from a mix. It worked remarkably well but was a desaster to operate because you had like 120 parameters that you could change and needless to say, it was painfully slow when processing tracks as the CPUs of that time were not very powerful. So we ended up licensing the SGI version of Pandora to another company as part of a 10 year exclusive contract and subsequently only sold “Pandora RT”, which was a feature-reduced realtime “voice reduction” software that came out as a plug in for sonicWORX and could reduce and sometimes even remove voice from mono or stereo audio files. In 2008 that contract expired and the CPUs were so much faster then, so this gave us the opportunity to design a totally new product and focus on the task of manipulating mixes at the instrument level.
With our new version of sonicWORX we wanted to create an application that allows a new way of interacting with audio, notably selecting and exporting individual features in a mix. There is currently no other product that can do this the way we do it - on a 1:1 high resolution representation of the real signal. You don’t want the computer to second guess you on what part of a mix you want to select, as this limits your degree of freedom. So we didn’t want to make this automatic. We believe that the human ear is still superior at identifying sounds so we really wanted to leave that part to the user and assist him in a very easy to use way rather than take important decisions away from him.
Blogasys: How does this concept compare to Melodyne DNA?
Actually, they might seem similar at first but really they’re not. I’m not even sure if this comparison is really fair. You see, both programs have, at least in my opinion, entirely different applications and problems that they solve. I think of Melodyne as a musical tool aimed at correcting recordings when you don’t have access to the musicians to redo it, because they’re no longer physically available. That’s also what they advertise it for and it does an amazing job at that. So, Melodyne is great when you want to quickly and semi-automatically change notes within a relatively simple to reasonably complex mix and I wouldn’t want to miss it in our studio.
But at the same time, with Melodyne DNA you stay within the original musical context, so the side effects and artifacts of the changes will in most cases be masked by the rest of the mix. If you have attempted to use DNA for actually extracting, say, the vocals from a mix you will have noticed that you don’t really have enough access to the decision that the program does to effectively do this.
Now, our interest is in extracting part of the mix as opposed to pitch shifting or time stretching individual notes within the original context. What we wanted to create is a comprehensive, “what you see is what you get” kind of way for the user to interact with the raw material. Until now the user was only able to hear things in a mix, now he can actually see and edit them - as instruments, not just as a bunch of frequencies, and let the computer take care of all the details such as unmixing voice and drums. This level of access is what we aim to provide with sonicWORX.

splitting audio signals into their components - sonicWORX Pro
Blogasys: What do you think will be possible with sonicWORX in the future? What are your ideas for future developments?
Stephan M. Bernsee:
Well obviously I can’t talk too much about the future, but feature labeling, sound extraction and -manipulation are likely to be a repeating pattern in our products over the next few years. For instance, we have just released a free plug in called “VuvuX” which is based on our sound demixing technology and removes the Vuvuzela noise from World Cup 2010 audio streams without affecting the stadium atmosphere or the commentary. So we already have a few interesting projects going in 2010 that I think will make for a very nice product portfolio.
Prosoniq has always been a company focusing on new technologies and we see ourselves primarily as technology providers. For instance, we were the first company to develop audio morphing based on feature recognition with a neural network, we had numerous wavelet transform based sound design algorithms in the 1990s when the concept was still new, we were the first to use raytracing to achieve room modeling and 3D localized sound, our TimeFactory was the first application to do polyphonic time stretching and we also helped Emagic beat MotU in getting formant corrected pitch shifting ready in Logic before they did in Digital Performer. By the same token, we’d be happy to start a dialog with anyone interested in using our sonicWORX technology in their own products, and I’m sure we’ll see a lot of new applications in the near future.
Blogasys: What can we users expect in the coming years? What do you think will be possible with hard- and software in the near future?
Stephan M. Bernsee:
Certainly a lot. We’ve come a long way in the past decades but we still have more brainpower than CPU horsepower, so I guess it’s fair to say there is plenty of room for exciting discoveries in the realm of understanding and modelling the auditory perception process on a computer.
Blogasys: What kind of instrument are you interested in?
Stephan M. Bernsee:
I’ve been playing the keyboard since I was 14 or so and did my school final on sound synthesis at the age of 18. I never quite liked the idea of being limited to the sound of a single instrument so this question is difficult for me to answer. I do like the human voice as an instrument as it is incredibly versatile, unique as an instrument and at the same time very personal.
Blogasys: Do you still have time to make music? Which equipment do you use and what do you like about it?
Stephan M. Bernsee:
I used to earn my money doing remixes and produce tracks in the 90s but I decided to give it up when I reached the junction where doing both was just too much and I had to decide what kind of future I wanted to pursue. I must say I am pretty much happy to develop and deliver the tools to people who in turn make the music that I like, so I never felt this as a loss. It’s only recently that I’m starting to make music again. Looks like I have to learn a lot from scratch - a few years have passed since I’ve used Notator SL on an Atari 1040ST but fortunately we’re still using the same black and white keys these days.
Our new CD the “Vaust Project” which is in cooperation with Bernhard BouchĂ© and Klaus Abel will be out in late 2010. We still use a lot of our original equipment like a PPG 2.1, Oberheim Xpander, Prophet 5, Emu Morpheus, Yamaha VL-1, Neuron, a bunch of AKAI samplers, a JD-880, several Kurzweil modules, a Korg Phophecy, an Ensoniq ASR-10, a Yamaha CS-40 and a couple of other legacy synths and samplers. Not forgetting our own morph and OrangeVocoder plug ins of course.

Hartmann Neuron, a collaboration between Stephan M. Bernsee and industrial designer Axel Hartmann
Blogasys: You developed the DSP for Hartmann’s Neuron, could you please tell us a little bit about this collaboration? How it began and what you think about the Neuron, what are the advantages and what are the disadvantages in your opinion?
Stephan M. Bernsee:
Initially Axel Hartmann and I had the idea to put some of Prosoniq’s technology into a hardware device. Since all our software at Prosoniq evolves around pretty much the same basic neural analysis, modification and resynthesis technology this seemed like a logical choice. I also think that all people involved in developing software ultimately have the dream to see their ideas cast into hardware. I must admit that looking back this was pretty selfish of me, and I was most certainly blinded by the prospect of being involved in developing a real hardware synth as we really didn’t have the resources nor did they have the required infrastructure to make this happen. So this wasn’t a particularly good environment for a project of this magnitude but I guess I refused to see it. In the end a lot of compromises were forced upon the Neuron and regrettably in most cases I wasn’t even involved in the decisions.
In the end, the Neuron ended up as a - in my personal opinion - badly marketed, poorly manufactured, hopelessly underpowered device that ran a crippled feature-reduced version of what our technology was really capable of doing, with a half-done user interface software on top of it that was missing essential features. I’d say that this project, even though it had a lot of potential, good ideas and an absolutely fabulous design, was really a big disappointment for me when I think back to how the collaboration turned out. Partly this was due to my not taking a more active role in the management, which I couldn’t at the time, but mostly because Hartmann didn’t have the additional year that would have been required to finish their user interface software and they could not afford faster hardware because their manufacturing process was too expensive.
But despite all this disappointment I think that once you manage to look past the limited sound set that it has built-in the Neuron is still a hugely creative instrument and it holds a lot of inspiration and certainly a wealth of totally unique sounds for those who have enough patience and dedication to dive in and explore its possibilities. It never ceases to amaze me to see what a variety of sounds it can produce from a simple modellized sample. So even if it came out as something other than I had envisioned I do believe that it still managed to realize some of the potential that I personally wanted to see.
Blogasys: Do you see any chance for a Neuron II?
Stephan M. Bernsee:
Yes definitely. But if we were to develop a successor we would definitely need more financial resources and one or two skilled embedded systems developers who can afford to work on the system full time and not just in their spare time. We would need a project manager who is responsible for the project and for everybody getting ready in time, and a centralized development team - all the things that Hartmann didn’t have back then.
Blogasys: To what kind of music do you listen to and do you have any favourite musicians? If yes, what do you like about their music and about them?
Stephan M. Bernsee:
I don’t think that my taste in music is set in stone and I never had any favourite musicians either as I rarely like everything that a particular artist does. I grew up in the 70s and 80s and I never managed to understand why many people would want to keep listening to songs from their childhood for the rest of their lives. I certainly don’t. So I try to keep an open mind and I enjoy pretty much anything from drum’n bass to classics if it has something unusual or interesting in it or is simply well done musically or artistically, or moves me in a certain way. It does have to have a structure to it that I can understand, though, I’m not at all a fan of certain abstract contemporary “serious” music that is better read than listened to.
Above all, I love anything that has to do with sound design, and if anything you could call me a fan of Ben Burtt’s work - with the exception of Wall-E perhaps which I don’t like that much because I am allergic to sinusoidal resynthesis artifacts, but that is probably a very unique condition not shared by too many people on the planet.
Blogasys: Thank you very much for this interview, Stephan!
Stephan M. Bernsee: Thank you!
Stephan M. Bernsee Links:
- Prosoniq homepage
- DSP Dimension
- sonicWORX Isolate
- sonicWORX example , extracting Peter Gabriel’s voice and a trumpet
- sonicWORX and Neuron Forum
- Hartmann Neuron on facebook
You can listen to some audio demos and see some photos of the Neuron in the Hartmann Neuron section of this blog.
Viewed 6429 times by 1403 visiters







