11 Dynamics
11.1 Introduction
We now enter upon an aspect of sound restoration which is almost entirely subjective, and therefore controversial - the restoration of original dynamic range.
People do not seem to realise that the range of sound volumes in a musical or dramatic performance was usually “squashed” when the performance was recorded. It is a great tribute to the skills of professional recording engineers that so many people have been fooled. But you should learn how engineers altered things in the past, so you know how to do your best if you are ever called upon “to restore the original sound.”
The restoration of full volume range is never appropriate for an “archive copy,” there have to be special circumstances for an “objective copy,” and it is rarely essential for a “service copy.” But I maintain that a present-day operator with experience in controlling certain types of music when recording it, might be best-placed to undo the effect on similar music (and this applies to other subject matter, of course). It will nearly always be a subjective issue, and I mention it only because if the approach is to reproduce the original sound as accurately as possible, a present-day operator could be better-equipped to do the job than any future operator, as he knows the craftsmanship of masking the side-effects.
The same principles can be used when we consider that past engineers always worked with a definite end-medium in mind, whether it was acoustic disc reproduction, A.M. radio, or compact disc. Not only would they have kept the signal within suitable limits for such applications, but they would also have taken into account the “scale distortion” I mentioned in section 2.12. So, on the assumption that future listeners will not be working with the same medium, we may have to allow for this on the service copy at least.
There are three levels of difficulty. At the first level, manual adjustments of volume were made while the recording took place. At the second level, electrical apparatus was used to provide automatic compression of the dynamic range, and the role of the engineer was to ensure that the apparatus didn’t affect the music adversely. At the third level, automatic apparatus was used which was “intelligent” enough not to need an engineer at all; it could select its parameters depending on the subject matter, and could generally reduce volume ranges more than the first two levels.
Which brings us to a definite reason why I must write this chapter. Suppose your archive has made an off-air recording, and the broadcaster uses your archive so he may repeat something. Then you must reverse the “second and third levels” above, or the effects will be doubled. This neither represents the wishes of the original producer, nor makes a broadcast you can be proud of (let alone the listeners). Therefore you have to reverse the effects!
Recovering the sounds is rather like using a reciprocal noise reduction system (Chapter 8) whose characteristics are unknown. In fact, it is sometimes possible to confuse recordings made using reciprocal noise reduction with recordings made using other types of dynamic restriction. Your first duty is to ensure the recording was not made with reciprocal noise-reduction! A useful side-effect of restoring the original dynamic range is that background-noise of the original analogue media may be reduced at the same time.
Hardly any of the necessary apparatus for restoring dynamic range has yet been developed, although relatively minor modifications of existing machinery could allow most of the possibilities. Restoring the original range means moving the background-noise of any intervening medium up and down as well. This may prove a distracting influence which fights your perception of the original sound. It may be best to work with a copy of the recording which has already been through a sophisticated noise reduction process (sections 0 to 4.21). Since many of these processes are only appropriate for a service-copy, you might as well “be hung for a sheep as for a lamb” and get better dynamics as well!
I hope this essay will inspire others to do some development work. I seem to have been working on my own in this field so far. No-one else seems to have recognised there is a problem; yet there has never been much secrecy about what was going on. The techniques were taught to hundreds of BBC staff every year, for example. Nevertheless, I will start with a statement of the basic difficulty.
11.2 The reasons for dynamic compression
Since the beginning of time, it has been necessary to control the dynamic range of sounds before they came to the recording medium. Such controlling may be “absolute,” that is the sensitivity of the machine may be fixed with no adjustment during the recording, or “relative,” in which the overall sensitivity plus the relative sensitivity from moment to moment, are variable. Such control has always been necessary, because practical analogue recording media are unable to cover the complete dynamic range perceived as sound by the human ear. (Note that, once again, I am having to imply human hearing; readers concerned with other applications must make the appropriate mental adjustments). To oversimplify somewhat, there may be 120 decibels between the faintest detectable sound and the onset of pain, whilst practical recording media with this dynamic range are still only at the experimental stage. Furthermore, we cannot yet reproduce such a range, even if we could record it. And when it becomes technically feasible, health and safety laws will stop us again!
Therefore recording engineers have always practised implicit or explicit dynamic controlling, which usually means some distortion of the quantity of the original sound on playback. The next few sections will therefore comprise brief histories of different aspects of the topic, followed by my current ideas about how the effects of such controlling might be undone, or rather minimised.
Undoing such compression means you need lots of amplification. Please remember what I said above - I cannot be held responsible if you damage your loudspeakers or inflict hearing loss upon yourself. I am quite serious. There is a very real danger here, which will be exacerbated because many of the clues needed to control such sound lie at the lower end of the dynamic range. So you will have to turn up your loudspeakers to hear these clues. I also advise very quiet listening-conditions; these will make listening easier without blowing your head off (or annoying others). And, although you may find it fights what you are trying to control, I strongly advise you to install a deliberate overloading device in your monitoring system (such as an amplifier with insufficient power), preferably arranged so it comes into effect at the maximum safe volume. That way you will be able to separate the extra anomalies, because they will not coincide with distortion of the monitoring system itself.
11.3 Acoustic recording
There were only two ways of controlling signal volume in the days of acoustic-recording. The absolute sensitivity of the machine could be adjusted by changing the diaphragm at the end of the horn. Professionals used easy-to-change diaphragms for that very reason. The power-bandwidth principle came into effect here, although it was not recognised at the time. A thick diaphragm, being relatively stiff, had a high fundamental resonant frequency resulting in good treble response, but had low sensitivity to most notes within the musical range. A thin diaphragm, being relatively compliant, gave inferior high-frequency response, but greater sensitivity to lower notes.
This shall be considered in greater detail in the next chapter, but in the meantime I shall discuss the “relative” adjustments. The only techniques for altering sound volumes during a “take” were to move the artists, or to use artists trained in the art of moderating their performances according to the characteristics of the machine. The balance between different parts was predetermined by a lengthy series of trial-recordings. There was no way in which the absolute or relative sensitivities of the apparatus could be adjusted once the recording had started (Ref. 1). Because of these factors, it is the author’s view that we have no business to vary the dynamics as they appear on acoustically-recorded discs or cylinders. They were determined by the artists themselves, or by the person with the job of “record producer,” and for us to get involved would subvert the original intentions of the performance.
11.4 Manually-controlled electrical recording
There is no doubt at all that from the earliest days of electrical recording, volume adjustment was possible by means of the “pot” (short for “potentiometer”), better known as the “volume-control.” This introduced a variable electrical resistance into the circuit somewhere, and the principle is still used today. I shall use the term “pot” for the artefact concerned, whether it actually comprises an electrical resistance with a wiper contact, or a sophisticated integrated circuit controlled through a computer, or a combination of any number of such devices. The essential point is that, in my terminology anyway, a “pot” will imply that there is someone with a hand on a control somewhere, so the signal volume can be controlled before it gets to the recording machine.
This paragraph outlines the history of the hardware. Pots started off as wire-wound variable resistors with fairly fine resolution from one turn of the coil to another (usually much less than a decibel), but suffering from a tendency to make noises as the volume was changed. Professionals from about 1930 to about 1970 used “stud faders,” in which the wiper moved over a set of contacts made from precious metals to resist corrosion. Precise fixed resistances were wired to these studs. They were designed to give a predetermined degree of attenuation from one contact to the next, and the physical layout of the studs ensured consistency, so the same results could be guaranteed if it was necessary to change to a different fader, or use duplicate equipment, or repeat a session. There was always a trade-off between a large number of contacts and their resolution.
Eventually most organisations settled on “2dB stops,” and engineers might refer to changing the sound by “four stops,” meaning eight decibels. Values between these 2dB steps could not be obtained except by sheer luck if a wiper bridged two contacts. Line-up tones could be set up on a meter this way, but the process couldn’t be trusted in real recording situations. With some types of sound (low frequencies with a heavy sinewave content), audible “sidebands” occurred as the wiper moved from one stud to another. It was perfectly obvious to those concerned (who usually reacted by cleaning the studs vigorously, an inappropriate action since the clicks were an inherent side-effect of the system); but it is worth noting this, since it shows when volume-controlling is unlikely to have taken place. The effect could be concealed by making the change occur on a wideband sound, during a pause, or at a “new sound” such as a bar of music with new instrumentation or dynamics.
Meanwhile, semi-professionals and amateurs used “carbon track pots,” comprising a shaped arc of carbon compound with a wiper running along it. This gave infinite resolution, but was prone to inconsistency (especially noticeable in the early days of stereo), and electrical noise if the track wasn’t perfectly clean. From about 1970 professionals preferred “conductive plastic” faders, which could be made consistent and free from sidebands. Sometimes the problem of dirty contacts was ameliorated by using the pot to control a “VCA” (voltage controlled amplifier).
The purpose of the pot (or a combination of pots and/or switches and/or integrated circuits) was threefold. First, it set the “absolute level.” This ensured the voltages were in the right ballpark for the subsequent apparatus. Next, the operator manipulated the pot so that the loudest sounds did not have any undesirable side-effects, such as distortion, or repeating disc-grooves, or aural “shocks” which might alienate the audience. Thirdly, the operator manipulated the pot to ensure that the quietest sounds did not get lost in the background noise of the recording medium or the expected listening environment, or were too weak to have the subjective impact that the artists desired. It will be seen that there was a large subjective element to this. It wasn’t simply a matter of keeping electrical signals within preordained limits.
Nevertheless, the electrical limits were important, and pots were almost always operated in conjunction with a meter of some sort, sometimes several at once. The design of such meters is not really a subject for this manual, but you should note that there were several philosophies, and that certain authorities (such as the B.B.C) had standardised procedures for operating pots in conjunction with such meters. As archivists, it may be appropriate to treat B.B.C recordings using B.B.C metering principles if the intended effect of the broadcasting authority is to be reproduced. (Ref. 2). Unfortunately, I have not been able to find an equivalent document for any other such organisation, although aural evidence suggests that something similar must have existed at one or two record companies.
The important point is as follows. Nearly all sound operators used both ears and eyes together to set the signal before it was recorded. Ears were needed to listen out for technical and artistic side-effects and strike a compromise between them, and eyes to read a meter to inform the operator how much latitude there was before side-effects would become audible.
Such side-effects might not be apparent to the engineer immediately associated with the performance. Meters also helped users of the sound further “downstream.” They ensured that broadcasting transmitters at remote locations didn’t blow up for example, or that disc recording machines didn’t suffer repeating grooves, or that tape recordings could be spliced together without jumps in absolute volume.
The present-day operator will have no difficulty distinguishing between two quite different sound-controlling techniques. One is the “unrehearsed” method, in which the engineer was unaware what was going to happen next, and reacted after the sound had changed. For example, a sudden loud signal might create an overload, and the engineer would react after the event to haul the volume down.
The other is the “rehearsed” method. The action here depended upon the signal content. If the idea was to preserve the contrast between the quiet and the loud sounds, for example to preserve the effect of a conductor’s sforzando without overloading the recording, the engineer might nudge the volume down carefully in advance, perhaps over a period of some tens of seconds. But if the idea was simply to keep the sound at a consistent subjective volume, for example two speakers in a discussion, he would wait until the last possible moment before moving the pot; nevertheless, it would be before the new speaker started talking. Double-glazed windows were provided between studio and control-room so he could see when a new speaker opened his mouth or a new instrument took a solo; but an intelligent engineer could often do the same job without the benefit of sight by his personal involvement with the subject matter. So a present-day operator ought to be able to reverse this action when necessary.
The point is that the present-day restoration operator should be able to distinguish between the “unrehearsed” and “rehearsed” methods of control. In the former case, we will have violent and unintended changes of sound level which cannot be called “restoring the original sound,” and they should be compensated as far as possible on the service copy. In the latter case, the contemporary operator was working to produce a desired subjective effect. Whether we should attempt to undo this depends, in my view, on whether the performance was specifically done for the microphone or not. If it took place in a studio at the recording company’s expense, then we should preserve the dynamic range as the company evidently intended it; but if the microphone was eavesdropping at a public concert, then it could be argued that the dynamic range as perceived by the concert audience should be reinstated.
11.5 Procedures for reversing manual control
At present we have no way of setting the “absolute level” of a sound recording unless the recording happens to come with a standard sound calibration. (And see section 2.12). So we are confined to reversing the “relative” changes only.
There are two main indications operators can use for guidance. One is if there is a constant-volume component to the sound, such as the hum of air-conditioning equipment. By listening to how this changes with time, it is possible to use this clue to identify the points at which the original engineer moved his pot, and by how much. The background need not be strictly constant for this to be possible. Passing traffic, for example, comprises individual vehicles coming closer to the microphone and receding again. However, there is always a certain rate-of-change to this, which does not vary so long as the traffic is moving more-or-less constantly (which a present-day listener can identify by ear). Thus, when an anomalous rate-of-change is perceived, it is a sign that the original engineer may have moved a pot or done something that needs our attention.
The other case is where the wanted sound itself has a change of quality accompanying a change of quantity. The human voice is a good example of this; voices raised in anger are different in quality from voices which are not raised, and even if the original engineer has done a perfect job of holding the recorded signal at a constant electrical value, the change in quality gives it away. The same applies to orchestral music. This writer finds, in particular, that both strings and brass playing fortissimo have more harmonics than when they play mezzo-forte. It is less obvious than the spoken-word case; but by skipping a disc pickup across a record and listening to quality and quantity together, anomalies often show up. The original engineer was able to conceal these anomalies by operating his pot slowly or very quickly, as we saw earlier. By comparing different passages in rapid succession, his basic philosophy will become clear, and it can then be reversed.
Obviously, the present-day operator must have a very good idea of the actual sound of a contemporary performance. He might have to be a frequent concert-goer, and a practicing sound-controlling engineer, and be familiar with hearing uncompressed performances on his loudspeakers, if his work is to have any value. But provided these requirements are met, the present-day operator could be better-equipped to tackle this challenge than anyone else.
The principal trap is that dynamic controlling may have been performed by the artist(s) as well as the engineer(s). Maybe the orchestra played mezzo-forte rather than fortissimo to make life easier for the engineers. This frequently happened in recording studios, and in British broadcasting studios prior to the advent of “studio managers” in the early 1940s. A meter cannot be used for undoing this effect. We can never say “this mezzo-forte was 15 decibels below peak fortissimo on this modern recording of the work, so we will make this old performance match it on the meter.” If the operator does not take the harmonic structure of the sound into account, the resulting transfer may be a gross distortion of the original. I believe that if an orchestra restricted its own dynamic range for any reason, the service-copy must reflect that.
There are a few other clues which can sometimes be exploited to identify dynamic compression. It is well-known that, with the advent of the potentiometer, engineers frequently controlled the volume downwards as a 78rpm disc side proceeded from the outer edge to the inner, in order to circumvent “inner-side distortion” (section 4.15). This almost always happened whenever a disc was copied, since potentially-distorting passages could be rehearsed. Most reasonable recording-venues have reverberation which decays evenly with time; indeed, the standard definition of “reverberation time” is how long it takes the reverberation to decay by sixty decibels. A listener may be able to spot when a change of volume has occurred, because there will be a corresponding apparent change in reverberation-time for a short period. Indeed, if he knows the sound of the original venue, he will be able to recognise this effect even when it occurs throughout the whole of a particular recording, perhaps because an automatic compressor was used. Which brings us to the next topic.
11.6 Automatic volume controlling
In the early 1930s automatic devices became available for restricting the dynamic range of sounds before they were recorded or broadcast. Today these devices are called “limiters” and “compressors.” Different writers have given different meanings to these two words, so I shall follow the dominant fashion, and define them my own way!
I shall use the word “limiter” to mean a device which existed as an overload-protector, and a “compressor” as a device which had some other effect (usually artistic). I will also use the word “expander” for an automatic device which can increase dynamic range. The archivist can use expanders to undo some of the effects of compressors and limiters, but a full range of facilities does not yet exist on any one commercially-available device.
Limiters were invented first, and provide most of the work for us today. They started in the film industry, to help control spoken dialogue, and protect the fragile light-valves then used for optical film recording (section 13.14). With the relatively limited signal-to-noise ratio of the medium and suitable manipulation of other parameters, it was found that four to eight decibels could be taken off the peaks without the effect being apparent to anyone unfamiliar with the original. Thus the limiter quickly became another tool for creating convincing soundtracks from unconvincing original sounds. For films at least, an “unlimiter” should not generally be used today; but, if (say) a recording of film music were to be reissued on compact disc, it could be argued that an unlimiter is justified because the sound is being copied to a different medium.
I have found no evidence for limiters in pure sound work before 1936, when the R.C.A Victor record company used one to help with Toscanini’s uncompromising dynamics on commercial orchestral recordings. By 1939 they were commonplace in the US recording industry, and the B.B.C adopted them from 1942 to increase the effective power of their A.M. transmitters under wartime conditions. As with the film industry, four to eight decibels of gain-reduction was found useful without too many side-effects being apparent to listeners unfamiliar with the originals. But it must be remembered this judgement was made in the presence of background noises higher than we would consider normal today. If we succeed in reducing the background noise of surviving recordings, it may require us to do something with the dynamics as well.
An engineer generally listened to the effects of a limiter for the following reason. When two sounds occured at once, one might affect the other, with a result called “pumping.” We shall come to some examples, and how the matter was resolved, later; but the best way of overcoming this difficulty in early days was to have two limiters, one for each sound, and mix the sounds after they had passed through the limiters.
By 1934 Hollywood had separate limiters on the dialogue, the music, and the effects. The final mix would have another limiter, but essentially this would function as an overload-protector; the component tracks would have been controlled individually. In sound-only media, again the RCA Victor company seems to have been at the forefront, with several limiters being available on each of the microphones as early as 1943.
The B.B.C also incorporated limiters in most of their new disc recording equipment from this time onwards. Here we have a real ethical dilemma, because there was always one limiter in circuit for each pair of recording machines (called “a channel”, designed to permit continuous recording). It was generally pre-set to take 4-8 dB off the peaks, and was not an “operational” control. Surviving nitrate discs may have a much better background-noise than that perceived by contemporary A.M listeners, which can be an argument in favour of undoing the effect. On the other hand, A.M listeners might have heard it after passing through a second limiter at the transmitter, this being an argument in favour of doubling the amount of expansion today! You can see the ethical difficulties; clearly, whether we should undo these effects will depend sharply upon the subject matter, whether it was also transmitted on F.M. or only on wartime A.M., the purpose for which the disc was made (e.g. was it an “insert” into a programme, or made for the Transcription service or given to the artist as part of his fee), and other considerations.
The European record industry was strangely slow to adopt limiters, and even slower to adopt several at once. The earliest example I know is EMI Columbia’s recording of Act 3 of Die Walküre at the 1951 Bayreuth Festival, where the device was used specifically to compensate for the fact that the singers were moving about on stage. During the same season Columbia covered Die Meistersinger. In this opera, the commercial 78s show a limiter was used during the disc-cutting process, but not on the LP (or the recent C.D). So by this time, limiters were already being used in two different ways.
With the advent of rock music from 1954, limiters became almost mandatory in popular music. Rock music was meant to be LOUD. Limiters not only enabled 100% loudness, they also enabled the unpredictable antics of vocalists to be balanced against the comparatively steady backing. Whenever mixing is taking place after one or more limiters, the limiter is an essential part of the balance, and we should not try to undo it.
When F.M radio started in Europe in the mid-1950s, limiters functioned simply as overload-protectors. F.M was supposed to be a hi-fi medium. Unnecessary distortions were abhorred, and limiters at F.M transmitters were hardly ever triggered until the advent of commercial local radio in Britain in the mid-sixties.
Limiters were sometimes incorporated in domestic and portable recording equipment from about 1960, this was usually to act as an automatic way of setting the recording-level, thereby making recording easier for non-professionals. Such limiters had widely varying characteristics depending upon the imagined customer-base. A cassette-recorder for the home might react relatively slowly to overloads, but would eventually settle down to an average setting which would still allow loud peaks to sound loud. However, dictation-machines for offices would even out consecutive syllables of speech when delivered on-mic, while not amplifying office background noise, or being upset by handling-noise or mike-popping. We should therefore study contemporary practices if we aim to undo the effects; fortunately, this type of equipment was rarely used for archivally-significant work.
However, limiters became a real menace in domestic and semi-pro video equipment. The distractions of the picture-capture process meant automatic sound-control became almost essential. This wasn’t a problem in the film industry, because professional Nagra machinery became the industry standard. It incorporated a very “benign” design of limiter, which tended to be used carefully by professionals with rehearsals, and its results were always subject to further control in the film dubbing theatre. But amateur and professional video engineers tended to ignore the audio aspects of their kit, often using it without rehearsals in the presence of noisy backgrounds to record syllabic speech with fast volume changes. In addition, the noises made by the camera and its crew, and the difficulties of the sound-man (if any) hearing what was happening under these conditions, means most video sound is a travesty of the original.
A final twist, which happened even with films, is that “radio mikes” became available from about 1960 to mitigate the problems of microphone cables. Usually, such mikes were used “in shot” (e.g. by an interviewer or a pop vocalist); but sometimes an experienced operator can tell a radio-mike has been used even when it can’t be seen, because it would be impossible to cover a scene any other way. (For example, long tracking shots across soft ground, or actuality battle-scenes). Radio mikes always incorporate limiters so they conform to international radio regulations when the sound gets too loud.
In the 1970s the battle for increased audiences meant more sophisticated signal-processors were used to increase the subjective loudness of radio stations. The manufacturers had proprietary trade-secrets, but generally the devices used more complex attack-time and decay-time circuits. Also the frequency range might be divided up into several separate compressed “slices,” so that a loud note in one slice did not affect the sound in another, and the overall volume remained high without perceptible side-effects. It is my unhappy duty to tell you that I do not yet know a way of reversing the effects of these devices. In principle, it is possible to “reverse-engineer” the effects of most of these devices.
We might obtain the required parameters by comparison with (say) untreated commercial records which have been broadcast by the same station, or by acquiring an example of the actual device and neutralising its effects with op-amps, or by something which once actually happened to me. After I had done a live radio broadcast, the studio engineer proudly gave me a tape of what I’d said (knowing I was doing my own off-air recording at home), with the words “This hasn’t been compressed”. But for the remainder of this chapter, I shall confine myself to the second level of difficulty I mentioned above, automatic devices with no “intelligence.”
11.7 Principles of limiters and compressors
It is necessary to understand the principle of operation of a limiter before we can reverse-engineer it. I shall start with a “normal limiter.” This classic design is also known as the “feed-back” limiter, and was the only practical design before the mid-1960s. After that, devices with different architectures became possible, thanks to the development of solid-state VCAs (voltage-controlled amplifiers) with stable and predictable characteristics which didn’t need the feed-back layout.
The circuitry had the architectural shape depicted in Fig. 10.1.4 Uncontrolled electrical sounds arrived from the left, and volume-setting would usually be achieved by a pot P. This might or might not be an operational control, depending on the reasons for the presence of the limiter. There would now follow a piece of circuitry which gave a variable degree of amplification or attenuation. It has been shown here with the letters “VCA”; although this is a modern term, the principle is much older. The degree of amplification could be varied by changing a “sidechain” voltage; here, this voltage is depicted as coming up from below.
What is important is how this control voltage varied. This is what we must emulate to restore the original dynamics today. In a “feed-back” limiter, a sample of the output voltage was taken. In Fig. 10.1, it is being amplified in a stage shown symbolically as “A”; however, most limiters actually combined the amplification and VCA functions in the same piece of hardware. These have been shown separately, because the amount of amplification affected the compression-ratio of the limiter. The signal was then rectified (converted to a direct current) and applied to an analogue .AND. gate. Again, this is a modern term, although the principle is much older. An .AND. gate only passes signal when voltages are present at two inputs. In this case, a fixed voltage is present at the other input of the gate, here shown coming from a battery. (In practice, most limiters had a special power supply for this purpose, which was made variable in some designs). Only when the voltage at the first input exceeded that of the second did any voltage appear at the output of the gate. The output then travelled back to the VCA as the control voltage.
Thus, the limiter behaved as a normal amplifier until the signal reached a particular level of intensity. After this, the control voltage reached the VCA, and the amplification was reduced.
The degree of reduction (or the “compression-ratio”), was dependent upon the amplification at A. Since the VCA-stage often incorporated its own amplification, this is not a definable quantity in many practical limiters. With lots of amplification, no matter how much louder the signal may have become at P, the output would increase barely at all. With less gain, the limiter would turn into a “compressor” - a device that reduced peaks in proportion to their size, rather than bringing them all down to the same fixed level.
Fig. 10.25 depicts some steady-state characteristics of devices with “feed-back” architecture. Curve (A) is the “ideal limiter” characteristic, never actually achieved in practice. Curve (B) shows what was often accepted as a satisfactory limiter with “ten-to-one” compression, and curve (C) shows the actual performance of a 1945 Presto recording limiter for discs. Curve (D) shows how reducing the amplification at A might give a reduced compression-ratio - in this case, “two-to-one.” This would be for subjective reasons, rather than for preventing overloads.
The control voltage was made to change with time by a number of components, the three most important being shown symbolically in Fig. 10.16 by R1, R2, and C1. These represent two electrical resistances and a capacitor in the side-chain. Usually, if there was an overload, the aim was to get the signal volume down quickly before any side-effects could happen (distortion, repeating grooves, etc). So the limiter had to act quickly. We shall consider this factor in section 11.9. After the peak had passed, it was necessary to restore the amplification so quieter sounds would not be lost. C1 and R2 served the function of restoring the status quo; any charge built up on capacitor C1 would leak away through R2, and the amplification of the VCA would slowly revert to normal. This function will be considered in section 11.10.
Three other parameters are relevant. One is to identify the action of the .AND. gate. If the limiter was being used as an overload protector, this setting would obviously be left alone for the duration of a recording or broadcast, or else the level of protection would vary. Practical limiters might have a control called “Threshold” or “Set Breakaway” to adjust this; but the gain of the subsequent recording equipment also had the same effect, and cannot be quantified today. We shall need to identify the whereabouts of the threshold by watching the compressed signal on a peak-reading meter.
The second is the compression-ratio. Curve 10.2B shows ten-to-one compression. At first, we might try using one-to-ten expansion to restore the original range. Unfortunately, this is usually impossible to achieve for two reasons. First, our threshold-emulator has to be set extremely accurately, and this must remain true at all frequencies. Half a decibel of error will result in five decibels of incorrect expansion, and an audible disc click may be amplified into something like a rifle-shot. Secondly, the actual expansion-ratio was never consistent, as we can see from curve 10.2C; this too can result in wild deviations with small causes. An expansion-ratio of about three-to-one is about the best we can reverse-engineer with present technology. But all is not lost; we shall see how to circumvent these problems in section 11.11.
The third parameter is “side-chain pre-emphasis.” There might be an audio equaliser incorporated in the amplifier A. The most common reason for this was to increase the amount of treble going to the rectifier. In an F.M transmitter, for example, the wanted sound is transmitted with high-frequency pre-emphasis, as we saw in section 7.3. If the high frequencies are allowed at their full volume they will overload (or break international regulations). The equivalent operation could, in theory, be done by putting the pre-emphasis circuit before the VCA; but then it is difficult to judge the subjective effect.
Oddly enough, side-chain pre-emphasis seems rarely to have been used on mono disc records, despite the presence of pre-emphasis. But when motional feedback cutterheads became available in 1949, there was a sudden demand for the facility. Motional-feedback cutterheads have their resonant frequencies in the middle of the audio range, so kilowatts of power might sometimes be needed to force the mechanism to trace an undistorted high-frequency wave. And when you burnt out a motional-feedback cutterhead with its special coils, the damage was very expensive. There was always a sidechain-preemphasis limiter on stereo cutterheads, although every effort was made not to trigger it. Sidechain-preemphasis was normal on optical film recordings, for much the same reason.
Now for some words on the subjective use of the facility. Normally, vowel sounds yield higher voltages than consonants (particularly sibilants), so using a “flat” limiter would compress the vowels and leave the sibilants proportionally stronger. Side-chain pre-emphasis diminished this effect. Optical film sound was very prone to high-frequency blasting, and the matter was not helped by cinema soundtracks being reproduced more loudly than natural. As far as I know, all limiters for protecting optical light-valves had side-chain preemphasis, with different settings for 35mm and 16mm film. This circuitry comprised of a resonant peak rather than a normal 6dB/octave equaliser. This had the advantage that short transients, such as gunshots, did not cause the limiter to respond immediately, and thus the transient quality was preserved. Many early microphones and cutterheads had noticeable resonances in the range 5kHz to 10kHz. A naturally-pronounced sibilant could be turned into an irritating whistle by such equipment. Sidechain preemphasis could diminish the irritation without affecting the other consonants, so its effect was much appreciated by speakers with false teeth!
11.8 Identifying limited recordings
The first task of the present-day operator is to establish whether a limiter was used, so we know when to reverse-engineer it. The best clue is to reproduce the recording with the correct equalisation, and watch it on an analogue peak-programme meter (PPM). This is the standard BBC instrument I mentioned above in section 11.4. It has a mechanical pointer with a relatively short attack time and a long recovery time, and it is possible to see exactly how the peaks on the recording behave to infinite resolution (unlike modern “digital” or “bar-graph” displays). If the peaks resolutely hit the same point on the dial and never go beyond it, a limiter has almost certainly been used. Un-limited speech and music usually has at least one “rogue peak” a couple of decibels louder than any other.
Another sign is if the sounds between the peaks are heard to “pump”, as we defined in section 11.6. This is very conspicuous when there is a constant element in the wanted sound - for example, air-conditioning noise or background traffic. Pumping is also noticeable when there is a number of peaky things sounding at once. A choir with eight treble voices is a good example. Treble voices are especially prone to “peakyness” with their vibratos, and several such voices often modulate each other in an easily-recognised manner. Applause can have a similar result, especially if there are one or two dominant individuals. (And again I must remind you not to confuse this effect with that of a reciprocal noise reduction system).
A third clue is whether the subject matter was unrehearsed, so the operator was forced to use a limiter with a faster-than-human reaction-time to prevent the troubles we mentioned before. The recordings of the British Parliament, with rowdy and unpredictable interjections during the debates, are prime candidates; we can tell a limiter has been used even though no side-effects are audible, because it would be quite impossible to cope any other way!
You can sometimes detect a limiter by its effect on starting-transients; this gives a characteristic “sound” which you will have to learn, but I shall be enlarging upon this point later.
The best way for an original recording engineer to avoid pumping was to combine several separate microphones which were limited (or compressed) individually. Thus compression was an integral part of the mix, and should not be undone. The rigorous peaking of the PPM does not tell you whether this has happened; you must use your ears and your experience.
Perhaps the best way of learning the effects of a limiter is to use one - or rather, misuse one - yourself; but do it on material which hasn’t been controlled before! Most broadcasts and many LPs have already been limited by operators skilled in minimising the side-effects, and such material will not give typical results. As I keep saying, any skilled maker of such recordings is supremely well-equipped to unmake them.
11.9 Attack times
Having proven that we do need to neutralise a limiter, we should first think about setting our equipment to emulate the action of R1. This is how we do it.
If R1 was a very low resistance, there would be nothing to prevent the limiter operating quickly, which was the desired aim. But practical circuits always had some resistance, so R1 always had a finite value, even when the circuit designer didn’t actually provide a physical resistance R1 in the wiring. Thus all “feed-back” limiters had a finite “attack-time.” It was usually in the range from ten milliseconds to 0.1 milliseconds. Anything longer than ten milliseconds might give audible harmonic distortion or catastrophic failures of powerful amplifying equipment, and two or three milliseconds was the most which could be tolerated for preventing repeating grooves in the constant-amplitude sections of disc-recording characteristics. Even-shorter attack-times were needed to prevent the clash of ribbons in optical light-valves.
In the 1940s and 1950s, it was noticed that starting-transients (particularly the starts of piano notes) sounded wrong. At first, this was attributed to “overshoot.” It was believed that the start of the note was passing through the limiter without being attenuated, so there was a short burst of overloading in the following apparatus until the limiter attacked. So limiters were made with even shorter attack-times, but matters got no better.
Eventually it was realised that the sudden change of volume was generating sidebands, like stud faders, and the solution was deliberately to slow the limiter. The BBC fixed on ten milliseconds for its transmitters as being the optimum for balancing the side-effects of distortion and sidebands; but this was too long to prevent groove-jumping on discs, and the recording industry generally used shorter attack-times. I shall not go any further into the philosophy of all this. In conclusion, if you im to “undo” a limiter, you may need to emulate the attack-time today in certain circumstances. And the best way of doing this is - once again - to emulate R1 using an experienced ear.
Until now, I have not said anything about the box marked “VCA.” In the days of valves, this comprised a “variable-mu” valve, a device whose amplification was directly controlled by a grid in the electron path. Unfortunately, a change-of-voltage coming from R1 would be amplified and emitted as a “thump.” To neutralise this, professional circuits used a pair of variable-mu valves in push-pull. The valves had to be specially selected and the voltages balanced so the thump was perfectly cancelled. Unfortunately, this did not always work. In domestic applications, the only solution was to increase the attack-time or cut the bass at the output, so the thump was effectively infrasonic. The Fletcher-Munson effect often masks quieter thumps; but they can be shown up by watching the waveform on an oscilloscope, when the zero-axis appears to jump up and down.
Although I have done a lot of work in “unlimiting” recordings, I have not yet concocted a circuit to generate an equal-but-opposite thump. When we get this right, it will provide another clue to setting the attack-time and the degree of expansion, because we shan’t get consistent results otherwise; and when we do get consistent results, it will be a powerful proof we are doing the job objectively. But this is in the future.
The cases of very long attack-times are fortunately much rarer, and easy to detect. Sometimes the machine took as much as a second to react, so that it sounds like a clumsy human-being behaving in the manner we saw earlier. However, it will do this consistently. It will not have the common-sense to realise another loud peak may follow the first, and this will tell you an automatic device is responsible!
At this point, I must mention that in 1965 came the “delay-line limiter”, a device which had a new architecture differing from the classical layout. This stored the sound for a short time in a delay-line, taking the side-chain signal from before the delay-line, so the VCA could reduce the gain before the peak got to it. This architecture is known as a “feed-forward” limiter, and can have other advantages (such as infinite compression-ratio). But for the moment, please remember that from 1965 onwards we may find recordings with negative “attack-times.”
Fortunately, the setting of the attack-time is not often a critical matter today. Most sounds are of relatively long duration, longer than about ten milliseconds. So most limited peaks will be followed by a “plateau” of sound of more-or-less constant volume, during which time most expanders will respond satisfactorily. Only extremely short loud sounds (such as hand-claps and the starts of piano-notes) will cause trouble.
11.10 Decay-times
To reverse the “decay-time,” we must emulate the setting of the R2/C1 combination. This is usually more important than emulating R1. The decay-time defined how long it took for normal conditions to be re-established after a peak. Engineers had several ways of defining it, which I won’t bother you with. But it had to be considerably longer than about a tenth of a second, or appreciable changes would happen during each cycle of a loud sound, causing harmonic distortion. The maximum encountered is usually around ten seconds or so; I cannot be more precise without going into how to define this characteristic. Fortunately, we don’t need to bother with this complication. It never seems to be documented, so we have to set it by ear anyway.
In practice, the decay-time of the limiter or compressor was usually made adjustable to suit the subject matter being recorded. A fifth of a second corresponded to syllabic speech, so the limiter could compress one loud syllable without affecting others. At the other extreme, ten seconds was slow enough not to affect the reverberation behind music.
A further complication occurred when a limiter was installed as an overload-protection device which might have to cope with all types of subject matter. A broadcast transmitter is an obvious example; transmitter engineers certainly couldn’t re-set the switch every time speech changed to music. Furthermore, even at the long setting, there was always the risk of an isolated peak. One loud drum-beat would then be followed by many seconds of music at reduced power. Although this could be solved by two limiters in series set to different recovery-times, the solution adopted was a minor modification. All BBC transmitters and the Type D disc-cutters had this facility (although not the Presto disc-cutting equipment); and when the idea was fitted to commercial limiters from about 1955 onwards, it was described on the switch as “AUTO” or some such word. I call this a “dual recovery-time.”
To set the recovery-time of our expander we must usually listen to the aural clues I mentioned in section 11.5. We shall probably discover why there were so many definitions of “recovery time”! The VCA frequently did not have a linear response to the control voltage. Although the voltage will have decayed exponentially at R2, the VCA-section may have resulted in a different recovery-pattern. We will probably find a satisfactory setting for reversing small peaks, but be unable to hold any constant background-noise steady between louder or wider-separated peaks. At present, I do not know an expander which enables us to correct for this, although it is a relatively trivial matter to build one. Something with an adjustable non-linear transfer-characteristic is needed in the side-chain.
A further development was the “triple recovery time.” (Ref. 3). This was developed especially for the all-purpose transmitter protection I mentioned earlier. The third time constant was made dependent upon the output signal-level; when this fell below a certain level for a certain amount of time, the machine assumed a new speaker was about to start talking, and reset itself quickly. In the writer’s experience this worked very well. For the record, the new circuit was installed at the first ten BBC Local Radio stations in 1969.
11.11 The compression-ratio and how to kludge It
The next thing we shall discover is that we cannot know, much less reverse-engineer, the exact compression-ratio of the particular limiter. It is rather like balancing a razor-blade on its edge. Even if we knew the threshold, the attack-time, the decay-time, and the actual characteristic, any slight mismatch will have a catastrophic effect - usually the gain of the expander will increase dramatically and blow up your loudspeaker.
In any case, you can see from curve 10.2(C) that there may be ambiguities. If we were trying to simulate this particular limiter and we had an output 4dB above the threshold, we would not know if this corresponded to an input of +5dB or +20dB. Usually we have to live with the fact that a limiter destroys the very information needed to expand the signal again. In this respect, it differs from a reciprocal noise reduction system, where the compressed sound is deliberately designed to be expandable.
How then do we undo a limiter at all? Here is where we make use of the other clues I listed earlier - any steady background-noise, the harmonics of musical instruments, etc., - routing the signal through an expander circuit. This does nothing until a peak signal comes along. The circuit detects such a peak because the threshold at the .AND. gate is set by a control R4. However, the circuit then expands the signal, not according to any pre-set expansion-ratio, but according to the setting of another pot R5. This is where the most important part of the subjective judgement comes in.
The operator will need a great deal of rehearsal to set this pot, and he will need to keep his hand on it all the time as the work progresses. He will also need a marked-up script (for speech), or a score (for music). By listening to the aural clues, he will continuously manipulate this pot according to the loudness of the original sounds, and the circuit will take over responsibility for pulling the sound up faster and more accurately than the operator can do it manually (where 0.1 to 10 milliseconds was the norm). It will also do the chore of restoring the original gain at the correct recovery-time after the passage of each peak, while the operator sets R4 ready for the next peak according to his rehearsed script.
Another useful facility is to insert a filter at Q. Almost as soon as the sound film was invented, cinema-goers noticed that a “flat frequency response” was not successful. There was a great deal of discussion whether this should be the responsibility of the studio or of the cinema, but the psychoacoustic factors were not realised at first. A definitive answer did not arise until 1939, when “dialog equalization” was researched, to compensate for cinema audiences hearing dialogue from the loudspeakers at volumes much louder than natural (Ref. 4).
The result was to cut the bass somewhat. For television and video that has been through a limiter but not a “dialog equalizer”, the original sound may be re-created more faithfully by preceding the expander with a treble-lift circuit and following it with a treble-cut circuit, with reciprocal responses so the overall frequency response is maintained. The researches of Ref. 4 have proved very helpful in developing such circuits. The result is that the signal is expanded more as voices are raised in pitch. Since louder voices are higher in pitch, background sounds can be made to sound consistent for longer periods of time.
These ideas form, frankly, two kludges to get around anomalous and inconsistent compression-ratios; but they often work. The only time they cannot work is when the recording is compressed so deeply that some of the background noises reach peak volume. If the background reaches the setting of R4, the circuit will overemphasise it to the peak setting R5. This is particularly troublesome when the recovery-time R2 is short. (When it’s long, it’s possible to switch the expander out-of-circuit manually before the background reaches peak volume).
One potential solution is to introduce a filter into the side-chain at X which discriminates between the wanted and unwanted sounds. If the unwanted sounds are low-frequency traffic-noises for example, it is sometimes possible to insert a high-pass filter here which allows speech vowel sounds through, but cuts the traffic. This does not eliminate the traffic frequencies from the corrected recording - the original sounds are preserved - but it prevents them triggering the expander anomalously.
Another idea (which I have not tried) is to insert a highly selective filter at X. If there is a reasonably constant background, such as the hum of air-conditioning, this can be selectively picked out, and used to control both the time and the amplitude of the expansion process. But, as I say, I haven’t tried this and I am sceptical regarding its success. Automatic selection is difficult when the hum is in the background. I suspect the human ear is better able to pick out such features when there are loud foreground sounds. However, expanders with such side-chain filters are available commercially.
Thus, to cope with the problem of backgrounds and foregrounds being of similar volume, I prefer a digital audio editor in which it is possible to pre-label passages where expansion should take place (and by how much), so it will ignore passages where the background must not be expanded. Research into such a program is taking place as I write.
REFERENCES
- 1: Peter Ford, in his article “History of Sound Recording” (Recorded Sound Vol. 1 No. 7 (Summer 1962), p. 228), refers to the surviving acoustic lathe at EMI He says: “A cord and coiled-spring tensioning device . . . provided a means of tuning the resonances of the diaphragm assembly to fine limits.” This was a misunderstanding. About twenty years after that article was published, the cord perished, and the assembly fell apart. Thanks to the co-operation of Mrs. Ruth Edge of the EMI museum, I was able to examine the parts, and it was possible to see that the cord-and-spring assembly was designed to permit the quick exchange of cutting styli. It could not be used to tune an individual diaphragm, much less alter its properties during a performance.
- 2: British Broadcasting Corporation: C. P. Ops. Instruction No. 1 - “Control and Modulation Range Instructions.” This version came into effect on 1st January 1957, and was reprinted, with additional information, in “Programme Operations Handbook (Sound Broadcasting)”, December 1956, pp. 165-170.
- 3: British Broadcasting Corporation: D. E. L. Shorter, W. I. Manson, and D. W. Stebbings, Research Department Report No. EL-5, “The dynamic characteristics of limiters for sound programme circuits.” (1967). This was reprinted in a slightly shortened form as BBC Engineering Monograph No. 70 (October 1967).
- 4: D. P. Loye and K. F. Morgan (Electrical Research Products Inc.), “Sound Picture Recording and Reproducing Characteristics” (paper), Journal of the Society of Motion Picture Engineers, June 1939, page 631.
-