Trial, Controlled

When and how did we invent the RCT? (part I)

Sep 30, 2025

From the lofty intellectual heights of 2025, it’s easy to explain the value of a randomized controlled trial. We want to know what will happen if we do something, compared to if we don’t. Unfortunately, in any given situation, we can only either do the thing or not do the thing – it’s not possible to both do it and not do it, and then compare the outcomes. To get as close as we can to that impossible ideal, we collect lots of similar situations (e.g., patients), and do the thing for some of them but not others (e.g., give a new medication to some patients and standard care to the others). We decide which of the situations get the thing using randomness, in order to avoid the biases of any other method. Et voilà! A setup that lets us compare a world where we do something to a world where we don’t – or at least, as close as we can get to that without time travel.

But now suppose you’ve never thought of it that way, and neither has anyone else. And it’s not your job to run experiments to figure things out, and in fact that’s not the job of anyone you’ve ever heard of. Maybe you’re a doctor. Your patients come to you with various maladies, and you give them the best treatments you know of. Sometimes they recover, sometimes they don’t.

Would you invent the randomized controlled trial?

Uncontrolled

To improve your practice as a hypothetical historical doctor, you might seek the advice of more experienced doctors, study existing medical texts, or contemplate theories of anatomy and pathology. Tradition, authority, and pure reason can get you a long way. If you’re unsatisfied with these methods, you might decide that empirical observation and experiment is the best way to figure something out. But even then, you won’t necessarily arrive at the idea of a controlled trial.

Galen, the Greek physician who lived around 200 AD, is considered a key figure in experimental medicine. He studied anatomy by dissecting and vivisecting animals, and his experiments took the form of applying various surgical techniques to animals and seeing what happened. For example, consider his research on arteries. When you dissect a dead animal, you find that the arteries are mostly empty (because the blood has drained out), so people believed that the arteries were a system of tubes for conveying inhaled air to the rest of the body. They realized that when you cut the artery of a living animal, it bleeds, but they theorized that this was because the cut lets the air out of the artery, and then blood rushes in. Galen’s experiment was to expose a length of artery in a living animal, tie ligatures in two places, and then open the artery in between. There was, of course, already blood in this isolated section of artery.

Galen dissecting a pig. Illustration from an edition of Galen’s works published in Venice in 1565.

This kind of experiment wouldn’t really benefit from a control. The question under investigation wasn’t fundamentally about the effect of the surgery, so a second animal standing by, not being operated upon, wouldn’t have been helpful. Galen’s surgical intervention was just his method of peering into the mysterious black box of the living body, and cleverly distinguishing the predictions of two different theories.

About fourteen centuries later came what we now call the Scientific Revolution in Europe. There was renewed interest in empirical observation and experimentation, there were innovations in precise measurement and mathematical modeling, and significant advances in our understanding of astronomy, optics, physics, chemistry, etc. These developments are foundational for modern science, but they don’t really prefigure the modern controlled trial. Early scientists aimed to observe nature, uncover its hidden regularities, and develop mathematical laws to describe and predict its mechanisms. An experiment was just a little corner of nature set up for close observation. A bronze ball on a wooden ramp, for example, allows you to take a hundred repeated measurements and conclude that “the spaces traversed were to each other as the squares of the times.”

Even when studying the efficacy of a drug, which we now consider the perfect occasion for an RCT, it’s possible to think very hard about what experiments to conduct and never come up with the idea of a controlled trial. In The Canon of Medicine (1025 AD), Persian physician and philosopher Avicenna laid out “the rules that must be observed in finding out the potency of medicines through experimentation.” The rules are all described in terms of Avicenna’s theory of the four temperaments: hot, cold, moist, and dry. For example, we are instructed to test every drug against two “contrary” conditions, such as a cold disease and a hot disease; if the drug works against the cold disease but not the hot disease, this shows that the drug has a hot effect. Other rules include: test drugs in patients with no comorbidities (because if the patient is suffering from two contrary conditions, we won’t know which one the drug is acting on), and test drugs in humans (because “the medicine might be hot compared to the human body and be cold compared to the lion’s body”). Although the justifications seem bizarre to us now, it’s a thoughtful list, and yet the concept of controls makes no appearance.

Gaining Control

The first proper controlled trial that I know of was conducted by the Persian physician al-Razi, and described in his Comprehensive Book of Medicine around 900 AD. He wrote: “So when you see these symptoms, then proceed with bloodletting. For I once saved one group by it, while I intentionally neglected another group. By doing that, I wished to reach a conclusion.”

You couldn’t ask for a clearer description of a controlled trial: treating one group and intentionally neglecting another in order to reach a conclusion. It is a bit awkward that the first controlled trial proved the therapeutic efficacy of, well, bloodletting. Perhaps that’s one reason that the story of James Lind’s test of cures for scurvy is a more popular touchstone in the history of controlled trials. According to Lind, in 1747 he took a group of twelve sailors suffering from scurvy, whose conditions were “as similar as I could have them,”1 and assigned them in pairs to six different treatments, including cider, sea-water, and citrus. The two sailors eating citrus fruits recovered best, which fits perfectly into our modern understanding that scurvy is a deficiency of vitamin C.

Another interesting early reference to a controlled test appears in Bencao Tujing, a pharmacopeia compiled and edited by the Chinese naturalist and engineer Su Song around 1060 AD. “It was said that in order to evaluate the effect of genuine Shangdang ginseng, two persons were asked to run together. One was given the ginseng while the other ran without. After running for approximately three to five li, the one without the ginseng developed severe shortness of breath, while the one who took the ginseng breathed evenly and smoothly.”

I love this example, because I think it beautifully demonstrates both the intuitiveness and the unintuitiveness of controlled trials. Clearly, the reader is supposed to consider this anecdote compelling evidence for the effectiveness of genuine Shangdang ginseng. There’s no explanation of why this constitutes good evidence; it’s just obvious. At the same time, the story is presented purely as a secondhand, one-off event. There’s no suggestion that anything else ever was or should be tested this way. It’s just a fact about ginseng.

One more early example of controlled trials: the work of Ambroise Paré, a French surgeon in the 1500s. He recalls in his memoirs that as an inexperienced battlefield medic, he cauterized gunshot wounds with boiling-hot “oil of elder,” as instructed by a medical text and by his fellow surgeons. But then he ran out of oil of elder: “At last my oil lacked and I was constrained to apply in its place a digestive made of the yolks of eggs, oil of roses and turpentine. That night I could not sleep at my ease, fearing by lack of cauterization that I should find the wounded on whom I had failed to put the said oil dead or empoisoned, which made me rise very early to visit them, where beyond my hope, I found those upon whom I had put the digestive medicament feeling little pain, and their wounds without inflammation or swelling having rested fairly well throughout the night; the others to whom I had applied the said boiling oil, I found feverish, with great pain and swelling about their wounds. Then I resolved with myself never more to burn thus cruelly poor men wounded with gunshot. [...] See how I learned to treat wounds made by gunshot, not from books.”

“VVhat chance may do in finding out of remedies.” From an older translation, published in London in 1649.

Paré didn’t set out to do a controlled trial, but apparently the results of running out of oil were so clear, dramatic, and unexpected that he took the findings to heart. He may also have picked up the idea of a controlled trial more generally, because elsewhere he describes a deliberate within-subject trial of treating burns with onion paste: “I applied onions to one half of his face and the usual remedies to the other. At the second dressing I found the side where I had applied the onions to have no blisters nor scarring and the other side to be all blistered; and so I planned to write about the effects of these onions.”

A common thread in the work of al-Razi, Paré, and Lind is that they found themselves in situations where many people were simultaneously suffering from the same ailment: medieval Islamic hospitals, a battlefield of soldiers with gunshot wounds, a shipful of sailors with scurvy. Other contexts for practicing medicine, like working in a small town, or serving as a personal physician in a powerful household, wouldn’t provide this opportunity. Of course this doesn’t entirely explain the historical paucity of controlled trials, because the world is full of other kinds of opportunities to conduct them. Paré’s onion paste trial and the ginseng running race are examples that could have been much more common, and you can easily imagine controlled tests of anything from the best bait for catching fish to the best well for throwing curse tablets into.

Then again, maybe some such tests were carried out, and lost to history. Another common thread in the work of al-Razi, Paré, and Lind is simply that they wrote down their experiences and their texts have survived to the present day.

We don’t have any copies of the Comprehensive Book of Medicine from al-Razi’s lifetime. On the left, a page from the oldest surviving (partial) copy, made by an unnamed scribe in 1094; on the right, a page from the Latin translation commissioned by the King of Sicily in 1279.

Stay tuned for more installments from the history of the RCT, covering Peruvian bark, imperial decrees, cadaverous particles, and, eventually, randomization.

Lind gets a lot of credit for this line, which is reasonable, because it’s one of history’s first explicit mentions of the principle that different treatment groups should be as similar as possible in every way other than the treatment. But I feel I must point out that a few sentences later he describes one of his treatment groups as “two of the worst patients, with the tendons in the ham rigid, (a symptom none of the rest had).” So I’m not sure he fully had the right idea.

The Minus Roots

Discussion about this post

Ready for more?