In this podcast, Dr. David Burns describes the “Testing” part of the T.E.A.M. model. Topics include:
- The shocking results of a study of therapist accuracy at Stanford
- Why therapists who don’t test usually get it wrong
- How session-by-session testing can revolutionize your practice
(Repost for submission to iTunes)
Another great talk on a topic not always fully appreciated by therapists.
A question: do most therapists really get such low empathy scores when they first start using the empathy questionnaire? Such scores could be quite deflating, and possibly lead to discouraging negative cognitions for a therapist, especially if he or she has been practicing for years thinking that he or she has good empathy skills! This being said, if he or she gets over the initial shock and persists in this practice, one can hope that empathy scores will improve over time…
Thanks so much Chris! Yes, most therapists will get failing grades initially when they first start using the empathy questionnaire, and it can be very upsetting, especially if your ego is involved. This is a case were therapists need to work on their own negative thoughts on their own Daily Mood Log, in addition to working to improve their empathy skills. If the therapist persists with the tools, and get some training, then you can get superb scores from the vast majority of patients within a few weeks. One way to do that is to join of the weekly online TEAM-CBT training groups. A list of times and contact persons can be found on my website. But it is shocking and deflating for therapists to learn that they cannot acknowledge patients’ feelings in an accurate or skillful way. But this is reality, this is how it actually is! And if the therapist has courage and a huge determination to improve, then great things can be done! Please keep listening and posting your terrific comments and questions, Chris! All the best, david
Dr. Burns, I am curious about the reliability and validity percentage for Beck Inventory. Articles state it is within 80s%, but you said 65% and below.
Thanks, sadly I do not get many statistical questions, but that’s an area I really enjoy. The “reliability” of a scale is defined as the correlation of the total scale score with the “factor” or concept that drives the scale. In other words, if you are looking at the reliability of any depression scale, like the Beck Depression Inventory (BDI), you would look at the correlation between the total score, and the concept of “pure depression” which has no errors of measurement, the Platonic “pure concept,” so to speak. You can actually make this calculation using something called “structural equation modeling,” a fairly advanced statistical method.
The correlation between the scores on the BDI and the Depression Factor is generally reported around .78, which is close to .8. The shared variance, or “accuracy,” is the square of the correlation, or around 64% or less. This means that roughly one third of the points on the test do not reflect depression, but other variables, along with errors of measurement. To me, that’s not very good, and that’s why I created the three Burns depression tests, plus many other tests for anxiety, anger, relationship problems, therapeutic empathy, and many more variables of interest to therapists.
The reliability on the scales I have created are typically between .90 and .98, much higher than the Beck test. In addition, I try to create scales that are short and user friendly. These scales have revolutionized the practice of psychotherapy and are a cornerstone of what we call TEAM-CBT. Hope this helps. Thanks!
David
Thank you for your reply, Dr. Burns!
Once i am in my internship, i will certainly give it a try to your scales! They are available on your website, correct?
Aksana.
They are not free, but if you purchase a Therapist’s Toolkit your are licensed to use them free of charge for your entire career. You can find order forms for the Toolkit on my website. All the best, David
Thanks so much for your great work.
I have just one question : could you please give the reference of the article you mention where you found a very low correlation between patient and clinicians rating?
Thanks, Remy, I did not publish it, was just curious as I had the data for analysis. I did publish the original study on Willingness and recovery from depression, and this is the reference, along with a similar study on poor congruence between patient and therapist perceptions of the therapeutic alliance, with I think the exact findings I observed, and similar methodology. David
Burns, D., Westra, H., Trockel, M., & Fisher, A. (2012) Motivation and Changes in Depression. Cognitive Therapy and Research DOI 10.1007/s10608-012-9458-3 Published online 22 April 2012
Hatcher, R. L., Barends, A., Hansell, J. & Gutfreund, M.J. (1995). Patients’ and therapists’ shared and unique views of the therapeutic alliance: An investigation using confirmatory factory analysis in a nested design. Journal of Consulting and Clinical Psychology, 63(4), 636 – 643.
HI David, I’m going back to listen to some podcasts. I love your books as you know and I also loved Change Your Brain Change Your Life by DR. Daniel Amen. Are you familiar with his work? He says he treats people by doing PET (?) Scans and seeing where the brain is under/over active or damaged. It’s confusing that so many mental health professionals have such different views of how to treat mental health.
Hi David,
I don’t know statistics and I don’t really understand how you got the accuracy ratings for your tests. However, your claims of about 95% accuracy just don’t pass my sniff test. You need to know the correct answer before you can say how well any test result compares to it. We can never know exactly how depressed or how anxious a person is. So how can you convince a skeptic like me of the accuracy of your test or of any such test?
Skepticism is always welcomed, as that’s the whole basis of my thinking. I’m not sure if you want just to protest, or would want a scholarly and maybe cool statement / dialogue on validity and reliability, and their philosophical and mathematical bases. To me, this is an incredibly fascinating and practically mind blowing topic, and really appreciate your question, which can be answered in a “sniff” resistant way. I have no interest, actually, in convincing you of anything, to be honest, but always thrilled to teach and dialogue. I just got out of the hospital, so I may be out of commission for a while, but can cycle back if you are sincerely interested I use structural equation modeling (SEM) to do the reliability calculations, and second-order confirmatory factor analysis om SEM for validity calculations, using direct FIML (Full Information Maximum Likelihood) estimates. This allows for consistent parameter estimates, even in the presence of missing data that is not completely at random. In addition to the mathematical considerations, there is the philosophical aspect (what, really, “is” depression, for example?) and the practical considerations (is it possible to know how patients actually feel, or feel about the therapist, without these measures? And how about therapists who cannot bear to hear the “bad news” that these scales nearly always deliver, due to having a fragile ego? d
There was surely something in the tone of my comment that led you to ask if I really want an explanation or if I just want to make a protest. All I want is an answer I can understand. I’d have to dig deeply into methods that go far beyond my two semesters of undergraduate stats to understand your explanation. Is it possible to give a convincing explanation in layman’s terms? Is “the presence of missing data that is not completely at random” an important clue to understanding?
Sure, thanks, here’s a short answer, the statistical component. I am currently using a five-item depression test with response options that go from not at all to completely for the five key symptoms of depression. Reliability, developed by the statistician Cronbach (as in Cronbach’s coefficient alpha) is the question of whether the test measures something in a reliable, consistent manner. You can calculate this automatically with any of the major statistics programs, like SAS or SPSS. You can also calculate more precisely, if desired, using structural equation modeling techniques. I have developed a large number of brief scales to measure all kinds of problems, like depression, social anxiety, general anxiety, anger, happiness, physical pain, therapeutic empathy, and many more.
The reliability of all the scales I have developed (the alpha value) ranges from around .90 to .99, using pretty large populations with great range on all of the scales.
The second issue is validity. A scale could be incredibly reliable, but it could be measuring the wrong thing in an incredibly reliable manner! So, far example, if your depression test is actually measuring anger, or ice cream preferences, and so forth, it won’t have much value, if any, in a clinical or research setting. Validity is not “real,” but stipulated, and can be defined in a wide variety of ways. The definition I use is the variance shared by a number of scales that are widely accepted as measuring depression (or whatever scale you’re trying to validate.) So, for example, people use scale for measuring depression that are better than nothing, but not stellar, like the Beck Depression Inventory, the Hamilton, the Zung, and so forth. In addition, there are interview scales, like the SCIDs (the Hamilton is also an interview scale). These other scales have much lower reliabilities, on average, more in the range of .75 to .80.
So I’ve validated my own depression scale(s) against most of these other measures, and the correlations tend to be quite high, like .9 or so. When you purge all of these scales of errors of measurement, using SEM techniques, the correlations tend to approach 1.0. I publish these types of findings in the top clinical psychology journals, like Journal of Consulting and Clinical Psychology. Their standards are far more conservative than most medical journals, although I am a physician and not a psychologist.
There are a multitude of other factors. Will patients be honest? Yes, totally honest, as long as they don’t have a hidden agenda. This is not an issue of the properties of test, but the skill and shrewdness of the shrink. For example, if a patient is trying to get disability income, she or he may fake answers. I am always referring to a voluntary therapeutic relationship.
There is also the issue of therapist resistance. Measuring at the start and end of every therapy session takes no time, as patients complete scales prior to and after each session, but the data makes therapists accountable, and most will discover what might feel like “bad”news–that they are not effective in changing depression (or whatever the target variables are), that they get failing grades in most sessions in the therapeutic empathy and helpfulness scales, and more. This information can be transformative and amazingly useful, much like an EKG or X-ray machine, but requires therapist humility and the willingness to learn and change course.
Let me know if that information is useful. The missing data thing simply means that we are often collecting data for these analyses over time, in clinical settings where dropout is inevitable and common, so the data that’s missing may be biased. For example, more depressed patients may be more likely to drop out. To do any statistical analysis requires assumptions about the dropouts, but only SEM can provide consistent parameter estimates when patient dropout is correlated with some variable you are measuring, like empathy or depression, and so forth. That point is relatively unimportant with reliability estimates, since they can be done in cross-sectional studies where missing data is more minimal.
I have done thousands of hours using SEM to learn more about how psychotherapy works, and that is what led to the emergence of TEAM-CBT, along, of course, with my clinical experiences.
You are skeptical. I’ve been uber skeptical, too, and did not believe much of what I was taught in my psychiatric residency training. So I salute your skepticism, and hope my (probably lame) “tutorial” has been fun or interesting. I’m a slow learning and it took me quite a bit of time to grasp this fascinating topic that involves mathematics, philosophy, the principles of scale development and testing, and clinical pragmatics, along with writing skills in terms of how you ask questions with any scale. In my opinion, most psychological scales are on the poor side.
David
Hello Dr. Burns, This is the second podcast I’ve listened to and they seem to be geared more toward therapists than patients. I’m not a therapist. Should I keep with it, or should I be subscribing to a different podcast. Thanks for your help.
Stick with it, they will evolve! d
Iam one of your fan , since I red your book feeling good in 2008 and I just bought your other book feeling great I’ve suffered from survivor depression for 24 yrs
Thanks, appreciate you note! Warmly, david
everything you say is interesting. I have all your books and use ideas from the books in my classes wirh psychiatruc patients. Have not used testing. My patients often have difficulty understanding what they read, suffered brain trauma or have learning difficulties in addition to psychosis.
Thanks, I use testing in classes, too! Makes a huge difference, because you can see immediately how effective you were, or weren’t, and how individuals responded to your class. Outstanding classes are impossible without testing, IMO! d
I have started listening to your questions. I am getting a better perspective on how therapists should be working and how research evolves. They were questtions and interests I had.As I said before I had two abusive therapists and it really harmed me, but I still have trust in the good ones but they are very hard to find. Most therapists aren;;t good if the truth be known.
Sadly, I agree with you! I might be biased, but think the same way! All the best, david