Interview With ButterfliesAgain Test Creator, Dr. Block - Online Personals Watch: News on the Online Dating Industry and Business

« Gayness Is A Sexual Orientation, Not A Political One | Main | Randodate's Free SpeedDating Application »


James Houran

Mark, I understand that you conduct these interviews for non-specialist audiences, but I was very disappointed with this one.

As an established and published expert on compatibility testing, I would've asked different questions. After all, clinicians do not always make good researchers and neither clincians nor researchers necessarily know how to make effective assessments. Having a good theory or applying it in the clinic is one thing, but being able to translate knowledge into being a good test/assessment is quite another beast.

Tests and measurements is a highly specialized field. If the measurement isn't right, then the output isn't right. Fernando hinted at this above. Sadly, virtually every compatibility test I've examined doesn't seem to conform to professional testing standards. In other words, the fundamental validity of current offerings is questionable.

I'd be eager to have Dr. Block explain the procedure used to construct his questionnaire. I'd also appreciate hearing him elaborate on his statement, "The countless intimate hours I've spent over those years have been incomparably valuable in understanding the very heart and meaning of lasting love." Specifically, what does he envision or conclude about love and attachment that modifies or is new compared to existing models?

I've other questions (more technical in nature) if Dr. Block would be willing to answer them as well.


James Houran, Ph.D.

Dr. Joel Block

James, do you have an agenda that you should disclose here? First you give a pass on a dating book that is basically a manual for doing a pre-nup and continually, and I must assume purposely, miss my point that in life it is not the load you carry, but how you carry it. What's up with that?

Come on, the book is a Cosmo article and you act as if the author was certified by Dr. Freud—who by the way may have made comments about relationships, but it was far from his best work.

You seem to have a negative or at least limited view of clinicians. I believe the best ideas come from being with couples in treatment.

Your take reminds me of the Washington politicians who don’t have a clue but make policy for people of very modest means without ever having seen what it is like to live in their shoes.

As for the test construction, have you visited and read about it? If not, I suggest you do so.

As for what is new about love attachment that comes from the “live lab”, perhaps you should do your homework there as well.

First, nothing is really new. It is how things are presented, in the service of helping people make love last that is critical.

A small sampling: My book, Broken Promises, Mended Hearts: Maintaining Trust in Love Relationships is the first and probably the only one to look at the small issues that impact trust, the issues that go under the radar.

Naked Intimacy is a bold approach to real intimacy—not new—but a step-by-step approach that has received lots of praise from professionals and couples.

My forthcoming dating book, (December) The Real Reasons Men Commit will introduce new concepts for women to consider, particularly pathological empathy.

I wrote one of the early books (1978) on infidelity which still reads well. My book, To Marry Again (1979) was also one of the first to look at the remarriage process.

Yada, yada. So, what are you bragging about these days? How’d you get so smart? What contributions have you made?

Your credibility needs some boosting. Online dating magazine isn’t nearly enough.

To be continued? If so, late next week. Life awaits.

Joel ([email protected])

James Houran


I was expecting direct answers to my legitimate questions, not misdirection and more condescension. You seem scared to discuss the issues head on. Even more surprising, you seem ignorant of recent research that suggests that the load a couple carries helps define how it is carried. Quantitative and qualitative issues in variables that define relationship quality are more complex and dynamic than you seem to be aware.

And you keep hounding on this interview with Adreyenn! Let it go, Joel. If all you took away from that interview is that couples should have a pre-nup, then I'd argue that you are trapped in minutia and can't see the bigger picture and philosophy that her comments spoke to. You can quibble about the details, but the general principle of "disclose and discuss" in all aspects of relationships is a valid one.

By the way, her interview was more detailed and informative than yours. I should also say that you are the one who seems to have a limited or negative view of researchers. Evidenced-based, data-derived mathematical models consistently outperform clinical impressions. I'd prefer the insights from a researcher's data much more than ideas that come from working with self-selected couples in a clinical context.

It's clear you have no idea who I am, or you wouldn't suggest my credibility needs boosting. Joel, I'm one of the very few industry insiders who has lived it -- in the lab, in the field and in market research. I was also a clinician, until I chose applied research several years ago. The fact that you don't seem to know who I am reveals your "newness" to the field of compatibility testing and research.

At the end of this post, I'll list some sample works. In addition to those works, my team has built many popular compatibility tests used by some of the biggest sites and we have advised many online dating sites on the psychometric challenges with such testing (including Chemistry and PerfectMatch).

Of course, I have an agenda. I'm known for setting standards in compatibility testing methods, so any new services are of prime interest to me. I have examined your website. I like some of the features of your site -- congratulations on the presentation. My criticism is that you never specify evidence for why people should believe your assessment is reliable and valid. You neither address it on the site nor here on OPW.

One can make all the marketing claims in the world about a test being the best, easiest, yadda, yadda.... but without evidence for those claims, it's all BS. Clincians and researchers -- psychological professionals -- should not perpetutate BS. I also noticed that when I requested you to reveal your "newfound knowledge" (your words) about love and compatibility from your clinical work, you had to admit that there wasn't any to share. Now what kind of impression about your credibility does that leave?

As one psychologist to another, let's cut through marketing spin and power struggle and talk shop. I'm sincerely sorry we got off on the wrong foot. You have my apology for my part in that. My agenda is to start an intelligent dialogue about the psychology of compatibility and how the best models are best adapted to online applications.

Would you please answer 4-5 serious questions about your compatibility test here on OPW? I can also guarantee that your answers will appear on Online Dating Magazine as well, so there'll be even more PR for you.


James Houran, Ph.D.

Sample Works

Houran, J., Thalbourne, M.A., & Hartmann, E. (2003). Comparison of two alternative measures of the boundary construct. Psychological Reports, 96, 311-323.

Houran, J., & Lange, R. (2004). Expectations of finding a ‘soul mate’ with online dating. North American Journal of Psychology, 6, 297-308.

Houran, J., Lange, R., Rentfrow, P. J., & Bruckner, K. H. (2004). Do online matchmaking tests work? An assessment of preliminary evidence for a publicized ‘predictive model of marital success.’ North American Journal of Psychology, 6, 507-526.

Houran, J. Lange, R., Wilson, G., & Cousins, J. (2005). Redefining compatibility: Gender differences in the building blocks of relationship satisfaction. 17th Annual Convention of the American Psychological Society, Los Angeles, CA, May 28.

Lange, R., Jerabek, I., & Houran, J. (2004). Building blocks for satisfaction in long-term romantic relationships: Evidence for the complementarity hypothesis of romantic compatibility. Annual meeting of the AERA (Adult Development Symposium Society for Research in Adult Development). San Diego, California, April 11 – 12, 2004.

Lange, R., Jerabek, I., & Houran, J. (2005). Psychometric description of the True Compatibility Test™: A proprietary online matchmaking system. Dynamical Psychology [online journal].

Lange, R., Thalbourne, M. A., Houran, J., & Lester, D. (2002). Depressive response sets due to gender and culture-based differential item functioning. Personality and Individual Differences, 33, 937-954.

Thalbourne, M. A., & Houran, J. (2005). Patterns of self-reported happiness and substance use in the context of transliminality. Personality and Individual Differences, 38, 327-336.

Dr. Joel Block

Hey Doc--I'm back for this final round:

Our differences, which I believe are not as big as they seem on this forum, are analogous to the debate between those psychologists who favor EBT—Empirically Based Treatment--and those who do not. The empiricists are guided by research and want to standardize treatment for various diagnoses.

Those in opposition wonder whether the contents of our consciousness can be broken down in a meaningful way through research. The critique contends that psychotherapy research is devoid of intuition and emotions. I can see both sides. I don’t think you and I need to replicate this false dichotomy.

Personally, although I do not have the temperament to be a full time researcher, I respect the path your career has taken. Unfortunately, a lot of good work goes unnoticed. Much of the research in our field is not read, and of those few who do read it, fewer are guided by it. I am not “anti-research” in fact, in one of my recent books I have a closing chapter for couples that details “Best Practices”. It was influenced by Dr. John Gottman’s work.

I realize that research in the area of test construction is different. In fact, my doctoral dissertation oh so many years ago was based on using a compatibility method (based on FIRO-B) to compose therapy groups. Without telling you quite how long ago, James, trust me, you were probably crapping your pants back then.

Hint, not only was I defending my compatibility premise in front of a very tough doctoral committee, I was also dealing with one of those draft letters, “Greetings, we’d like you to go to Viet Nam and get your butt shot up…”

Which reminds me—some of the anti-EBT guys argued that research strategies in therapy, especially couple therapy, was analogous to strategies in war—only good until the first shot is fired. Some truth there, but I am not one to volunteer to test the hypothesis.

Now, Doc, you accuse me of evading your questions. Actually, this is posted prominently on the site: ( )

Test Reliability & Validity
4 Key Relationship Factors

In fact, I pasted that in from the site. You will find a full explanation. I’ve addressed the issues that are most important, in detail!—and I also discuss the limitations of compatibility instruments later on the site—after the demo test.

James, I checked the registry of all those who have gone on the site fully and unless you use a pseudonym, you were not on it. Are you sure you went on the site? Yes, I know, a colleague of yours was on the site, Jon Cousins, but he did not agree with the standard NDA and wrote me to that effect. I wrote back immediately and invited him to speak with me—he never called me. Because he would not accept the standard agreement, he was not privy to the demo test—but he could have read the info on test development.


A couple of last points, James. You derided me for “admitting” that nothing is new—or at least that I have not invented something new. Well, I believe the simplicity of my test, ButterfliesAgain, applied to I-dating, is new.

But we were talking more generally. Here’s a really brief take: Dr. Murray Bowen in the ‘50’s was a pioneer in family and couple therapy. Years later we have Dr. David Schnarch tweaking the same approach and making headlines. Dr. Hendrix adds his tweak to an Object Relations approach that was there when he was a child and he makes headlines.

Dr. Albert Ellis, considered to be the most influential living psychologist of the 20th century, dates his cognitive revolution to the influence of Epictetus, 1st century A.D. Dr. Aaron Beck picked up on Al’s work, called it CBT (Al called his REBT) and he also made headlines.

All the big cognitive guys (especially Al, who I had a personal and professional relationship with), agree with my take on “it’s the belief held about differences—whatever they are,” being a major factor in relationships.

My early published research (on delinquency, weight management, smoke cessation) was influenced by 2 year post doc with Dr. Ellis. Along with Dr. Beck, I was one of the keynote speakers at Al’s memorial last summer.

So, no shame in acknowledging that nothing’s really new—whether it is my contributions or some of the headliners. Or yours, for that matter.

James, once again, I have differences with you, but I do respect your views. The thing is, I cannot keep up this dialogue. I live like there is no afterlife. Time is precious—and this is way too consuming. I am not going to reply to your forthcoming excellent reply.

But here’s an offer:

You are on Long Island. So am I! I am in Huntington. Email me ([email protected]) and let’s throw back some pints. There is a bar in Huntington with nearly 50 beers on tap. The crowd contains a couple of schizophrenics (I think), a few professional drunks, and a host of everyday people. My type of crowd.

I’ll buy. If you are coupled, bring your partner. I always bring mine (my 8th wife? No, no the original) and we can continue face-to-face.

Best to you, Joel

James Houran

Hi Joel,

I'm based in Dallas, but let's definitely get together next time I'm in NY or you're out here.

I checked the FAQs section for the psychometric info, but saw nothing:

Where on the site is that info exactly?


James Houran, Ph.D.

Dr. Joel Block

James, try this link. if it doesn't work, simply register and on the "take the test" page of, on the right side, right under the register is the click for the data...

Also, may I suggest, we are probably boring people to tears, if you choose to continue, let's take this off line and into email: [email protected]

Best, Joel

James Houran

Hi Joel, List

Believe me, this stuff intrigues readers. I know because I've contributed here for years, and my readers at Online Dating Magazine routinely ask me to review compatibility services.

In fact, this is the kind of information that industry insiders and consumers rarely see discussed as there are few experts to debate the issues.

Thanks for pointing the way to the psychometric information - this is stuff that might be better placed on the front page (or a clear link to it). Just my two cents.

In the spirit of continuing the OPW interview and getting you more PR, I'm drafting a few specialized questions. These will be posted here soon. Thanks in advance for your participation. It's a service to the online dating industry for test providers to talk about their offerings. All too often, such providers (for example, eHarmony, DNA dating) avoid hard questions about their product claims. Sad, huh?

P.S. Heads up... two statements on your site are incorrect:

“What's more, ButterfliesAgain is the only test that produces matches that compare favorably with happy married couples.”

“Other compatibility tests have not passed that kind of scrutiny when comparing their matches to actual couples who were happily married.”

You should revise these statements immediately, since several compatibility tests meet this criterion (and have for years). In fact, eHarmony was the first to use psychographic variables to predict DAS scores. They even have a patent on it. Further, Glenn Wilson ("father of modern compatibility testing") and Jon Cousins were arguably the first to develop and publish a compatibility test validated in this way in a peer-reviewed journal.


James Houran, Ph.D.

James Houran


OPW/ now poses a few technical questions about the Butterfliesagain compatibility questionnaire. Also included is key commentary to place these questions in context. The aim is to assess more fully the scientific merit of the questionnaire and matching process, which is advertised as being better than existing approaches. After all, several compatibility testing methods have been published in peer-reviewed, scientific journals and have been validated against committed, romantic couples (including marrieds).

I thank Dr. Block in advance for his cooperation.

1. Is your matching based on the typically used “similarity principle”? If so, please explain the mathematics used to calculate/estimate similarity. If some other principle is used - or in perhaps in combination with similarity - please explain the theoretical rationale of your matching approach.

2. You use four “factors” in your questionnaire: Social, Dominance, Submissive and Intimacy. Are these taken directly from the proprietary Fundamental Interpersonal Relations Orientation Behavior test (FIRO B)? If not, please describe in detail the psychometric properties of your version of the FIRO B – with particular reference to analytics that show the independence of the factors, their reliabilities, as well as that the individual questions have no significant response biases related to Age, Gender and Marital Status.

3. Like eHarmony, you match people based on their scores on the Spanier’s (1976; Spanier & Cole, 1976) Dyadic Adjustment Scale (DAS). In fact, eHarmony has a patent on their process.

[NOTE: For those who are unfamiliar with the DAS, it conceptualizes marital quality not only as a subjective evaluation but also a process in a dyad. The DAS has a total score, as well as four subscales that measure Dyadic Cohesion (“Do you and your mate engage in outside interests together?”), Dyadic Consensus (the extent of agreement/ disagreement between the couple on various issues), Dyadic Affection (“Do you kiss your mate?”), and Dyadic Satisfaction (“How often do you discuss or have you considered divorce, separation or terminating your relationship?”)].

Spanier constructed the DAS via classical test theory, which is an outdated approach that unfortunately does not guarantee that this instrument meets the requirements of modern test approaches - such as Rasch scaling (see Bond & Fox, 2001). These are the same gold standard statistics used in such well known assessments such as the GRE, MCAT and LSAT. yield interval-level measures free of response biases related to extraneous variables, such as a respondent’s age and gender. It is essential to control for such biases because statistical theory (Stout, 1987) and computer simulations alike (Lange, Irwin, & Houran, 2000) indicate that response biases can lead to spurious factor structures, significant distortions in scores, and consequently erroneous research findings. In this light it is not surprising to observe that the DAS’ factor structure, and hence its construct validity, continues to be debated in the literature (e.g., Hunsely, Pinsent, Lefewbvre, James-Tanner, & Vito, 1995; Sharpley & Cross, 1982; Spanier & Thompson, 1982).

With this in mind, how did you reliably and validly measure relationship quality taking into account potential response biases at the item level that concern Age, Gender and Marital Status?

4. Posting testimonials from colleagues about your questionnaire follows from the marketing approach pioneered by (which also has numerous testimonials from academicians). However, your testimonials are not from tests and measurements experts -- people qualified to assess the reliability and validity of instruments. Moreover, the validation process described on your website is too vague to assess on its scientific merits. Therefore, please describe in detail how many married couples participated in the research and used to establish your correlation with the DAS. Also, please explain how and why you cut your sample into thirds instead of simply assembling a large sample with adequate variance and conduct a (non-parametric) correlation?

5. Where did you recruit the participants for your research – are these people known to you from your clinical practice, were they “blind” to your research hypotheses, were they drawn from the general population, etc?


James Houran, Ph.D.

Dr. Joel Block

James, you note ButterfliesAgain isn’t the only test that compared favorably with results on the DAS. The wording you are referring to (in your next to last post) was stated poorly. You are correct!

What was meant was that I believe that ButterfliesAgain is unique in its brevity, its face validity—the vignettes strike a familiar chord—and that I offer my extensive experience along with the test. This is what I should have said in speaking of the uniqueness of the instrument--it is a fresh product when all is considered. As for extensive experience here's what I mean...

Providing quality matches is one thing, but we both know most relationships fail. Along with my test, I-dating sites get the option of having a highly credentialed, accomplished specialist in love and sex (with a forthcoming dating book!) to do virtual seminars, a specialized newsletter targeted to making love last, answering email queries from subscribers that can be archived, and more. That is what I mean by "when all is considered." It is not just the test, it is a relationship expert as well.

I view a stellar I-dating site as offering two services: One is an effective match. The other part, which very few do, at least with a strongly credentialed expert, is providing assistance in maintaining the love relationship. My offer, if not unique, is rare in the I-dating business.

As for E-Harmony, or other dating sites that may use the DAS in their validity studies, I may be mistaken, but I have not seen them post this data on their site in straightforward language as I have. And, as you must have noticed this time around, since you formally registered and--I assume, viewed the entire site--unlike my competition, as far as I know, I discuss the limits of compatibility (Closing Thoughts) and “The Passion Factor” in some detail and quite openly.

My webmaster is tending to his gravely ill mother (suffering from an unremitting series of strokes I am told) so I will not be able to amend the site “immediately” as you suggest. I will have to wait until he is available. Now is not the time for me to make a request of him.

As for answering more questions for your readers--right now, I am sacrificing in too many areas just to keep up. So…

Here is a “quickie” and it may be all I am able to provide. I am not about to do another dissertation defense, a few decades later. You can call that dodging if you choose, I call it being careful about a most precious commodity—my time.

Answering questions that very few, other than hard-core researchers are interested in is your arena, not mine. I have posted enough info for any I-dating site owner to understand and feel confident in ButterfliesAgain. Who else is posting what I have that is completely and easily Accessible to a lay market?

What’s more, I offer a free trial to see how my test works. In my view that is better than a statistical debate. I know, you disagree. But you are not a dating site owner and that is who I developed ButterfliesAgain for. For once, I am considering the marketing aspect of a product I developed. For example, e-Harmony's test may be terrific but it is tedious, doesn't have face validity and it turns off many subscribers.

Lastly, and this is sensitive. My associate, (and friend) who is more sophisticated in advanced test construction than I am, was sent back to Afghanistan and suffered a head wound during this recent tour. I cannot consult with him on some of the more advanced issues. Honestly, though, even if I could, I don’t believe a full “dissection” is necessary.

Here, using your numbers is my quick, but admittedly not as detailed a reply as you would want:

1. Some factors are similar (social and intimacy) and some are complementary (submissive and dominance). My guess is that most people who have been in lasting relationships will identify with the breakdown. In fact, my colleague interviewed numerous couples—using a SCI (structured clinical interview) in an effort to see if these factors would be confirmed. The results were very encouraging. The discussion of similarity/complementarity is posted on ButterfliesAgain.

2. As I say on the site as well, the factors are DERIVED from FIRO-B—not taken directly! Please, the factors used for FIRO-B are ICA—Inclusion, Control and Affection. I challenge anyone to task me for “taking” proprietary info! I used a derivative of a well researched test, the basis of my dissertation research many years ago rather than “invent” factors. My questions and the questions on the FIRO-B are distanced enough that once again, I challenge a copyright infringement. Ridiculous! Get a copy of FIRO-B and see for yourself!

3. DAS outdated? So is last year’s car but it’s still functional. So are many pharmaceuticals that have been around for years but still work. For many clinicians AND researchers, DAS is still a very useful instrument. The “outdated” view that you hold is controversial. I’ve talked to several researchers who still use it. You yourself chided me for saying I was the only one who used this instrument and that some others had used it as well. If it was good enough for them, it is certainly at least adequate for my use. My sample was drawn randomly (double blind) by my associate based on an (almost equal) distribution as to age, gender and years married.

4. The number of married couples is—and this is pasted in from the website, the one you read—Research participants were 54 married couples who were given the Dyadic Adjustment Scale (DAS) and the ButterfliesAgain™ compatibility profile. Duration of marriage ranged from 4 to 22 years and age of the participants ranged from 28 to 57.

5. Answered above (3 and 4). I hired an associate and my colleague trained him in recruiting a random sample as per above.

James, I will end with a little story (I love stories!) A lady goes into a butcher shop and is examining a fresh chicken with over-the-top detail.

She is looking into every crevice and crease, only stopping short of taking out a magnifying glass. Finally, the butcher asks her, “Could you pass that kind of examination yourself?”

The moral of this little yarn, at least the one I would like you to get, is that I have been open, my data is open, the test works and so do I.

How ‘bout ending on a positive note? It would be nice of you to find some positive points to make so that your readers have that part of your wisdom to consider as well. There must be something in the many exchanges we've had that you can go positive on!

Warm regards, Joel

(And this time, I will try even harder
to fight the temptation to keep this up!)

Fernando Ardenghi

Great debate Dr. Houran!

You are definitely the Researcher the Online Dating Industry needs.

Kindest Regards,

Fernando Ardenghi.
Buenos Aires.
[email protected]

James Houran

Dear Dr. Block,

Thank you for your time. As a clinician working with couples on issues of accommodation and hope, it greatly surprises me that you perceive my questions as "negative" commentary about your website and questionnaire. You invited me to debate you, remember? You said that it inspired you, remember? Well, I do not wish to disappoint. To me, this is all a positive process aimed at informing both consumers and online dating site owners about the merits of your system.

Your system will affect people's lives. It's not a mere academic exercise at some university where the quality of research has no real damaging outcomes per se. Instead, like other services, you portray your service (and perhapos rightly so) as more akin to a "health and human service." This is a more responsible approach to take on the compatibility and matching issue -- and I happen to agree with it. This is why the methods used to develop systems should be top quality.

As I stated in a peer-reviewed journal article assessing the psychometric merits of eHarmony's questionnaire (it has few merits): the prospect that millions of singles are making life-changing decisions based on compatibility tests that are not scientifically sound is a sobering one. Indeed, medical patients would not take a drug that has not been approved by the FDA (unless they are desperate) and likewise people looking for relationships should not so willingly trust online psychological tests and matching systems that have not been independently proven to meet professional testing standards.

Unfortunately, the information on your website and your responses above strongly suggest that your system does not meet professional testing standards and should be avoided at present by both consumers and online dating site owners. Your differentiators are also illusory -- there are several matching systems already in the market (and have been for years) that exceed the psychometric quality of eHarmony's approach and yet are brief and have strong face validity to consumers. Moreover, Dr. Neil Clark Warren at eHarmony was also a clinician and their suite of products was been expanded some time ago to include "clinical" support/educational materials to complement their matching process. They also talk about the quirks of relationship maintenance (not well, but they do).

You are correct in that most websites don't have well credentialed experts as the foundation of their services. In fact, it's a marketing epidemic that every site seems to have an in-house psychologist. The trouble is, what kind of expert is really needed? If the core of the service is testing, then it had better be at the very least an established tests and measurement expert. Clinicians and social science researchers per se do not qualify as such. And it shows.

One can have all the knowledge of the research literature or deep clinical experience and insight, but it is all useless without proper application to a testing process. That is much, much easier said than done. And that is why the academic community rightfully is upset at online dating websites marketing to consumers by abusing the term "scientific."

You have done some things well, which I have noted before publicly. But your main claim is unproven -- namely, that your matching process works and is better than any other approaches. In fact, your responses to my pointed questions reveal that you have considerable research to conduct before offering this service to consumers, or at least offering it as anything more than something for entertainment purposes. Let's review:

1. Good to see that you acknowledge the role of complementarity. However, you did not specify the matematical method used to determine when two people are alike or unalike. Also, you did not use statistics to establish the factor structure and reliability and validity of your questionnaire. Clinical interviews are no substitute for a rigorous evidence-based approach. So really we do not know what factors your questionnaire taps and how well it measures those factors in the first place.

2. Good to see that you are not using any proprietary material. I did not think as much, but your information did not make that clear to people. That said, while the FIRO-B is a well established instrument, your version of it is not. You did not provide any specific psychometric information on it. I suspect there is none, or that it the psychometric quality is poor. An amateur copy of a well-established instrument is no substitute for the real thing.

3. Good to see that you asked about years of marriage in your validation. The problem is that no tests and measurement expert who constructs instruments for use in the real world and have defended the psychometric properties of approaches in court would rely on the DAS as a reliable and valid measure of relationship quality. Many people do still use the DAS, but then again many people still carry rabbit foots in their pocket for good luck. The DAS was constructed with testing methods that have been outdated since 1960. Today, modern test theory approaches dominate applied testing. This outdated view is not controversial, it is factual. What is controversial is the meaningfulness of research using the DAS. Even the authors of the DAS itself agree. Maybe the DAS is good enough for clinicians for gestalt work, but the instrument is not psychometrically sound enough for use by tests and measurements experts in applied research aimed at building evidence-based models of relationship quality.

4. Good to see that you thought about demographic variables in your validation sample. The trouble is that 54 couples is hardly enough to warrant a peer-reviewed publication. In fact, you stated that you had (almost equal) distribution as to age, gender and years married. I do not see how you achieve adequate statistical variance with such a low sample. And it means that the moderately high correlation between your matching process and DAS scores was based on a sample of 18 couples! (54 couples divided into thirds). Professionally speaking, I would never recommend any clinical or medical treatment based on research outcomes using only 36 people.

Finally, no where do you address the issue of inherent response biases related to Age, Gender or Marital Status. Having some diversity in your sample is good, but your sample was too restricted to be useful and you did nothing with the diversity you did have from a statistical perspective. Biases can occur at the item or aggregate level, yet you provide no psychometric data to support the use of your matching process in real world situations.

I have actually researched and published and presented data on compatibility matching methods and measurement of relationship quality (anyone who knows my team knows we do practice psychometrically what we preach). I have conducted research with competitors and colleagues alike (I am interested in learning, not selling). I have co-designed matching systems and tracked outcomes using leading edge modern test theory approaches (I know what it takes to make successful systems). I am also an industry insider who conducts market research on what daters want (I know a great deal about face validity et al). The bottom line is that daters want a system that works -- even if they have to answer a bunch of questions to get it.

It is my job to vet compatibility science claims. Based on your own information, any informed and impartial observer would have to conclude that your system/service offers nothing inherently unique to the market and that there is no proper scientific evidence it works, much less as described. This is not simply an opinion that one can respond, "We must agree to disagree." Either there is proper scientific/psychometric evidence for a matching process, or there is not. You know when you meet the threshold when an expert can attest you meet professional testing standards (yes, there is an objective manual).

That said, your clinical work does have relevance in the type and manner of feedback given to customers. My recommendation is to collect a solid validation sample, conduct more research on your proposed method and have the research conducted by tests and measurement experts. Once the science matches your claims, then introduce it to the industry. At that point you really will have something meaningful to offer -- a compatibility test that is clearly scientific, as well as being brief and engaging. Compatibility tests need not be battles of style over substance.


James Houran, Ph.D.

Dr. Joel Block

James, I’ve been thinking and I realize I have a driving point that I haven’t made clear. It’s not all about the test. The test is a part of the equation, but there is more and I offer more.

Many people have consulted (unhappily) with a physician who was more into medicine and procedures than people. I think the same can be said of getting microscopic on a test. ALL tests can be effectively challenged, including any to which you have contributed.

EVERY research article has shortcomings, including yours. In the real world of research, the closer you come to perfect, the greater the likelihood that the variables being studied are less important.

To claim that the DAS, an instrument that credible researchers have used, can be questioned is true. Everything can be questioned on some level. It’s not perfect, but it is sufficient. To think otherwise--not just about the DAS, but to be overly critical--is too much focusing on a tree and too little on the forest, the bigger picture.

There isn’t a test on the planet that is going to guarantee or fully predict a long-term satisfactory marriage.

ButterfliesAgain is a credible instrument AND combined with the relationship expertise I bring to the table, I believe it is a contender for increasing the probability of lasting love. I make this point on ButterfliesAgain for all to read.

This, from the ButterfliesAgain website:

“Is it all about compatibility? No, it is not. Compatibility assessment does not assure, nor is it meant to assure couples that they will live happily ever after. Even well-made matches take work. Compatible matches do offer a distinct advantage; there is less work to do…

“Long term thinking beyond butterflies in each other's presence should be able to assist you in answering some of these questions:

•Will I be happy in the future with my spouse the way he/she is now?

•Do we have common preferences about starting a family?

•Do we have a shared vision regarding our life together?

•Do we want to raise our children the same way? With the same values and beliefs?

•Are we able to laugh together?

•Do we respect each other as individuals?

A healthy and harmonious relationship should consist of love and respect. When taking a closer look it is inevitable that there is more to a relationship than love...

Combine the guidance your compatibility match provides with your common sense and intuition as you interact with prospects. Those factors, in addition to keeping in mind that a relationship is a live and vibrant entity that evolves over time and requires care, and you've put together a winning combination.”

You make it sound as if the only route to relationship paradise is my using your team to create a compatibility test. Rather than repeat yourself in your next message, consider this: post all the criticisms of your work. You know, create a little balance. Come on, James, I feel like I have been more vetted than Sarah Palin!

And, my friend, that’s all he wrote…Joel

James Houran

Hi Joel,

Good to hear from you. I definitely agree that a compatibility test is not the "be all, end all." A legitimate measurement can be a valuable addition to the process used to match people and teach them how to nurture a lasting relationship. We are on the same page with this. And kudos to you for trying to bridge the clinical with the empirical. I have that same background, and offering consumers a holistic process that is personalized certainly attempts to take "personal matchmakers" (a booming business in some markets) to a new level.

However, your messaging does not convey this as effectively as it could. Your offer is for online dating site owners to integrate your questionnaire and matching process on their site. I assume you're not offering clinical sessions/relationship coaching for free as well. Now some cutting edge approaches have meshed assessment with clinical feedback that is personalized (relationship coaching). The tests at are the prime example. And I believe the meshing of approaches (compatibility testing with customized relationship coaching included in the feedback) is the future of these testing services.

Of course, any research or test can be criticized. But that is not license to dismiss existing standards of quality. Proper measurement is the foundation of any empirical process. It is also the foundation of any evidence-based model. Equating "being lenient with standards in measurement" with "not seeing the big picture" is absurd. Proper scientific measurements (observations) allow us to make models of the big picture. Indeed they rightly define it. One is wholly dependent on the other. And to suggest that the better we get at measuring variables the less important they become is shocking -- I'm glad designers of medicines, aircraft, medical equipment, etc don't take that view! If the variables in your questionnaire don't matter (much less the precision in their measurement), then why bother with them at all? It's an illogical position.

My own research with people like Rense Lange, Ilona Jerabek, Glenn Wilson, Jason Rentfrow, Sam Gosling and Jon Cousins has received praise, as well as criticism. The criticims are not technical because our methods are entirely based in modern test theory, so already our standards are higher than 99% of the field.

The criticisms far and aware concern online sampling. Specifically, some researchers argue that online samples can be unreliable. In other words, they give tremendous sample sizes but the people themselves are not proctored so no one is sure they are really whom they claim to be. Of course, the same arguments apply to in-person data collection. We strike a balance whenever possible. But more to the point, Internet samples are mandatory if the assessment itself is intended for online use. Simply putting a "pencil and paper" test/questionnaire on the web is not adequate, as research as shown time and again. Internet testing is a different beast, and instruments used must be validated in online environments.

I think you are sincere in your intentions, and you certainly seem more grounded than the folks at eHarmony. I do have the big picture close in mind throughout all of my comments here and before. The big picture is that consumers are in no position to judge the validity of "scientific" claims made by dating sites or services. That is why attention to standards matters now more than ever.

If we as clinical and research professionals aren't cognizant of those standards and disciplined enough to settle for nothing less, then how are we making people's lives better? We are "cutting corners" out of ignorance or convenience, but either way it hurts the consumers because they are using a product that is not what it purports to be, and we are dragging down the scientific field in the process. Both scenarios are pretty big picture.

I like what you are trying to do, but your website and your OPW interview all put great emphasis on your questionnaire and matching process. That is why I have focused my comments to your main selling point. I don't doubt you're an award winning clinician, but that expertise isn't a substitute for expertise in creating reliable and valid testing applications. In fact, the information you've shared demonstrates that your questionnaire/matching method doesn't meet standards for real world use.

I didn't offer my team for hire to help evaluate your system and improve it. You should consult modern test theory experts if you're serious about building a scientific system that works in the real world. However, if you collect good data on your methods, my team will PRO BONO evaluate its psychometric properties and improve the reliability and validity. We would only ask that any revenue generated by or associated with the improved version be donated to a credentialed institution in order to help fund students studying human relationships.

I'd appreciate interviewing you for Online Dating Magazine about your whole process. I think consumers would appreciate hearing about how a professional clinician approaches relationship coaching. It'd contrast nicely with the wealth of attention thrown recently on date coaches who don't necessarily have any training or authority.


James Houran, Ph.D.

Dr. Joel Block

James, and all--

I've had enough sparring. Something BIG is coming. It's coming soon!

Joel (aka...?)Stay Tuned!

Dr. Joel Block

Dear James,


Rocky 7!

Have you noticed how often pharmaceuticals that have been thoroughly researched and passed FDA scrutiny are found either ineffective (e.g. Vytorin) or worse, harmful (e.g. Celebrex) in real life?

In so many areas of life a surprising percentage of research that has been peer reviewed is a disappointment in real life. Even in an area far from the conventional, Wall Street, many people have lost money with very sophisticated, well researched mathematical models. It seems that in all areas research is best proven in real life.

In that regard, a review of our discussion casts me as Rocky in comparison to you, the Apollo Creed of test construction. You posit strong credentials as one of the champions of effective compatibility testing. And maybe rightly so.

And then there’s me. Rocky, the underdog. A well-seasoned clinician who has trained countless other psychologists in couple and sex therapy at a leading teaching hospital. I come along with a test that you maintain doesn’t meet scientific standards.

You challenge me to expose my stats, as most other test developers have refused to do—most prominently E-Harmony and Chemistry, according to John Tierney’s science column in The New York Times: “Hitting It Off, Thanks to Algorithms of Love as reported by Daniel Ehrenhaft on his blog. Once I do what many others have not, you go for the knockout punch.

Given that so much research looks good on paper and fails in real life, here’s my challenge:

Develop a compatibility test that has the engaging style mine has, that has a brief completion time, as mine has, and do it in the next 6 months. Not something you've simply worked on, but a new instrument created by you to compete with mine.

My test and yours will be placed on mutually agreeable I-dating sites. Let’s see how we do in real life—the arena where theory turns functional—or not! I will lend my clinical services to support the matches and you will lend yours.

We’re talking Rocky 7!

The results will be measured in stability of enrollment and satisfaction with the quality of matches (as obtained by follow-up research with the participants).

Yes, I know, you think introducing clinical services will confound the research, but I believe that matches require support—that’s part of the reason so many I-dating subscribers are discouraged and that’s the package I offer.

Are you ready to lace up for Rocky 7? Or are you going to cite some statistics and dance around?

James, you can run, but you can’t hide!

Your fans may be taken with your high-powered research talk, but the walk is where it’s at! That is the ultimate research—outside the lab and into real life!

Joel (aka "Rocky")

James Houran


Challenge Accepted
YOU’RE ON! My team would be delighted to demonstrate the art and science of compatibility testing. Let's first take care of housekeeping. Dr. Block, you must have heard about the distinction between reliability and validity, and that reliability comes first, because it’s mathematically impossible for an unreliable test to be valid (check any book on test theory). So, we’ll HAVE TO start with the research, mathematics and statistics. Serious testing companies, like the ones making the ACT, SAT, GRE, etc do this, and so should we - not only is this logical, it’s also the prescribed way in the APA's manual for test design. Your website claims to be offering a scientific test, so, to do your "real world test" challenge justice, we’ll adopt the standard approach that’s advocated by all 101 level test and measures courses (and beyond). First establish reliability, then validity, then outcomes. Again, your own website makes bold claims about its “science,” so continued attempts to downplay or dismiss this step would be hypocritical on your part at the very least.

Of course, validation of the system should be done before releasing to the public. Even here math and stats cannot be avoided: You now talk about a "real-world test" or challenge based on outcome metrics related to customer experience, satisfaction, usability and ROI to the online dating service. That's all well and good, but doesn’t this imply that mathematics and research is needed after all? The answer is "yes, it is." Since we all agree on this, let's start the challenge with research and mathematics. To be good citizens, and to protect ourselves and the public, before we launch the head-to-head comparison, please submit a test manual showing your "test" is even safe for the public to use. As an example, your manual should follow standard protocol. As an example, check out this link:

My team has developed several proprietary matching systems on some of the biggest dating sites. That said, proper test construction is not a simple "just add water" process, so your “six month” time line might need to be extended. Also, the application will be built with an eye towards being "engaging and shorter" test than your questionnaire. Of course, you have yet to show that online daters require matching systems to be short and "engaging" (whatever that means). For example, eHarmony is phenomenally successful, yet their questionnaire is neither short nor probably engaging according to your definition. So, from a business standpoint these features may be meaningless. Also, researchers have long known that even "bad tests" can be perceived as effective due to placebo-type effects. Being a clinician, you already know this, but just in case I recommend this article: Thus, it’s again imperative to start at the beginning of any scientific instrument – the reliability and internal validity.

With this all in mind, my team is confident we can build an original customized application that’s shorter and more entertaining test than your questionnaire. Shorter tests that don't sacrifice psychometric quality are arguably best constructed using a Computerized Adaptive Testing (CAT) approach. Fortunately, my team includes pioneers in this as well, see:

My team is well established in this industry and in the academic community because of our customized applications that drive revenue. Indeed, we have presented at iDate and published peer-reviewed scientific articles about the evidence (and lack of) for compatibility tests. It was even our team that first initiated a challenge to compatibility test makers to show the public any and all evidence for their specific methodologies. Some started to do this, and in keeping with the initial standards by Glenn Wilson and Jon Cousins, my team was arguably the first to bring testing standards to this aspect of cyberspace. Even Dr. Mark Thompson at the pioneering online testing company weAttract acknowledges this.

However, we don’t work for free. The estimated cost to combine all of the features described above in a new application, to meet APA testing standards, and being ready within approximately six months will cost approximately USD $350,000.00. You already have my private email address. Please write me so I may provide you with the mailing address to send a check. We will require full payment before the project will begin. We can also accept a wire transfer. The nice part is that you would own the application we build and be free to sell or license it as you see fit (we can always build another). We would insist, though, that the results be published in a peer-reviewed journal so the field can learn from our head-to-head comparison.

Personal Note to Dr. Joel Block
You welcomed debate and questions from me on It was clear from your comments that you had no idea who I was or that my recognized expertise was compatibility testing. It was also clear you had little to no grasp of the literature and history behind compatibility testing itself. I even had to correct you on several naïve claims on your website. Then when I started asking hard questions about the “scientific” matching system you were selling/advertising on that blog you became less cooperative. It was framed in niceties like, “let’s take this offline; this stuff is probably boring people.” But, I didn’t go away and continued to ask hard questions for which you obviously had no credible answers or data. Now, you’re trying a different method of misdirection by insulting the general public by asserting the public can’t or won’t appreciate all this science and research “stuff” and therefore we should ignore it and simply see “what people like.” People like candy but it’s not nutritional, whereas medicines can be life saving though not have the best “user experience.”

You poise yourself as someone who cares about the outcomes on people, but you seem to selling snake oil. Your dismissive, cavalier attitude about science is neither consistent with the messaging you have on your site nor with your clinical education and credentials. As an academic, I find it appalling and bordering on unethical. I’m even tempted to contact all professional organization to which you belong and notify them of the situation. Your website states your test is scientific and effective, although you repeatedly avoid providing any data of professional quality that establishes you even have a scientific test in the first place. I take strong offense to your advertised claims, and they should be modified immediately. Other researchers who have seen our exchanges side with me on this. You might have a “fun questionnaire,” but such an application shouldn’t be equated with an effective scientific instrument in which people should place their trust.

Everyone may not appreciate the science behind compatibility testing, but is that any reason not to adhere to professional standards? I’m incredibly relieved that experts who make and improve aircraft, household appliances, vehicles, seat belts and medical equipment don’t adhere to your point of view. I challenged you a while back to provide evidence for your scientific claims. Jon Cousins – a published compatibility researcher who deals with real world applications – also asked for similar data. These data apparently don’t exist, since you’ve never answered our original challenge to you. Amateur hour is over – you’ve been called out. Let’s start the challenge immediately and follow professional standards for evaluation. Frankly, I don’t expect you to actually participate. What I do expect is more stone-walling. The academic community is waiting – are you going to put up or shut up? My team started this challenge for dating sites to show clear and convincing evidence for their purported “scientific matching tests”, and I’m committed to see it through.


Wow Guys...this has been a really fascinating read. Of course, I don't understand most of what you're talking about, since I'm only a programmer, but it's obvious that you each have strong opinions on effective compatiblity testing.

Can I just deviate from the debate for a second and ask a question that has been nagging me since I worked with Dr. Houran at

While coding the logic in the compatibility test offered by, I noticed that the user interface would present a thermometer type bar to the user, who would be asked to click on the bar to indicate their level of agreement with a given statement or question. While the interface appeared to be very specific about the level of agreement (or disagreement), the program actually captured a simple value of 1,2 or 3 depending on where the user clicked the bar. Now the interface might look snazzy to the user, but why not just give them three options and let them choose?

When I've taken tests of this type, I found myself very carefully thinking about how much I actually agreed with the statement that was presented so that I could click in just the right place on the bar. So imagine my chagrin when I realized that I could've clicked anywhere in each third of the bar and the result would have been the same. Surely I'm not alone. Of the millions of people that have taken these tests, just imagine if each of them took an extra 5 minutes completing the test due to this type of interface. The combined loss of productivity and personal time is staggering! Wow...I didn't know I felt so strongly about this. LOL

Anyway, thanks for the entertaining read. Keep up the good work.

Kind Regards,

Lee Phillips
Web Developer Extraordinaire ;)

James Houran

Dear Lee,

It's not my place to address your question specifically for obvious reasons.

That said, speaking to the broad topic of user experience, if an audience likes the feel of a visual analogue scale (what you described above) over a Likert-type format (what you proposed above) then one could easily argue that an additional five minutes is hardly an inconvenience to anyone under the circumstances. Of course, that assumes that the protocol adds more time to begin with, and I've seen no such data. In fact, some people claim just the opposite -- that visual analogue scales for online tests are more intuitive than rigid categories and thus users respond faster and more reliably. If we're talking about five minutes, then I see the issue as moot to most people.

On the other hand, many tests my team have built do use a strict Likert scale format. My team has constructed many compatibility tools, and clearly one of the goals is for different applications to be different in important ways so users can have experiences that are unlike the competition.

Finally, when it comes to creating reliable and valid tests of any kind, there are clear guidelines. Opinion doesn't really enter into it. It's expected that test designers follow professional standards. That's what I advocated above and that's the foundation for all of my professional criticisms of so-called "scientifc matching tests" that seem to have "nothing meaningful under the hood."

Online toys and games are fine, but makers should be open and honest about this fact and not try to sell fortune cookie readings or placebo-type effects as scientific conclusions.


James Houran, Ph.D.


Dr. Houran,

Thanks for the response. I'm extremely curious to know if any studies have been done that compare the two types of user intefaces that we're talking about. My gut instinct tells me that, all other things being equal, the average amount of time spent by users on a survey/test using visual analogue scales would be significantly more that that spent by users of a survey/test with a Likert-type interface; especially if we're talking about questions with 3 or less options.

While five minutes doesn't sound like a lot, it can FEEL like a lot to the user that is taking the survey. Another statistic that would be interesting, is the number of abandoned sessions of a survey/test presented in a analogue scale format vs. the same survey/test presented in a Likert-type format.

I would imagine that a site owner would be willing to sacrifice the snazzy interface if it were proven that it had a higher rate of abandonment. Of course, this a all conjecture on my part.

Do you know of any studies that have been done that might satisfy my curiousity?


Lee Phillips
Web Developer Extraordinaire ;)

James Houran

Hi Lee,

I hope you're well. I don't know of any published studies that have compared the two formats with respect to online testing. And online testing is a different beast than pencil-and-paper formats.

However, my gut and experience tells me the time difference would be a wash UNLESS the test/assessment is very long. Yet, even "shorter" Likert scale options would still feel like an inconvenience simply due to normal test fatigue. So, test completion would very likely have more to do with content (are people motivated to keep answering questions) and test length (number of questions) than it does with time added or taken away due to use of a visual analogue scale versus a Likert scale.


James Houran, Ph.D.

Fernando Ardenghi

For some women the (True Compatibility Test) TCT's visual analogue Likert scale resembles the shape of a ....
Feminine Healthcare Sanitary Towel

See these screenshots.

Kindest Regards,

Fernando Ardenghi.
Buenos Aires.
[email protected]


Oh....yeah... Check it out.... Can you match us?

The comments to this entry are closed.


  • Dig Deeper - Research Categories

We're Social

  • Facebook  X   Youtube Linkedin