On The Supposed Newly Discovered “Gay Genes.” Or, The Importance Of Model Skill


Great news! The press and activists are touting a study which claims to have discovered the genes that “make” somebody gay. So get your enwombed baby’s DNA scanned and make that appointment with Planned Parenthood early. There’s sure to be a line out the door as parents rush to eliminate these fabulous “clumps of cells.”

Hey. Why not? Abortion is the law of the land, right? And the law of the land decides right and wrong, yes? And abortion, we are told by heretics, isn’t killing, right? (Right, Sylvain? JH? Hello?) So why not save your potential child a lot of trouble and put it out of its misery as early as possible. Right?

Never mind. I was only kidding. You’re saved from confronting this gut-wrenching progressive dilemma—sexual libertinism vs. unchecked bloodlust—because the study is almost certainly nonsense. Why? Wee p-values were the evidence for the claim.

Wee p-values are the smallest sins here. Worst is the claimed accuracy of a model and the absence of skill. What’s skill? Read on.

Here was one of the initial headlines: “The DNA test ‘that reveals if you’re gay’: Genetic code clue is 70% accurate, claim scientists.” This is silly on its face, because if a man is same-sex attracted he doesn’t stand in need of a chemical test to tell him so. But let it pass, because the idea was to discover biological drivers, which is to say causes, of homosexual desire (and not homosexuality: there is no such thing).

The headline was based on an abstract—a mini-paper of about one page, common in medicine—and a press release, unfortunately common in science. Anyway, some fellow named Tuck Ngun at UCLA did the study. According to the paper:

The study involved 37 pairs of twins in which one brother was homosexual and the other heterosexual, and 10 pairs in which both were homosexual.

Using a computer program called Fuzzy Forest they found that nine small regions of the genetic code played the key role on deciding whether someone is heterosexual or homosexual.

The research looked at a process called ‘methylation’ of the DNA — which has been compared to a switch on the DNA — making it have a stronger or weaker effect.

This process can be triggered by hormonal effects on the growing foetus in the womb.

Fuzzy forest? An algorithm to do classification of some thing, here same-sex attraction or not, based on input variables, here the genetic markers. These markers weren’t DNA per se, but epigenetical markers, these methylation sites. Wikipedia, for once, does not let us down: it’s simpler to read their article than have me explain it. It’s not necessary to understand epigenetics to follow the statistics.

And did you notice? Thirty-seven pairs of identical twin brothers, in which one brother did not suffer same-sex attraction and one did. Know what that proves? It proves same-sex attraction can’t be entirely genetically caused. If it were, then you’d find all pairs of identical twins with the same attractions, given we accept everybody tells researchers the truth. This study thus proves that attraction is at least partly environmentally caused (this includes epigenetic changes, which we’d expect should be mostly the same for both twins in the womb). Incidentally, one environmental cause is choice. Skip it.

Then came the day Ngun gave his paper. His audience, we are told by The Atlantic, was not satisfied.

[Ngun] analysed 140,000 regions in the genomes of the twins and looked for methylation marks…He whittled these down to around 6,000 regions of interest, and then built a computer model…

The best model used just five of the methylation marks, and correctly classified the twins 67 percent of the time. “To our knowledge, this is the first example of a biomarker-based predictive model for sexual orientation,” Ngun wrote in his abstract.

Ngun separated the data in a training and validation set, so the model was built on something like 20 or pairs, and tested on the other 20 or so. He built “several models” and selected the one which gave the best classification accuracy, the one with just five methylation marks. Did Ngun correct for multiple testing? No, sir, he did not.

That means Ngun is claiming sexual desire is controlled largely, but not entirely, by these five methylation marks. Sound plausible to you? No. It didn’t to other researchers The Atlantic contacted either.

Guy named John Greally from Albert Einstein gets it. He said Ngun “could not resist trying to interpret [his] findings mechanistically”. Of course, nearly everybody who discovers a wee p-value interprets their findings mechanistically, which is to say, they believe statistical models have discovered causes.

Greally wrote (and The Atlantic also quoted): “It’s not personal about [Ngun] or his colleagues, but we can no longer allow poor epigenetics studies to be given credibility if this field is to survive. By ‘poor,’ I mean uninterpretable.”

To which we say Amen.


Here’s the real criticism. 37 of 47 pairs were SSA. Suppose you invent the naive model “Say everybody is SSA”. What is that model’s accuracy here? Well, you’d be right 37 * 2 = 74 times and wrong 10 * 2 = 20 times, for 74/94 = 0.79, or 79% accurate. Damn good! And it beats Ngun’s fancy schmancy Fuzzy Forests by a long shot. That sophisticated model only got 67% accuracy. This assumes his training-validation split was equal across the 37 and 10 pairs, naturally, but you get the idea.

If accuracy is your goal, there is no reason in the world to use Ngun’s model and every reason to use the naive model. The naive model does a much better job. It lacks the just-so story of methylations causing SSA, but what of it?

This lack of skill is what I’m always going on about in climate models. Persistence beats the sophisticated models, so why not choose persistence? Models without skill compared to a natural reference model should not be used.

Update Greally discovered Ngun’s answers to some of his critics. From that, we can see wee p-values were the source of decision. But the most interesting attempt at rebuttal was to Greally’s criticism about interpreting results mechanistically. Here’s what Ngun said:

Let’s be real here: no one is going to pay attention unless you talk about implicated genes. It’s all about interpretability. We ultimately want to understand what’s going on in terms of the biology so of course we’re going to talk about any genes that seem related and are interesting.

“Being real” means, to him, juicing the press release with potentially misleading information, because, he implies, who’d be interested in the truth?

Ngun later goes on to answer The Atlantic (Greally didn’t link to this). Ngun said, “[My] approach is used widely in statistical/predictive modeling field. It is not an insidious issue or data manipulation…” This is true. And that’s the problem. The method, as regular readers know, stinks and is guaranteed to produce large quantities of false positives.

More confirmation of wee p-values: “The single test we did was to ask whether the final model we had built was performing better than random guessing. It seemed to be because its p-value was below the nearly universal statistical threshold of 0.05.”


  1. James

    I enjoy how blatant it is that they just had some blood samples and threw them at software until a politically correct answer popped out.

    Also, in classification, “your model does worse than the sample proportion” is probably the worst thing that could happen. How did this even pass peer review? (he asked rhetorically)

  2. John B()


    Please defend your use of the Daily Mail as a source.

    IMHO, they are the National Enquirer of the UK and when it comes to reporting on conservative issues, every bit as bad as reporting about the “environment” by mainstream media. Find another source or report on the actual paper.

  3. John B()


    You reported on the one page paper

  4. Thanks for taking the trouble to write this up. I thought they had actually found something interesting, but you’ve made a good case that it’s just more p-hacking.

  5. John B()

    And Nature has posted an Update almost retracting the story (LA and Mail will leave the article as is but even the LAT had a caveat at the end discussing replication and mentioning the too-small sample size)

    Update, 12 October 2015: Since this story was first published, several researchers have criticized the study’s methods. Some statisticians, including Andrew Gelman, at Columbia University in New York, have said that the study incorrectly presented its results as statistically significant. Study co-author Tuck Ngun, of the University of California, Los Angeles, has disputed this and other statistical criticisms, although he has acknowledged another criticism that his study was underpowered. He has not yet made the full details of his analysis available, but has said he and his collaborators will issue a statement.


  6. Perhaps the twins were fraternal twins. And even if they were identical twins, identical twins are not genetically identical. They vary in copy numbers of genes.

    That said, I wouldn’t be surprised if you could find a genetic test for anything you like, no matter how obviously environmentally caused. Take a small sample of people and divide them into those who have ever been bitten by a snake and those who have not. Out of the three billion base pairs in each person’s genome, surely there’s something that most of the snake bite victims have in common.

  7. Excellent opening paragraph! This has long been recognized as a very real possibility. However, in the current political climate, there may in the future be a test required before killing that lump of cells to be sure that lump of cells is not a potential Democrat voter and supporter of diversity. If it is, removal will simply not be allowed. Pity the woman raped and then carrying a potential liberal voter, folks.

    Is that not the most perfect name for a computer program used in creating government science results?! I love it!

    It is somewhat encouraging that at least some sources do include criticism of the work. In the past, I don’t remember seeing such counters.

  8. Briggs

    John Cook,

    That excellent point can also be made for every hypothesis ever conducted (on humans). Whatever the supposed cause is, there are almost surely other markers that are also as different in the groups. I call this the Banana Test. Whatever test you have, it could be that everybody in one group ate one more banana in their lives than folks in the second group. Therefore, the hypothesis testing logic applies equally to eating bananas as it does to presupposed cause.

  9. JohnK

    Matt –

    Congratulations on yet another splendid lesson. Increasingly it would seem that you’re now able to teach any eager student volumes within just one example, in just a few short words. Even your offhand yet penetrating comment here to John Cook on the Banana Test – in a few words, both praising a student’s insight, and leading him further – reveal you as a master teacher.

    I remain both sorry and angry that the world now in practice forbids you to teach what you have to give – what you eagerly wish to give – to dozens and dozens of undergraduate and graduate students, every year. And one might add, to teach other professors; even, dare we hope, to teach study administrators and bureaucrats, as well. It’s not right that you are in practice prevented from doing so.

    You and your family remain in my prayers. My own prayers are puny, my widow’s mite, but you have them.

  10. As someone who never studied stats and hasn’t done math for decades — when I loved abstract math, particularly algebra and geometric proofs, and hated all sorts of applied math — what is the best way for me to learn about the “p value” you are always talking about? I really want to grasp this.
    So the idea behind epigentics is that some genes can express different traits because of environmental effects — producing a particular condition, etc., that they would not if the environmental effect were not there. Right? The Wikipedia article mentioned developments in fur, etc., in mice given certain foods. The food triggers the genes to do something they would not normally do, but the changes do not mean that the genes themselves have changed/mutated. Right?

  11. Ray

    If you believe your behavior is really a physical identity then you can always claim you can’t change the behavior and people that call the behavior immoral are persecuting you. You’re a victim of persecution. When you claim victim status you are morally superior to the people victimizing you.

  12. Nina

    The lab is funded by the NIH, our tax dollars. Fraudulent science.

  13. Sylvain


    For once you did not distort someone view

    “And abortion, we are told … isn’t killing …? (Right, Sylvain?) ”

    With YOS difforming Aristotle concept:

    “Aristotle also described the concept of “delayed ensoulment” and “animation.” Explains Merritt and Merritt,

    “Building upon the classical Greek appreciation and elevation of form in nature and life, he taught that a fetus was at first an unformed ‘vegetable soul.’ This evolved into an ‘animal soul’ later in gestation. It finally became ‘animated’ with a formed human soul” (When Does Human Life Begin?, p. 14).”


    “The most familiar but also the crudest form of trichotomy is that which takes the body for the material part of man’s nature, the soul as the principle of animal life, and the spirit as the God-related rational and immortal element in man. The trichotomic conception of man found considerable favor with the Greek or Alexandrian Church Fathers of the early Christian centuries.”


    And here is a good sommation of different point of views about the beginning of life:


  14. Ray

    Thinks for the link. Anyone who has studied languages knows that gender is a property of declined languages, not humans. Human so called gender is just a social construction, i.e. imaginary. The UCLA center for gender based biology is just quackery.

  15. I’m surprised that no studies were done on homosexual behavior in animals given that “culture” could be eliminated (or at least diminished) as a contributing factor. And abnormal sexual behavior does exist in the lower species. My grandmother had a spayed boston terrier bitch who would hump my leg every chance she (the boston terrier bitch) could get.

  16. JH

    “And abortion, we are told … isn’t killing …? (Right, Sylvain?) ”

    Sylvain, thank you for not including me in the parenthesis.

    And abortion, we are told by heretics, isn’t killing, right? (Right, Sylvain? JH? Hello?)

    Mr. Briggs, this is definitely not how one says “I miss you” to a friend. So why do you mention me here? Does this imply that I believe or have argued that abortion is not killing? If yes, I expect you to answer this question with evidence. If not, please remove my online alias from the above statement.

  17. JH

    The study involved 37 pairs of twins in which one brother was homosexual and the other heterosexual, and 10 pairs in which both were homosexual.

    Here’s the real criticism. 37 of 47 pairs were SSA. Suppose you invent the naive model “Say everybody is SSA”. What is that model’s accuracy here? Well, you’d be right 37 * 2 = 74 times and wrong 10 * 2 = 20 times, for 74/94 = 0.79, or 79% accurate. Damn good!

    Assuming SSA=same sex attraction, the naive model would be correct with a relative frequency of 57/94. Anyway, fitting such naive model to data resulting from a purposive sampling method is nonsense.

    No going to spend more time on this post since I already read about this tabloid hype in Gelman’s blog here and here this weekend. And I agree with Gelman’s assessment that
    ”Based on the abstract and what Ngun wrote on his webpage, … we don’t really have enough information to judge. In general it seems like you’re asking for trouble when you start publicizing technical claims without supplying the accompanying evidence. Everything seems to depend on trust—or perhaps the fear of getting scooped by another news outlet. If you’re Nature and the American Society of Human Genetics emits a press release, and you know it’s gonna get covered by the Daily Mail etc., there’s some pressure to run the story.”

  18. Well, it certainly seems plausible, given the complexity of our reproductive systems, all involved really considered, genetic irregularities would sometimes occur that would cause or make someone more open to homosexuality.

    As well, it should come as little surprise if it turns out genetics have little to do with it. Given the complexity of our brains and human society, it seems quite plausible homosexuality is a social/psychological phenomena.

    I never liked the genetic argument as a moral/political argument. It should make no difference. We know that sexuality is deeply wired into our minds. There is no simple therapy or lifestyle or religion or law that can change a persons sexuality. It is about the most personal thing to any person. If you believe in human liberty, you can not hold to regulating common sexual behavior among consenting adults.


  19. It’s turtles all the way down. That a certain gene complex “cause” “something ” is the ultimate in gibberish answers. What causes gene complex ABC to express behaviour XYZ is the sort of question (and answer) that can only be meaningful. Even if you were able to demonstrate a solid correlation you still don’t know what that means. Dressed up phrenology designed to impress the stupid.

    (I’ll ignore the fact that Dr Brigg’s predicates his philosophical claims on dark age philosophical precepts, which is yet another level of stupid entirely…)

Leave a Reply

Your email address will not be published. Required fields are marked *