Thursday, 14 March 2019

DNA Testing: what can it tell you?

Personal DNA kits have become very popular over recent years.

Many are sold to those looking for long-lost relatives, or to people just curious about their 'roots'.

But what can they tell you about your origins and relationships to other people?

In a moment of enthusiasm, following an excellent series of BBC programs by Prof Alice Roberts, I decided to order a DNA testing kit from the British company: Living DNA

Its not that I am looking for any long-lost relatives, its just that I am curious.

the sample

Providing a sample simply involves moving a swab on a stick around the inside of your cheek. You then send away the sample in special packaging and wait for an email telling you that the results are in. Having created an account, you can then view the information via your browser online.

a bit of DNA theory

I think it is important to understand a bit of theory, otherwise you may be disappointed with the end results.
This may be a rather over-simplified description, but hopefully it will provide you with some kind of background, and may prompt you to dig a little deeper.
DNA analysis provides statistical data based upon our current understanding of how DNA changes over time, and upon field data indicating the variations between populations in (or originating from) different geographical locations. DNA analysis is evolving, and is likely to provide more information, of better quality, in the future.

Motherline Ancestry: Mitochondrial DNA

This kind of DNA (also referred to as mtDNA) is only passed from mother to child. It changes (or mutates) quite slowly, typically over many generations. Science has attempted to document these changes and reference them by group identities (i.e. "haplogroups"). So for example, people within the same mtDNA haplogroup have a common female ancestor.

Going back much further in time, we must reach a point where we all share a common female ancestor. This unknown female is unfortunately referred to mitochondrial Eve (mt-Eve).

This leads to biblical comparisons, but mt-Eve is NOT the first woman. mt-Eve is only the most recent female from which we are all descended. She may have had sisters that also reproduced, but all of their descendants have perished, so none of their mtDNA survives in the human population today. The theory is that mt-Eve came from a location in eastern Africa. She is the 'root' with a haplogroup identity of: L

Each change in mtDNA is represented by letters & numbers (e.g. haplogroups L0, L1, L2, H, H1 & so on).

As we record and document more & more mtDNA from around the world, we can build up a picture showing the likely origins and migrations of people of particular mt-dna haplogroups.

Here is a recent attempt to show mtDNA groups with the migration of people at different times in human history;

My motherline signature belongs to one of the most common in Europe which is believed to have originated 25,000 - 30,000 years ago: haplogroup H

Although haplogroup H is one of the most common, we don't appear to know where it originated. It could be north eastern Mediterranean, or others have suggested south east Asia or the middle east. This really is an illustration of how new this science is, and how much more there is to discover. So if you were expecting more precise detail, you may be disappointed.

Fatherline Ancestry: Y-DNA

Y-DNA is only passed from father to son, and therefore, this data only applies to us men.

The 'root' haplogroup  is: Y

The world map for male haplogroups looks different to the motherline equivalent;

Once again, the origin 'Adam' is NOT the first man. He is the most recent male ancestor for all living humans, and he certainly never met mt-Eve.

My fatherline haplogroup is: R-U106

According to Living DNA: "R-U106 is sometimes referred to as the Germanic branch of the R1b fatherline, and this haplogroup is found in large concentrations in both Northwest Germany and the Netherlands".

Here is the suggested migration route for R1B;

Autosomal DNA

This part of your DNA is passed down from all your ancestors, which is unique to you. This may provide your genetic history for the last 10 generations.

Your Autosomal DNA comes from your parents (50/50 split) but the amount from previous generations is more variable.

Again quoting from Living DNA;
"If your grandmother was 100% Eastern European, and the rest came from Asia, then your genetic profile could show anywhere from 0 to 34% Eastern European".

This looks like quite a limitation.

Various charts are presented which attempt to estimate origins over different time periods.

My initial illustration looks like this, and covers the last 10 generations;

...where each coloured area is assigned a percentage.

Living DNA: accuracy of results

In their FAQs, Living DNA fudge the question of accuracy by using phrases like:-
  • In a nutshell, your results are as accurate as the current science can make them.
  • Each of the 650,000+ locations we look at on your DNA represented at least 15 times, allowing us to present your results with high precision and accuracy
  • Living DNA uses the world's most advanced DNA testing technology...
 But the crucial phrases they use include:-
  • It is important to note that all family ancestry results are estimates, based on a comparison of your DNA to our range of reference samples
  • Our algorithm will attempt to match you directly to a population/region, where we have it in our reference database
  • ...we know that deciding where your ancestors lived hundreds of years ago by looking only at your DNA is a tough problem to tackle...


The basic data mentioned above is presented in a number of different ways, seemingly to give the illusion that there is more than there actually is. For example, I couldn't see the point of the "What makes you" spotty-man illustration, when the doughnut chart provided a clearer view of the same data.

The historical information (which relates to periods of time roughly associated with certain haplogroup steps along your ancestral line) is interesting, but would be better if there was more content.

I didn't understand why the "Through History" screens auto-runs through each step. It only makes any sense to me if you step through manually, giving yourself time to read the history at each stage, but maybe I'm missing the point. There are also a few typos and broken links that they need to rectify.

It seems to me that much of the data is essentially layers of uncertainty, stacked one upon another. Remember the earlier example concerning your Eastern European granny? If such a recent ancestor can make such a small impact on your Autosomal DNA as 0%, your personal results could turn out to be very misleading.

Future Research

There is no doubt that this relatively immature area of science will develop rapidly, providing some very useful information to mankind, especially in the area of medical research. At the moment, I think that my data is of more use as part of some vast DNA dataset, than it is as a single personal record to me.

It is often the case that useful information can be deduced from a large database, while the individual records are almost useless. So I have agreed to allow my data to be used anonymously for some (as yet) unspecified research.

Looking for relatives

Living DNA are working on a feature called "Family Networks" which will attempt to identify family relatives. While the idea of this feature fills me with dread, I do understand that this could be the single most useful option for those trying to build family trees or looking for answers to family mysteries. I hope to do a follow-up post when I have more information.

Some DNA diy

I'd like to be able to report that I've lashed up a Raspberry Pi with a sink-plunger, plus the cardboard tubes from two toilet rolls, and made my own low cost spit analyser. Unfortunately not.

But I have downloaded my RAW DNA data, edited it in LibreCalc and started my own analysis.

For example, the SNP rs6152 has a strong association with baldness, and by referencing the data at SNPedia, I see that:-
  • Genotype: AA = "won't go bald"
  • Genotype: AG = "increased risk of baldness"
  • Genotype: GG = "Able to go bald"

By searching my data I see that my rs6152 Genotype is: GG

Oh well, I still have a full head of fluff!

Anyway, this is very interesting! I'll cover this in more detail in a subsequent post.

Further reading:-


  1. We're the product...

    Have a look at the diagnostics side of things, progressive reductions in sequencing costs have made clinical diagnostic projects such as 100,000 genomes a reality:

    1. Thanks mate. The 100kGenome project link was of particular interest.

      My LivingDNA RAW file is only about 17MB of text. I understand the 23andMe RAWs are typically about 25MB, and the full Genome would be 200GB!

      So much info out there to read, and so little time to do it.
      We are only at the beginning of this scientific adventure. Who knows what the future will bring.

  2. Maybe a genome should become a standard comparison of data volumes... a bit like 'X times the size of Wales', so.. I reckon my Great tit box will generate approx 1/2 a genomes worth of video data unless I turn the resolution down....