Reconstruct trees for MyHeritage matches using AutoKinship

The newest feature on Genetic Affairs website is AutoKinship.  The most amazing thing about AutoKinship is that it generates a tree using only your DNA matches and the shared DNA between your matches.  It doesn’t require you or any of your matches to have a tree. On the Genetic Affairs website there are 2 ways to run AutoKinship, an automated analysis for 23andme or a manual analysis for MyHeritage or GEDmatch matches. Recently Roberta Estes wrote a blog describing the use of AutoKinship for 23andme.  This blog will describe using the manual AutoKinship at Genetic Affairs using MyHeritage DNA matches.

Figure 1. The two AutoKinship approaches on Genetic Affairs.

The start point of this analysis is the MyHeritage AutoCluster clustering. Starting with the MyHeritage cluster my second cousin, Trish1 is found in cluster 6, but she has several grey cells to cluster 1. Cluster 7 has several additional known cousins.

Figure 2. MyHeritage AutoCluster.

Cluster 1

A close up of cluster 1 is shown in figure 3.

Figure 3. Cluster 1 and grey cells to Trish.

From the information provided on MyHeritage for the matches in cluster 1 two of them live in Australia, two in England, a couple in the US and the rest in Ireland.  Trish and my great grandparents, Thomas Byrnes and Bridget Fenton, both came from Ireland so I’m interested in finding the connection there.

To set up for AutoKinship I took the HTML from this MyHeritage cluster and converted it to an Excel file using the “Transform AutoCluster HTML to Excel” under the “Analysis” menu at Genetic Affairs.

Figure 4. Convert HTML file to Excel.

The Excel file has several tabs in it.  The first is a list of my matches.  The second tab is a list of the shared matches.  The next tab has the matches in cluster 1 followed by a tab with the shared matches for cluster 1.  This continues then for all of the clusters with a list of matches showing the cM that the match shares with me and whatever notes I’ve written about the match, followed by a tab that has those matches with the names of all their shared matches in the cluster.  Matches from grey cells are not included in the cluster matches but do show up in the shared matches list for that cluster.  Figure 5 shows the match list for cluster 1.

Figure 5. Match list in Excel file for Cluster 1.

Part of the shared match list for cluster 1 is in figure 6.

Figure 6. Part of the shared match list, ‘icw_1’ for cluster 1 as found in the Excel file.

Using the shared match list, I then went to MyHeritage and got the number of cM that these matches share with each other.

Figure 7. MyHeritage shared matches for Mary who is in cluster 1.

The shared match list in MyHeritage shows how much DNA Mary shares with me, but it also shows the amount that she shares with each of our shared matches. These data are needed for AutoKinship. I’ve circled two of these amounts in red in figure 7. These shared centiMorgans are copied and pasted into column C of ‘icw_1’ list.

Figure 8. After adding the shared cM to the ‘icw_1’ Excel tab.

Because Trish was a shared match to several of these matches she already shows up in the shared match list, but not in the match list (see figure 5), so I needed to add her to the match list. On MyHeritage Trish is listed as Patricia Ann Harris, her formal name.  AutoKinship needs the names exactly as they appear at MyHeritage so that it can find the correct person in the cluster.

Figure 9. Match list for cluster 1 with Trish added.

I also need to add Trish and myself to the shared match list.  I copied the table from the match list, shown in figure 9, and added that to the shared match list.

Figure 10. The ‘icw_1’ list after adding myself to the shared matches list.

I could run AutoKinship with the information that I have now, but I can also add a known genealogical tree (in the WATO format) for the known relationship between Trish and me.  Our common ancestors are Thomas Byrnes and Bridget Fenton, our second great grandparents. Using the WATO tree insures that Trish and I are placed correctly relative to each other. The amount of DNA that we share is on the high side for second cousins, and it could be labeled as first cousins once removed. Since we know the relationship it’s better to set it with a WATO tree.

Figure 11. WATO tree showing Trish and my family relationship.

With the WATO tree I need to use the same exact names for Trish and myself that MyHeritage uses, or they will not be recognized as the same people.  Also on the WATO tree I need to add the shared cM that MyHeritage has for Trish and use 0 cM for myself.

To use the WATO tree with AutoKinship I downloaded the WATO tree.  Do not use the ‘save image’ as that will create an image of the tree and not what is actually in the tree.

Figure 12. Download the WATO tree to use with AutoKinship.

Now everything is ready to run the manual AutoKinship.

Figure 13. The entry screen for manual AutoKinship.

For name of tested person, I entered my name exactly as it is on MyHeritage.  The default is for 10 trees. You can select more if you want, and they are listed from highest probability down to lower ones.  The first few would be the most likely.  Maximum difference in generation refers to the difference between the tested person and their matches.  The default is 2 generations which would include people in my generation or my parent’s or my children’s generations.  Since I don’t know how all the matches are related to me this is likely a good value.  If I were to set it to 3 generations that would indicate that some of the matches could be in my grandparents or grandchildren’s generation. Looking at the ages the matches have indicated in MyHeritage gives me an idea that 3 generations is not needed here. ‘Set generation of tested person’ lets you set the generation level for yourself if you’ve set the generation of some of your matches.  This is especially helpful if you know how some of the matches in the list fit in your tree and are a different generation.  This is data from MyHeritage so I want to use the MyHeritage probabilities.  And I’ve loaded the WATO tree for Trish and me.

Figure 14. Full screen setup for manual AutoKinship.

There are two ways that the data can be entered.  However, the bulk import is so much easier! Just copy and paste the match data from the Excel file.  In this case it’s cluster 1 so the tab with the data has a ‘1’ on it for the Bulk Input DNA matches data.

Figure 15. Bulk import DNA matches with the data from cluster 1 filled in.

Then copy and paste the shared match list from the ‘icw_1’ tab into the Import shared matches data screen.

Figure 16. Data pasted from ‘icw_1’ into the shared matches screen.

Next I clicked on “Perform AutoKinship Analysis”.  A zip file is sent to my email which I then downloaded, saved to my computer and unziped.  The first autokinship.html file is the landing page and has the highest probability.  The autokinship.xlsx lists the match file in one tab and all the shared matches in another tab.  I’d used 10 as the max number of trees.  Tree1.html is identical to the landing page tree.  The other 9 are trees with lower probability. WATO trees of the 10 probability trees are also provided.

Figure 17. List of files in the AutoKinship directory.

Figure 18 shows the first tree using WATO for Trish and me and setting only myself as generation 0.

Figure 18. Landing AutoKinship for cluster 1 with only me set to generation 0.

The WATO puts Trish and me into our correct second cousin relationship.  However, Sarah, Sue and Joe Smith being our grandparent’s level seems unlikely, since they list their age range on their MyHeritage page, and they are in the same range as Trish and me.

Next I ran the AutoKinship setting Joe Smith as generation 0 as well as having set myself as generation 0 using the ‘set generation level of tested person’ showed in figure 13.  To set generation 0 for Joe Smith I added 0 in Column C next to Joe’s name in the match Excel file (see figure 19) and used that match file in the AutoKinship.

Figure 19. Match table with Joe Smith set to generation 0.

The landingpage AutoKinship tree for the analysis that has Joe listed as generation 0 is shown in figure 20. There is a notation of gen 0 by both Joe’s and my names in the AutoKinship tree.

Figure 20. AutoKinship landing page tree with both Joe and me set to generation 0.

In the AutoKinship tree clicking on the person’s name brings up a box that summarizes all of their matches and the amount of DNA charged both as centiMorgans and percentage. This is shown for Joe Smith in figure 21.

Figure 21. This display shows how Joe Smith matches each person in the AutoKinship tree.

One interesting thing that jumps out at me is the relationship between Joe Smith, Sue and Sarah is the same in both AutoKinship trees.  On MyHeritage Sue only has a tree of 1, so that doesn’t provide any information.  However, her son Frank has a small tree but indicates his mother’s maiden name was Smith.  Sarah also has a small tree and indicates her mother’s maiden name was Smith.  From their shared DNA it appears that those connections are through Joe’s grandfather and great grandfather on the Smith side of his family.

Bridget Fenton’s mother, our second great grandmother was Johanna O’Brien.  Bridget was born in 1853 in Limerick and was Johanna’s only child born in Ireland.  All her other children were born in the United States and lived there their entire lives.  We don’t know who Johanna’s parents or her siblings were.  It appears that at least one generation is missing here, since Johanna cannot be person #1 in the tree.

In following Joe Smith’s family back starting with the tree he had and looking up Irish civil birth records and Catholic baptismal records I discovered that his great grandfather, born 1851, married an O’Brien who was born in 1850.  They would be in the same generation as Bridget, born 1853.  And both this O’Brien and Bridget Fenton were baptized at the same Catholic parish in Limerick.  Unfortunately, the records haven’t survived far enough back to give either my second great grandmother, Johanna’s or Joe’s second great grandfather’s baptismal records.  My hypothesis is that they were siblings (see figure 22).  There is another occurrence of O’Brien in Joe’s tree on his mother’s side.  It’s quite possible that Trish and I match on his mother’s side as well, and we just haven’t found that connection yet.

Figure 22. Tree with Bridget’s mother, Johanna O’Brien added.

Looking at the others in cluster 1 I’ve messaged with Mary.  She and her brother, Bob, are second cousins to Joe Smith but not on the Smith line.  Mary’s grandmother is sister to Joe’s grandfather. I’ve not found enough records to generate a hypothesis for her connection to Trish and me since it is different than our connection to Joe.  Meagan and Barbara are cousins to each other.  They have a private tree and did not reply to messages.  So I have no idea how they connect.  Ann has a small tree but all of the people in her tree live in the US.  I have not messaged her at this time.

Cluster 7

Cluster 7, shown in figure 23, has several known cousins in it. 

Figure 23. Cluster 7 from the MyHeritage clusters.

Carl is a known second cousin on my Dad’s father’s side, and Carol and Andy are known second cousins twice removed. Our Barry family came from Kilkenny, Ireland.  Edward Barry married Pauline Fröhlich from Baden after both families had immigrated to Evans, Erie, New York.   Figure 24 shows the WATO tree for this side of my family.

Figure 24. WATO tree for known cousins in cluster 7.

With both a second cousin and second cousins twice removed the generation indicator in Genetic Affairs AutoKinship setup becomes very important.  If Carl and I are set to generation 0, then Carol and Andy would be -2 since twice removed is 2 generations past Carl and me.  Figure 25 shows the cluster 7 match tab.  It is worth noting that Carol’s and Andy’s setting is based on where they are in relation to our common ancestor and not on when they were born. Their births are both within in a few years of my daughter’s birth.

Figure 25. Cluster 7 match table showing relationship of cousins.

The AutoKinship cluster for cluster 7 is shown in figure 26.

Figure 26. AutoKinship landing tree for cluster 7.

Looking at the information the matches provided on their MyHeritage post Laura has ancestors in Nova Scotia and Prussia.  Richard has ancestors from England, several counties in Ireland and Newfoundland, and Mae has ancestors from England and Ireland and specifically from County Kilkenny in Ireland.  Based on immigrating information for the Barry family, specially not finding any passenger list to the United States, a cousin’s family that immigrated through Canada, and several DNA matches that live in Ontario, the Newfoundland and Nova Scotia are not surprising. Our hypothesis is that the family immigrated to Erie, NY, a short distance from Buffalo, an international entry location, after arriving by ship from Ireland to Canada.  Since Ireland to Canada would have been within the British Commonwealth there would have been no passenger lists for the journey. 

Conclusion

First there was a DNA match.  Then shared matches gave a hint to the family connection.  A triangulated match provided a second hint.  Next AutoClusters grouped these shared matches together to hint of the relationship between them.  And now AutoKinship provides the biggest hint by suggesting how the family tree is connected!

  1. Trish has given me permission to use her real name. No other living people are identified by their real names in this blog.
Posted in DNA

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s