The AutoKinship tool was introduced on GEDmatch about a year ago. Developed by Evert-Jan Blom, AutoKinship is able to reconstruct trees based on shared DNA between shared matches. Genetic Affairs has AutoKinship for 23andMe data., as well as manual AutoKinship. Manual AutoKinship can be performed for any site that allows you to view the amount of cM shared by your matches. FamilyTreeDNA and Ancestry are the only companies that do not share this information.
When AutoKinship was first introduced for GEDmatch, the clusters were only made of matches that triangulated on segments of DNA. Recently the clustering was updated to include In Common With (ICW) matches that do not have a triangulated segment as well. Although I usually prefer to work with matches that have segment triangulation, clustering approaches work best when employing all ICW matches.
Figure 1 represents a cluster of 100 kits run in February 2022. It produced 17 clusters including 95 matches. Since these clusters only share triangulated segments there are not many grey cells. I’ve labeled my two paternal second cousins (2C). Trish1 is on my dad’s mother’s side and my cousin I’ll call Frank is on my dad’s father’s side. Trish and I share 19 segments of DNA, and Frank and I share 9 segments.
Rerunning my GEDmatch kit with the updated AutoKinship using 100 kits gave 26 clusters and lots of grey cells connecting matches, see figure 2. Again, I’ve labeled my second cousins Trish and Frank.
The Many Files in AutoKinship
To better understand the features of AutoKinship on GEDmatch (available for tier 1 users) we are going to look at what results are included in the AutoKinship run. After unzipping the file when I first open the AutoKinship folder I find nine folders, two HTML files, and an Excel file, see figure 3. This particular run was for 500 GEDmatch kits that match me.
I like to look at the AutoCluster for my results first. This is the autokinship.html file. If it’s too large to be viewed the autokinship_no_chart.html file has all the information except for the visual of the clusters, and the Excel file will show the clusters such that the match names can be easily read. My AutoCluster has 482 DNA matches and 87 clusters, so I’ll be using the Excel file to read the names of matches in each of the clusters.
Going down the screen below the large clusters in the HTML file is an explanation of each of the items performed in the analysis, as shown in figure 5.
Next is a list of the results from the analysis. A partial list is shown in figure 6. This table shows all of the separate analyses that were performed as part of the AutoKinship analysis. These include the regular AutoCluster analysis, AutoTree (identification of common ancestors), AutoSegment (identification of groups of triangulated segments) and the AutoKinship analyses.
Below the list of results is a listing of all the matches in each cluster. Figure 7 shows the match list for cluster 1. The match name and kit number are given along with the centimorgan shared, the number of shared matches that each match has, if the match has a gedcom tree on GEDmatch and the match’s email address. This AutoCluster information includes a listing of matches for all of the clusters.
Going back to the results of the AutoKinship analysis, shown in figure 6, I’m going to explain the various items based on cluster 33, since it has an entry in each column. On the far left is the cluster number. Next is the number of matches in that particular cluster. AutoTree will display a tree that is based on common ancestors identified in gedcoms that the matches in this cluster (and the gedcom linked to the tested person, if available) had posted on GEDmatch. Clicking on the tree icon displays that tree, shown in figure 8, in another tab.
The icon that looks like a book in column 4, displays the common ancestors found in that cluster. This is shown in figure 9. In this case I don’t have any ancestors in Arizona so it’s only listing some recent common ancestors of people in the cluster.
The next column is location and shows where there are common locations for people in the cluster and the tester. Typically there are several lists of places and matches, but I’ve only shown the first one in the figure. This is when I got super excited. This one has County Limerick, Ireland. Matches in this cluster and I both have ancestors who lived in County Limerick! As shown in figure 10, Jeremiah Fenton, my fourth great grandfather, his son, William, and William’s granddaughter, Bridget Mary Fenton, my great grandmother, all lived in County Limerick.
The paternal side of my tree is shown in figure 11. My 2C Trish shares great grandparents, Thomas Byrnes and Bridget Fenton, with me.
Since I know a great deal about my Fenton family I had to go and look at the two trees listed here for B and J. These would be the gedcoms that they had uploaded to GEDmatch. Michael Carroll and Katherine Callaghan had a child Thomas born about 1830. I looked for baptismal record for him and found his and five of his siblings’ baptismal records at Dromin & Athlacca Catholic parish. Checking John Grenham’s site I found that the Civil parishes for these churches were Athlacca, Dromin and Uregare. Dromin and Uregare were familiar names as I know some of my Fentons had lived there. A quick check for Carrolls in Griffiths Valuation taken in 1851 in this part of Limerick, found John Carroll, Thomas’ brother, living in Cloonygarra, Dromin. My second great grandfather John Fenton was in Maidenstown, Dromin in Griffiths Valuation. Figure 12 has a map showing this area of Civil Parish Dromin. These townlands are very near each other.
Getting back to the AutoKinship diagram in figure 6 the icon that looks like an anchor opens a new tab with the AutoKinship tree predictions. These are based only on the shared DNA of the matches and not on any gedcoms they might have added to their GEDmatch profile. The first one that is shown has the highest probability, but there are nine other probability trees. In this particular cluster the top six of mine all have the same probability. Figure 13 has my AutoKinship tree 1.
Below the AutoKinship tree list is a matrix of how the matches relate to each other, shown in figure 14.
Both in the AutoKinship tree and the matrix you can see the parent-child relationship for J and B, as well as the sibling relationship for E and U. (You can click on the siblings and see if there is full identical regions (FIR) data to backup the sibling claim!) The AutoKinship probability tree suggests that the matches are 4C or 3C1R to me. All of the matches share about 14 cM with me. My known Fenton cousins that share common fourth great grandparents with me share 15.5 cM.
To the right of the AutoKinship tree in figure 6 is the AutoKinship tree that includes the AutoTree that is based on the gedcom that the matches loaded to GEDmatch. Figure 15 shows this for the first probability AutoKinship prediction.
The last icon in figure 6 brings up the AutoSegment data in a new tab. The top of the window shows the chromosome(s) where the matches are located. Further down the page is the list with the segment data. These data are shown in figure 16.
Seeing the DNA segments on chromosome 4 here made me go and look at my DNA Painter profile on chromosome 4.
The Fenton 5C are descendants of my William Fenton’s brother Timothy. Our most recent common ancestor couple (MRCA) would be my fourth great grandparents Jeremiah and Norah Fenton. My Fenton line out to Jeremiah is shown in figure 18. Prior to running this GEDmatch cluster I had painted some of the matches who are showing up in this cluster.
Matches from AutoKinship
To go back to the original AutoKinship folder, shown in figure 3, each of the folders contains the data for that particular feature that we saw in the results of the AutoKinship in figure 6. The ‘gedcom’ folder has the AutoTree gedcom for each one that had an AutoTree. The ‘gephi’ folder has the data needed for gephi software. Matches contains a cluster of matches for each person that appeared in my match set. For example, this file in the matches folder is for my 2C, Trish.
Her cluster is shown in figure 19. Trish is in the orange cluster 1, and the long line of grey cells shows how all the matches in this cluster are connected to her. In the matches folder there is an HTML file that contains a clustering report of all the ICW matches for each person that is listed as a match to me in the original analysis. This makes for an easy way to find all the shared matches and clustering patterns for each person that matches Trish and me.
I’ve added Mark’s location on Trish’s cluster. Mark is an interesting match to me. We share two segments. One of them on chromosome 12 that triangulates with Trish and me, and the other is on chromosome 20 and it triangulates with Frank and me. Normally when I find a match who shares more than one segment my first assumption is that both of them connect to the same MRCA. That is certainly the simplest situation. But Mark doesn’t follow that simple assumption. Mark’s father also matches Frank on chromosome 20, so that line has to be Mark’s father’s side of his family. It turns out Mark’s paternal grandmother was a Byrne, and the segment on chromosome 12 that matches Trish is from his father’s mother’s side. The match file for Mark is on figure 20.
Mark’s family immigrated from Ireland to Canada. There are several triangulated DNA matches with Frank and me who live in Canada. The Aides side of the family immigrated through Buffalo, NY on their way to Wisconsin. Our Barrys settled in Evans, Erie County just south of Buffalo. No passenger list has been found for the Barrys. Thomas Barry was listed in the 1845 House Books, which was one of the precursor surveys just prior to Griffiths Valuation. But he is not listed in the 1848 House book which gives a hint to when the family immigrated. They were listed in Evans in the 1855 New York State census and indicated they lived there for five years. The hypothesis is that the Barrys immigrated from Ireland to Canada and then to Evans, New York. Since Canada and Ireland were both part of the Great Britain, there would be no passenger lists for travel between those two countries. Passage to Canada from Ireland was a lot less expensive than transport to the United States. At that time there was also no paperwork required to cross the border between Canada and the United States so there were no records,
Exploring ICW Connections
Since the updated AutoKinship on GEDmatch gives information about ICW matches there are more connections to be discovered. Looking at Frank and the Barry side of my family our MRCA are Edward Barry and Pauline Fröhlich. Edward was born in Kilkenny, Ireland and Pauline in Baden, Germany. Separating which of our great grandparents a DNA match is related to can often be done based on where the matches’ families lived.
Looking at the 100 kit AutoKinship clusters from figure 2, Frank is in cluster 22. He has two copies of his DNA on GEDmatch. He is the third and fourth member of green cluster 22 and has grey cells to four matches in cluster 25, see figure 21.
Clusters 25 and 26 are particularly interesting in that several of the matches live in County Kilkenny. Frank and my MRCA from Kilkenny was Edward Barry. His parents were Thomas Barry and Mary Aide. Frank and I share a large DNA segment on chromosome 20, see figure 22. Matt, Dot, and Dan all triangulate with Frank and me there. Dot descends from the Aide side of the family. Our MRCA was likely Mary Kilfoil, but we don’t know if she was Mary Aide’s mother or grandmother. Since a segment of DNA can only come from one ancestor this large segment on chromosome 20 must be from the Aide side of the family.
Matt, and Mary live in Kilkenny. Tom is a descendant of a Barry family that lives in Sugarstown, Kilfane, Kilkenny which is less than 10 miles from Moanroe Commons where Thomas Barry and Mary Aide lived. Dan’s family was from Counties Wexford and Carlow which are next to Kilkenny.
Since several of these ICW matches live or have family living in Kilkenny, I decided to look for marriages between Barry or Aide and any of the matches’ surnames. Found a marriage to Aide in 1806 and followed the children’s baptisms and marriages out for a couple generations. But then there were no more marriage or baptismal records, and it was too early for anything to be in civil registration. So now I have a small rabbit-hole-tree that probably won’t go any further at trying to figure out the connection.
AutoKinship provides many different tools for exploring shared matches with your DNA matches. Now having all ICW matches including those with segment triangulation is going to be an improvement to GEDmatch AutoKinship.
- Trish has given permission for me to use her real name. All other living person’s names are fictitious.