Once a genealogical cluster has been determined and NPE submissions are added to the cluster, the next step is to analyze the submissions of the cluster in order to estimate the haplotype (DNA values) of the common ancestor of the cluster. All the submissions are the DNA of the donors – not our ancestors. The DNA of our ancestors can only be estimated based on the DNA of their descendants. It is very important to only include submissions that could be closely related in this cluster. If improper submissions are included, then any estimates of the DNA values of the common ancestor could be incorrect. Some NPE (adopted) lines should be included in the cluster for analysis but only NPE lines that have a high probability of being related to the surname of interest should be included.
The common ancestor of any genealogical cluster is called the Most Recent Common Ancestor (MRCA). The DNA marker values of any donor is called the haplotype of the donor. The DNA values of the MRCA is called the MRCA haplotype. For most genealogists, we need assistance in defining the MRCA haplotype of previously unrelated lines in the 200 to 300 year time frame where traditional research usually hits a brick wall. There are obvious exceptions of family histories that are fortunate to have much earlier traditional research. If your cluster is fortunate to have located solid source documentation that has an older proven ancestor, you would need to adjust the time frame to match the traditional research.
There is a common misconception that any submission that has similar DNA must be closely related. Even with 67 markers, it is a common scenario that a grouping of submissions with similar DNA may not be related as suspected. If any grouping of submissions have too few unique DNA values, these submissions may not be related as the MRCA calculators and other publications suggest. The more markers tested, the higher the probability that more unique DNA values will be discovered. Common DNA fingerprints can be shared by individuals that are not closely related. It is important that your genealogical cluster only includes submissions that really belong to the cluster. If any submissions do not belong to the cluster, then the determination of the MRCA haplotype would be less accurate due to improper inclusion of submissions that are not related.
This scenario has been call "overlapping haplotypes" by the Mark Jobling. It is estimated that around 25 % of today's groupings will have overlapping haplotypes at 37 markers. The surname itself may be enough to separate overlapping haplotypes of other surnames. This makes adding NPE submissions very risky for these groupings. There are a few groupings where there are obvious signs of overlapping haplotypes even with the surname used as a filter. Some groupings must initially include multiple genealogical clusters as there are just too many DNA variations to belong to one true genealogical cluster. Estimating the MRCA haplotype for any grouping that includes multiple genealogical clusters that are not related is very problematic. Estimating the MRCA haplotype of the Casey R1b1a2 grouping is not really possible until this grouping can be broken up into groupings that are genealogical significant.
There are two methods to break up these groupings into genealogical clusters. First, by upgrading to 67 or 111 markers, many submissions will reveal more genetic differences that can be used to break up the grouping into multiple genealogical clusters. Second, submissions can not be closely related if they do not share the same deep ancestry (haplogroup). By testing Y-SNP markers (deep clade or special order Y-SNPs), this will often reveal different deep ancestry for submissions that orignally appear to be closely related by only Y-STR testing. If submissions do not share a common deep ancestry, it is virtually impossible that they share recent connections.
The Casey R1b1a2 grouping of submissions probably includes overlapping haplotypes of at least two Casey genealogical clusters. It is hoped that testing Y-SNPs (deep ancestry) for this grouping will divide this grouping into at least two genealogical clusters. Most submissions in this grouping only have their haplogroup estimated by Y-STR markers which is not as accurate or detailed as true Y-SNP testing. Other submissions tested their deep ancestry several years ago before many new branches were discovered. Only one submission in the Casey R1b1a2 grouping has a recent deep ancestry test and it reveals 13 levels of branches (this is much more than the estimates of 6 levels of branches). Deep ancestry may be determined by only one SNP marker, SRY2627 (which reveals haplogroup R1b1a2a1a1b5a).
There are two clusters that are defined by unique DNA haplogroups that do not have deep Irish ancestry (2,000 to 20,000 years ago). These unique Casey clusters obtained their non-Irish deep ancestry in one of two ways. The most likely scenario is that these lines are NPE births of a non-Casey male. However, some very Irish people will be surprised that they do not have Irish deep ancestry. These people could have lived in Ireland for over 1,000 years but still may not have deep Irish ancestry. It only takes one male immigrant with non-Irish deep ancestry in the last 10,000 years to create a non-Irish deep ancestry. In either scenario, these unique deep ancestries provides a very unique DNA fingerprint for these two clusters.
For these non-Irish clusters, it may be difficult to determine the MRCA haplotype of these clusters since there will probably be only s few submissions assigned to these clusters. The ability to determine a MRCA haplotype will depend on when the NPE event happened. If the NPE event was a very recent NPE event (100 to 150 years ago), there will not be many submissions assigned to these clusters. If the NPE event occurred long ago, then the cluster could become a much larger cluster at some future date.
The South Carolina Casey cluster and Munster, Ireland Casey cluster are two different genealogical clusters in the 200 to 300 year time frame. However, it appears that both clusters could share a common male ancestor in the 600 year time frame. This means that the MRCA of each cluster is related in some fashion. The MRCA of each cluster could descend from common MRCA of both clusters. Another alternative is that one cluster is a branch of the other cluster. The analysis of the L226 deep ancestry implies that the South Carolina cluster and the Munster, Ireland cluster descend from a common male ancestor after L226 (1,300 years ago). Of the 14 surname clusters currently known in the L226 cluster, only the two Casey clusters share the common mutation 534 (15 to 14). Additionally, the MRCA of the Munster, Ireland cluster is only 2 or 3 mutations from the L226 MRCA while the MRCA of the South Carolin cluster is an amazing 8 mutations from the L226 MRCA. Knowing the relationship between these two clusters is very important for determining MRCA haplotypes of each genealogical cluster.
FILTERING OUT MUTATIONS
Another common misconception found in a lot of DNA analysis is that few DNA genealogists attempt to separate "genealogical significant" DNA mutations from "recent" DNA mutations. It is extremely important for genealogists to separate the DNA mutations for generations close to the donor (not genealogically significant) from the mutations that are close to their oldest proven ancestors (which are genealogically significant). Mutations occur randomly anywhere in the all male line from oldest proven ancestor (or his male ancestry) all the way down to the donor himself. DNA mutations that occur within three or four generations closest to the donor only assist in defining branches of well known events (ie., separately your grandfather's line from his brother's line). If these connections are already well proven, these "recent" DNA mutations do not help solve the task of connecting various lines that have no proven connections. In fact, these "recent" DNA mutations can hinder the analysis as many genealogists treat these "recent" DNA mutations as "genealogically significant" DNA mutations. The analysis of any cluster requires filtering out mutations due to "recent" mutations that are not genealogically significant.
For example, the DNA submission of the John/Levi/Francis Casey line definitely has at least one "recent" DNA mutation. The mutation 385b (14 to 15) only separates the descendants of Francis Casey from his brother William Casey. Since this event is already proven and does not provide any new insight to the ancestry of John Casey (MO), it would be considered a "recent" DNA mutation. The verdict is not as certain for Francis' mutation 460 (12 to 11). The highest probability scenario is that this is yet another "recent" DNA mutation. However, this DNA marker is extremely significant to the South Carolina cluster. The DNA marker has two very important characteristics that make this DNA marker the most important DNA marker for the South Carolina cluster at this point in time.
First, the 460 marker separates the South Carolina Casey cluster into two large subgroups of lines within the South Carolina cluster (about half the lines have 460 = 12 and half have 460 = 13). Secondly, the John/Levi/William line has 460 = 11 which is a third variation of this significant marker and this variation matches the marker value of the submissions in the Munster, Ireland cluster. Since both of these clusters could belong to a common genetic cluster, there is a small possibility that the John Casey (MO) line could replace the John Casey (SC) line that most resembles the DNA marker set of the progenitor of the South Carolina cluster. Additional submissions from the John Casey (MO) line have some chance to radically change the Casey DNA descendancy chart (it is higher odds that it will not change these charts but still remain a worthwhile opportunity to really shake things up). It is more probable (but not certain) that 460 = 11 is another "recent" backwards mutation to the original value of 460 = 11 (the most likely marker value for the common male ancestor born 300 to 400 years ago).
Another example of a "recent" DNA mutation is that of the Henson/Jackson Casey line. The mutation 607 (15 to 16) appears to only distinguish Jackson Casey from his brother Arvle Casey. However, the DNA mutation 460 for these two brothers presents a scenario that requires more DNA submissions to solve. There are two reasonable scenarios: 1) Arvle Casey has the mutation 460 (12 to 13) or; 2) there is another reasonable scenario that Jackson Casey has the DNA mutation 460 (13 to 12). For the Henson Casey line, this particular mutation has a dramatic impact on how the Henson Casey line fits into all other lines. One scenario implies Henson Casey is closely related to Pleasant Casey and the Hanvey line. The other scenario implies that Henson Casey is closely related to the John Casey (SC), the Moses Casey (SC) line, etc. A third submission of another son of Henson Casey is required to settle which major grouping this line belongs to.
Another example where it is currently assumed that a DNA mutation is "genealogically significant" but could actually be a "recent" mutation. The DNA mutation of the James Casey (SC) line has been assumed to be genealogically significant which may not be the case. Another descendant of James Casey (SC) (via a different son or grandson of James Casey) may not have the DNA mutation 437 (15 to 16). The current donor of the DNA of this line may be the source (or his father or grandfather). This would push this mutation further down on the DNA descendancy chart and make this line more closely related to the John Casey (SC) line. Only additional submissions of this line will allow this mutation to be classified as "genealogically significant" or just another "recent" mutation. All lines that have a mutation from the MRCA haplotype require two submissions (of different sons or grandsons of the oldest proven ancestor) to validate that the mutation is genealogically significant (helps define a genetic branch close to the oldest proven ancestor).
|