After the first phase of determing the genealogical cluster, the next phase to add high probability NPE submissions to the the cluster. There is a very common misconception that having very similar Y-STR DNA must mean that all submissions are closely related. This rule of thumb is only somewhat accurate when all submissions also have the same surname. For very common surnames that have 50 or more genetic origins, even similar DNA and common surname is not a guarantee submissions are related since extensive genetic origins can create overlapping genealogical clusters. There are two common sources of similar DNA with different surnames: 1) NPE related births (great for connecting lines with different surnames); 2) overlapping haplotypes (common DNA marker values that can produce false hits). The Casey R1b1a2 grouping show significant signs of overlapping haplotypes. The Casey South Carolina cluster has several rare DNA marker values and shows little evidence of overlapping haplotypes. The Munster, Ireland cluster shows some signs of overlapping haplotypes but some NPE connections are possible as well.
There are some good methods to determine the probability of NPE events (different surnames with similar DNA) vs. just common DNA values (overlapping haplotypes). First, look for a unique fingerprint that is revealed by shared rare marker values. This is the primary and most reliable method for separating good NPE candidates from overlapping haplotypes. Finding a common unique DNA fingerprint provides more proof of a connection than just similar DNA values. Another reliable method is to search a public database (Y-Search) where submissions are not limited by surname. If the search results includes many submissions with numerous surnames, this is evidence of common DNA values. If the search produces over 90 % of the submissions with the same surname within a few mutations, then NPE events are much more likely.
There are some common sense rules of thumb that can be used. I would search Y-Search at 32 markers and maximum of 6 mutations. Choosing 32 markers will catch the common markers between the FTDNA 37 marker tests and the Ancestry.com 43 marker tests. If any surname only has 10 % of the submissions, this implies overlapping haplotypes (common DNA marker values). If 90 % of all submissions are only one surname, the other 10 % make much better NPE candidates. If 90 % are two surnames, this implies there could be an early NPE connection between the two surnames. The extremely wide variation in surnames from searches can not all be closely related as there can not be such a dramatic variation in NPE rates between groupings of submissions.
The analogy of a river can help you visualize the existance of overlapping haplotype. If the river is 100 yards wide, most of the water probably flows down in the middle 50 yard area where the river is the deepest. Common DNA values are those in the middle of the heavy flow of water. You want to be on the edge of the river where rare marker values reside. DNA marker values constantly add and delete strings of rungs but tend migrate back to the highest probable marker values in middle of marker value ranges.
Once the uniqueness of the DNA has been determined, you should then look for traditional information that supports the DNA evidence. In order for the NPE event to have happened in the last 200 or 300 years, the two lines (with two different surnames) must have lived in the same geographic area during a common time frame for the NPE to have occurred. Once a common geographic connection and common time frame is found, you then need to look for supporting traditional documentation that connects the two lines (some examples are: lived within 10 households in a census record, intermarriages between the two lines, probate records that mention both lines, etc.)
Finding NPE connections is an iterative process. Once you find supporting evidence that both lines could be connected by traditional research, you should go back and look at the DNA evidence for common mutations. If both lines also share common rare marker values, this further strengthens the case for connections between the lines. Another genetic piece of evidence is the mutation rate of the markers where common mutations are shared. If the mutation rate is very fast, this greatly reduces the chance that these lines share a common marker value as these fast-mutating markers change back and forth at a rapid rate. The slower the mutation rate of the shared rare marker values implies a stronger connection.
Similar to building a case for relationships via traditional research, establishing genetic connections is really only an extension of traditional genealogical research. You must evaluate both traditional source documentation and genetic source documentation to build a case for NPE connections. Building a case for connections via traditional research is often based on "preponderance of evidence" where many documents combined present enough evidence to support the connection between individuals. To determine possible NPE candidates, you start with DNA evidence (similar DNA) that imply a relationship. You must then evaluate the DNA evidence in further detail (dominant surnames in submissions, rare DNA values, shared rare DNA values and rate of mutation for DNA markers that have common mutations). If multiple sources of DNA information continue to support the NPE connection, you then look for traditional connections that allow the connection to be made. If your list of evidence is substantial, you may be able to declare a connection based on the "preponderance of evidence" based on both DNA evidence and traditional documentation.
|