Analysis of DNA for Casey lines
Last updated on December 28, 2006
By Robert Brooks Casey - descendant of
Robert Brooks - d. 1805 Mecklenburg County, VA
The analysis of DNA markers provides a new opportunity for genealogists
to unravel their family history. This new tool is now producing
results that can take some of the guesswork out of adding
another ancestor to our family trees. Historically, our traditional
research is heavily influenced by geography, family naming patterns,
migration patterns, etc. This approach most often leads us to discover
new ancestors but it can also lead us into wrong direction as well.
Your particular oldest proven ancestor may have broken away from his family
connections and traditions. Your oldest proven ancestor may not have
named his children after his older generation of relatives or may
have moved to new areas where no siblings or cousins lived, etc.
Analysis of DNA markers allows us to identify which Brooks lines
look encouraging as potential relatives and assist in avoiding research
on those unrelated lines that ended up in the same county by chance.
The Brooks surname is a common name, therefore, Brooks researchers
should expect to regularly encounter unrelated Brooks families in the
same counties during the same time period.
Unfortunately, DNA provides its best fit in tracing your "all male" ancestors as this is basic biology that limits DNA research to these lines. This limits male researchers to submit their DNA for their surname only. For other surnames of your ancestors, you have to get one of your cousins born with the surname of interest to submit their DNA sample and you can "sponsor" their submission (or assist them with the expense of the DNA submissions based on your mutual interest in your common ancestor). There is also DNA test for women that will trace their "all female" lines as well. With European surname practices for marriage, women of European descent take the surname of their husband which results in a different surname for every generation. This constant change of surname makes it more difficult to trace than tracing one surname over many generations. Additionally, the female DNA is much slower to mutate which further reduces its usefulness for genealogical research. Male DNA markers are like probate records and census records, our best primary sources that yield maximum information. Female DNA tests are equivalent to tax records and property records, producing fewer results but can be very useful when census records and probate records fail to produce results. We should search our male DNA lines first and then later supplement them with research of female DNA lines when further analysis of male DNA lines no longer yield enough information to break through to another generation.
Currently there are 42 sets of DNA markers to analyze for the Brooks surname. There are 12 submissions with 12 markers, 15 with 25 markers, 12 with 37 markers and 3 with 67 markers. There are six pairs of DNA markers that are matches within this project, however, several do not have the same number of markers or have too few markers to accurately determine if these matches are really closely related or just false hits. Without traditional genealogical documentation, it is impossible to determine if these are significant matches where connections have not been made to date or are only matches that are already proven by traditional research. There are also six other submissions with only one marker deviation, one with two markers that are different, two submissions with three markers that are different. Again, some do not have a common number of markers and without matching traditional documentation, some matches may be already known to be closely related individuals. It is unknown if any my cousins have submitted to this project (my line has around 7,500 descendants documented to date). It is obvious from the submissions to date, that Brooks is a relatively common surname and that many of these Brooks lines have no common ancestor prior to the date when most Western Europeans starting using surnames. The Brooks surname is based on geographic origins (living near a brook) and there are many unrelated lines that started using this surname when our ancestors first started using surnames (primarily from 1,100 to 1,500). There are twleve submissions that match or have only one marker difference. These submissions, if not already known to be related, would be closely related and warrant exchange of information between these submittors. The goal of any DNA project is find to discover clusters of very similiar DNA marker sets which imply close relationships. Submittors need not only to submit their DNA but they also need to assist in locating other "all male" cousins (or suspected cousins) and get their these people to submit their DNA. The objective is to form several clusters of related lines and concentrate traditional research on these newly discovered clusters of closely related lines.
Any DNA research project can benefit from a coordinated compilation of the DNA documentation. This is also true with publishing a family history or being involved with a surname organization. As with traditional genealogical research, some researchers tend to organize and document findings, others travel to courthouses and gather those important source documents and some are really good about contacting many cousins and gathering more recent descendants. This analysis is an attempt to coordinate such documentaion. It is hard for many individuals to know which submissions could benefit from expanding the number of markers of DNA samples that have already been submitted. After analyzing all the submissions to date, here is a summary of the most closely related lines:
14383 (37) and 33138 (37) - match (both 37 markers)
32321 (37) and 42464 (25) - match (37 / 25 markers)
42798 (25) and 61478 (25) - match (both 25 markers)
48392 (25) and 22789 (25) - match (both 25 markers)
28312 (25) and 24831 (12) - match (25 / 12 markers)
45365 (12) and 47968 (12) - match (both 12 markers)
14383 (37) and 43358 (25) - one mutatation (37 / 25 markers)
45522 (37) and 42798 (25) - one mutatation (37 / 25 markers)
73766 (25) and 48392 (25) - one mutatation (both 25 markers)
42798 (25) and 73159 (25) - one mutatation (both 25 markers)
42857 (37) and 51227 (37) - two mutatations (both 37 markers)
These closely related submmissions do not necessarily mean significant new information due to DNA analysis. There are three scenarios: 1) These lines are already known to be related by traditional research (we just do not have this information available at the DNA web site); 2) These lines have a potential for being closely related but do not have enough markers to know how close these lines are related; 3) These lines are closely related and traditional research has not connected these lines with proper documentation (these submissions should contact each other and share traditional research).
After analyzing all the 37 marker submissions, there are only two submissions that match. This is the only closely related lines found with 37 markers. The next closest pairs have two mutations (1 pair), 3 mutations (1 pair) and 5 mutations (1 pair). Therefore, only one pair appear to be closely related from a DNA point of view. There are many other pairs that have the potential to be closely related but need additional markers analyzed to determine if there really is a close relationship. Below are recommended upgrades where there are currently not enough markers to draw significant conclusions. First, only two submissions could benefit from being upgraded from 37 markers to 59 markers (assumes that this relationship is not already known from traditional research):
14383 (37 to 67 marker upgrade) - exact match with 33138
33138 (37 to 67 marker upgrade) - exact match with 14383
There are eight submissions that could benefit from upgrading from 25 markers to 37 markers (assumes that relationship is not already known from traditional research):
42798 (25 to 37 marker upgrade) - exact match with 61478
and one marker difference from 73159 & 45522
48392 (25 to 37 marker upgrade) - exact match with 22789
and one marker difference from 73766
22789 (25 to 37 marker upgrade) - exact match with 48798
and one marker difference from 73766
42464 (25 to 37 marker upgrade) - exact match with 32321
43358 (25 to 37 marker upgrade) - exact match with 14383
61478 (25 to 37 marker upgrade) - exact match with 42478
73766 (25 to 37 marker upgrade) - one marker from 48392
73159 (25 to 37 marker upgrade) - one marker from 42798
There are four submissions that could benefit from upgrading from 12 markers to 25 or 37 markers (assumes that relationship is not already known from traditional research):
24831 (12 to 25 marker upgrade) - exact match with 28312
45365 (12 to 25 marker upgrade) - exact match with 47968
47968 (12 to 25 marker upgrade) - exact match with 45365
69789 (12 to 25 marker upgrade) - one marker from 22789
Our goal is to get several clusters of Brooks lines that help establish recent common ancestry between various Brooks submissions. There are two basic methods to get these clusters started: 1) Actively recruiting submittors that are sons of the same ancestor or believed to be closely related; 2) Random submissions will eventually begin to cluster when the sample size grows. Once the number of submissions greatly expands in scope, a major benefit will start to emerge. It will become obvious that several diverse Brooks lines will become more closely related than traditional research has shown to date. DNA documentation can help genealogists better select which "possibly" related lines to research based soley on DNA evidence. Researching these newly discovered potential relationships through traditional genealogical methods may result in locating supporting documentation and may be the key to getting past that brick wall. Currently, no major cluster of submissions has been discovered but there are several potential clusters that could emerge with upgrades to submissions (shown above) or a few new submissions which will eventually start overlapping with existing submissions.
As time passes by, many submittors may become no longer interested in paying the premium to have their sample analyzed for additional markers. Eventually, these samples will become unviable to analyze. The person supporting the analysis could also die or become incapacited with the children potentially showing no interest in this project. For the vast majority of cases, the exposure to lose valuable DNA documentatoin will probablly not be of great concern as most lines have many male living descendants of any particular son of an oldest proven ancestor. If there are numerous living male descendants, then there will remain many others to assist in the future. However, if you are the only surviving male of your line, it is very important that you submit as many markers that are currently available (currently 67 markers from this company collecting samples for this project). My great grandfather, William Martin Shelton (born 1847), had seven daughters and only one son. This son produced only one grandson who died as a teenager in 1928. Therefore, there are no male descendants of this Shelton line that can be tested for the Shelton DNA project even though there are around 400 living descendants (all descending from daughters born with the Shelton name at some point).
So who should we encourage to submit additional samples that would benefit this project? There are three broad categories of submissions that should be sought in the near term. Once other submissions are analyzed, there will surely be new items of interest. First, for all the current submissions, we should encourage male descendants other sons of our oldest proven ancestors to submit DNA samples. This helps determines where the uniqueness of each marker set begins. It also provides more evidence connecting these sons to their oldest proven ancestor. Second, everyone has their favorite candidates for possible connection to their lines. Your hunch (supported by traditional genealogical research) can be either dismissed by DNA evidence or further strengthened by DNA evidence. We must have more submissions from possible candidates to make any progress on which lines are worthy of additional research. Third, we need wider participation of all Brooks lines to determine those big surprises and possible connections that we all have missed to date. As the current submissions confirm, the Brooks surname is a relatively common surname with dramatically different DNA backgrounds.
We need more traditional documentation about this project's Brooks lines in order to analyze the current submissions and to be able to recommend specific submissions. Without this documentation, anyone analyzing the current submissions have to make many assumptions which make the current analysis less meaningful. Current and future submissions should also submit a pedigree chart of their all male line. This chart should start with the most recent "all male" ancestor that is born prior to 1900 (more recent information is not really necessary for DNA analysis). The chart should go back to your earliest proven ancestor and can include one or two generations of speculation (please label as speculation). Also, any web links to any web sites that include the line would also be very useful. Please submit this information to this web site (or the administrator of the Brooks DNA web site) as soon as possible.
We also need to identify which lines have few living all male descendants and determine any exposures for any Brooks lines that may die out in the near future. We need to identify known living "all male" descendants of these exposed lines and actively recruit DNA submissions for the sons of these oldest proven ancestors. For now, we just need to present our recommendations for submissions to allow these key individuals to contribute these submissions and we need assist interested individuals to avoid unnecessary submissions for the lines that already have DNA samples. It is much better to assist a distant cousin with the financing of his DNA submission than having another close relative submit DNA that is not very useful to this DNA project. Again, without additional traditional documentation concerning these lines, recommendations are very difficult to make.
The analysis of DNA submissions can be very challenging without the proper tools (how did we ever compile family histories before personal computers). It is very difficult to manually analyze any chart that includes many DNA submissions. Cladograms are graphical representations of the marker mutations between individuals. These charts can quickly determine the closeness of relationships between various submissions. The cladogram charts were created using a free phylogenetic network software program offered by Fluxus Engineering:
More information about free cladogram software
This program determines the simplest configuration which has the least number of interconnections or mutations. For excellent examples of what cladograms can do in the future, refer to the Mumma DNA web site. The Mumma line is related to my wife's Garver line and was one of the first genealogical DNA projects. The Mumma DNA project is pretty far along in their collection of DNA submissions and they have gained singificant geneaogical information through the use of DNA. This surname is relatively uncommon which makes the project much more useful with many fewer samples. Their web site has a great presentation of their tables and show the usefulness of cladograms:
Mumma DNA Web Site
The first cladogram for Brooks submissions includes all unique submissions with 37 markers. Note that all 67 marker submissions were also included in this analysis but only the first 37 markers were used. It reveals two items of interest: 1) there are a few submissions that are somewhat related but not closely related; 2) The Brooks DNA is a common surname and many Brooks lines are probably not even related prior to our ancestors using last names.
37 marker Cladogram (PDF)
The next two cladogram charts include all unique submmissions with 25 markers. Because the cladogram program only supports around twenty submissions in each graph, this analysis was split into two groups. These charts were split based on the common mutation point in the 37 marker cladogram that resulted in an equal number of submissions in each part of the cladogram (this was given the label of 99999). These 25 marker cladograms also include all 37 marker submissions with the last 12 markers omitted (and all 67 marker submissions without the last 42 markers omitted). As it turns out, many more 25 marker only submissions ended up in the more remotely related Brooks lines (any future analysis will split submissions more evenly by including less in the more remotely related Brooks lines).
25 marker Cladogram - Part 1(PDF)
25 marker Cladogram - Part 2(PDF)
Currently, there are no major clusters (or groupings) of lines which indicates that most of the submissions were fairly random in nature. However, by simple comparison of the two 25 marker cladograms, it is obvious that some lines are probably at least distantly related while several Brooks lines have so many mutations that any common ancestor would have to occur before our English ancestors started using any surnames. The 25 marker submissions also show a potential for several very small clusters. However, there are two things that prevent these very small clusters from being genealogical significant. First, the close genealogical relationship may already been proven with traditional reseach (this is where we really need traditional documentation to assist in the analysis of the DNA submissions). Second, the 25 marker matches (or single mutation differences) are not as conclusive as 37 marker matches. Where 25 marker submissions are closely related, these submittors should upgrade to 37 markers before any solid conclusions can be made (again, only if there is no known relationship between these two lines).
Raw DNA data without traditional genealogical research is not very useful. It is critical to have both the DNA marker sets and known information about the ancestry of these DNA submissions. The Pace DNA web site (one of my ancestors) has an excellent web page dedicated to providing significant genealogical information known about all of their DNA submittors. This information is conveniently made available to anyone of interest and saves redundant efforts of many people gathering what they know about these submissions for their own personal analysis.
Pace DNA web site's ancestry listings for submittors
Please send your comments by email, letter or phone:
E-mail (new) ___________
______________________ E-mail Address changed to image to reduce my spam email
Snail mail______________ Robert B. Casey, 4705 Eby Lane, Austin, TX 78731-4507
Phone (home)__________ (512) 371-0579 (nights and weekends only)