Analysis of DNA for Casey lines
Last updated on December 28, 2006
By Robert Brooks Casey - descendant of
Robert Brooks - d. 1805 Mecklenburg County, VA
The analysis of DNA markers provides a new opportunity for genealogists
to unravel their family history. This new tool is now producing
results that can take some of the guesswork out of adding
another ancestor to our family trees. Historically, our traditional
research is heavily influenced by geography, family naming patterns,
migration patterns, etc. This approach most often leads us to discover
new ancestors but it can also lead us into wrong direction as well.
Your particular oldest proven ancestor may have broken away from his family
connections and traditions. Your oldest proven ancestor may not have
named his children after his older generation of relatives or may
have moved to new areas where no siblings or cousins lived, etc.
Analysis of DNA markers allows us to identify which Brooks lines
look encouraging as potential relatives and assist in avoiding research
on those unrelated lines that ended up in the same county by chance.
The Brooks surname is a common name, therefore, Brooks researchers
should expect to regularly encounter unrelated Brooks families in the
same counties during the same time period.
Unfortunately, DNA provides its best fit in tracing your "all male"
ancestors as this is basic biology that limits DNA research to these
lines. This limits male researchers to submit their DNA for their
surname only. For other surnames of your ancestors, you have to get
one of your cousins born with the surname of interest to submit their
DNA sample and you can "sponsor" their submission (or assist them
with the expense of the DNA submissions based on your mutual interest
in your common ancestor). There is also DNA test for women that will
trace their "all female" lines as well. With European surname
practices for marriage, women of European descent take the surname
of their husband which results in a different surname for every generation.
This constant change of surname makes it more difficult to trace than
tracing one surname over many generations. Additionally, the female
DNA is much slower to mutate which further reduces its usefulness for
genealogical research. Male DNA markers are like probate records and
census records, our best primary sources that yield maximum information.
Female DNA tests are equivalent to tax records and property records,
producing fewer results but can be very useful when census records and
probate records fail to produce results. We should search our male DNA
lines first and then later supplement them with research of female DNA
lines when further analysis of male DNA lines no longer yield enough
information to break through to another generation.
Currently there are 42 sets of DNA markers to analyze for the
Brooks surname. There are 12 submissions with 12 markers, 15 with
25 markers, 12 with 37 markers and 3 with 67 markers. There are
six pairs of DNA markers that are matches within this project,
however, several do not have the same number of markers or have too
few markers to accurately determine if these matches are really
closely related or just false hits. Without traditional genealogical
documentation, it is impossible to determine if these are
significant matches where connections have not been made to date
or are only matches that are already proven by traditional research.
There are also six other submissions with only one marker deviation, one
with two markers that are different, two submissions with three
markers that are different. Again, some do not have a common number
of markers and without matching traditional documentation, some matches
may be already known to be closely related individuals. It is unknown
if any my cousins have submitted to this project (my line has
around 7,500 descendants documented to date). It is obvious from
the submissions to date, that Brooks is a relatively common surname
and that many of these Brooks lines have no common ancestor prior
to the date when most Western Europeans starting using surnames.
The Brooks surname is based on geographic origins (living near
a brook) and there are many unrelated lines that started using this
surname when our ancestors first started using surnames (primarily
from 1,100 to 1,500). There are twleve submissions that match or have
only one marker difference. These submissions, if not already known
to be related, would be closely related and warrant exchange of
information between these submittors. The goal of any DNA project
is find to discover clusters of very similiar DNA marker sets
which imply close relationships. Submittors need not only to submit
their DNA but they also need to assist in locating other "all male"
cousins (or suspected cousins) and get their these people to submit
their DNA. The objective is to form several clusters of related lines
and concentrate traditional research on these newly discovered clusters
of closely related lines.
Any DNA research project can benefit from a coordinated compilation
of the DNA documentation. This is also true with publishing a family
history or being involved with a surname organization. As with
traditional genealogical research, some researchers tend to organize
and document findings, others travel to courthouses and gather those
important source documents and some are really good about contacting
many cousins and gathering more recent descendants. This analysis
is an attempt to coordinate such documentaion. It is hard for
many individuals to know which submissions could benefit from
expanding the number of markers of DNA samples that have already
been submitted. After analyzing all the submissions to date, here
is a summary of the most closely related lines:
14383 (37) and 33138 (37) - match (both 37 markers)
32321 (37) and 42464 (25) - match (37 / 25 markers)
42798 (25) and 61478 (25) - match (both 25 markers)
48392 (25) and 22789 (25) - match (both 25 markers)
28312 (25) and 24831 (12) - match (25 / 12 markers)
45365 (12) and 47968 (12) - match (both 12 markers)
14383 (37) and 43358 (25) - one mutatation (37 / 25 markers)
45522 (37) and 42798 (25) - one mutatation (37 / 25 markers)
73766 (25) and 48392 (25) - one mutatation (both 25 markers)
42798 (25) and 73159 (25) - one mutatation (both 25 markers)
42857 (37) and 51227 (37) - two mutatations (both 37 markers)
These closely related submmissions do not necessarily mean
significant new information due to DNA analysis. There are three
scenarios: 1) These lines are already known to be related by
traditional research (we just do not have this information
available at the DNA web site); 2) These lines have a potential
for being closely related but do not have enough markers to
know how close these lines are related; 3) These lines are
closely related and traditional research has not connected these
lines with proper documentation (these submissions should contact
each other and share traditional research).
After analyzing all the 37 marker submissions, there are only
two submissions that match. This is the only closely related
lines found with 37 markers. The next closest pairs have
two mutations (1 pair), 3 mutations (1 pair) and 5 mutations
(1 pair). Therefore, only one pair appear to be closely related
from a DNA point of view. There are many other pairs that have
the potential to be closely related but need additional markers
analyzed to determine if there really is a close relationship.
Below are recommended upgrades where there are currently not
enough markers to draw significant conclusions. First, only
two submissions could benefit from being upgraded from 37
markers to 59 markers (assumes that this relationship is not
already known from traditional research):
14383 (37 to 67 marker upgrade) - exact match with 33138
33138 (37 to 67 marker upgrade) - exact match with 14383
There are eight submissions that could benefit from upgrading
from 25 markers to 37 markers (assumes that relationship is not
already known from traditional research):
42798 (25 to 37 marker upgrade) - exact match with 61478
and one marker difference from 73159 & 45522
48392 (25 to 37 marker upgrade) - exact match with 22789
and one marker difference from 73766
22789 (25 to 37 marker upgrade) - exact match with 48798
and one marker difference from 73766
42464 (25 to 37 marker upgrade) - exact match with 32321
43358 (25 to 37 marker upgrade) - exact match with 14383
61478 (25 to 37 marker upgrade) - exact match with 42478
73766 (25 to 37 marker upgrade) - one marker from 48392
73159 (25 to 37 marker upgrade) - one marker from 42798
There are four submissions that could benefit from upgrading
from 12 markers to 25 or 37 markers (assumes that relationship
is not already known from traditional research):
24831 (12 to 25 marker upgrade) - exact match with 28312
45365 (12 to 25 marker upgrade) - exact match with 47968
47968 (12 to 25 marker upgrade) - exact match with 45365
69789 (12 to 25 marker upgrade) - one marker from 22789
Our goal is to get several clusters of Brooks lines
that help establish recent common ancestry between
various Brooks submissions. There are two basic methods
to get these clusters started: 1) Actively recruiting
submittors that are sons of the same ancestor or believed
to be closely related; 2) Random submissions will eventually
begin to cluster when the sample size grows. Once the number
of submissions greatly expands in scope, a major benefit
will start to emerge. It will become obvious that several
diverse Brooks lines will become more
closely related than traditional research has
shown to date. DNA documentation can help
genealogists better select which "possibly"
related lines to research based soley on DNA
evidence. Researching these newly discovered potential
relationships through traditional genealogical
methods may result in locating supporting
documentation and may be the key to getting past
that brick wall. Currently, no major cluster of submissions
has been discovered but there are several potential clusters
that could emerge with upgrades to submissions (shown
above) or a few new submissions which will eventually start
overlapping with existing submissions.
As time passes by, many submittors may become no longer
interested in paying the premium to have their sample
analyzed for additional markers. Eventually, these
samples will become unviable to analyze. The person supporting
the analysis could also die or become incapacited with
the children potentially showing no interest in this project.
For the vast majority of cases, the exposure to lose
valuable DNA documentatoin will probablly not be of
great concern as most lines have many male living descendants
of any particular son of an oldest proven ancestor. If
there are numerous living male descendants, then there
will remain many others to assist in the future. However,
if you are the only surviving male of your line, it is
very important that you submit as many markers that
are currently available (currently 67 markers from this
company collecting samples for this project). My great
grandfather, William Martin Shelton (born 1847), had seven
daughters and only one son. This son produced only one grandson
who died as a teenager in 1928. Therefore, there
are no male descendants of this Shelton line that can be
tested for the Shelton DNA project even though there are
around 400 living descendants (all descending from
daughters born with the Shelton name at some point).
So who should we encourage to submit additional
samples that would benefit this project? There are
three broad categories of submissions that should
be sought in the near term. Once other submissions
are analyzed, there will surely be new items of
interest. First, for all the current submissions,
we should encourage male descendants other sons
of our oldest proven ancestors to submit DNA samples.
This helps determines where the uniqueness of
each marker set begins. It also provides more
evidence connecting these sons to their oldest
proven ancestor. Second, everyone has their
favorite candidates for possible connection
to their lines. Your hunch (supported by traditional
genealogical research) can be either dismissed
by DNA evidence or further strengthened by DNA
evidence. We must have more submissions from possible
candidates to make any progress on which lines
are worthy of additional research. Third,
we need wider participation of all Brooks lines
to determine those big surprises and possible
connections that we all have missed to date.
As the current submissions confirm, the Brooks
surname is a relatively common surname with
dramatically different DNA backgrounds.
We need more traditional documentation about this project's
Brooks lines in order to analyze the current submissions
and to be able to recommend specific submissions. Without
this documentation, anyone analyzing the current
submissions have to make many assumptions which make
the current analysis less meaningful. Current and future
submissions should also submit a pedigree chart of
their all male line. This chart should start with the
most recent "all male" ancestor that is born prior to 1900
(more recent information is not really necessary for
DNA analysis). The chart should go back to your
earliest proven ancestor and can include one or two
generations of speculation (please label as speculation).
Also, any web links to any web sites that include the
line would also be very useful. Please submit this
information to this web site (or the administrator of
the Brooks DNA web site) as soon as possible.
We also need to identify which lines have few living
all male descendants and determine any exposures for
any Brooks lines that may die out in the near future.
We need to identify known living "all male" descendants of
these exposed lines and actively recruit DNA submissions for
the sons of these oldest proven ancestors. For now, we
just need to present our recommendations for submissions
to allow these key individuals to contribute these submissions
and we need assist interested individuals to avoid unnecessary
submissions for the lines that already have DNA samples. It is
much better to assist a distant cousin with the financing of
his DNA submission than having another close relative submit
DNA that is not very useful to this DNA project. Again, without
additional traditional documentation concerning these lines,
recommendations are very difficult to make.
The analysis of DNA submissions can be very challenging without
the proper tools (how did we ever compile family histories
before personal computers). It is very difficult to manually
analyze any chart that includes many DNA submissions. Cladograms
are graphical representations of the marker mutations between
individuals. These charts can quickly determine the closeness
of relationships between various submissions. The cladogram
charts were created using a free phylogenetic network software
program offered by Fluxus Engineering:
More information about free cladogram software
This program determines the simplest configuration which
has the least number of interconnections or mutations.
For excellent examples of what cladograms can do in
the future, refer to the Mumma DNA web site. The Mumma
line is related to my wife's Garver line and was one of
the first genealogical DNA projects. The Mumma DNA
project is pretty far along in their collection of DNA
submissions and they have gained singificant geneaogical
information through the use of DNA. This surname is
relatively uncommon which makes the project much more
useful with many fewer samples. Their web site has a
great presentation of their tables and show the usefulness
of cladograms:
Mumma DNA Web Site
The first cladogram for Brooks submissions includes all unique
submissions with 37 markers. Note that all 67 marker submissions
were also included in this analysis but only the first 37 markers
were used. It reveals two items of interest: 1) there are a few
submissions that are somewhat related but not closely related;
2) The Brooks DNA is a common surname and many Brooks lines
are probably not even related prior to our ancestors using
last names.
37 marker Cladogram (PDF)
The next two cladogram charts include all unique submmissions with
25 markers. Because the cladogram program only supports around twenty
submissions in each graph, this analysis was split into two groups.
These charts were split based on the common mutation point in the 37
marker cladogram that resulted in an equal number of submissions in
each part of the cladogram (this was given the label of 99999).
These 25 marker cladograms also include all 37 marker
submissions with the last 12 markers omitted (and all 67 marker
submissions without the last 42 markers omitted). As it turns out,
many more 25 marker only submissions ended up in the more remotely
related Brooks lines (any future analysis will split submissions
more evenly by including less in the more remotely related Brooks
lines).
25 marker Cladogram - Part 1(PDF)
25 marker Cladogram - Part 2(PDF)
Currently, there are no major clusters (or groupings) of lines which
indicates that most of the submissions were fairly random in nature.
However, by simple comparison of the two 25 marker cladograms, it is
obvious that some lines are probably at least distantly related while
several Brooks lines have so many mutations that any common ancestor
would have to occur before our English ancestors started using any
surnames. The 25 marker submissions also show a potential for several
very small clusters. However, there are two things that prevent these
very small clusters from being genealogical significant. First, the close
genealogical relationship may already been proven with traditional
reseach (this is where we really need traditional documentation to
assist in the analysis of the DNA submissions). Second, the 25 marker
matches (or single mutation differences) are not as conclusive as
37 marker matches. Where 25 marker submissions are closely related,
these submittors should upgrade to 37 markers before any solid
conclusions can be made (again, only if there is no known
relationship between these two lines).
Raw DNA data without traditional genealogical research is not very
useful. It is critical to have both the DNA marker sets and known
information about the ancestry of these DNA submissions. The Pace
DNA web site (one of my ancestors) has an excellent web page
dedicated to providing significant genealogical information
known about all of their DNA submittors. This information is
conveniently made available to anyone of interest and saves
redundant efforts of many people gathering what they know about
these submissions for their own personal analysis.
Pace DNA web site's ancestry listings for submittors
Please send your comments by email, letter or phone:
E-mail (new) ___________
______________________ E-mail Address changed to image to reduce my spam email
Snail mail______________ Robert B. Casey, 4705 Eby Lane, Austin, TX 78731-4507
Phone (home)__________ (512) 371-0579 (nights and weekends only)