DNA Survey Submission Form

March 13, 2022

Greetings Citizen Scientists,

By taking part in this survey, you are helping to create a global database of DNA relationship data that will be used to determine the distribution of shared segments for dozens of relationship types. Click the below link to view the current survey in real-time.

DNA Survey

Gigatrees can be configured to generate a tab-delimited text file (dnasurvey-[timestamp].txt) that contains all your DNA match data (no personally identifiable information is included). You can then paste the contents of that file into this form and submit it. The data you provide will be immediately aggregated with other users' data, processed and published in real-time in the above linked survey.

Genetic genealogy has advanced by leaps over the last few years, and a large number of free tools have been created in order to help users make accurate genetic predictions, including Jonny Perl's Shared CM Tool which is based on manually entered user data into the The Shared cM Project created and maintained by Blaine Bettinger, and made available in occasional releases through the use of distribution curves for each relationship type. Another tool is the Orogen Relationship Predictors created by Brit Nicholson and based on peer reviewed data. My understanding is that his source data is computer generated. You should really read through his blog to understand how his data is relevant and where it may have advantages for amassing large amounts of data. There is also the distribution probability matrix made available by Dr. Leah Larkin in her article, "The Limits of Predicting Relationships Using DNA", DNAGeek, Dec. 19, 2016 (accessed Oct. 28, 2018) and is based on an early version of an Ancestry.com DNA whitepaper.

Gigatrees differs from all of these because relationships are verified by examining users' family trees directly before being included in the DNA survey report. The data is tracked using an autogenerated unique user id so that when resubmitting data, duplicates can be prevented and data, including relationships, can be updated. Together these features should reduce inaccuracies that might be caused by manually enetered user relationships and duplicates entries that could skew the average and median values. Another difference is that data is always up-to-date, so there is no waiting for a new release. Finally, the raw data, in the form of autogenerated DNA survey reports, can potentially be read and aggregated by other applications to create similar or enhanced tools.

Happy Reporting

Tim Forsythe (September 5, 2022)

Comments

Last Modified: March 13, 2022