WGS Extract WWW home

View My GitHub Profile

WGS Extract

a desktop tool for verifying, analyzing and manipulating your DTC 30x WGS test results

Current release is Beta v2b (18 Feb 2020):

See the Update at the end here for a patched download that can be performed in stages.

Still waiting for your WGS test results? Want to get started today? See the International Genome Sample Resource (1K Genome archive) for BAM files you can download to play with and learn the tool while waiting.

Note: MacOS 11 on M1 systems is having issues with the MacPorts tools execution. We are waiting for Apple and Macports to find a solution. Also, since switching to our Google Drive delivery cloud, it seems Safari may be required to download in MacOS. See our special MacOSX Release Patch for further notes.

IMPORTANT: MacOSX Release Patch is available for Version Beta v2b. Fixes the install and start scripts of the program. Adds an Unintall as well. (v1 on 25 April 2020, v3 on 20 Jun 2020, v5 on 31 Oct 2020)

Note: Français Language Patch is available for Version Beta v2b. Adds Français language support to the existing English and Deutsch language support. (1 May 2020)

Note: If working with a Nebula Genomics CRAM file, please check the CRAM to BAM conversion document. The next release will handle CRAMs and BAMs interchangeably.

This tool is geared toward the needs of genetic genealogy but may be helpful for those looking into health-releated uses of WGS tests. The sub-$500, Direct-to-Consumer (DTC), 30x Whole Genome Sequence (WGS) tests are delivered with basic data files and often some health-related reports. This tool serves to bridge the gap between the WGS files delivered and the present day genetic genealogy community tools. Many health sites accept microarray and VCF files generated here as well.

This tool is designed to be a simple, push-button manipulation of WGS files from any source. It hides the scripting of complex bioinformatic tools in the background and automatically determines needed parameters and variances for the data supplied it. For more control over your pipeline, either learn to use the underlying tools directly in a command shell or seek a Galaxy server (such as UseGalaxy).

The tool has the potential to be a simple install in a BioConda environment or even as a simple Python package. But as a majority of the users are on Microsoft Windows 10 systems, and the underlying bioinformatic tools are not available there, we currently deliver the tool as a more complex, “install everything” approach. This may change going forward after we find a Win10 package manager to supply the bioinformatic tool ports we make here. We do fully test and use the Linux and Apple MaxOS versions. This is the only source of the bioinformatic tools on a Win10 system (that we are aware of).

We use the Facebook group Dante Labs and Nebula Genomics Customers for discussions on how to make use of your sub-$500, DTC 30x WGS test results. Bugs, use cases and announcements about this tool happen there. As part of that Facebook groups’ Files section, you will find a number of useful companion documents and tool references. In particular, start with Bioinformatics for Newbies.

User issues, if not brought up in the Facebook group, should be raised in the user issues section of this site. This issues section is the preferred location so code bugs, use limitations and suggested improvements can be tracked within the development project.

The tool acronym is WGSE and is pronounced as “wig-see”. We encourage this use in English language conversation.

Further documentation beyond the manual link above is available in our WGS Extract Developer’s Documentation Repository.

Developer’s should visit the main GitHub WGS Extract Developers Code Repository. Development issues, code bugs and limitations should be raised in the development issues section so they are tracked till resolved in a release.

The original, first year, historical release is documented here.

This page is located at https://WGSExtract.github.io/ and serves as the new WWW home for the tool. As the need develops, we will create our own Facebook Group for users to raise issues outside of the User Issues Section already mentioned.

UPDATE (One Year Later (Feb 2021)): The BetaV2b release was simple and quick. The 4.5 GB zip archive consists of 4.2 GB of Human Genom Reference Models that are sometimes needed. But it is this large size that gives most some issue with installing. Therefore, we have created three separate .zip archives to aid the installation.

WGS Extract release of Beta v2b (18 Feb 2020) broken up into separate ZIP archives (including all patches already applied:

We are working to make all Reference Genome files downloaded on demand in future releases. You can run some features of the tool without the Reference Genome available. So consider downloading and installing the Main Program Release and later adding the Reference Genomes. And to move the Win10 tool ports that are only available here to the installation script and downloaded at that time for Win10 users only. In fact, if we even make our own binary reference files of human genome models downloaded as needed, the program drops down to under 10 MB downloaded and installed! Clearly a much better solution going forward.