Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ingest wip to be added to other var db code #6582

Merged
merged 10 commits into from
May 8, 2020

Conversation

RoriCremer
Copy link
Contributor

@RoriCremer RoriCremer commented May 4, 2020

Pr to merge my forked branch into Andrea's gatk branch!

rori and others added 9 commits May 1, 2020 15:09
add pet creation to walker

BRUH - cleanup the cword code

update vet schema

add headers enum and method to pet creation

add xsv creation to loop

update vet

fix extra delim bug and show non-ref placeholder

dont rebase me bruh
Also fixed `[.....]` strings in the vet and converted a lot of strings to constants plus a couple of TODOs

Fix GT String to output index of alleles instead of allele charaters

fix enum bug

add GQ band to drop as an optional param

include end base

code used to push PET and VET to BQ

i dont know, it finally fuckign works....seriously.  also schema jsons

testing wip

cleanup

update schema and pray

update schema cuz im bad

added ability to pass in what GQ state to interpret as "missing" as well as skeleton for creating the metdata tsv (sample list, what interval list was use, the interval list's md5, and what GQ is signified by "missing".  Also put some util files in a folder under BlahVariantWalkerUtils for now (should get rid of them eventually but nice to have for now)

spacing + adding optional to GQ arguments (whoops)

rename intervalListPath to  intervalListBlob

pass thru sample name

swap out errors thrown

add stringified  interval list to metadata table

fixup on interval blog rename

use L for interval list

This doesn't have the level of validation that we need, but works for now
Ideally the interval list is required to be a picard interval list file

wip testing

to be dropped? interval wip

add testing materials

wip which type should this be passed in asg

md5 interval list concated string

wip split xsvs by contig

add vet stripping intial changes

review vet trim with Laura

fix overzealous missing bug

laura seal of approval on schema

add back call_DP until we are sure evoq doesn't need it

write chr dirs on the fly

create dirs for chrs w pet and vet dirs inside

add sample id to metadata

semantics for sample id

move position and sample into main method for easier access

add sample id mapping and use it

add enum for chr index

use longs for the chr addition

add array schema and enum

add bool for array

update array options with sample id and chr adjustment

wip to remove chr separation

new sample directory structure

let the rename begin

fixup on walker for arrays

more renames

better comments

fixup on ingest walker
@RoriCremer RoriCremer requested review from ahaessly and kcibul May 4, 2020 18:29
@RoriCremer RoriCremer changed the title Pr to merge my forked branch into Andrea's gatk branch! Ingest wip to be added to other var db code May 4, 2020
@RoriCremer RoriCremer merged commit 7b3b41e into broadinstitute:ah_var_store May 8, 2020
ahaessly pushed a commit that referenced this pull request Aug 13, 2020
* initial vet creation class

* initial variant walker for jg

* initial pet creation class

add pet creation to walker

update vet schema

add headers enum and method to pet creation

add xsv creation to loop

update vet

fix extra delim bug and show non-ref placeholder

Also fixed `[.....]` strings in the vet and converted a lot of strings to constants plus a couple of TODOs

Fix GT String to output index of alleles instead of allele characters

fix enum bug

add GQ band to drop as an optional param

include end base

code used to push PET and VET to BQ

added ability to pass in what GQ state to interpret as "missing" as well as skeleton for creating the metdata tsv (sample list, what interval list was use, the interval list's md5, and what GQ is signified by "missing".  Also put some util files in a folder under BlahVariantWalkerUtils for now (should get rid of them eventually but nice to have for now)

spacing + adding optional to GQ arguments (whoops)

rename intervalListPath to  intervalListBlob

pass thru sample name

swap out errors thrown

use L for interval list

This doesn't have the level of validation that we need, but works for now
Ideally the interval list is required to be a picard interval list file

md5 interval list concated string

wip split xsvs by contig

add vet stripping intial changes

review vet trim with Laura

fix overzealous missing bug

laura seal of approval on schema

add back call_DP until we are sure evoq doesn't need it

write chr dirs on the fly

create dirs for chrs w pet and vet dirs inside

add sample id to metadata

semantics for sample id

move position and sample into main method for easier access

add sample id mapping and use it

add enum for chr index

use longs for the chr addition

add array schema and enum

add bool for array

update array options with sample id and chr adjustment

wip to remove chr separation

new sample directory structure

let the rename begin

fixup on walker for arrays

more renames

better comments

fixup on ingest walker

* unrelated conflict--maybe merge mistake?

* cleanup

* swap method for indexed alleles

* add genome specific VET creation

* move mode enum into common code

* wip on adding enums as params
ahaessly pushed a commit that referenced this pull request Sep 11, 2020
* initial vet creation class

* initial variant walker for jg

* initial pet creation class

add pet creation to walker

update vet schema

add headers enum and method to pet creation

add xsv creation to loop

update vet

fix extra delim bug and show non-ref placeholder

Also fixed `[.....]` strings in the vet and converted a lot of strings to constants plus a couple of TODOs

Fix GT String to output index of alleles instead of allele characters

fix enum bug

add GQ band to drop as an optional param

include end base

code used to push PET and VET to BQ

added ability to pass in what GQ state to interpret as "missing" as well as skeleton for creating the metdata tsv (sample list, what interval list was use, the interval list's md5, and what GQ is signified by "missing".  Also put some util files in a folder under BlahVariantWalkerUtils for now (should get rid of them eventually but nice to have for now)

spacing + adding optional to GQ arguments (whoops)

rename intervalListPath to  intervalListBlob

pass thru sample name

swap out errors thrown

use L for interval list

This doesn't have the level of validation that we need, but works for now
Ideally the interval list is required to be a picard interval list file

md5 interval list concated string

wip split xsvs by contig

add vet stripping intial changes

review vet trim with Laura

fix overzealous missing bug

laura seal of approval on schema

add back call_DP until we are sure evoq doesn't need it

write chr dirs on the fly

create dirs for chrs w pet and vet dirs inside

add sample id to metadata

semantics for sample id

move position and sample into main method for easier access

add sample id mapping and use it

add enum for chr index

use longs for the chr addition

add array schema and enum

add bool for array

update array options with sample id and chr adjustment

wip to remove chr separation

new sample directory structure

let the rename begin

fixup on walker for arrays

more renames

better comments

fixup on ingest walker

* unrelated conflict--maybe merge mistake?

* cleanup

* swap method for indexed alleles

* add genome specific VET creation

* move mode enum into common code

* wip on adding enums as params
meganshand pushed a commit that referenced this pull request Oct 6, 2020
* initial vet creation class

* initial variant walker for jg

* initial pet creation class

add pet creation to walker

update vet schema

add headers enum and method to pet creation

add xsv creation to loop

update vet

fix extra delim bug and show non-ref placeholder

Also fixed `[.....]` strings in the vet and converted a lot of strings to constants plus a couple of TODOs

Fix GT String to output index of alleles instead of allele characters

fix enum bug

add GQ band to drop as an optional param

include end base

code used to push PET and VET to BQ

added ability to pass in what GQ state to interpret as "missing" as well as skeleton for creating the metdata tsv (sample list, what interval list was use, the interval list's md5, and what GQ is signified by "missing".  Also put some util files in a folder under BlahVariantWalkerUtils for now (should get rid of them eventually but nice to have for now)

spacing + adding optional to GQ arguments (whoops)

rename intervalListPath to  intervalListBlob

pass thru sample name

swap out errors thrown

use L for interval list

This doesn't have the level of validation that we need, but works for now
Ideally the interval list is required to be a picard interval list file

md5 interval list concated string

wip split xsvs by contig

add vet stripping intial changes

review vet trim with Laura

fix overzealous missing bug

laura seal of approval on schema

add back call_DP until we are sure evoq doesn't need it

write chr dirs on the fly

create dirs for chrs w pet and vet dirs inside

add sample id to metadata

semantics for sample id

move position and sample into main method for easier access

add sample id mapping and use it

add enum for chr index

use longs for the chr addition

add array schema and enum

add bool for array

update array options with sample id and chr adjustment

wip to remove chr separation

new sample directory structure

let the rename begin

fixup on walker for arrays

more renames

better comments

fixup on ingest walker

* unrelated conflict--maybe merge mistake?

* cleanup

* swap method for indexed alleles

* add genome specific VET creation

* move mode enum into common code

* wip on adding enums as params
kcibul pushed a commit that referenced this pull request Jan 29, 2021
* initial vet creation class

* initial variant walker for jg

* initial pet creation class

add pet creation to walker

update vet schema

add headers enum and method to pet creation

add xsv creation to loop

update vet

fix extra delim bug and show non-ref placeholder

Also fixed `[.....]` strings in the vet and converted a lot of strings to constants plus a couple of TODOs

Fix GT String to output index of alleles instead of allele characters

fix enum bug

add GQ band to drop as an optional param

include end base

code used to push PET and VET to BQ

added ability to pass in what GQ state to interpret as "missing" as well as skeleton for creating the metdata tsv (sample list, what interval list was use, the interval list's md5, and what GQ is signified by "missing".  Also put some util files in a folder under BlahVariantWalkerUtils for now (should get rid of them eventually but nice to have for now)

spacing + adding optional to GQ arguments (whoops)

rename intervalListPath to  intervalListBlob

pass thru sample name

swap out errors thrown

use L for interval list

This doesn't have the level of validation that we need, but works for now
Ideally the interval list is required to be a picard interval list file

md5 interval list concated string

wip split xsvs by contig

add vet stripping intial changes

review vet trim with Laura

fix overzealous missing bug

laura seal of approval on schema

add back call_DP until we are sure evoq doesn't need it

write chr dirs on the fly

create dirs for chrs w pet and vet dirs inside

add sample id to metadata

semantics for sample id

move position and sample into main method for easier access

add sample id mapping and use it

add enum for chr index

use longs for the chr addition

add array schema and enum

add bool for array

update array options with sample id and chr adjustment

wip to remove chr separation

new sample directory structure

let the rename begin

fixup on walker for arrays

more renames

better comments

fixup on ingest walker

* unrelated conflict--maybe merge mistake?

* cleanup

* swap method for indexed alleles

* add genome specific VET creation

* move mode enum into common code

* wip on adding enums as params
kcibul pushed a commit that referenced this pull request Jan 29, 2021
* initial vet creation class

* initial variant walker for jg

* initial pet creation class

add pet creation to walker

update vet schema

add headers enum and method to pet creation

add xsv creation to loop

update vet

fix extra delim bug and show non-ref placeholder

Also fixed `[.....]` strings in the vet and converted a lot of strings to constants plus a couple of TODOs

Fix GT String to output index of alleles instead of allele characters

fix enum bug

add GQ band to drop as an optional param

include end base

code used to push PET and VET to BQ

added ability to pass in what GQ state to interpret as "missing" as well as skeleton for creating the metdata tsv (sample list, what interval list was use, the interval list's md5, and what GQ is signified by "missing".  Also put some util files in a folder under BlahVariantWalkerUtils for now (should get rid of them eventually but nice to have for now)

spacing + adding optional to GQ arguments (whoops)

rename intervalListPath to  intervalListBlob

pass thru sample name

swap out errors thrown

use L for interval list

This doesn't have the level of validation that we need, but works for now
Ideally the interval list is required to be a picard interval list file

md5 interval list concated string

wip split xsvs by contig

add vet stripping intial changes

review vet trim with Laura

fix overzealous missing bug

laura seal of approval on schema

add back call_DP until we are sure evoq doesn't need it

write chr dirs on the fly

create dirs for chrs w pet and vet dirs inside

add sample id to metadata

semantics for sample id

move position and sample into main method for easier access

add sample id mapping and use it

add enum for chr index

use longs for the chr addition

add array schema and enum

add bool for array

update array options with sample id and chr adjustment

wip to remove chr separation

new sample directory structure

let the rename begin

fixup on walker for arrays

more renames

better comments

fixup on ingest walker

* unrelated conflict--maybe merge mistake?

* cleanup

* swap method for indexed alleles

* add genome specific VET creation

* move mode enum into common code

* wip on adding enums as params
kcibul pushed a commit that referenced this pull request Feb 1, 2021
* initial vet creation class

* initial variant walker for jg

* initial pet creation class

add pet creation to walker

update vet schema

add headers enum and method to pet creation

add xsv creation to loop

update vet

fix extra delim bug and show non-ref placeholder

Also fixed `[.....]` strings in the vet and converted a lot of strings to constants plus a couple of TODOs

Fix GT String to output index of alleles instead of allele characters

fix enum bug

add GQ band to drop as an optional param

include end base

code used to push PET and VET to BQ

added ability to pass in what GQ state to interpret as "missing" as well as skeleton for creating the metdata tsv (sample list, what interval list was use, the interval list's md5, and what GQ is signified by "missing".  Also put some util files in a folder under BlahVariantWalkerUtils for now (should get rid of them eventually but nice to have for now)

spacing + adding optional to GQ arguments (whoops)

rename intervalListPath to  intervalListBlob

pass thru sample name

swap out errors thrown

use L for interval list

This doesn't have the level of validation that we need, but works for now
Ideally the interval list is required to be a picard interval list file

md5 interval list concated string

wip split xsvs by contig

add vet stripping intial changes

review vet trim with Laura

fix overzealous missing bug

laura seal of approval on schema

add back call_DP until we are sure evoq doesn't need it

write chr dirs on the fly

create dirs for chrs w pet and vet dirs inside

add sample id to metadata

semantics for sample id

move position and sample into main method for easier access

add sample id mapping and use it

add enum for chr index

use longs for the chr addition

add array schema and enum

add bool for array

update array options with sample id and chr adjustment

wip to remove chr separation

new sample directory structure

let the rename begin

fixup on walker for arrays

more renames

better comments

fixup on ingest walker

* unrelated conflict--maybe merge mistake?

* cleanup

* swap method for indexed alleles

* add genome specific VET creation

* move mode enum into common code

* wip on adding enums as params
kcibul pushed a commit that referenced this pull request Feb 1, 2021
* initial vet creation class

* initial variant walker for jg

* initial pet creation class

add pet creation to walker

update vet schema

add headers enum and method to pet creation

add xsv creation to loop

update vet

fix extra delim bug and show non-ref placeholder

Also fixed `[.....]` strings in the vet and converted a lot of strings to constants plus a couple of TODOs

Fix GT String to output index of alleles instead of allele characters

fix enum bug

add GQ band to drop as an optional param

include end base

code used to push PET and VET to BQ

added ability to pass in what GQ state to interpret as "missing" as well as skeleton for creating the metdata tsv (sample list, what interval list was use, the interval list's md5, and what GQ is signified by "missing".  Also put some util files in a folder under BlahVariantWalkerUtils for now (should get rid of them eventually but nice to have for now)

spacing + adding optional to GQ arguments (whoops)

rename intervalListPath to  intervalListBlob

pass thru sample name

swap out errors thrown

use L for interval list

This doesn't have the level of validation that we need, but works for now
Ideally the interval list is required to be a picard interval list file

md5 interval list concated string

wip split xsvs by contig

add vet stripping intial changes

review vet trim with Laura

fix overzealous missing bug

laura seal of approval on schema

add back call_DP until we are sure evoq doesn't need it

write chr dirs on the fly

create dirs for chrs w pet and vet dirs inside

add sample id to metadata

semantics for sample id

move position and sample into main method for easier access

add sample id mapping and use it

add enum for chr index

use longs for the chr addition

add array schema and enum

add bool for array

update array options with sample id and chr adjustment

wip to remove chr separation

new sample directory structure

let the rename begin

fixup on walker for arrays

more renames

better comments

fixup on ingest walker

* unrelated conflict--maybe merge mistake?

* cleanup

* swap method for indexed alleles

* add genome specific VET creation

* move mode enum into common code

* wip on adding enums as params
Marianie-Simeon pushed a commit that referenced this pull request Feb 16, 2021
* initial vet creation class

* initial variant walker for jg

* initial pet creation class

add pet creation to walker

update vet schema

add headers enum and method to pet creation

add xsv creation to loop

update vet

fix extra delim bug and show non-ref placeholder

Also fixed `[.....]` strings in the vet and converted a lot of strings to constants plus a couple of TODOs

Fix GT String to output index of alleles instead of allele characters

fix enum bug

add GQ band to drop as an optional param

include end base

code used to push PET and VET to BQ

added ability to pass in what GQ state to interpret as "missing" as well as skeleton for creating the metdata tsv (sample list, what interval list was use, the interval list's md5, and what GQ is signified by "missing".  Also put some util files in a folder under BlahVariantWalkerUtils for now (should get rid of them eventually but nice to have for now)

spacing + adding optional to GQ arguments (whoops)

rename intervalListPath to  intervalListBlob

pass thru sample name

swap out errors thrown

use L for interval list

This doesn't have the level of validation that we need, but works for now
Ideally the interval list is required to be a picard interval list file

md5 interval list concated string

wip split xsvs by contig

add vet stripping intial changes

review vet trim with Laura

fix overzealous missing bug

laura seal of approval on schema

add back call_DP until we are sure evoq doesn't need it

write chr dirs on the fly

create dirs for chrs w pet and vet dirs inside

add sample id to metadata

semantics for sample id

move position and sample into main method for easier access

add sample id mapping and use it

add enum for chr index

use longs for the chr addition

add array schema and enum

add bool for array

update array options with sample id and chr adjustment

wip to remove chr separation

new sample directory structure

let the rename begin

fixup on walker for arrays

more renames

better comments

fixup on ingest walker

* unrelated conflict--maybe merge mistake?

* cleanup

* swap method for indexed alleles

* add genome specific VET creation

* move mode enum into common code

* wip on adding enums as params
Marianie-Simeon pushed a commit that referenced this pull request Feb 16, 2021
* initial vet creation class

* initial variant walker for jg

* initial pet creation class

add pet creation to walker

update vet schema

add headers enum and method to pet creation

add xsv creation to loop

update vet

fix extra delim bug and show non-ref placeholder

Also fixed `[.....]` strings in the vet and converted a lot of strings to constants plus a couple of TODOs

Fix GT String to output index of alleles instead of allele characters

fix enum bug

add GQ band to drop as an optional param

include end base

code used to push PET and VET to BQ

added ability to pass in what GQ state to interpret as "missing" as well as skeleton for creating the metdata tsv (sample list, what interval list was use, the interval list's md5, and what GQ is signified by "missing".  Also put some util files in a folder under BlahVariantWalkerUtils for now (should get rid of them eventually but nice to have for now)

spacing + adding optional to GQ arguments (whoops)

rename intervalListPath to  intervalListBlob

pass thru sample name

swap out errors thrown

use L for interval list

This doesn't have the level of validation that we need, but works for now
Ideally the interval list is required to be a picard interval list file

md5 interval list concated string

wip split xsvs by contig

add vet stripping intial changes

review vet trim with Laura

fix overzealous missing bug

laura seal of approval on schema

add back call_DP until we are sure evoq doesn't need it

write chr dirs on the fly

create dirs for chrs w pet and vet dirs inside

add sample id to metadata

semantics for sample id

move position and sample into main method for easier access

add sample id mapping and use it

add enum for chr index

use longs for the chr addition

add array schema and enum

add bool for array

update array options with sample id and chr adjustment

wip to remove chr separation

new sample directory structure

let the rename begin

fixup on walker for arrays

more renames

better comments

fixup on ingest walker

* unrelated conflict--maybe merge mistake?

* cleanup

* swap method for indexed alleles

* add genome specific VET creation

* move mode enum into common code

* wip on adding enums as params
kcibul pushed a commit that referenced this pull request Mar 9, 2021
* initial vet creation class

* initial variant walker for jg

* initial pet creation class

add pet creation to walker

update vet schema

add headers enum and method to pet creation

add xsv creation to loop

update vet

fix extra delim bug and show non-ref placeholder

Also fixed `[.....]` strings in the vet and converted a lot of strings to constants plus a couple of TODOs

Fix GT String to output index of alleles instead of allele characters

fix enum bug

add GQ band to drop as an optional param

include end base

code used to push PET and VET to BQ

added ability to pass in what GQ state to interpret as "missing" as well as skeleton for creating the metdata tsv (sample list, what interval list was use, the interval list's md5, and what GQ is signified by "missing".  Also put some util files in a folder under BlahVariantWalkerUtils for now (should get rid of them eventually but nice to have for now)

spacing + adding optional to GQ arguments (whoops)

rename intervalListPath to  intervalListBlob

pass thru sample name

swap out errors thrown

use L for interval list

This doesn't have the level of validation that we need, but works for now
Ideally the interval list is required to be a picard interval list file

md5 interval list concated string

wip split xsvs by contig

add vet stripping intial changes

review vet trim with Laura

fix overzealous missing bug

laura seal of approval on schema

add back call_DP until we are sure evoq doesn't need it

write chr dirs on the fly

create dirs for chrs w pet and vet dirs inside

add sample id to metadata

semantics for sample id

move position and sample into main method for easier access

add sample id mapping and use it

add enum for chr index

use longs for the chr addition

add array schema and enum

add bool for array

update array options with sample id and chr adjustment

wip to remove chr separation

new sample directory structure

let the rename begin

fixup on walker for arrays

more renames

better comments

fixup on ingest walker

* unrelated conflict--maybe merge mistake?

* cleanup

* swap method for indexed alleles

* add genome specific VET creation

* move mode enum into common code

* wip on adding enums as params
kcibul pushed a commit that referenced this pull request Mar 9, 2021
* initial vet creation class

* initial variant walker for jg

* initial pet creation class

add pet creation to walker

update vet schema

add headers enum and method to pet creation

add xsv creation to loop

update vet

fix extra delim bug and show non-ref placeholder

Also fixed `[.....]` strings in the vet and converted a lot of strings to constants plus a couple of TODOs

Fix GT String to output index of alleles instead of allele characters

fix enum bug

add GQ band to drop as an optional param

include end base

code used to push PET and VET to BQ

added ability to pass in what GQ state to interpret as "missing" as well as skeleton for creating the metdata tsv (sample list, what interval list was use, the interval list's md5, and what GQ is signified by "missing".  Also put some util files in a folder under BlahVariantWalkerUtils for now (should get rid of them eventually but nice to have for now)

spacing + adding optional to GQ arguments (whoops)

rename intervalListPath to  intervalListBlob

pass thru sample name

swap out errors thrown

use L for interval list

This doesn't have the level of validation that we need, but works for now
Ideally the interval list is required to be a picard interval list file

md5 interval list concated string

wip split xsvs by contig

add vet stripping intial changes

review vet trim with Laura

fix overzealous missing bug

laura seal of approval on schema

add back call_DP until we are sure evoq doesn't need it

write chr dirs on the fly

create dirs for chrs w pet and vet dirs inside

add sample id to metadata

semantics for sample id

move position and sample into main method for easier access

add sample id mapping and use it

add enum for chr index

use longs for the chr addition

add array schema and enum

add bool for array

update array options with sample id and chr adjustment

wip to remove chr separation

new sample directory structure

let the rename begin

fixup on walker for arrays

more renames

better comments

fixup on ingest walker

* unrelated conflict--maybe merge mistake?

* cleanup

* swap method for indexed alleles

* add genome specific VET creation

* move mode enum into common code

* wip on adding enums as params
mmorgantaylor pushed a commit that referenced this pull request Apr 6, 2021
* initial vet creation class

* initial variant walker for jg

* initial pet creation class

add pet creation to walker

update vet schema

add headers enum and method to pet creation

add xsv creation to loop

update vet

fix extra delim bug and show non-ref placeholder

Also fixed `[.....]` strings in the vet and converted a lot of strings to constants plus a couple of TODOs

Fix GT String to output index of alleles instead of allele characters

fix enum bug

add GQ band to drop as an optional param

include end base

code used to push PET and VET to BQ

added ability to pass in what GQ state to interpret as "missing" as well as skeleton for creating the metdata tsv (sample list, what interval list was use, the interval list's md5, and what GQ is signified by "missing".  Also put some util files in a folder under BlahVariantWalkerUtils for now (should get rid of them eventually but nice to have for now)

spacing + adding optional to GQ arguments (whoops)

rename intervalListPath to  intervalListBlob

pass thru sample name

swap out errors thrown

use L for interval list

This doesn't have the level of validation that we need, but works for now
Ideally the interval list is required to be a picard interval list file

md5 interval list concated string

wip split xsvs by contig

add vet stripping intial changes

review vet trim with Laura

fix overzealous missing bug

laura seal of approval on schema

add back call_DP until we are sure evoq doesn't need it

write chr dirs on the fly

create dirs for chrs w pet and vet dirs inside

add sample id to metadata

semantics for sample id

move position and sample into main method for easier access

add sample id mapping and use it

add enum for chr index

use longs for the chr addition

add array schema and enum

add bool for array

update array options with sample id and chr adjustment

wip to remove chr separation

new sample directory structure

let the rename begin

fixup on walker for arrays

more renames

better comments

fixup on ingest walker

* unrelated conflict--maybe merge mistake?

* cleanup

* swap method for indexed alleles

* add genome specific VET creation

* move mode enum into common code

* wip on adding enums as params
mmorgantaylor pushed a commit that referenced this pull request Apr 6, 2021
* initial vet creation class

* initial variant walker for jg

* initial pet creation class

add pet creation to walker

update vet schema

add headers enum and method to pet creation

add xsv creation to loop

update vet

fix extra delim bug and show non-ref placeholder

Also fixed `[.....]` strings in the vet and converted a lot of strings to constants plus a couple of TODOs

Fix GT String to output index of alleles instead of allele characters

fix enum bug

add GQ band to drop as an optional param

include end base

code used to push PET and VET to BQ

added ability to pass in what GQ state to interpret as "missing" as well as skeleton for creating the metdata tsv (sample list, what interval list was use, the interval list's md5, and what GQ is signified by "missing".  Also put some util files in a folder under BlahVariantWalkerUtils for now (should get rid of them eventually but nice to have for now)

spacing + adding optional to GQ arguments (whoops)

rename intervalListPath to  intervalListBlob

pass thru sample name

swap out errors thrown

use L for interval list

This doesn't have the level of validation that we need, but works for now
Ideally the interval list is required to be a picard interval list file

md5 interval list concated string

wip split xsvs by contig

add vet stripping intial changes

review vet trim with Laura

fix overzealous missing bug

laura seal of approval on schema

add back call_DP until we are sure evoq doesn't need it

write chr dirs on the fly

create dirs for chrs w pet and vet dirs inside

add sample id to metadata

semantics for sample id

move position and sample into main method for easier access

add sample id mapping and use it

add enum for chr index

use longs for the chr addition

add array schema and enum

add bool for array

update array options with sample id and chr adjustment

wip to remove chr separation

new sample directory structure

let the rename begin

fixup on walker for arrays

more renames

better comments

fixup on ingest walker

* unrelated conflict--maybe merge mistake?

* cleanup

* swap method for indexed alleles

* add genome specific VET creation

* move mode enum into common code

* wip on adding enums as params
mmorgantaylor pushed a commit that referenced this pull request Apr 6, 2021
* initial vet creation class

* initial variant walker for jg

* initial pet creation class

add pet creation to walker

update vet schema

add headers enum and method to pet creation

add xsv creation to loop

update vet

fix extra delim bug and show non-ref placeholder

Also fixed `[.....]` strings in the vet and converted a lot of strings to constants plus a couple of TODOs

Fix GT String to output index of alleles instead of allele characters

fix enum bug

add GQ band to drop as an optional param

include end base

code used to push PET and VET to BQ

added ability to pass in what GQ state to interpret as "missing" as well as skeleton for creating the metdata tsv (sample list, what interval list was use, the interval list's md5, and what GQ is signified by "missing".  Also put some util files in a folder under BlahVariantWalkerUtils for now (should get rid of them eventually but nice to have for now)

spacing + adding optional to GQ arguments (whoops)

rename intervalListPath to  intervalListBlob

pass thru sample name

swap out errors thrown

use L for interval list

This doesn't have the level of validation that we need, but works for now
Ideally the interval list is required to be a picard interval list file

md5 interval list concated string

wip split xsvs by contig

add vet stripping intial changes

review vet trim with Laura

fix overzealous missing bug

laura seal of approval on schema

add back call_DP until we are sure evoq doesn't need it

write chr dirs on the fly

create dirs for chrs w pet and vet dirs inside

add sample id to metadata

semantics for sample id

move position and sample into main method for easier access

add sample id mapping and use it

add enum for chr index

use longs for the chr addition

add array schema and enum

add bool for array

update array options with sample id and chr adjustment

wip to remove chr separation

new sample directory structure

let the rename begin

fixup on walker for arrays

more renames

better comments

fixup on ingest walker

* unrelated conflict--maybe merge mistake?

* cleanup

* swap method for indexed alleles

* add genome specific VET creation

* move mode enum into common code

* wip on adding enums as params
mmorgantaylor pushed a commit that referenced this pull request Apr 6, 2021
* initial vet creation class

* initial variant walker for jg

* initial pet creation class

add pet creation to walker

update vet schema

add headers enum and method to pet creation

add xsv creation to loop

update vet

fix extra delim bug and show non-ref placeholder

Also fixed `[.....]` strings in the vet and converted a lot of strings to constants plus a couple of TODOs

Fix GT String to output index of alleles instead of allele characters

fix enum bug

add GQ band to drop as an optional param

include end base

code used to push PET and VET to BQ

added ability to pass in what GQ state to interpret as "missing" as well as skeleton for creating the metdata tsv (sample list, what interval list was use, the interval list's md5, and what GQ is signified by "missing".  Also put some util files in a folder under BlahVariantWalkerUtils for now (should get rid of them eventually but nice to have for now)

spacing + adding optional to GQ arguments (whoops)

rename intervalListPath to  intervalListBlob

pass thru sample name

swap out errors thrown

use L for interval list

This doesn't have the level of validation that we need, but works for now
Ideally the interval list is required to be a picard interval list file

md5 interval list concated string

wip split xsvs by contig

add vet stripping intial changes

review vet trim with Laura

fix overzealous missing bug

laura seal of approval on schema

add back call_DP until we are sure evoq doesn't need it

write chr dirs on the fly

create dirs for chrs w pet and vet dirs inside

add sample id to metadata

semantics for sample id

move position and sample into main method for easier access

add sample id mapping and use it

add enum for chr index

use longs for the chr addition

add array schema and enum

add bool for array

update array options with sample id and chr adjustment

wip to remove chr separation

new sample directory structure

let the rename begin

fixup on walker for arrays

more renames

better comments

fixup on ingest walker

* unrelated conflict--maybe merge mistake?

* cleanup

* swap method for indexed alleles

* add genome specific VET creation

* move mode enum into common code

* wip on adding enums as params
@lbergelson lbergelson mentioned this pull request Mar 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant