1

I have a large VCF file from which I want to extract certain columns and information from and have this matched to the variant location. I thought I had this working but for some variants instead of the corresponding variant location I am given the ID instead?

My code looks like this:

# see what fields are in this vcf file
scanVcfHeader("file.vcf")

# define paramaters on how to filter the vcf file
AN.adj.param <- ScanVcfParam(info="AN_Adj")

# load ALL allele counts (AN) from vcf file 
raw.AN.adj. <- readVcf("file.vcf", "hg19", param=AN.adj.param)

# extract ALL allele counts (AN) and corressponding chr location with allele tags from vcf file - in dataframe/s4 class
sclass.AN.adj <- (info(raw.AN.adj.))

The result looks like this:

               AN_adj
1:13475_A/T    91
1:14321_G/A    73
rs12345        87
1:15372_A/G    60
1:16174_G/A    41
1:16174_T/C    62
1:16576_G/A    87
rs987654       56

I would like the result to look like this:

               AN_adj
1:13475_A/T    91
1:14321_G/A    73
1:14873_C/T    87
1:15372_A/G    60
1:16174_G/A    41
1:16174_T/C    62
1:16576_G/A    87
1:18654_A/T    56

Any ideas on what is going on here and how to fix it?

I would also be happy if there was a way to append the variant location using the CHROM and position fields but from my research data from these fields cannot be requested as they are essential fields used to create the GRanges of variant locations.

4

0 に答える 0