VCU - Variants correction by UMI

origin vcf

Type

level of confidence (A high)

# of UMI groups - a group is located to the same start position


Evidence against PCR duplication

1 : weak evidence

>1 : strong evidence 

# reads with same variant and same UMI (sum of all starts)

# incompatible cuttings




Evidence for PCR duplication

mean (on all incompatible starts groups) of the ratio between reads with mutation to reads with incompatible no mutation (or other mutations)

# of different UMIs observed without mutation



Evidence against DNA mutation

# reads without variant and different UMI (sum of all UMIs and starts)

# of different UMIs with mutation (the same or other)



Evidence for DNA mutation

means on the means of the ratio between reads with mutation (also other mutation) to reads without mutation in all incompatible UMIs

This value don't give information about the number of cuttings within each incompatible UMI.

DescriptionMeaning
Variants fully compatible










n/0:m/0;0/c;0/e

A.png

A2n+m002c+e00

Three different UMI, first UMI has more than one group (2 groups) of starts mapping, with n and m reads with mutation and 0 non mutations, and the ohter UMIs has 0 mutations and c and e like reference

Strong evidence against PCR duplication + evidence against DNA mutation
n/0;0/c;0/eB1n002c+e00Three different UMI, first UMI type has n mutations and 0 non mutations, and the ohter UMIs has 0 mutations  and c and e like reference Evidence against PCR duplication + evidence against DNA mutation
n/0:m/0C2n+m000000More than one group (2 groups) of starts mapping, of a single UMI with n and m reads with mutation and 0 reads without mutations Strong evidence against PCR dupliation
n/0D1n000000A single UMI with n reads with mutation and 0 reads without mutation, all starting at the same positionEvidence against PCR duplication (weak evidence because mutation can appear in early stage of the PCR)
Variants incompatible










n/0:m/0:x/a:z/b;0/c;0/e

EA2n+m2mean(a/x,b/z)2c+e00Like A type and additional 2 groups of other starts mapping non100% mutated.PCR duplication
n/0:m/0;0/c;0/e;a/z:d/w;b/x:f/yEB2n+m002c+e2mean(mean(a/z,d/w),mean(b/x,f/y))Like A type and additional 2 UMIs that contain the mutation.DNA level mutation
n/0:m/0:x/a:z/b;0/c;d/y;e/wEC2n+m2mean(a/x,b/z)1c1mean(d/y,e/w)Like A type and additonal 1 UMI that contain mutation and additional 2 starts mapping non 100% mutatedPCR duplication + DNA level mutation

n/0:x/f:w/d;0/cFA1n2mean(f/x,d/w)1c00Like B type and additional 2 groups of other starts mapping non100% mutated.PCR duplication
n/0;0/c;0/e;a/z;b/xFB1n002c+e2mean(a/z,b/x)Like B type and additional 2 UMIs that contain the mutation.DNA level mutation
n/0:x/f:w/d;0/c;d/y;e/wFC1n2mean(f/x,d/w)1c1mean(d/y,e/w)Like B type and additonal 1 UMI that contain mutation and additional 2 starts mapping non 100% mutatedPCR duplication + DNA level mutation
n/0:m/0:x/a:z/bGA2n+m2mean(a/x,b/z)0000Like C type and additional 2 groups of other starts mapping non100% mutated.PCR duplication
n/0:m/0;a/z;b/xGB2n+m00002mean(a/z,b/x)Like C type and additional 2 UMIs that contain the mutation.DNA level mutation
n/0:m/0:x/a:z/b;d/y;e/wGC2n+m2mean(a/x,b/z)001mean(d/w,e/w)Like C type and additonal 1 UMI that contain mutation and additional 2 starts mapping non 100% mutatedPCR duplication + DNA level mutation
n/0:x/a:z/bHA1n2mean(a/x,b/z)0000Like D type and additional 2 groups of other starts mapping non100% mutated.PCR duplication
n/0;a/z;b/xHB1n00002mean(a/z,b/x)Like D type and additional 2 UMIs that contain the mutation.DNA level mutation
n/0:x/a:z/b;d/y;e/wHC1n2mean(a/x,b/z)001mean(d/w,e/w)Like D type and additonal 1 UMI that contain mutation and additional 2 starts mapping non 100% mutatedPCR duplication + DNA level mutation

n/a

I001a/n0000No cutting group with mutations in all reads.strong PCR duplication


Legends:

a,b,c,d,e, n, m, v, t >0

z,x, y,w  >= 0

When need to devide by zero, we add 0.5.

;  separating between different UMIs

:  separating between different cuttings (read starts at various genomics locations within a gene)

|  separating between different variants

For a certain UMI and position  -  reads with mutation/ reads without mutation  (or other mutation)

definitions:

Proof from cutting - all sequences of one cutting of UMI contain the variant

Support from other UMIs -  there are other UMIs without variants.

Types of validations:

Variants with full compatible:

A - prove from more than one cutting with supporting of other UMIs.

B - proof from only one cutting with supporting of other UMIs.

C - prove from more than one cutting.

D - proof from only one cutting.

Variants with incompatible:

E - prove from more than one cutting with supporting of other UMIs but with incompatible

F - proof from only one cutting with supporting of other UMIs but with incompatible

G - prove from more than one cutting but with incompatible

H - proof from only one cutting but with incompatible