VCU - Variants correction by UMI
origin vcf | Type level of confidence (A high) | # of UMI groups - a group is located to the same start position Evidence against PCR duplication 1 : weak evidence >1 : strong evidence | # reads with same variant and same UMI (sum of all starts) | # incompatible cuttings Evidence for PCR duplication | mean (on all incompatible starts groups) of the ratio between reads with mutation to reads with incompatible no mutation (or other mutations) | # of different UMIs observed without mutation Evidence against DNA mutation | # reads without variant and different UMI (sum of all UMIs and starts) | # of different UMIs with mutation (the same or other) Evidence for DNA mutation | means on the means of the ratio between reads with mutation (also other mutation) to reads without mutation in all incompatible UMIs This value don't give information about the number of cuttings within each incompatible UMI. | Description | Meaning |
---|---|---|---|---|---|---|---|---|---|---|---|
Variants fully compatible | |||||||||||
n/0:m/0;0/c;0/e | A | 2 | n+m | 0 | 0 | 2 | c+e | 0 | 0 | Three different UMI, first UMI has more than one group (2 groups) of starts mapping, with n and m reads with mutation and 0 non mutations, and the ohter UMIs has 0 mutations and c and e like reference | Strong evidence against PCR duplication + evidence against DNA mutation |
n/0;0/c;0/e | B | 1 | n | 0 | 0 | 2 | c+e | 0 | 0 | Three different UMI, first UMI type has n mutations and 0 non mutations, and the ohter UMIs has 0 mutations and c and e like reference | Evidence against PCR duplication + evidence against DNA mutation |
n/0:m/0 | C | 2 | n+m | 0 | 0 | 0 | 0 | 0 | 0 | More than one group (2 groups) of starts mapping, of a single UMI with n and m reads with mutation and 0 reads without mutations | Strong evidence against PCR dupliation |
n/0 | D | 1 | n | 0 | 0 | 0 | 0 | 0 | 0 | A single UMI with n reads with mutation and 0 reads without mutation, all starting at the same position | Evidence against PCR duplication (weak evidence because mutation can appear in early stage of the PCR) |
Variants incompatible | |||||||||||
n/0:m/0:x/a:z/b;0/c;0/e | EA | 2 | n+m | 2 | mean(a/x,b/z) | 2 | c+e | 0 | 0 | Like A type and additional 2 groups of other starts mapping non100% mutated. | PCR duplication |
n/0:m/0;0/c;0/e;a/z:d/w;b/x:f/y | EB | 2 | n+m | 0 | 0 | 2 | c+e | 2 | mean(mean(a/z,d/w),mean(b/x,f/y)) | Like A type and additional 2 UMIs that contain the mutation. | DNA level mutation |
n/0:m/0:x/a:z/b;0/c;d/y;e/w | EC | 2 | n+m | 2 | mean(a/x,b/z) | 1 | c | 1 | mean(d/y,e/w) | Like A type and additonal 1 UMI that contain mutation and additional 2 starts mapping non 100% mutated | PCR duplication + DNA level mutation |
n/0:x/f:w/d;0/c | FA | 1 | n | 2 | mean(f/x,d/w) | 1 | c | 0 | 0 | Like B type and additional 2 groups of other starts mapping non100% mutated. | PCR duplication |
n/0;0/c;0/e;a/z;b/x | FB | 1 | n | 0 | 0 | 2 | c+e | 2 | mean(a/z,b/x) | Like B type and additional 2 UMIs that contain the mutation. | DNA level mutation |
n/0:x/f:w/d;0/c;d/y;e/w | FC | 1 | n | 2 | mean(f/x,d/w) | 1 | c | 1 | mean(d/y,e/w) | Like B type and additonal 1 UMI that contain mutation and additional 2 starts mapping non 100% mutated | PCR duplication + DNA level mutation |
n/0:m/0:x/a:z/b | GA | 2 | n+m | 2 | mean(a/x,b/z) | 0 | 0 | 0 | 0 | Like C type and additional 2 groups of other starts mapping non100% mutated. | PCR duplication |
n/0:m/0;a/z;b/x | GB | 2 | n+m | 0 | 0 | 0 | 0 | 2 | mean(a/z,b/x) | Like C type and additional 2 UMIs that contain the mutation. | DNA level mutation |
n/0:m/0:x/a:z/b;d/y;e/w | GC | 2 | n+m | 2 | mean(a/x,b/z) | 0 | 0 | 1 | mean(d/w,e/w) | Like C type and additonal 1 UMI that contain mutation and additional 2 starts mapping non 100% mutated | PCR duplication + DNA level mutation |
n/0:x/a:z/b | HA | 1 | n | 2 | mean(a/x,b/z) | 0 | 0 | 0 | 0 | Like D type and additional 2 groups of other starts mapping non100% mutated. | PCR duplication |
n/0;a/z;b/x | HB | 1 | n | 0 | 0 | 0 | 0 | 2 | mean(a/z,b/x) | Like D type and additional 2 UMIs that contain the mutation. | DNA level mutation |
n/0:x/a:z/b;d/y;e/w | HC | 1 | n | 2 | mean(a/x,b/z) | 0 | 0 | 1 | mean(d/w,e/w) | Like D type and additonal 1 UMI that contain mutation and additional 2 starts mapping non 100% mutated | PCR duplication + DNA level mutation |
n/a | I | 0 | 0 | 1 | a/n | 0 | 0 | 0 | 0 | No cutting group with mutations in all reads. | strong PCR duplication |
Legends:
a,b,c,d,e, n, m, v, t >0
z,x, y,w >= 0
When need to devide by zero, we add 0.5.
; separating between different UMIs
: separating between different cuttings (read starts at various genomics locations within a gene)
| separating between different variants
For a certain UMI and position - reads with mutation/ reads without mutation (or other mutation)
definitions:
Proof from cutting - all sequences of one cutting of UMI contain the variant
Support from other UMIs - there are other UMIs without variants.
Types of validations:
Variants with full compatible:
A - prove from more than one cutting with supporting of other UMIs.
B - proof from only one cutting with supporting of other UMIs.
C - prove from more than one cutting.
D - proof from only one cutting.
Variants with incompatible:
E - prove from more than one cutting with supporting of other UMIs but with incompatible
F - proof from only one cutting with supporting of other UMIs but with incompatible
G - prove from more than one cutting but with incompatible
H - proof from only one cutting but with incompatible