miR target enrichment for mouse

  • For the analysis an Input of a miR target database files is required (target db)
    Examples /ngs001/dena_public/DenaAnalysis/Diana/miR_targets_lists1/

Conserved_families_conserved_targets.txt

Conserved_Family_All_targets.txt

miR_table3_non_conserved_families_all_targets.txt

 

  • Script to convert the miR target file (target db) to an input of GSEA (*.gmt)


 perl /ngs001/dena_public/DENAScripts/perl_for_mir.pl miR_targets_lists1/miR_table3_non_conserved_families_all_targets.txt miR_table3_non_conserved_families_all_targets.gmt
 perl /ngs001/dena_public/DENAScripts/perl_for_mir.pl miR_targets_lists/Conserved_families_conserved_targets.txt Conserved_families_conserved_targets.gmt
 perl /ngs001/dena_public/DENAScripts/perl_for_mir.pl miR_targets_lists/Conserved_Family_All_targets.txt Conserved_Family_All_targets.gmt


need to make sure there are no / in the miR names

  • Script to count the genes per mir in the:  small gene list, all genes analysed. The output produced is required for the next step - hypergeometric test

perl /ngs001/dena_public/DENAScripts/prepare_for_chitest_mir.pl miR_table3_non_conserved_families_all_targets.txt RNAseq_DicerVsWT_allgenes.txt RNAseq_DicerVsWT_Up2FC.txt  miR_table3_non_conserved_families_all_targets_RNAseq_DicerVsWT_Up2FC.count.csv

 

  • Hypergeometric test is run in R script 

    /ngs001/dena_public/DenaAnalysis/Diana/miR_targets_lists1/R-hyper-script.R

     

  • This script was updated e to run in command line:

First find all count files needed to be processed and make a shell to run the R script for them. 

[bsdena@localhost miR_targets_lists1]$ ls -1 *140816.counts.tsv | sed 's/.*/Rscript R-hyper-script.commandline.R &/'
Rscript R-hyper-script.commandline.R Conserved_families_conserved_targets_RNA_new_down_Pv0.05_140816.counts.tsv
Rscript R-hyper-script.commandline.R Conserved_families_conserved_targets_RNA_new_up_Pv0.05_140816.counts.tsv
Rscript R-hyper-script.commandline.R Conserved_Family_All_targets_RNA_new_down_Pv0.05_140816.counts.tsv
Rscript R-hyper-script.commandline.R Conserved_Family_All_targets_RNA_new_up_Pv0.05_140816.counts.tsv
Rscript R-hyper-script.commandline.R miR_table3_non_conserved_families_all_targets_RNA_new_down_Pv0.05_140816.counts.tsv
Rscript R-hyper-script.commandline.R miR_table3_non_conserved_families_all_targets_RNA_new_up_Pv0.05_140816.counts.tsv
[bsdena@localhost miR_targets_lists1]$ ls -1 *140816.counts.tsv | sed 's/.*/Rscript R-hyper-script.commandline.R &/' > R_commands_140816.txt

[bsdena@localhost miR_targets_lists1]$ sh R_commands_140816.txt
[1] "reading file Conserved_families_conserved_targets_RNA_new_down_Pv0.05_140816.counts.tsv"
[1] "created file Conserved_families_conserved_targets_RNA_new_down_Pv0.05_140816.hyper.tsv"
Time difference of 0.02397466 secs
[1] "reading file Conserved_families_conserved_targets_RNA_new_up_Pv0.05_140816.counts.tsv"
[1] "created file Conserved_families_conserved_targets_RNA_new_up_Pv0.05_140816.hyper.tsv"
Time difference of 0.02270269 secs
[1] "reading file Conserved_Family_All_targets_RNA_new_down_Pv0.05_140816.counts.tsv"
[1] "created file Conserved_Family_All_targets_RNA_new_down_Pv0.05_140816.hyper.tsv"
Time difference of 0.03952098 secs
[1] "reading file Conserved_Family_All_targets_RNA_new_up_Pv0.05_140816.counts.tsv"
[1] "created file Conserved_Family_All_targets_RNA_new_up_Pv0.05_140816.hyper.tsv"
Time difference of 0.0333159 secs
[1] "reading file miR_table3_non_conserved_families_all_targets_RNA_new_down_Pv0.05_140816.counts.tsv"
[1] "created file miR_table3_non_conserved_families_all_targets_RNA_new_down_Pv0.05_140816.hyper.tsv"
Time difference of 0.1291921 secs
[1] "reading file miR_table3_non_conserved_families_all_targets_RNA_new_up_Pv0.05_140816.counts.tsv"
[1] "created file miR_table3_non_conserved_families_all_targets_RNA_new_up_Pv0.05_140816.hyper.tsv"
Time difference of 0.1333916 secs

 

 

 

Log for the complete pipeline- 

 

wget http://www.targetscan.org/mmu_71/mmu_71_data_download/Conserved_Family_Conserved_Targets_Info.txt.zip
wget http://www.targetscan.org/mmu_71/mmu_71_data_download/Conserved_Family_Info.txt.zip
wget http://www.targetscan.org/mmu_71/mmu_71_data_download/Nonconserved_Family_Info.txt.zip

unzip Conserved_Family_Conserved_Targets_Info.txt.zip
unzip Conserved_Family_Info.txt.zip
unzip Nonconserved_Family_Info.txt.zip

#filter for only mouse
awk -F'\t' '{ if($5==10090) print $_}' Conserved_Family_Info.txt > Conserved_Family_Info_mouse.txt
awk -F'\t' '{ if($5==10090) print $_}' Nonconserved_Family_Info.txt > Nonconserved_Family_Info_Info_mouse.txt
awk -F'\t' '{ if($5==10090) print $_}' Predicted_Targets_Info.txt > Predicted_Targets_Info_mouse.txt

 

perl /ngs001/dena_public/DENAScripts/perl_for_mir.pl Conserved_Family_Info_mouse.txt Conserved_Family_Info_mouse.gmt
perl /ngs001/dena_public/DENAScripts/perl_for_mir.pl Nonconserved_Family_Info_mouse.txt Nonconserved_Family_Info_mouse.gmt
perl /ngs001/dena_public/DENAScripts/perl_for_mir.pl Predicted_Targets_Info_mouse.txt Predicted_Targets_Info_mouse.gmt
perl /ngs001/dena_public/DENAScripts/prepare_for_chitest_mir.pl Conserved_Family_Info_mouse.txt RNA_total_list_140816.txt RNA_new_up_Pv0.05_140816.txt Conserved_Family_Info_mouse_up.counts.csv
perl /ngs001/dena_public/DENAScripts/prepare_for_chitest_mir.pl Conserved_Family_Info_mouse.txt RNA_total_list_140816.txt RNA_new_down_Pv0.05_140816.txt Conserved_Family_Info_mouse_down.counts.csv
perl /ngs001/dena_public/DENAScripts/prepare_for_chitest_mir.pl Nonconserved_Family_Info.txt RNA_total_list_140816.txt RNA_new_down_Pv0.05_140816.txt Nonconserved_Family_Info_mouse_down.counts.csv
perl /ngs001/dena_public/DENAScripts/prepare_for_chitest_mir.pl Nonconserved_Family_Info.txt RNA_total_list_140816.txt RNA_new_up_Pv0.05_140816.txt Nonconserved_Family_Info_mouse_up.counts.csv
perl /ngs001/dena_public/DENAScripts/prepare_for_chitest_mir.pl Predicted_Targets_Info_mouse.txt RNA_total_list_140816.txt RNA_new_up_Pv0.05_140816.txt Predicted_Targets_Info_mouse_up.counts.csv
perl /ngs001/dena_public/DENAScripts/prepare_for_chitest_mir.pl Predicted_Targets_Info_mouse.txt RNA_total_list_140816.txt RNA_new_down_Pv0.05_140816.txt Predicted_Targets_Info_mouse_down.counts.csv

ls -1 *.counts.csv | sed 's/.*/Rscript R-hyper-script.commandline.R &/' > R_commands_mouse_only
sh R_commands_mouse_only
awk -F',' '$9 <= 0.05 {print FILENAME $0}' *hyper.csv > all_padj0.05.csv