- For the analysis an Input of a miR target database files is required (target db)
Examples /ngs001/dena_public/DenaAnalysis/Diana/miR_targets_lists1/
Conserved_families_conserved_targets.txt
Conserved_Family_All_targets.txt
miR_table3_non_conserved_families_all_targets.txt
- Script to convert the miR target file (target db) to an input of GSEA (*.gmt)
perl /ngs001/dena_public/DENAScripts/perl_for_mir.pl miR_targets_lists1/miR_table3_non_conserved_families_all_targets.txt miR_table3_non_conserved_families_all_targets.gmt
perl /ngs001/dena_public/DENAScripts/perl_for_mir.pl miR_targets_lists/Conserved_families_conserved_targets.txt Conserved_families_conserved_targets.gmt
perl /ngs001/dena_public/DENAScripts/perl_for_mir.pl miR_targets_lists/Conserved_Family_All_targets.txt Conserved_Family_All_targets.gmt
need to make sure there are no / in the miR names
- Script to count the genes per mir in the: small gene list, all genes analysed. The output produced is required for the next step - hypergeometric test
perl /ngs001/dena_public/DENAScripts/prepare_for_chitest_mir.pl miR_table3_non_conserved_families_all_targets.txt RNAseq_DicerVsWT_allgenes.txt RNAseq_DicerVsWT_Up2FC.txt miR_table3_non_conserved_families_all_targets_RNAseq_DicerVsWT_Up2FC.count.csv
- Hypergeometric test is run in R script
/ngs001/dena_public/DenaAnalysis/Diana/miR_targets_lists1/R-hyper-script.R
This script was updated e to run in command line:
First find all count files needed to be processed and make a shell to run the R script for them.
[bsdena@localhost miR_targets_lists1]$ ls -1 *140816.counts.tsv | sed 's/.*/Rscript R-hyper-script.commandline.R &/'
Rscript R-hyper-script.commandline.R Conserved_families_conserved_targets_RNA_new_down_Pv0.05_140816.counts.tsv
Rscript R-hyper-script.commandline.R Conserved_families_conserved_targets_RNA_new_up_Pv0.05_140816.counts.tsv
Rscript R-hyper-script.commandline.R Conserved_Family_All_targets_RNA_new_down_Pv0.05_140816.counts.tsv
Rscript R-hyper-script.commandline.R Conserved_Family_All_targets_RNA_new_up_Pv0.05_140816.counts.tsv
Rscript R-hyper-script.commandline.R miR_table3_non_conserved_families_all_targets_RNA_new_down_Pv0.05_140816.counts.tsv
Rscript R-hyper-script.commandline.R miR_table3_non_conserved_families_all_targets_RNA_new_up_Pv0.05_140816.counts.tsv
[bsdena@localhost miR_targets_lists1]$ ls -1 *140816.counts.tsv | sed 's/.*/Rscript R-hyper-script.commandline.R &/' > R_commands_140816.txt
[bsdena@localhost miR_targets_lists1]$ sh R_commands_140816.txt
[1] "reading file Conserved_families_conserved_targets_RNA_new_down_Pv0.05_140816.counts.tsv"
[1] "created file Conserved_families_conserved_targets_RNA_new_down_Pv0.05_140816.hyper.tsv"
Time difference of 0.02397466 secs
[1] "reading file Conserved_families_conserved_targets_RNA_new_up_Pv0.05_140816.counts.tsv"
[1] "created file Conserved_families_conserved_targets_RNA_new_up_Pv0.05_140816.hyper.tsv"
Time difference of 0.02270269 secs
[1] "reading file Conserved_Family_All_targets_RNA_new_down_Pv0.05_140816.counts.tsv"
[1] "created file Conserved_Family_All_targets_RNA_new_down_Pv0.05_140816.hyper.tsv"
Time difference of 0.03952098 secs
[1] "reading file Conserved_Family_All_targets_RNA_new_up_Pv0.05_140816.counts.tsv"
[1] "created file Conserved_Family_All_targets_RNA_new_up_Pv0.05_140816.hyper.tsv"
Time difference of 0.0333159 secs
[1] "reading file miR_table3_non_conserved_families_all_targets_RNA_new_down_Pv0.05_140816.counts.tsv"
[1] "created file miR_table3_non_conserved_families_all_targets_RNA_new_down_Pv0.05_140816.hyper.tsv"
Time difference of 0.1291921 secs
[1] "reading file miR_table3_non_conserved_families_all_targets_RNA_new_up_Pv0.05_140816.counts.tsv"
[1] "created file miR_table3_non_conserved_families_all_targets_RNA_new_up_Pv0.05_140816.hyper.tsv"
Time difference of 0.1333916 secs
Log for the complete pipeline-
wget http://www.targetscan.org/mmu_71/mmu_71_data_download/Conserved_Family_Conserved_Targets_Info.txt.zip
wget http://www.targetscan.org/mmu_71/mmu_71_data_download/Conserved_Family_Info.txt.zip
wget http://www.targetscan.org/mmu_71/mmu_71_data_download/Nonconserved_Family_Info.txt.zip
unzip Conserved_Family_Conserved_Targets_Info.txt.zip
unzip Conserved_Family_Info.txt.zip
unzip Nonconserved_Family_Info.txt.zip
#filter for only mouse
awk -F'\t' '{ if($5==10090) print $_}' Conserved_Family_Info.txt > Conserved_Family_Info_mouse.txt
awk -F'\t' '{ if($5==10090) print $_}' Nonconserved_Family_Info.txt > Nonconserved_Family_Info_Info_mouse.txt
awk -F'\t' '{ if($5==10090) print $_}' Predicted_Targets_Info.txt > Predicted_Targets_Info_mouse.txt
perl /ngs001/dena_public/DENAScripts/perl_for_mir.pl Conserved_Family_Info_mouse.txt Conserved_Family_Info_mouse.gmt
perl /ngs001/dena_public/DENAScripts/perl_for_mir.pl Nonconserved_Family_Info_mouse.txt Nonconserved_Family_Info_mouse.gmt
perl /ngs001/dena_public/DENAScripts/perl_for_mir.pl Predicted_Targets_Info_mouse.txt Predicted_Targets_Info_mouse.gmt
perl /ngs001/dena_public/DENAScripts/prepare_for_chitest_mir.pl Conserved_Family_Info_mouse.txt RNA_total_list_140816.txt RNA_new_up_Pv0.05_140816.txt Conserved_Family_Info_mouse_up.counts.csv
perl /ngs001/dena_public/DENAScripts/prepare_for_chitest_mir.pl Conserved_Family_Info_mouse.txt RNA_total_list_140816.txt RNA_new_down_Pv0.05_140816.txt Conserved_Family_Info_mouse_down.counts.csv
perl /ngs001/dena_public/DENAScripts/prepare_for_chitest_mir.pl Nonconserved_Family_Info.txt RNA_total_list_140816.txt RNA_new_down_Pv0.05_140816.txt Nonconserved_Family_Info_mouse_down.counts.csv
perl /ngs001/dena_public/DENAScripts/prepare_for_chitest_mir.pl Nonconserved_Family_Info.txt RNA_total_list_140816.txt RNA_new_up_Pv0.05_140816.txt Nonconserved_Family_Info_mouse_up.counts.csv
perl /ngs001/dena_public/DENAScripts/prepare_for_chitest_mir.pl Predicted_Targets_Info_mouse.txt RNA_total_list_140816.txt RNA_new_up_Pv0.05_140816.txt Predicted_Targets_Info_mouse_up.counts.csv
perl /ngs001/dena_public/DENAScripts/prepare_for_chitest_mir.pl Predicted_Targets_Info_mouse.txt RNA_total_list_140816.txt RNA_new_down_Pv0.05_140816.txt Predicted_Targets_Info_mouse_down.counts.csv
ls -1 *.counts.csv | sed 's/.*/Rscript R-hyper-script.commandline.R &/' > R_commands_mouse_only
sh R_commands_mouse_only
awk -F',' '$9 <= 0.05 {print FILENAME $0}' *hyper.csv > all_padj0.05.csv