Compare modification levels (Modification analysis)

Over the first day, we learned that features associated with DNA replication such as initiation and termination sites can be identified on single molecules using BrdU labelling. In the experimental protocol used in the following yeast dataset, sites that are replicated early are more modified than sites that are replicated late. In this exercise, please demonstrate that sequences on single molecules corresponding to initiation sites are more modified than sequences on single molecules that correspond to termination sites.

Your calculation should use the following files as input:

  • ~/nanomod_course_capstone/compare_mod_levels/dnascent.detect.mod.sorted.bam
  • ~/nanomod_course_capstone/compare_mod_levels/origins_DNAscent_forkSense.bed
  • ~/nanomod_course_capstone/compare_mod_levels/terminations_DNAscent_forkSense.bed

Although the commands you need to answer this question have been covered in the course, we expect this exercise to be challenging for you. So, you can answer the question in any manner you like. You can produce as quantitative or as qualitative an answer as you wish. You can give an answer using numbers and/or using visualizations. We expect a reasonably quantitative answer to take around 10-20 lines of Linux commands.

We suggest that you formulate a plan and try to execute it. If it turns out that your plan requires a lot of work, then please abandon it and try to think up a simpler solution.

For some answers, you may want to use the bed files above as an input to programs such as modkit. Please be advised that although programs such as DNAscent forkSense outputs files ending in .bed, these are not in the correct bed format. So, if you intend to use these files as inputs, please convert them to the correct bed format before doing so. The following commands may be helpful for the conversion.

input_forksense_bed= # fill suitably
output_forksense_bed= # fill suitably
awk -v OFS="\t" '{print $1, $2, $3, $4, 100, $8=="fwd"?"+":"-"}' \
  $input_forksense_bed > $output_forksense_bed
Expected outputs for Compare modification levels: 1