Compare modification levels (Modification analysis)

In the commands we executed on the first day, we used DNAscent forkSense to call features such as initiation and termination sites on single molecules. In the case study today, we learned that in the experimental protocol used in the yeast dataset, sites that are replicated early are more modified than sites that are replicated late. In this exercise, please demonstrate that sequences on single molecules corresponding to initiation sites are more modified than sequences on single molecules that correspond to termination sites.

Your calculation should use the following files as input:

~/nanomod_course_outputs/yeast/dnascent.detect.mod.sorted.bam
~/nanomod_course_outputs/yeast/origins_DNAscent_forkSense.bed
~/nanomod_course_outputs/yeast/terminations_DNAscent_forkSense.bed

Although the commands you need to answer this question have been covered in the course, we expect this exercise to be challenging for you. So, you can answer the question in any manner you like. You can produce as quantitative or as qualitative an answer as you wish. You can give an answer using numbers and/or using visualizations. We expect a reasonably quantitative answer to take around 10-20 lines of Linux commands.

We suggest that you formulate a plan and try to execute it. If it turns out that your plan requires a lot of work, then please abandon it and try to think up a simpler solution.

For some answers, you may want to use the bed files above as an input to programs such as modkit. Please be advised that although DNAscent forkSense outputs files ending in .bed, these are not in the correct bed format. So, if you intend to use these files as inputs, please convert them to the correct bed format before doing so. The following commands may be helpful for the conversion.

input_forksense_bed= # fill suitably
output_forksense_bed= # fill suitably
awk -v OFS="\t" '{print $1, $2, $3, $4, 100, $8=="fwd"?"+":"-"}' \
  $input_forksense_bed > $output_forksense_bed

Expected outputs for Compare modification levels: 1