28 Refactorization of Legacy Code
Introduction
Most DNA is made up of the standard, right-handed DNA we are used to working with, but there are a number of alternate conformations that don’t follow this paradigm. This is referred to as non-B DNA.




non-B_gfa
The non-B_gfa database and accompanying package was developed to find sequences associated with non-B DNA forming motifs. It was originally published by Cer et al. (2012), and the current version is written in C.
It hasn’t recieved much attention in the past decade. We want to modernize the code and make it easier to incorporate into R and Python workflows.
Instructions
- Fork a copy of the non-B_gfa repository using this GitHub classroom link
- Clone your fork to your local machine
- Refactor the code into either an R or Python package, including an updated README, tests and documentation
- Push your changes to GitHub and submit a link to your fork on Blackboard
References
Brázda, Václav, Rob C. Laister, Eva B. Jagelská, and Cheryl Arrowsmith. 2011. “Cruciform Structures Are a Common DNA Feature Important for Regulating Biological Processes.” BMC Molecular Biology 12 (August): 33. https://doi.org/10.1186/1471-2199-12-33.
Cer, Regina Z., Duncan E. Donohue, Uma S. Mudunuri, Nuri A. Temiz, Michael A. Loss, Nathan J. Starner, Goran N. Halusa, et al. 2012. “Non-B DB V2.0: A Database of Predicted Non-B DNA-Forming Motifs and Its Associated Tools.” Nucleic Acids Research 41 (D1): D94–100. https://doi.org/10.1093/nar/gks955.
De Rosa, Matteo, Daniele De Sanctis, Ana Lucia Rosario, Margarida Archer, Alexander Rich, Alekos Athanasiadis, and Maria Armenia Carrondo. 2010. “Crystal Structure of a Junction Between Two Z-DNA Helices.” Proceedings of the National Academy of Sciences 107 (20): 9088–92. https://doi.org/10.1073/pnas.1003182107.
Pearson, C. 1998. “Structural Analysis of Slipped-Strand DNA (S-DNA) Formed in (CTG)n. (CAG)n Repeats from the Myotonic Dystrophy Locus.” Nucleic Acids Research 26 (3): 816–23. https://doi.org/10.1093/nar/26.3.816.
Varshney, Dhaval, Jochen Spiegel, Katherine Zyner, David Tannahill, and Shankar Balasubramanian. 2020. “The Regulation and Functions of DNA and RNA G-Quadruplexes.” Nature Reviews Molecular Cell Biology 21 (8): 459–74. https://doi.org/10.1038/s41580-020-0236-x.