Computational Biology and Informatics

As part of the retargetting of this grant to address courses and research in bioinformatics we developed two courses for non-majors: Compsci 4g and Compsci 6g. The former is the most recent version of the course, it's not intended as an introduction to a computer science major. Although the materials from those courses offer information about the content, we highlight the most useful and portable assignments here.

Shotgun Materials

As an introduction to the whole-genome shotgun algorithm, we assign this text reconstruction project. Students use this in understanding the basic idea and algorithm before proceeding to the programming final project based on the algorithm. The text-reconstruction has been tested in many Duke courses, including courses for non-majors, with reasonable success in all environments. The final project builds on this project that everyone does (whereas the final project is one of several, and typically is chosen by students with previous programming experience.

For the final project, we want to convey a sense of efficiency, and why this matters. We also want students to understand regular expressions since they form not only a grammar that is simple to understand and use (and which is relevant to understanding programming)


After reading the assignment (see above) students are given a simple implementatation of merging strands as part of shotgun reconstruction. The code is below and relies on pattern matching using the java.util.regex package. Students have pratice with regexes based on this regex java tool and a series of exercises from the 4G website linked above.

Shotgun Redux

Based on this inefficient, but simple code:

public IStrand merge(IStrand other, int threshold) { String xx = mySeq + "XXX" + other.strandToString(); Matcher m = ourPattern.matcher(xx); boolean found = m.find(); if (found){ String s =; if (s.length() < threshold) return null; s = m.replaceFirst(s); return new SlowStrand(s,getName()+":"+other.getName()); } return null; } Students must design a more efficient implementation. The write-up from a student presentation showing what kind of work students can do with this problem is accessible here.

Student View

One of the students in the first offering of the course was the TA for the second course. He worked to put together a scope and sequence for assignments and materials.

Owen L. Astrachan
Last modified: Sun Mar 26 14:25:26 EST 2006