Are there functional examples of parallel DNA double helices?

Are there functional examples of parallel DNA double helices?

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

The anti-parallel structure of the DNA double helix is well studied, but I am curious if there are any examples of parallel DNA double helices. There are reports of synthetic such structures; see this paper, for example. However, my question is: are there functional examples of parallel DNA double helices? Here are some implicit/additional/guiding criteria:

  • It should be double stranded. This excludes structures such as G-quadraplexes, which do have some parallel strands.
  • It does not have to be an extended helix. I don't have any specific length restrictions, but don't assume it needs to be hundreds or even tens of base pairs long.
  • The structure need not be solely composed of DNA. If a parallel helix is induced by, say, protein or small molecule binding, so be it.
  • By "functional examples", I mean that the parallel structure should have some effect on cellular processes. The structure can be studied in vitro, even using synthetic constructs, but it should have some functional significance in vivo (or, at least a proposed significance).

Note that although I only mention DNA above, papers discussing parallel RNA double helices in the same spirit would also be welcome.

Related question: why is DNA antiparallel? Can it be parallel?

Surprisingly, a parallel DNA duplex has been reported! In a paper, Tchurikov et al have reported the presence of parallel complementary DNA in the non-coding region of alcohol dehydrogenase gene as well as between two Drosophila DNA sequences. The region, which is ~40 bp long, has 76% bases in same polarity along with complementarity. However, its presence in vivo and its significance are not known (they observed its existwnce in vitro).

Tchurikov et al, in another paper, have reported that parallel complementary RNA in E. coli plays some role in RNA interference and is indeed more effective than antisense RNA in silencing mRNA for gene expression regulation. They also propose the presence of such a system in vivo in E. coli cells. (Seemingly, this paper alone is enough to answer your question since it fulfills all your criteria).

In another paper, Szabat et al have shown that DNA, 2'-O-MeRNA and RNA oligonucleotides can adopt parallel duplex configuration at pH 5 and lower. Also, presence of LNA stabilizes parallel duplex configuration. This might seem helpful in processes such as RNA interference, though this study too was in vitro (obviously, in vivo LNA is not known).

Many such papers, like Westhof et al, Mohammadi et al, etc. have reported the presence of parallel duplex DNA.


1. Tchurikov NA, Chernov BK, Golova YB, Nechipurenko YD. Parallel DNA: generation of a duplex between two Drosophila sequences in vitro. FEBS letters. 1989;257(2):415-8. pmid:2479581

2. Tchurikov, N. A., L. G. Chistyakova, G. B. Zavilgelsky, I. V. Manukhov, B. K. Chernov, and Y. B. Golova. 2000. Gene-specific silencing by expression of parallel complementary RNA in Escherichia coli. J. Biol. Chem. 275:26523-26529

3. Szabat M, Pedzinski T, Czapik T, Kierzek E, Kierzek R (2015) Structural Aspects of the Antiparallel and Parallel Duplexes Formed by DNA, 2'-O-Methyl RNA and RNA Oligonucleotides. PLoS ONE 10(11): e0143354. doi:10.1371/journal.pone.0143354

4. Westhof, E., and M. Sundaralingam. 1980. X-ray structure of a cytidylyl-3',5'-adenosine-proflavine complex: a self-paired parallel-chain double helical dimer with an intercalated acridine dye. Proc. Natl. Acad. Sci. USA. 77:1852-1856.

5. Mohammadi, S., R. Klement, A. K. Shchyolkina, J. Liquier, T. M. Jovin, and E. Taillandier. 1998. FTIR and UV spectroscopy of parallel-stranded DNAs with mixed A-T/G-C sequences and their A-T/I-C analogues. Biochemistry. 37:16529-16537.

All articles I found discussing parallel helices are purely speculative with regards to biological significance; but, they are still interesting. Here are some that I found, in addition to the other answer.

Safaee N, Noronha AM, Rodionov D, Kozlov G, Wilds CJ, Sheldrick GM, Gehring K. 2013. Structure of the Parallel Duplex of Poly(A) RNA: Evaluation of a 50 Year-Old Prediction. Angew Chem Into Ed 52:10370-10373.

This paper presents the solved crystal structure of parallel poly(A) RNA and shows that Poly(A) Binding Protein (PABP) promotes parallel duplex formation. A biological role is hypothesized:

As the great majority of eukaryotic messenger RNAs (mRNA) are tagged with 100 to 250 adenines at their 3' end, the polymorphism of poly(rA) is also relevant for present day cellular processes involving mRNA translation, storage, and decay. Under conditions of cell stress, cellular mRNAs are transported into RNA granules, increasing the local concentration of poly(rA). It is possible that nature evolved proteins such as PABP in part to regulate the occurrence of poly(rA) duplexes in cells.

There are several reviews that discuss the possible role of parallel RNA in the RNA world which revolve around the problem of replication using complementary, anti-parallel strands:

Taylor WR. 2005. Stirring the primordial soup. Nature 434:705.

Some mechanisms for replication in the RNA world have been put forward, and following the current systems of protein polynucleotide synthesis, all involve the creation of a complementary daughter strand using Watson-Crick base-pairing. But from a mechanistic viewpoint, such a model contains a fundamental problem: if a ribopolymerase were to make a complementary copy of itself, it would need to recopy this to obtain a new functional ribopolymerase. This implies that both the ribopolymerase sequence and its complement would have to coexist. But if these two copies came together, the result would be a double stranded Watson-Crick helix (as found in some RNA viruses) - not a new ribopolymerase. Even if both sequences had well determined secondary structures, the perfect complementarity of the Watson-Crick pairing would act as a sink, leading to a sterile population of double-stranded molecules.

The proposed solution is that early RNA polymerases may have created parallel complements to prevent such inhibition:

Taylor WR. 2006. Transcription and translation in an RNA world. Phil Trans R Soc B 361:1751-1760.

Replication strategies. (a) Replication via a reverse complementary strand leads to (b) a stable double-stranded duplex if the two copies meet. (c) Replication via a parallel complementary strand leads to (d) a relatively unstable double-stranded duplex if the two copies meet.

The propagation of information in a nucleic acid strand from one 'generation' to the next using Watson-Crick base pairing logically does not have to involve a reverse complementary strand. Providing that there is complementary base pairing, a parallel complement would also propagate the same information…

… all that need change from the viewpoint of replicase is the direction of its progression along the template. The resulting transcript could only be expected to base pair with the template over a short region before parting, but, faced with the problem of irreversible hybridization, this would be a desirable feature of the model.

This paper mentions some proposed functions of parallel stranded (ps) DNA, with accompanying references, but I can only access one of them:

Other functions have been proposed for ps-DNA in gene expression, recombination, RNA processing (14,18,20), the packing of single-stranded and dimeric viral genomes and the function of reverse gyrase (12).

This is the one reference that I could find from the above paper:

Ramsing NB, Jovin TM. Parallel stranded duplex DNA. Nucleic Acids Res 16:6659-6676.

The possibility that ps-RNA might exist is intriguing in view of the rich structural and functional repertoire of RNA species in general. Three canonical situations in which ps helices could arise by interactions of wholly or partially homologous strands or looping of single stranded nucleic acid of appropriate sequence are shown in Fig. 12. The topological implications of such structures are of great interest, particularly in relation to the potential roles of ps-DNA and ps-RNA in (nonhomologous) recombination, RNA splicing, stabilization of ribosomal RNA, and other cellular processes. In addition, it can be anticipated that specific ligands, particularly proteins, could intervene in order to stabilize and exploit the parallel stranded conformation.

Overview of Crosslinking and Protein Modification

A number of techniques for studying the structure and interaction of proteins, as well as for manipulating proteins for use in affinity purification or detection procedures, depend on methods for chemically crosslinking, modifying or labeling proteins.

Crosslinking is the process of chemically joining two or more molecules by a covalent bond. Modification involves attaching or cleaving chemical groups to alter the solubility or other properties of the original molecule. "Labeling" generally refers to any form of crosslinking or modification whose purpose is to attach a chemical group (e.g., a fluorescent molecule) to aid in detection of a molecule and is described in other articles.

The entire set of crosslinking and modification methods for use with proteins and other biomolecules in biological research is often called "bioconjugation" or "bioconjugate" technology. (Conjugation is a synonym for crosslinking.)

Page contents

View and select products

Covalent modification and crosslinking of proteins depends on the availability of particular chemicals that are capable of reacting with the specific kinds of functional groups that exist in proteins. In addition, protein function and structure are either the direct focus of study or they must be preserved if a modified protein is to be useful in a technique. Therefore, the composition and structure of proteins, and the potential effects of modification reagents on protein structure and function, must be considered.

Proteins have four levels of structure. The sequence of its amino acids is the primary structure. This sequence is always written from the amino end (N-terminus) to the carboxyl end (C-terminus). Protein secondary structure refers to common repeating elements present in proteins. There are two basic components of secondary structure: the alpha helix and the beta-pleated sheet. Alpha helices are tight, corkscrew-shaped structures formed by single polypeptide chains. Beta-pleated sheets are either parallel or anti-parallel arrangements of polypeptide strands stabilized by hydrogen bonds between adjacent –NH and –CO groups. Parallel beta-sheets have adjacent strands that run in the same direction (i.e., N-termini next to each other), while anti-parallel beta sheets have adjacent strands that run in opposite directions (i.e., N-terminus of one strand arranged toward the C-terminus of adjacent strand). A beta-pleated sheet may contain two to five parallel or antiparallel strands.

Tertiary structure is the full three-dimensional, folded structure of the polypeptide chain and is dependent on the suite of spontaneous and thermodynamically stable interactions between the amino acid side chains. Disulfide bond patterns, as well as ionic and hydrophobic interactions greatly impact tertiary structure. Quaternary structure refers to the spatial arrangement of two or more polypeptide chains. This structure may be a monomer, dimer, trimer, etc. The polypeptide chains composing the quaternary structure of a protein may be identical (e.g., homodimer) or different (e.g., heterodimer).

The four levels of protein structure. The sequence of amino acids, represented by blue dots, joined by peptide bonds, comprise the primary structure. The properties of the constituent amino acids, in the context of the cellular environment, largely determine spontaneous formation of the higher-level structure that is essential for protein function.

DNA in a material world

The specific bonding of DNA base pairs provides the chemical foundation for genetics. This powerful molecular recognition system can be used in nanotechnology to direct the assembly of highly structured materials with specific nanoscale features, as well as in DNA computation to process complex information. The exploitation of DNA for material purposes presents a new chapter in the history of the molecule.

“The nucleic-acid 'system' that operates in terrestrial life is optimized (through evolution) chemistry incarnate. Why not use it . to allow human beings to sculpt something new, perhaps beautiful, perhaps useful, certainly unnatural.” Roald Hoffmann, writing in American Scientist, 1994 (ref. 1).

The DNA molecule has appealing features for use in nanotechnology: its minuscule size, with a diameter of about 2 nanometres, its short structural repeat (helical pitch) of about 3.4–3.6 nm, and its 'stiffness', with a persistence length (a measure of stiffness) of around 50 nm. There are two basic types of nanotechnological construction: 'top-down' systems are where microscopic manipulations of small numbers of atoms or molecules fashion elegant patterns (for example, see ref. 2), while in 'bottom-up' constructions, many molecules self-assemble in parallel steps, as a function of their molecular recognition properties. As a chemically based assembly system, DNA will be a key player in bottom-up nanotechnology.

The origins of this approach date to the early 1970s, when in vitro genetic manipulation was first performed by tacking together molecules with 'sticky ends'. A sticky end is a short single-stranded overhang protruding from the end of a double-stranded helical DNA molecule. Like flaps of Velcro, two molecules with complementary sticky ends — that is, their sticky ends have complementary arrangements of the nucleotide bases adenine, cytosine, guanine and thymine — will cohere to form a molecular complex.

Sticky-ended cohesion is arguably the best example of programmable molecular recognition: there is significant diversity to possible sticky ends (4 N for N-base sticky ends), and the product formed at the site of this cohesion is the classic DNA double helix. Likewise, the convenience of solid support-based DNA synthesis 3 makes it is easy to program diverse sequences of sticky ends. Thus, sticky ends offer both predictable control of intermolecular associations and predictable geometry at the point of cohesion. Perhaps one could get similar affinity properties from antibodies and antigens, but, in contrast to DNA sticky ends, the relative three-dimensional orientation of the antibody and the antigen would need to be determined for every new pair. The nucleic acids seem to be unique in this regard, providing a tractable, diverse and programmable system with remarkable control over intermolecular interactions, coupled with known structures for their complexes.

There is, however, a catch the axes of DNA double helices are unbranched lines. Joining DNA molecules by sticky ends can yield longer lines, perhaps with specific components in a particular linear or cyclic order in one dimension. Indeed, the chromosomes packed inside cells exist as just such one-dimensional arrays. But to produce interesting materials from DNA, synthesis is required in multiple dimensions and, for this purpose, branched DNA is required.

Branched DNA occurs naturally in living systems, as ephemeral intermediates formed when chromosomes exchange information during meiosis, the type of cell division that generates the sex cells (eggs and sperm). Prior to cell division, homologous chromosomes pair, and the aligned strands of DNA break and literally cross over one another, forming structures called Holliday junctions. This exchange of adjacent sequences by homologous chromosomes — a process called recombination — during the formation of sex cells passes genetic diversity onto the next generation.

The Holliday junction contains four DNA strands (each member of a pair of aligned homologous chromosomes is composed of two DNA strands) bound together to form four double-helical arms flanking a branch point (Fig. 1a). The branch point can relocate throughout the molecule, by virtue of the homologous sequences. In contrast, synthetic DNA complexes can be designed to have fixed branch points containing between three and at least eight arms 4,5 . Thus, the prescription for using DNA as the basis for complex materials with nanoscale features is simple: take synthetic branched DNA molecules with programmed sticky ends, and get them to self-assemble into the desired structure, which may be a closed object or a crystalline array (Fig. 1a).

a, Self-assembly of branched DNA molecules into a two-dimensional crystal. A DNA branched junction forms from four DNA strands those strands coloured green and blue have complementary sticky-end overhangs labelled H and H′, respectively, whereas those coloured pink and red have complementary overhangs V and V′, respectively. A number of DNA branched junctions cohere based on the orientation of their complementary sticky ends, forming a square-like unit with unpaired sticky ends on the outside, so more units could be added to produce a two-dimensional crystal. b, Ligated DNA molecules form interconnected rings to create a cube-like structure. The structure consists of six cyclic interlocked single strands, each linked twice to its four neighbours, because each edge contains two turns of the DNA double helix. For example, the front red strand is linked to the green strand on the right, the light blue strand on the top, the magenta strand on the left, and the dark blue strand on the bottom. It is linked only indirectly to the yellow strand at the rear.

Other modes of nucleic acid interaction aside from sticky ends are available. For example, Tecto-RNA molecules 6 , held together by loop–loop interactions, or paranemic crossover (PX) DNA, where cohesion derives from pairing of alternate half turns in inter-wrapped double helices 7 . These new binding modes represent programmable cohesive interactions between cyclic single-stranded molecules that do not require cleavage to expose bases to pair molecules together. Nevertheless, cohesion using sticky ends remains the most prominent intermolecular interaction in structural DNA nanotechnology.

It is over a decade since the construction of the first artificial DNA structure, a stick-cube, whose edges are double helices 8 (Fig. 1b). More complex polyhedra and topological constructs 9 , such as knots and Borromean rings (consisting of three intricately interlinked circles), followed. But the apparent floppiness of individual branched junctions led to a hiatus before the next logical step: self-assembly into two-dimensional arrays.

This step required a stiffer motif, as it was difficult to build a periodic well-structured array with marshmallow-like components, even with a well-defined blueprint (sticky-ended specificity) for their assembly. The stiffer motif was provided by the DNA double-crossover (DX) molecule 10 , analogous, once again, to the double Holliday-junction intermediate formed during meiosis (MDX, Fig. 2a). This stiff molecule contains two double helices connected to each other twice through crossover points. It is possible to program DX molecules to produce a variety of patterned two-dimensional arrays just by controlling their sticky ends 11,12,13 (Fig. 2b).

a, Schematic drawings of DNA double crossover (DX) units. In the meiotic DX recombination intermediate, labelled MDX, a pair of homologous chromosomes, each consisting of two DNA strands, align and cross over in order to swap equivalent portions of genetic information 'HJ' indicates the Holliday junctions. The structure of an analogue unit (ADX), used as a tiling unit in the construction of DNA two-dimensional arrays, comprises two red strands, two blue crossover strands and a central green crossover strand. b, The strand structure and base pairing of the analogue ADX molecule, labelled A, and a variant, labelled B*. B* contains an extra DNA domain extending from the central green strand that, in practice, protrudes roughly perpendicular to the plane of the rest of the DX molecule. c, Schematic representations of A and B* where the perpendicular domain of B* is represented as a blue circle. The complementary ends of the ADX molecules are represented as geometrical shapes to illustrate how they fit together when they self-assemble. The dimensions of the resulting tiles are about 4 × 16 nm and are joined together so that the B* protrusions lie about 32 nm apart. d, The B* protrusions are visible as 'stripes' in tiled DNA arrays under an atomic force microscope.

In addition to objects and arrays, a number of DNA-based nanomechanical devices have been made. The first device consisted of two DX molecules connected by a shaft with a special sequence that could be converted from normal right-handed DNA (known as B-DNA) to an unusual left-handed conformation, known as Z-DNA 14 . The two DX molecules lie on one side of the shaft before conversion and on opposite sides after conversion, which leads to a rotation. The problem with this device is that it is activated by a small molecule, Co(NH3) 3+ 6, and with all devices sharing the same stimulus, an ordered collection of DX molecules would not produce a diversity of responses.

This problem was solved by Bernard Yurke and colleagues, who developed a protocol for a sequence-control device that has a tweezers-like motion 15 . The principle behind the device is that a so-called 'set' strand containing a non-pairing extension hybridizes to a DNA-paired structural framework and sets a conformation another strand that is complementary to the 'set' strand is then added, which binds to both the pairing and non-pairing portions, and removes it from the structure, leaving only the framework.

A robust rotary device was developed based on this principle 16 (Fig. 3), in which different set strands can enter and set the conformation to different structural end-states. In this way, the conformation of the DNA device can readily be flipped back and forth simply by adding different set strands followed by their complements. A variety of different devices can be controlled by a diverse group of set strands.

a, The device works by producing two different conformations, depending on which of two pairs of strands (called 'set' strands) binds to the device framework. The device framework consists of two DNA strands (red and blue) whose top and bottom double helices are each connected by single strands. Thus, they form two rigid arms with a flexible hinge in between and the loose ends of the two strands dangling freely. The two states of the device, PX (left) and JX2 (right), differ by a half turn in the relative orientations of their bottom helices (C and D on the left, D and C on the right). The difference between the two states is analogous to two adjacent fingers extended, parallel to each other (right), or crossed (left). The states are set by the presence of green or yellow set strands, which bind to the frame in different ways to produce different conformations. The set strands have extensions that enable their removal when complementary strands are added (steps I and III). When one type of set strand is removed, the device is free to bind the other set strands and switch to a different state (steps II and IV). b, The PX–JX2 device can be used to connect 20-nm DNA trapezoid constructs. In the PX state, they are in a parallel conformation, but in the JX2 state, they are in a zig-zag conformation, which can be visualized on the right by atomic force microscopy.

What is the purpose of constructing DNA arrays and nanodevices? One prominent goal is to use DNA as scaffolding to organize other molecules. For example, it may be possible to use self-assembled DNA lattices (crystals) as platforms to position biological macromolecules so as to study their structure by X-ray crystallography 4 (Fig. 4a). Towards this goal, programming of DNA has been used to bring protein molecules in proximity with each other to fuse multiple enzymatic activities 17 . However, the potential of this approach awaits the successful self-assembly of three-dimensional crystals.

a, Scaffolding of biological macromolecules. A DNA box (red) is shown with protruding sticky ends that are used to organize boxes into crystals. Macromolecules are organized parallel to each other within the box, rendering them amenable to structure determination by X-ray crystallography. b, DNA scaffolds to direct the assembly of nanoscale electrical circuits. Branched DNA junctions (blue) direct the assembly of attached nanoelectronic components (red), which are stabilized by the addition of a positively charged ion.

Another goal is to use DNA crystals to assemble nanoelectronic components in two- or three-dimensional arrays 18 (Fig. 4b). DNA has been shown to organize metallic nanoparticles as a precursor to nanoelectronic assembly 19,20,21,22 , but so far it has not been possible to produce multidimensional arrays containing nanoelectronic components with the high-structural order of the naked DNA arrays described earlier.

There has been some controversy over whether DNA can be used as an electrical conductor (for example, ref. 23), although the resolution of this debate is unlikely have any impact on the use of DNA as a scaffold. Recently, the effects of DNA conformational changes on conduction in the presence of an analyte were shown to have potential as a biosensor 24 .

Replicating DNA components

A natural question to ask of any assembly system based on DNA is whether the components can be replicated. To produce branched DNA molecules whose branch points do not move, they must have different sequences in opposite branches but, as a consequence, these structures are not readily reproduced by DNA polymerase the polymerase would produce complements to all strands present, leading only to double helical molecules. One option is to use topological tricks to convert structures like the DNA cube into a long single strand by adding extra stretches of DNA bases. The single strand could then be replicated by DNA polymerase and the final replicated product induced to fold into the original shape, with any extraneous segments cleaved using restriction enzymes. Although this would produce a molecule with sticky ends ready to participate in self-assembly, it would be a cumbersome process 25 .

Günter von Kiedrowski and colleagues have recently developed a way of replicating short, simple DNA branches in a mixed organic–DNA species. Their branched molecule consists of three DNA single strands bonded to an organic triangle-shaped linker. To replicate the branched molecule, the single-stranded complement of each of these strands is bound to the molecule, so that one end of each complement molecule is close to the same end of the other complement molecule. In the final step, the juxtaposed complements are connected together by bonding their neighbouring ends to another molecule of the organic linker 26 . Extension of this system to the next level, such as objects like the cube, will need to solve topological problems involved in the separation of the two components, or it will be limited to unligated systems.

Many separate capabilities of DNA nanotechnology have been prototyped — it is now time to extend and integrate them into useful systems. Combining sequence-dependent devices with nanoscale arrays will provide a system with a vast number of distinct, programmable structural states, the sine qua non of nanorobotics. A key step in realizing these goals is to achieve highly ordered three-dimensional arrays, both periodic and, ultimately, algorithmic.

Interfacing with top-down nanotechnology will extend markedly the capabilities of the field. It also will be necessary to integrate biological macromolecules or other macromolecular complexes into DNA arrays in order to make practical systems with nanoscale components. Likewise, the inclusion of electronic components in highly ordered arrays will enable the organization of nanoelectronic circuits. Chemical function could be added to DNA arrays by adding nucleic acid species evolved in vitro to have specific binding properties ('aptamers') or enzymatic activities ('ribozymes' or 'DNAzymes'). A further area that has yet to have an impact on DNA nanotechnology is combinatorial synthesis, which may well lead to greater diversity of integrated components. DNA-based computation and algorithmic assembly is another active area of research, and one that is impossible to separate from DNA nanotechnology (see Box 1).

The field of DNA nanotechnology has attracted an influx of researchers over the past few years. All of those involved in this area have benefited from the biotechnology enterprise that produces DNA-modifying enzymes and unusual components for synthetic DNA molecules. It is likely that applications in structural DNA nanotechnology ultimately will use variants on the theme of DNA (for example, peptide nucleic acids, containing an unconventional synthetic peptide backbone and nucleic acid bases for side chains), whose properties may be better suited to particular types of applications.

For the past half-century, DNA has been almost exclusively the province of biologists and biologically oriented physical scientists, who have studied its biological impact and molecular properties. During the next 50 years, it is likely they will be joined by materials scientists, nanotechnologists and computer engineers, who will exploit DNA's chemical properties in a non-biological context.

Box 1: DNA computers

An assembly of DNA strands can process data in a similar way as an electronic computer, and has the potential to solve far more complex problems and store a greater amount of information, for substantially less energy costs than do electronic microprocessors. DNA-based computation dates from Leonard Adleman's landmark report in 1994 (ref. 27), where he used DNA to solve the 'Hamiltonian path' problem, a variant of the 'travelling salesman' problem. The idea is to establish whether there is a path between two cities, given an incomplete set of available roads. Adleman used strands of DNA to represent cities and roads, and encoded the sequences so that a strand representing a road would connect (according to the rules of base pairing) to any two strands representing a city. By mixing together the strands, joining the cities connected by roads, and weeding out any 'wrong answers', he showed that the strands could self-assemble to solve the problem.

It is impossible to separate DNA nanotechnology from DNA-based computation: many researchers work in both fields and the two communities have a symbiotic relationship. The first link between DNA computation and DNA nanotechnology was established by Erik Winfree, who suggested that short branched DNA molecules could be 'programmed' to undergo algorithmic self-assembly and thus serve as the basis of computation 28 .

Periodic building blocks of matter, such as the DNA molecules shown in Fig. 1a, represent the simplest algorithm for assembly. All components are parallel, so what is on one side of a component is also on the other side, and in every direction. Given this parallelism, if the right side complements the left, the top complements the bottom and the front complements the back, a crystal should result. Even more complex algorithms are possible if one uses components of the same shape, but with different sticky ends. For example, Winfree has shown that, in principle, DNA tiles can be used to 'count' (see figure below) by creating borders with programmable sizes for one-, two- and possibly three-dimensional assemblies 29 . If this scheme can be realized, self-assembly of precisely sized nanoscale arrays will be possible. A computation using self-assembly has been prototyped in one dimension, thereby lending some credence to the viability of algorithmic assembly 30 .

A process through which the disordered components of a system organize themselves into a defined ordered state. The process is guided by minimization of the free energy of the system. Protein folding is an example of molecular self-assembly.

The design and self-assembly of DNA into pre-defined patterns and attempts to control the shapes and functions of the assembled nanostructures.

A class of mechanically interlocked molecules consisting of a ring entrapped between the two bulky ends of a dumbbell-shaped molecule.

A class of mechanically interlocked molecules comprising two or more interchained macrocyclic rings.

A molecule or a molecular system that converts random Brownian motion to directional motion at the nanoscale by doing work on the environment.

Molecular switches made of DNA that transition between at least two distinct states using a trigger — for example, pH or metal ions.

A physical parameter indicating the stiffness of a polymer such as DNA, defined as the length over which the molecule behaves like a rigid rod.

A DNA motif self-assembled from multiple single-stranded DNA oligomers to form a unit for further assembly of a nanostructure. There are usually one or more crossovers in each tile, rendering it more rigid.

A DNA partial duplex with a single-stranded overhang that can hybridize to another, complementary single-stranded overhang, thus ‘sticking’ the two partial duplexes together.

DNA nanostructures formed by folding a long single-stranded DNA scaffold via hybridization of many short DNA complements, known as staple strands.

The long single-stranded DNA template molecule, running through a whole DNA origami structure.

The points at which a DNA single strand exits its hybridization axis and enters an adjacent helix to continue its hybridization in the second helical axis.

The short DNA oligomers (usually 20–60 nucleotides long) used to staple different segments of the scaffold together and form a pre-determined geometry.

Distances between two consecutive crossovers, which are a multiple of 7 bp in a honeycomb packing and a multiple of 8 bp in a square packing.

A DNA structure approximating a geometrical shape at its edges, through tiling of its surfaces by non-overlapping polygons that do not leave a gap.

Topological surface features of a DNA nanostructure, in the forms of protrusions and recessions that are capable of forming base-stacking interactions between two shape-complementary features, thus binding them.

Also known as deoxyribozyme, DNA enzyme or catalytic DNA. A DNA oligonucleotide with a specific sequence that performs a chemical reaction similar to enzymes.

Strand displacement reaction

(SDR). A hybridization scheme in which a longer complement (fuel strand) displaces a shorter complement (output strand) via branch migration to form a more stable duplex.

The unpaired segment of a partial DNA duplex, which can act as a seeding region to start a branch migration and a strand displacement reaction.

Small DNA oligonucleotides that can move on a molecular track by a series of hybridization–dehybridization cycles.

Originally an architectural concept a particular type of structure that maintains its integrity through pervasive tensional forces. In a tensegrity, each individual structural element is under stress, but the overall structure is stable.

Oligonucleotides or small peptides that bind specifically to a target molecule.

The mechanisms by which molecular motors use random thermal noise to produce directional motion.

Differences in DNA replication rates between bacteria and eukaryotes

DNA replication has been extremely well studied in bacteria, primarily because of the small size of the genome and large number of variants available. E. coli has 4.6 million base pairs in a single circular chromosome, and all of it gets replicated in approximately 42 minutes, starting from a single origin of replication and proceeding around the chromosome in both directions. This means that approximately 1000 nucleotides are added per second. The process is much more rapid than in eukaryotes. Table 1 summarizes the differences between bacterial and eukaryotic replications.

Table 1. Differences between prokaryotic and eukaryotic replication

Access options

Get full journal access for 1 year

All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.

Get time limited or full article access on ReadCube.

All prices are NET prices.

Two tricks in one bundle: helix–turn–helix gains enzymatic activity

Many examples of enzymes that have lost their catalytic activity and perform other biological functions are known. The opposite situation is rare. A previously unnoticed structural similarity between the λ integrase family (Int) proteins and the AraC family of transcriptional activators implies that the Int family evolved by duplication of an ancient DNA-binding homeodomain-like module, which acquired enzymatic activity. The two helix–turn–helix (HTH) motifs in Int proteins incorporate catalytic residues and participate in DNA binding. The active site of Int proteins, which include the type IB topoisomerases, is formed at the domain interface and the catalytic tyrosine residue is located in the second helix of the C-terminal HTH motif. Structural analysis of other ‘tyrosine’ DNA-breaking/rejoining enzymes with similar enzyme mechanisms, namely prokaryotic topoisomerase I, topoisomerase II and archaeal topoisomerase VI, reveals that the catalytic tyrosine is placed in a HTH domain as well. Surprisingly, the location of this tyrosine residue in the structure is not conserved, suggesting independent, parallel evolution leading to the same catalytic function by homologous HTH domains. The ‘tyrosine’ recombinases give a rare example of enzymes that evolved from ancient DNA-binding modules and present a unique case for homologous enzymatic domains with similar catalytic mechanisms but different locations of catalytic residues, which are placed at non-homologous sites.

The wealth of biochemical, sequence and structural information accumulated over the years of molecular biology provides examples of proteins that change function in the course of evolution (1𠄴). Enzymes having a chemical requirement for invariant amino acids in the active site are particularly vulnerable to selection pressure. Using sequence similarity, one can detect proteins evolutionarily related to enzymes but lacking catalytic activity due to disruption of their active sites. These proteins may function, for example, as transcription regulators (4). Given that an overwhelming majority of homologs to such proteins are indeed enzymes and that the non-catalytic variants are uncommon (4), there is little doubt about the direction of evolution in these cases: the enzyme has lost its activity⾬quired a new function. The reverse path of evolution is rather rare. There are few examples of normally non-enzymatic domains that gain catalytic activity (5,6), particularly for transcription regulators. One such example is discussed here.

The helix–turn–helix (HTH) DNA-binding motif is ubiquitous and detected in many transcription regulators (7𠄹). HTH transcription factors are diversified across a variety of orthologous families and the HTH motif is incorporated into several structural scaffolds (9). The most common of these scaffolds, hereafter referred to as homeodomain-like (HHTH), has a hydrophobic core of two α-helices (helices B and C) completed by another, usually N-terminal, α-helix (helix A). This structure can be described as a right-handed three-helical bundle (Fig. ​ (Fig.1b). 1 b). Some examples of HHTH proteins are homeodomains, AraC-type transcriptional activators and members of the winged HTH family (HHTHw), typified by the C-terminal domain of catabolite gene activator protein (CAP) (7). HTH bundles can usually be distinguished from other three-helical structures by a sequence signal in the HTH motif (8�). Very divergent representatives with known spatial structure can be recognized by the characteristic packing of α-helices B and C at nearly a right angle to each other (Fig. ​ (Fig.1b, 1 b, helices B1� and B2�, Fig. ​ Fig.1c). 1 c). The turn between α-helices B and C offsets α-helix C so that the N-terminal part of C is packed against the middle of B. α-Helix B is usually short (two or three turns) and C, which binds to the DNA major groove, is longer (12). A monophyletic origin for most HHTH proteins has been proposed (8).

Structural similarity between Cre recombinase and MarA. Ribbon diagrams of (a) Cre recombinase from bacteriophage P1 (pdb entry 1crx, residues A154�) and (b) MarA transcription regulator from E.coli (pdb entry 1bl0, residues A9�) in complex with DNA drawn by Bobscript (48), a modified version of Molscript (49). The structures were superimposed and then separated for clarity. N- and C-termini are labeled. The spatially equivalent structural elements are colored correspondingly in the two structures. N- and C-terminal HHTH domains are colored red and blue, respectively. α-Helices of the HTH motifs are in darker color. The turns in the HTH motifs are yellow and the loop connecting two HHTH domains is green. Long insertions (i1 and i2) in the first HHTH domain of Cre recombinase are shown in gray. DNA chains are orange. α-Helices are labeled A, B and C followed by a domain index (1 or 2). Side chains of active site residues in Cre recombinase are shown in ball-and-stick presentation. (c) The stereodiagram of Cre recombinase (red) and MarA (blue) superposition. The Cα traces of protein and DNA segments are shown. The regions used in r.m.s.d. minimization are outlined in darker colors. Superposition was performed using the InsightII package (MSI Inc) according to the DALI alignment (34). (d) Structure-based sequence alignment of Cre recombinase (1crx) and MarA (1bl0) generated by DALI (34). The starting and ending residues are numbered and the segments are labeled with the same letters as in (a) and (b). Color shading of the regions is the same as in (a) and (b). Invariant residues are shown in bold white letters boxed with black and conserved substitutions are shown in bold. The number of residues omitted from the alignment are shown in parentheses. The active site residues are marked with a red dot above the alignment and their side chains are displayed in (a).

Site-specific recombination allows living organisms to rearrange and redistribute their genetic content by cutting and rejoining DNA segments at specific sequences. Recombinases catalyze DNA breakage, strand exchange and ligation. One of the two major recombinase types, the λ integrase family (Int), uses a tyrosine nucleophile in a reaction that proceeds through a stable 3′-phosphotyrosine DNA𠄾nzyme intermediate (13,14). The structures of several family members, namely bacteriophage λ integrase (15), bacteriophage HP1 integrase (16), XerD from Escherichia coli (17) and Cre recombinase from bacteriophage P1 (Fig. ​ (Fig.1a) 1 a) (18), have recently been solved. The most extensive structural information obtained concerns the DNA-binding mode and mechanism of Cre enzyme (19,20). X-ray crystallography revealed that type IB topoisomerases (21), which include eukaryotic (22,23) and viral (24) enzymes, also belong to the Int family due to extensive conservation of the structural core, active site arrangement and the catalytic mechanism (25,26).

The Int family has always been treated as a unique fold without much structural similarity to other proteins (27�). SCOP (30,31) groups Int family structures into the fold named 𠆍NA-breaking/rejoining enzymes’ of α+β class. CATH (32) places them in the ‘mainly α’ class with non-bundle architecture. However, structure similarity searches with such programs as DALI and VAST initiated with Cre recombinase coordinates (18) (pdb entry 1crx, Fig. ​ Fig.1a) 1 a) reveal a highly significant and striking match that spans the entire length of the MarA transcriptional activator molecule (33) (pdb entry 1bl0, Fig.  1 b). DALI (34,35) superimposes 88 Cα atoms of 1crx (322 residues) and 1bl0 (116 residues) with a Z score of 4.0, r.m.s.d. of 3.3 Å and 17% identity in the resulting sequence alignment (Fig. ​ (Fig.1d). 1 d). VAST (36) aligns 78 Cα atoms of these proteins with a P value of 0.0002, r.m.s.d. of 2.5 Å and a sequence identity of 16.7%. Additionally, superposition of Cα traces of Cre recombinase and MarA results in an almost perfect superposition of DNA molecules present in the crystals (Fig. ​ (Fig.1a𠄼) 1 a𠄼) despite the fact that DNA coordinates were not used in r.m.s.d. minimization. Thus the modes of DNA binding are essentially identical for Cre recombinase and MarA. Such an extensive structural resemblance combined with similar substrate binding and non-random sequence identity (18%, Fig. ​ Fig.1c) 1 c) argues for homology (3,37) between DNA-breaking/rejoining enzymes and MarA. Surprisingly, similarity between the two proteins remained unnoticed to date.

MarA is a member of the AraC family of transcription activators that control expression of a variety of genes (33). The MarA structure consists of two HHTH modules with a unique mutual arrangement, previously unrecognized for multi-HTH proteins, in which two HHTH domains are approximately related by a translation (33) (Fig. ​ (Fig.1b). 1 b). This arrangement results in tight packing of the two domains and places two almost parallel DNA-binding helices in the major groove at a separation of one DNA double helix turn (Fig. ​ (Fig.1b). 1 b). Both MarA domains have structural counterparts in the Cre recombinase𠄽NA complex and all six MarA α-helices are superimposable between the two proteins (Fig. ​ (Fig.1). 1 ). The homology of Cre and MarA suggested by structural, functional and sequence similarity implies that the catalytic segment of Int proteins consists of two consecutive HHTH domains. However, it is difficult to determine at present if the common ancestor of Int and MarA already contained two HHTH domains or if duplications in these proteins occurred in parallel. Interestingly, among the four articles describing different independently solved Int protein structures (15�), only one discusses the structural similarity of the first HHTH domain in Int proteins with the HTH motif of the catabolite activator protein DNA-binding domain (17). X-ray crystallography revealed that the second HHTH domain, which contains a catalytic tyrosine residue, is conformationally variable between different representatives of the family, as well as between different DNA complexes of the same Cre protein, and thus might fold into the HHTH structure upon DNA binding only (15�,27�). For example, in λ integrase the catalytic tyrosine is modeled in a flexible β-strand-like region. Such flexibility might be necessary for proper functioning of the enzymatic HHTH domain. It is well known that the active sites of many enzymes include regions of higher flexibility to accommodate changes in the substrate during catalysis. Therefore, it is likely that the second HHTH domain, which contains most of the active site residues (Fig. ​ (Fig.1a 1 a and c), acquired some structural flexibility while the first HHTH domain, which is used mostly for DNA binding in a standard HTH-like manner, remained rigid.

Thus the Int family fold has likely evolved by a duplication of an ancient HHTH protein (Fig. ​ (Fig.1a, 1 a, red and blue). The first HHTH domain was elaborated with long insertions (Fig. ​ (Fig.1a, 1 a, gray) placed in the ‘turn’ region (Fig. ​ (Fig.1a, 1 a, yellow) of the HTH motif. These insertions are structured in subdomains that contain small β-sheets (Fig. ​ (Fig.1a, 1 a, gray). It is not unusual for HTH proteins to incorporate insertions in ‘turn’ regions, found for example in the endonuclease FokI (38). The presence of these subdomains disrupting the HTH motif masks the sequence signal and prevents motif detection in Int proteins by sequence analysis. The first HHTH domain of Int proteins is used primarily for DNA binding while the second HHTH domain is adapted to a catalytic role.

The following question arises: are there other examples of HTH domains that are not only present in an enzyme as nucleotide-binding modules but possess enzymatic activity (i.e. carry at least some of the catalytic residues)? PDB (39,40) searches by DALI (34,35) and VAST (36,41) reveal domains of different topoisomerases that contain catalytic tyrosine residues as members of the HHTH fold. The presence of HHTH domains in type IA, II and VI topoisomerases (42�) has been detected previously (44�). Topoisomerase IA, II and VI HHTH domains contain a small amount of β-sheet and should be classified as CAP-like ‘winged’ HTH domains (Fig.  2 a, c and d). Notably, all of these enzymes possess a catalytic mechanism similar to the one established for Int proteins, i.e. tyrosine is utilized as a nucleophile and found in an HHTH domain. The Int family includes type IB topoisomerases. Thus an evolutionary connection exists between all ‘tyrosine’ DNA-breaking/rejoining enzymes with known structure, namely type IA, IB, II and VI topoisomerases, which all contain an enzymatic HHTH module. Structure superpositions of these domains in the four enzymes reveal that the position of the catalytic tyrosine residue is not structurally conserved (Fig.  2 e). In the topoisomerase VI structure (44) Tyr103 is placed in α-helix B (Fig. ​ (Fig.2a) 2 a) in the Int family, including topoisomerase IB (21,22,24,47) and Cre recombinase (18), Tyr324 (Cre numbering) is incorporated in α-helix C (Fig. ​ (Fig.2b) 2 b) in topoisomerase IA (42) Tyr319 is at the C-terminal end of the first β-stand in the ‘wing’ segment of the HHTH domain (Fig.  2 c) in topoisomerase II (43) Tyr782 is located after a long loop at the beginning of the second β-strand in the ‘wing’ (Fig. ​ (Fig.2d). 2 d). The sites in homologous HTH domains where catalytic tyrosines are located are not homologous therefore, the catalytic properties of HTH domains in DNA-breaking/rejoining enzymes are likely to have evolved independently in parallel. Thus catalytic HHTH domains provide a unique example of homologous enzymes with a similar mechanism but different location of active site residues which are placed at non-homologous sites.

Catalytic HHTH domains with a topoisomerase-like mechanism. Ribbon diagrams of (a) DNA topoisomerase VI A subunit from Methanococcus jannaschii (pdb entry 1d3y, residues A72�), (b) Cre recombinase from bacteriophage P1 (pdb entry 1crx, residues A286�), (c) topoisomerase I from E.coli (pdb entry 1ecl, residues 279�), and (d) topoisomerase II from yeast Saccharomyces cerevisiae (pdb entry 1bgw, residues 699�) were drawn by Bobscript (48), a modified version of Molscript (49). Corresponding α-helices are labeled A, B and C. α-Helices of the HTH motifs are in a darker color. The turns in the HTH motifs are yellow. β-Strands are shown as purple arrows. The side chains of catalytic tyrosine residues are shown in ball-and-stick presentation and are colored red. Dots in (c) replace a long partially disordered insertion. (e) Structure-based sequence alignment of domains shown in (a)–(d). The starting and ending residues are numbered and the three helices are labeled. The number of residues omitted from the alignment are shown in brackets. Color shading of these regions matches that in (a)–(d). Residues at conserved hydrophobic positions are shown in bold. The catalytic tyrosine residues are shown in white and are boxed with red. The HHTH domain of 1ecl (c) is circularly permuted, which is reflected in the residue numbering of the segment.

DNA and genetic organization

The physicochemical properties conferred by DNA sequence not only determine bending and melting preferences, but also correlate strongly with the genetic organization of both eukaryotic and bacterial chromosomes. In general, the coding sequences of genes have a G/C-rich bias [45, 112] . In part, this is because the codons for the most abundant amino acids also have a G/C-rich bias [113, 114] . The corollary is that noncoding DNA sequences, including introns as well as 5′ and 3′ flanking DNA sequences, are generally more A/T-rich. Indeed the most A/T-rich and most thermodynamically unstable DNA sequences in the Saccharomyces cerevisiae genome are located in 3′ flanking regions [110] . This distribution of base composition on a genomic scale implies that, on average, coding sequences are stiffer or less bendable, whereas noncoding sequences are both more flexible and more susceptible to strand separation. However, in apparent contradiction to these variations in flexibility, in eukaryotic chromosomes coding sequences have a higher nucleosome occupancy than noncoding sequences [45, 112] . But, again, this pattern of occupancy is possibly related to another sequence-dependent physical property of the polymer, the higher intrinsic entropy of certain A/T-rich sequences [32] .

The occurrence of the more A/T-rich sequences in the flanking regions of genes has functional significance. At the 5′-end of a transcription unit there is an obvious correlation with the requirement for RNA polymerase to melt DNA prior to transcription initiation. But at the 3′-ends of transcription units, polymerase dissociates and releases the constrained unwound DNA so that it reforms a double helix. One possibility is that such regions serve as topological sinks, absorbing by writhing any positive superhelicity generated in advance of the transcribing enzyme. This would block the transmission of any such superhelicity to a neighbouring gene with the potential for disrupting its chromatin structure. Instead, the writhed DNA would serve as an appropriate substrate for relaxation by topoisomerases in particular, topoisomerase II, which is preferentially associated with actively transcribed genes [115] . Topoisomerase II, together with topoisomerase I, is also found in regions of low nucleosome occupancy at promoters [115] . However, measurement of the association of topoisomerase II with its optimal binding sites is precluded because of their highly repetitive and redundant nature.

The relationship of the physicochemical properties of DNA to chromosome organization and function in not only apparent at the level of individual genes and transcription units, but is also a feature of whole bacterial chromosomes. These chromosomes comprise, in general, a single circular DNA molecule which can vary in length from

0.5 Mb to 6–10 Mb. Remarkably in these chromosomes, at least in most γ-Proteobacteria, gene order is highly conserved such that those genes that are highly expressed during exponential growth are clustered near the origin of DNA replication, whereas those that are more active during episodes of environmental stress resulting in the cessation of growth are more frequent in the vicinity of the replication termini [116, 117] . However, not only is there a gradient of gene organization from origin to terminus, but also this gradient correlates, on average, with a gradient of base composition so that in each replichore the most stable, G/C-rich, DNA is close to the origin while the least stable is at the terminus [117] . This average pattern of course includes wide variations at the level of individual genes. Yet another feature that exhibits a graded response from origin to terminus is the distribution of binding sites for DNA gyrase [116, 118] , a topoisomerase that inserts negative superhelical turns into DNA [55] . Again these are concentrated primarily in proximity to the origin of replication and thus create the potential for the DNA in this region to be more highly negatively supercoiled than that close to the terminus. This overall pattern of organization can couple chromosome structure to energy availability [69] . When bacteria are shifted to a fresh rich growth medium, ATP levels rise, activating DNA gyrase and thus increasing the negative superhelical density of the chromosome [60] . This would be localized to the origin-proximal region and would, in turn, activate the genes producing the necessary components for growth – the transcription and translation machinery – as well as providing an appropriate environment for DNA replication. Once DNA replication is initiated, the passage of the replisomes along the two replichores would by itself generate a gradient of superhelicity by the Liu/Wang principle [56] , with the more negatively supercoiled DNA again being located closer to the origin and the more relaxed DNA close to the terminus. Again, by analogy to transcriptions, the DNA close to the terminus, in concert with topoisomerases, might act as a topological barrier between the two replichores. The bacterial chromosome thus functions as a topological machine in which the overall distribution of DNA sequences reflects the coupling between the processing of the replisomes and gene expression.

Although the Liu/Wang principle was initially conceived as applying to naked DNA, it is equally valid when considered in the context of higher order structures generated by DNA packaging. In eukaryotic nuclei, despite the existence of the 30 nm fibre in vivo being recently questioned [118] , the left-handed coiling of the nucleosome stacks responds to torsional forces – such as those generated by transcription – by unwinding on application of positive torsion and correspondingly rewinding with applied negative torsion [119] .

Contributors and Attributions

Connie Rye (East Mississippi Community College), Robert Wise (University of Wisconsin, Oshkosh), Vladimir Jurukovski (Suffolk County Community College), Jean DeSaix (University of North Carolina at Chapel Hill), Jung Choi (Georgia Institute of Technology), Yael Avissar (Rhode Island College) among other contributing authors. Original content by OpenStax (CC BY 4.0 Download for free at [email protected]).

Dr. Todd Nickle and Isabelle Barrette-Ng (Mount Royal University) The content on this page is licensed under CC SA 3.0 licensing guidelines.