Chapter 15: Knotted DNA -

Why is the genetic code written in strings? The text you are now reading is one-dimensional — the information that this line of text carries does not depend on its position relative to the line above or the line below. You could cut the page into strips, each containing one line, and paste them end to end – it would read the same. Images, figures, are essentially two dimensional – cut the image on the cover of this journal into strips: it would make no sense at all. In this sense the building blocks of life: DNA, RNA, and proteins (at least initially), are really the building strings of life, and we can ask: why? In Candide, Voltaire warned us that it is unlikely that this is the best of all possible worlds: things could be otherwise. The genetic code could be dimensionally otherwise, but there is a reason why it isn’t. Let’s begin with the premise that the code must be made out of physical stuff, and that it must be replicated locally, where local replication means that each letter in the code is copied next (close in physical space) to the original. Our choices for the dimension run between zero and three; zero because it is hard to imagine what negative dimensions are, and three is the apparent physical maximum if we assume the code does not depend on time. A zero dimensional code would have no continuity and therefore no order – you could think of a bag of Scrabble letters. There may be information in the letter distribution – how many of each letter you have, but you don’t know which letter comes after which. This is inefficient – a great deal of information can be kept in the order itself. A randomly ordered collection of the words on this page wouldn’t tell you as much as much as the page does now, I hope. Three dimensions presents problems in replication. This is an age-old challenge in manufacturing — in order to build an object we have to break it down to a set of pieces that are in a sense at most two-dimensional. You can make, say, a solid cube with a mold, but if the cube were truly three dimensional, in the sense that the inside of the cube might be different than the surface, a mold isn’t going to do it. You can’t determine interior detail with a mold. The problem is roughly this: a three dimensional object takes up all of physical space, so there is nowhere, locally, to replicate to. Many things seem to have a fractional dimension. For example, an irregular coastline is often modeled as having dimension between one and two. These constructions require a kind of infinite division, this is not possible with the sort of letter or discrete element code we are considering. We are then left with two options: one dimension or two dimensions. It is common knowledge that the eukaryotic chromosome is a marvel of string compaction. The DNA is supercoiled, wrapped around the nucleosome core, like thread on a spool, the spools are packed together in the 30nm fiber, which is then looped and the loops made into rosettes, the rosettes combined into coils, and the coils together make up the chormotids. It’s a long string in a small space, and as such it is subject to one of the fundamental laws of nature: long strands in small spaces tend to become entangled. But if we locally replicate an entangled strand, topology tells us that the result is a pair of strands that are entangled with themselves and with each other. How are they going to separate for the cell division? Thus we should not be surprised that the tale of compaction is also one of tangle management. One effect of supercoiling is to stiffen the strand, making it less likely to tangle. Since supercoiling is itself a kind of tangling, this is a tradeoff – a controlled, manageable tangling is induced to reduce random entanglement, sort of like braiding hair. The length wrapped around a nucleosome, 146 base pairs, might be about as much as you could let flop around without getting tangled. Each level of compaction is also a method of restricting random motion of the strand, like wrapping the last foot or two tightly around a coiled extension cord. DNA lives in a volatile environment — free radicals shoot through at regular intervals, causing breaks in the string that need to be repaired. Since there is so much length packed in, sometimes these repairs induce tangling. The toposiomerase type II enzymes, which cut the strands, pass other strands through, and rejoin the open ends, are required for the detangling. This would be a great method of untangling extension cords or Christmas lights, since you don’t have to pull the strand all the way through the tangle, except that the result would be cord covered in electrical tape. So dimension one presents some design challenges. Dimension two, a surface, has a great deal to recommend it. Information arrayed on a surface is accessible from the half space on either side, and so is easy to replicate. Consider one of our best replicators, the copy machine. Here the information is replicated along a one-dimensional front — the light bar you see processing underneath the glass if you copy with the cover open. It is pretty easy to imagine a two-dimensional genetic code. It could be two ply, like a paper towel. Replication would consist of peeling the two plys apart, while matching elements attach to make each daughter ply half of a new two-ply towel, which would then peel apart, and so on. Note that in the peeling apart, the replication would be proceeding along a one-dimensional front, just as it does with the copy machine. Moreover, a surface is far less likely to have the entanglement problems that a filament has. You can roll up a section of a plane, as we do a rug, which compactifies it quite a bit. You could also fold it accordion style. Of course, some two-dimensional shapes would have topological obstructions. You could not, for example, lift a surface off a sphere without cutting along a seam – though you could have enzymes that cut the genetic fabric. A finite cylinder would have no such obstruction — the replication front would be a circle around the tube that processed along the length of the tube. Another advantage of a two dimensional informational array is an extra dimension of local information for error checking — that is, we can require consistency up and down as well as left and right. Damage repair would be much easier – those rampaging radicals would poke holes, not cut strands, and the holes would be easy to fill and there would be no chance of joining the wrong open ends, as happens with strings. In fact these string misrepairs are believed to be a cancer mechanism. So two dimensions seems to be preferable, on grounds of both ease of replication and of replication fidelity. But nature has chosen one dimension. Why? The answer can be seen in our own information processing activity. For all of our fascination with images — photographs, paintings, graphs, movies — the greater part of our information manipulation is in one-dimensional forms: text, speech. This is for a pretty simple reason. It is much easier to edit in one dimension than in two. And this is for a pretty simple mathematical reason. What is editing? Editing consists of making new boundaries by cleaving sets of code, then aligning the boundaries in a new pattern. The boundary of a set is of dimension one less than the set itself. So, for a one dimensional set, a string, the boundary is zero dimensional – the two endpoints. Think of how easy it is to cut and paste in a word processor – you simply place one boundary next to another. You can paste a novel into the middle of a sentence if you want to. Compare this with editing in two dimensions, say with image processing software such as Photoshop. To change any section of the image you must deal with the boundary of that section, which is one dimensional. Another boundary is not likely to match up. For example, on a map you cannot simply replace one country with another (say, paste China into Mexico) – their boundaries (borders) will not align. To paste an arbitrary region into another arbitrary region, you generically need to scale and distort to have the boundaries match (this is one reason image editing is so time consuming). With a boundary of any length, this is very unlikely to happen by chance. Which brings us to evolution. The general idea of evolution is that our code will undergo random edits, and that natural selection will choose the edits that are beneficial. The problem with a two dimensional information array is that almost no random cuts can become completed edits, so evolution will be very slow if it happens at all. So the editing capabilities in one dimension make it much easier to make new from old, so a system based on one dimensional information would evolve much faster than a two dimensional one — two dimensions has too much fidelity, because of the difficult to satisfy requirement of boundary consistency. So two dimensions would be the place to be if we were sure we had the final design, and were not interested in any further adaptation. But in the competitive, dynamic world we live in, one dimension is the way to go.

A type-2 topoisomerase cleaves a DNA strand, passes another through the break, and then rejoins the severed ends. Because it appears that this action is as likely to increase as to decrease entanglements, the question is: how are entanglements removed? We argue that type-2 topoisomerases have evolved to act at “hooked” juxtapositions of strands (where the strands are curved toward each other). This type of juxtaposition is a natural consequence of entangled long strands. Our model accounts for the observed preference for unlinking and unknotting of short DNA plasmids by type-2 topoisomerases and well explains experimental observations.