From: Sender: (Yaneer Bar-Yam) To: complex-science Date: Wed, 09 Jul 2008 23:54:14 -0400 Message-ID: X-Original-Return-Path: Received: from [128.6.68.135] (HELO rci.rutgers.edu) by necsi.org (CommuniGate Pro SMTP 4.0.6) with ESMTP id 22100761 for complex-science@necsi.org; Mon, 07 Jul 2008 21:29:49 -0400 Received: by rci.rutgers.edu (Postfix, from userid 11335) id 8FCED1241; Mon, 7 Jul 2008 21:29:48 -0400 (EDT) Received: from 172.17.12.24 (SquirrelMail authenticated user sji) by webmail.rci.rutgers.edu with HTTP; Mon, 7 Jul 2008 21:29:48 -0400 (EDT) X-Original-Message-ID: <1721.172.17.12.24.1215480588.squirrel@webmail.rci.rutgers.edu> X-Original-Date: Mon, 7 Jul 2008 21:29:48 -0400 (EDT) Subject: What is a gene? A dynamic & triadic definition of a gene X-Original-To: complex-science@necsi.org User-Agent: SquirrelMail/1.4.13 MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal (Yaneer, if it is not too late, please replace my previous post with this one. Thanks. Sung) The most widely accepted definition of a gene during the past four decades has been a stretch of DNA that codes for a protein. Although this simple definition of a gene served well for the 20th-century molecular biology and genetics, the new data that have been emerging since the mid-1990's (when DNA microarrays were invented) have made the protein-centered definition of a gene obsolete [1,2,3]. A new definition proposed by Gerstein and his coworkers at Yale now includes as a gene those DNA regions that code for RNA as well [2]: "A gene is a union of genomic sequences encoding a coherent set of potentially overlapping functional products." . . . . . (1) The important phrase here is "functional products", by which the authors mean proteins and RNA molecules that are biologically active. The new definition of a gene given in (1) was motivated by the recent unexpected finding [1,3] that a large portion of the human genome (about 30% of the DNA mass), although not coding for any proteins, nevertheless code for RNA molecules whose functions have not yet all been characterized. There are two aspects to the definition of a gene given in (1) that I believe require revisions: i) It is too static, being based solely on gene "products", i.e., proteins and RNA, which are "equilibrium structures". According to Prigogine (917-2003)[4], there are two fundamental classes of structures in nature -- equilibrium (e.g., rocks, chairs, DNA double helix, nucleotide or amino acid sequences) and dissipative structures (e.g., the flame of a candle, all sorts of gradients, action potentials, gene expression profiles). One convenient way to distinguish dissipative structures from equilibrium structures is to remember that, when energy input is stopped, the former disappears but the latter remains. For example, when a computer is turned off, the primary memory (a dissipative structure) in CPU disappears but the secondary memory (an equilibrium structure) in the hard disk remains. ii) It excludes those DNA regions that regulate gene expression (called promoters, enhancers, silencers, etc.) without producing any proteins or RNA. In other words, Gerstein et al's definition of a gene excludes "dissipative structures" which would include all regulatory processes in the living cell. This is what Gerstein et al state [2]: "Although regulatory regions are important for gene expression, we suggest that they should not be considered in deciding whether multiple products belong to the same gene. . . . " . . . . . . . . . . . (2) To remedy these perceived shortcomings, I suggest that the concept of "dissipative structures" [4] be incorporated into the definition of a gene itself. One way to do this is as follows: "A gene is a DISSIPATIVE STRUCTURE that embodies (or stores) not only genetic information (in the form of a nucleotide sequence of DNA regions) but also mechanical energy (in the form of conformationally strained DNA regions) generated from chemical reactions catalyzed by enzymes." . . . . . . . . . . . . . . . . . . . (3) The fact that active regions of DNA carry mechanical energy, for example, in the form of DNA supercoils, has been well established [5]. Such mechanical energy stored in DNA has been variously referred to as conformons [6] and "Stress-Induced Duplex Destabilizations" or SIDDS [5]. The definition of a gene given in (3) is tantamount to postulating that a gene is a molecular machine composed of DNA segments and associated proteins that stores mechanical energy generated from chemical reactions and uses this energy to transcribe its sequence information into RNA molecules whenever and wherever needed in the cell for a right duration of time. The definition of a gene given by (1) can be made compatible with the definition given by (3) if we make the following two postulates: "The whole DNA carries three kinds of genes -- p-genes coding for proteins, r-genes coding for RNA, and d-genes coding for DNA molecules." . . . . . . (4) The existence of d-genes is self-evident, since DNA serves as the template for its own replication and this ability of DNA is heritable from one cell generation to the next. "DNA carries not only genetic/sequence information but also the mechanical energy (called conformons or SIDDS) to power gene expression. . . . . . . . . . . . . . . (5) In other words, by combining the dissipative structure concept of Prigogine [4] and the conformon concept introduced in molecular biology more than three decades ago (reviewed in [6]), a new definition of a gene can be formulated in two parts as follows: i) "DNA carries three kinds of genes, each coding for proteins (p-genes), RNA molecules (r-genes), and DNA molecules (d-genes)." . . . . . . . . . . . . . . . .(6) ii) "DNA stores mechanical energy in the form of conformons or SIDDS that powers the spatiotemporally organized motions of chromatins in order to express p-, r- and d-genes in response to the signals received from the cytosol." . . . . . . . . . . . (7) Statement (6) can be regarded as a definition of terms that are compatible with facts, and what is original in the proposed 'triadic' definition of a gene is contained in Statement (7) in the concept of conformons [6] or SIDDS [5]. Conformons are defined as the sequence-specific conformational strains of biopolymers that carry 'ordered energy' to power goal-directed molecular motions [6]. The first direct experimental evidence for conformons in DNA was provided by DNA supercoils [5] and for conformons in proteins by the single-molecule measurements of myosin motions along actin filament [7]. Also, Statement (6) deals with the informational aspects of a gene, while Statement (7) is concerned primarily with the energetic aspect of a gene, consistent with the information-energy complementarity principle believed to underlie all self-orgnaizng processes in nature [8]. With all the best. Sung ___________________________________________ Sungchul Ji, Ph.D. Department of Pharmacology and Toxicology Rutgers University Piscataway, N.J., 08855 References: [1] Pearson, H. (20056). Genetics: What is a gene? Nature 441:398-401. [2] Gerstein, M. B. et al. (2007). What is a gene, post-ENCODE? History and updated definition. Genome Research 17:669-681. [3] Greally, J. M. (2007). Genomics: Encyclopedia of human DNA. Nature 447: 782-783. [4] Prigogine, I. (1977). Dissipative Structures and Biological Order. Adv. Biol. Med. Phys. 16:99-113. [5] Benham, C. J. (1996). Duplex Destabilization in Supercoiled DNA is Predicted to Occur at Specific Transcriptional Regulatory Regions. J. Mol. Biol. 255:425-434. [6] Ji, S. (2000). Free energy and information content of Conformons in proteins and DNA. BioSystems 54: 107-130. [7] Ishijima, A., Kojima, H., Higuchi, H., Harada, Y., Funatsu, T. and Yanagida, T. (1998). Simultaneous measurement of chemical and mechanical reaction. Cell 70:161-171. [8] Ji, S. (2002). The Bhopalator: An Information/Energy Dual Model of the Living Cell (II). Fundamenta Informaticae 49(1-3), 147-165.