Protocells: what we have learned about minimal life and evolvability

In the paper we review lessons learned about two major evolutionary transition from a bottom up construction of protocells. We use a particular systemic protocell design process as a starting point for exploring two fundamental questions: (1) how may minimal living systems emerge from nonliving materials? - and (2) how may minimal living systems support open-ended evolutionary richness?


Non-life-to-life transition
Novel functionalities in physicochemical systems can be generated naturally in three ways: by the assembly of structures (equilibrium processes), by self-organization (nonequilibrium processes) and by a combination of the two, through the evolution of structures (Rasmussen et al., 2001).Our approach to create minimal living systems, which we define as protocells (Szostak et al., 2001, Rasmussen et al., 2004, Sole et al., 2007, Kurihara et al., 2015), utilizes both self-assembling and self-organizing processes.We investigate how a controlled environment together with coupled selfassembly and externally driven self-organization may play together to generate minimal, self-replicating, physicochemical systems.Thus our systemic protocell approach requires a simple metabolism controlled by information both kept together by a container.
We have successfully implemented a particular protocell around a ruthenium tris(bipyridine) Ru(bpy) 3 complex that uses light to catalyze redox reactions on precursors of both the amphiphiles and the information in bulk.The informational system serves as part of an electron relay that modulates the metabolic reaction rate, which in turn depends on the redox potential (the nucleobase composition) of the information molecule (DeClue et al., 2009).
In particular, we have established that the amphiphile production can be controlled by chemical information.The reduction potential of a nucleobase, 8-oxo-guanine [oxoG], can be exploited by the photocatalyst to produce amphiphiles, but not that of guanine (the next most easily oxidized nucleobase) or by extension, those of A, C, U and T. Furthermore, fatty acid vesicles will influence the production rates as a detailed investigation of the information-photocatalyst configuration showed, especially when both oxoG and the Ru(bpy) 3 are independently attached through hydrophobic anchors into the container (Maurer et al., 2011).Further we have established a photochemical fragmentation scheme to ligate DNA oligomers.First, the deprotection of an oligomer is performed using a Ru(bpy) 3 photosensitizer.This oligomer can only then, and in the presence of a template, be ligated with another oligomer (Cape et al., 2012).We have demonstrated several advantages of our systemic approach integrating the three mutually supporting components: (i) self-assembly of a decanoic acid container; (ii) anchoring to the container a metabolic ruthenium complex as well as (iii) a conjugated nucleic acid information complex; (iv) container feeding and growth; (v) metabolically driven container replication; (vi) metabolically driven nucleotide oligomer ligation (part of replication); (vii) one pot metabolic production of both amphiphilic molecules and ligated oligomers, new information molecules.These are all key milestones toward the construction of a minimal living system.However, one key milestone is not yet reached before full protocell integration can occur: To implement an effective DNA self-replication process based on template directed ligation of two smaller oligomers.

Missing link: Template directed ligation
We can derive the dependence of the overall replication rate constant on hybridization energies, temperature and strand length, by employing a model for the minimal ligation-based replication process of a single-stranded template in which the ligation of oligomers is involved in the formation of the complementary replica.Within the template directed replicator system, two complementary oligomers hybridize to a single stranded template.An irreversible ligation reaction (i.e., formation of covalent bonds in a condensation reaction) transforms the oligomers into the complementary copy of the template.The newly formed double strand can dehybridize, thus allowing for iteration of the process.Throughout the replication mechanism, we neglect both the production of waste as well as the hydrolysis of ligation.The resulting overall reaction rate is derived, Constantinescu et al., 2016, and summarized below in Fig 1, where two cases are discussed, both assuming parabolic template growth.When product inhibition is rate limiting both longer strands and higher temperatures increase the replication rate.When hybridization is the rate limiting factor, an optimal set of temperature and strand lengths exist.
In the presented protocellular system evolution may be defined in the following way: Compositional information, which is defined as the content and location of oxoG in the information strand, determines the metabolic reaction rates through an electron relay.These processes require a variety of environmental conditions including sacrificial proton and electron donors (DeClue et al., 2009, Maurer et al., 2011, Cape et al., 2012).
Thus a modification of the compositional information generally results in modified metabolic reaction rates.Thus, for the simple protocellular model in a fixed environment, we expect Darwinian evolution to be a metabolic reaction rate optimization process where presumably the overall replication rate (the phenotype) is enhanced.At the molecular -or "genotypic" -level, this means a change (through selection  Left side of table summarizes the included physical model (each row).Right side of table indicates the higher order observable phenomena/functionalities generated by the simulation.Top of table depicts the qualitative information details needed in a molecular model representation (the data structure) of the simulation (columns).Simulations with more detailed, and thus more complex, molecular components are able to generate increasingly more complex dynamics and functionalities.As an example, data structure D 3 has included enough molecular interaction details to allow the simulation to generate molecular self-assembly and e.g.micellar and vesicle formation.The last row is left open as we conjecture: to obtain a higher evolutionary potential we need to add more components and/or resources to the system.

Expanding evolvability
We know from experimental and theoretical investigations that if constituent components and environment are too simple, only trivial emergent structures will be generated.As (appropriate) diversity/complexity of the constituent components is increased, emergence of hierarchies or multilevel structure may occur.Thus, it seems natural to assume this conjecture could also be extended after selfreplication and simple Darwinian evolution has been achieved for a protocell.A discussion of the involved constituent components of a protocellular simulation is found in Fig 2 .In practice, more variation could include more and different resource oligomers (short nucleotide libraries), changes in the fatty acid composition, adding different photosynthesizes molecules.Impact or performance could then be measured at the resulting metabolic rate, container division properties and life-cycle (generation) time.However, given our experimental experiences, this would be a challenging and time-consuming enterprise as each new component in the mix in principle could cause undesired (destructive) side effects.Bedau et al., 1998, propose a statistical characterization of evolutionary processes that aims to quantify the innovative potential of an evolutionary process by measuring the rate that innovative changes are produced during the evolutionary process.In this classification scheme our protocellular systems falls into Class 2. Class 1 is a neural evolutionary process and includes diffusion processes.Class 3 is defined as evolutionary processes with an apparent open-ended ability to innovate and includes examples from biological evolution and technological evolution.

Conclusions
(a) The distinction between non-living and living matter is best characterized as a grey-zone where minimal living systems have the properties discussed above.(b) If we require life to exhibit open-ended evolution, the presented protocell (or for that matter, any published protocellular model we are aware of) does not qualify as a minimal living physicchemical process.However, if Class 2 evolution suffices, several of the published protocellular models, if successfully integrated and experimentally implemented, would qualify as minimal life-forms.(c) To enhance the evolutionary potential of a protocellular system (any system) more richness has to be added to the system.How this system expansion could occur depends on the details of the system.This work was in part supported by the EC Grant #318671.

Figure 1 .
Figure 1.Effective overall replication rate constant k as a function of strand length and temperature.(a) and (b) correspond to a template direct replication mechanism which suffer from product inhibition within a slow (i.e., rate limiting), respectively fast (i.e., not rate limiting) ligation reaction (AfterConstantinescu et al., 2016).We note that the replication rate depicted in (a) has been obtained byFellermann and Rasmussen, 2011, employing thermodynamic arguments as well as a polymer model for oligonucleotides that allows simulation of their diffusion and hybridization behavior.and amplification) of the compositional information of the nucleotide strands, as they are being inherited.

Figure 2 .
Figure 2. Connection between the details included in the simulations and the ability for the simulations to generate targeted observables.Left side of table summarizes the included physical model (each row).Right side of table indicates the higher order observable phenomena/functionalities generated by the simulation.Top of table depicts the qualitative information details needed in a molecular model representation (the data structure) of the simulation (columns).Simulations with more detailed, and thus more complex, molecular components are able to generate increasingly more complex dynamics and functionalities.As an example, data structure D 3 has included enough molecular interaction details to allow the simulation to generate molecular self-assembly and e.g.micellar and vesicle formation.The last row is left open as we conjecture: to obtain a higher evolutionary potential we need to add more components and/or resources to the system.