Thermodynamics and kinetics of Lesion Induced DNA Amplification (LIDA)

Non-enzymatic short-chain XNA replication is often prohibited due to product inhibition, and product inhibition has been one of the major stumbling blocks for implementing prebiotic information replication systems both in bulk4 and as part of synthetic protocells.7 Product inhibition implies that a newly formed complementary template is very unlikely to dehybridize from the template. However, GibbsDavis and her team1 have developed a method to circumvent product inhibition based on mismatches and lesions between the complementary templates that destabilize them and thus enables a continued replication process. In this work we explore in simulation the thermodynamic and kinetic properties of lesion induced DNA amplification processes. The method we use focuses on a characterization of the oligomers by their free energies rather than by their detailed base-pair structure.


Introduction
Non-enzymatic short-chain XNA replication is often prohibited due to product inhibition, and product inhibition has been one of the major stumbling blocks for implementing prebiotic information replication systems both in bulk 4 and as part of synthetic protocells. 7 Product inhibition implies that a newly formed complementary template is very unlikely to dehybridize from the template. However, Gibbs-Davis and her team 1 have developed a method to circumvent product inhibition based on mismatches and lesions between the complementary templates that destabilize them and thus enables a continued replication process. In this work we explore in simulation the thermodynamic and kinetic properties of lesion induced DNA amplification processes.
The method we use focuses on a characterization of the oligomers by their free energies rather than by their detailed base-pair structure.

Reaction kinetics
A model of the lesion induced DNA amplification (LIDA) kinetics is presented below. The rates of formation of each Figure 1: Schematic of idealized the LIDA reaction kinetics. 3 The reaction kinetics is simplified as the plus and minus template are assumed identical. Further, blunt-end ligations are not included.
of the X-containing species are 3 where X and O 1 , O 2 denote the template strand and oligomers 1 and 2, respectively. Further, k +,− i with i ∈ {O 1 , O 2 , T } are the rate constants 2 (on-/off-rates) describing the corresponding forward and backward reactions. The differential equations which are derived from the rates of formation describing the processes in Fig. 1 are given by 3 Furthermore, the total molar template concentration is

Thermodynamics
The equilibrium constants corresponding to the rates of formation in eqs. (1)-(3) are given by where ∆G i is the change in free energy per mol for the reaction, R is the gas constant (1.987 cal/mol·K) and T represents the absolute temperature in kelvin. The off-rates are calculated by fixing the on-rate to a specific value and using the equilibrium condition from equation (10), as we can assume 2 k + i = 2.0 · 10 7 /(mol/l s) because the on-rate is rather independent of the sequence details. Thus the ∆G i 's for the three hybridization processes determine the overall kinetics.
In this investigation O 1 consists of 8 base pairs and O 2 of 9 base pairs with a one base dangling end. The strategy for achieving the most efficient DNA replication is by dialling in destabilization. 1 Already with these short oligomers the number of possibilities to order base pairs as well as to dial in destabilization is rather high. Therefore we focus on the resulting free energy for the single oligomers as well as for the ligated template and not on the detailed base sequences.
The energy change for two neighboring base pairs at standard conditions (T = 310.15 ∼ 37 • C) ranges from −2.24 kcal/mol to 1.33 kcal/mol with a initiation cost per oligomer of 1.96 kcal/mol. 8 After ligation we assume the dangling end causes a bulge loop of size one with a corresponding energy cost of 4 kcal/mol. 8 Since the total template concentration in eq. 9, depends on the on-and offrates, which again are linked to the free energy change (eq. (10)), the template replication efficiency solely depends on the values of ∆G i .
The difference between the free energy changes of the two oligomers and the ligated total template ∆G T mainly relies on the cap contribution 8 of O 1 , the intervening nearest neighbor base pair 8 of O 1 + O 2 and especially on the bulge loop. 8 This relation shall be defined by ∆G T − (∆G O1 + ∆G O2 ) ≡ γ. In particular, for T = 310.15 K γ > 0 is always true.
There are usually multiple solutions in terms of the base sequence resulting in the same free energy changes, which means it does not matter whether there e.g. are more AT/TA base pairs or GC/CG pairs combined with internal mismatches. 8 However, what matters is the intervening nearest neighbour base pair of the O 1 + O 2 ligation, as it, besides the bulge contribution, decisively influences the energy difference between the single oligomers and the total template. Consequently, the task is to find the optimal values for ∆G i depending solely on γ.
The equations eq. (4-9) are implemented and simulated by matlab for different on-/off-rate combinations calculated from the respective ∆G i s.

Simulation results
Before investigating our system for various on-/off-and ligation rates, we first present in Fig. 2 a comparison of our simulation approach with the experimental data from Gibbs-Davis et al (2015). 1 Therefore, we use a identical initial template and oligomer concentrations as in the experiment and fix all on-rates i to k + i = 2.0 × 10 7 /(mol/ls) and the ligation rate to k L = 0.02/s. These same values for the rates are used throughout our simulations unless stated differently. Since the reported experiment was conducted at T = 299.15 K and the free energies of the base pairs are given at T = 310.15 K, 8 we need to translate the free energies between these two temperatures. We can either use an approximation of ∆G Oi by consulting Table 1 in Wu et al 9 or consulting the www.nupack.org DNA/RNA complex free energy calculator. 6 In both cases we find that ∆G O1,2 (26 • C) ≈ ∆G O1,2 (37 • C) − 1 kcal/mol. Further, we can approximate γ 26 • C ≈ γ 37 • C , as it is defined as a difference between the involved free energies where we assume the oligomers each decrease with ∼ -1 kcal/mol and the double length template decrease with ∼ -2 kcal/mol. Even though these approximations are rather rough, the compari-son in Fig. 2 shows that our approach is compatible with the experimental data. We now turn to unspecified base pair sequences and therefore a characterization of the oligomers solely by their free energies at T = 310.15 K. To have an unambiguous measure distinguishing the total template replication efficiencies for different sets of rates in different simulations, we use the time t where [X tot ]/ max{[X tot ]} = 0.9 (red dotted line in Fig. 2).
In Fig. 3 the results for γ = 3.89 kcal/mol are computed for a range of free energy values for ∆G O1 and ∆G O2 in the interval [−6; −9] kcal/mol, which corresponds to offrates in the interval 1.18 · 10 3 ; 9.09 /s. The used value for the energy difference γ of ligated template and single oligomers is estimated to be the largest possible with a bulge (loop of size one) arising from the ligation process. Each of the graphs correspond to a given value of ∆G O1 , and its minima indicate the highest replication efficiency and therefore the optimal value not only for ∆G O1 but also for ∆G O2 . The optimal replication efficiency is obtained for ∆G O1 ≈ −8 kcal/mol while ∆G O1 /∆G O2 ≈ 1. We can define an envelope curve that connects the minima of the ratio ∆G O1 /∆G O2 with the variable γ. In Fig.  4 the minima of the graphs for variable ∆G O1 (in Fig. 3) are connected, calculated in half integer steps between [-6;-9] kcal/mol for different γ ∈ {2.7, 3.89, 5} kcal/mol with ligation rate k L = 0.02/s and all on-rates i k + i = 2.0 × 10 7 /(mol/l s).
In any of the three graphs the data points from left to right stand for the minimal ratio of ∆G O1 /∆G O2 with a fixed value for ∆G O1 (varying ∆G O2 ) starting with the highest value ∆G O1 = −6 kcal/mol and decreasing in half integer steps to ∆G O1 = −9 kcal/mol. As mentioned before γ = 3.89 kcal/mol is the largest obtainable for systems with low loop size (< 9 bases) and no hairpins, meaning γ = 5 kcal/mol is not realistic but of theoretical interest. First it should be noted that increasing γ means decreasing product inhibition. So for a given ratio ∆G O1 /∆G O2 the replication efficiency increases, indicated by a shift to shorter replication times.
Secondly, it should be noted that the minimum replication time (and thus highest replication efficiency) is indeed maximal for ∆G O1 /∆G O2 ≈ 1, regardless of γ.
Lastly, for larger γ the optimum in replication efficiency corresponds to lower values in free energy. This is also suggested by a shift of the data points to the left in Fig 4 for increasing γ.
These results, especially that the template replication is most efficient for ∆G O1 /∆G O2 ≈ 1, imply that the base length of the oligomers is in fact of minor importance as only the free energies and their ratio have an impact. Longer oligomers can be adjusted by the use of AT/TA and internal mismatches instead of GC/CG base-pairs, to achieve the highest replication efficiency for a given γ-value. This means that the template replication presumably could be made significantly more efficient by expanding the oligomer and template length to include larger loops or hairpins., 8 corresponding to larger γ values. Hairpins are used exactly in this manner by Lincoln and Joyce(2009) 5 in their selfreplicating RNA system.
What is left to examine is the influence the ligation rate has on the template replication efficiency. For this purpose the replication time for ∆G O2 = ∆G O1 = −8 kcal/mol is plotted over various values of the ligation rate k L as illustrated in Fig. 5. Clearly, higher ligation rates lead to faster replications although the efficiency seems to converge towards a threshold. It should be noted that k L larger then ∼ 10 −2 are mainly of theoretical interest.

Conclusion
Our investigations of the free energies of oligomer-template interactions and their impact on the reaction kinetics have show us what mainly determines replication efficiency in non-enzymatic, short-chain, nucleotide systems. The replication efficiency does not depend on the details of the basepair sequences. Product inhibition decreases and can eventually be eliminated when the value of the difference between the free energy of template hybridization and the free energies of the oligomer hybridizations increases as defined by γ ≡ ∆G T − (∆G O1 + ∆G O2 ). Desired γ values be obtained by designing the replication systems with bulges and an appropriate balance between AT and CG base pairs as well as mismatches. The bigger the γ the faster the replication. Further, the template replication efficiency has a maximum when the hybridization energies of the two oligomers are identical ∆G O1 /∆G O2 ≈ 1. Finally, increasing the ligation rate can boost replication efficiency independently of the free energy relationships between the oligomers and the template.
Based on our analysis we predict that a significantly faster experimental kinetics should be achievable for parameters around: γ = 3.89 kcal/mol with ∆G O1 = ∆G O2 = −8 kcal/mol, k L = 0.02/s and assuming all on-rates i k + i = 2.0 × 10 7 /(mol/l s).
As our results are independent of the amount of base pairs per oligomer, the concept should still be valid for larger system sizes (longer templates and oligomers). Therefore it should be possible to make the replication process even faster by increasing the system size so that larger loops and/or hairpins become accessible.