Saturday, 17 May 2014

Mathematics with DNA, other way to do arithmetic operation using DNA



Numerical System is essential to write specific symbol as number and to create complex mathematical equations or formulas. In this article describes numerical system in base 4, which uses the nucleotides of DNA strand (A, G, C and T) as symbols. Conventionally, the encoding of organic DNA is known macroscopically, i.e. encoding a chain or portion, or also encoding the operations with its operands [4]. These encodings do not permit as much flexibility because they limit the number of combinations that can be performed, by this reason, numerical system described below consists of a system where genetic information is encoded microscopically, i.e. using nucleotides codes. This characteristic of atomicity permits performing combinations to generate a numerical system, which has a potency and scalability, in the representation of numerical quantities.

Fig 1: Correspondence between numerical systems and others systems.

Nucleotide code is defined as two bits sequence, which acquires value from zero to three. Also any chain of nucleotide represents a numerical quantity, because, this numerical system is positional and any symbol (nucleotide) inside of chain has two meanings; first, position within the numerical chain, and second, value of symbol receives (from 0 to 3). With both data, it is possible to obtain partially final value of numerical quantity. Fig 1 shows the correspondence between numerical system described and other systems.

An example is always more clarifying than a set of words, by this reason, if we consider number 4567 in decimal base, the corresponding binary representation is 0001000111010111, in octal is 010727, in hexadecimal is 0x11D7 and finally, in this numerical system (In a future could be Cod-Inchrosil) encoding it is AGAGTGGT. It is also possible to change from a numerical system using a next polynomial equation [8]:

Number (Xn-1 . . . X2X1X0b=Xn-1bn-1+ . . . +X2b2+X1b1+X0b0

Where Xn-1 . . . X2X1X0 are digits of number and b base used in codification. For example, next number encoded in Cod-Inchrosil, AGAGTGGT4 can convert in decimal:

AGAGTGGT4 =A·47 +G·46 +A·45 +G·44 +T·43 +G·42 +G·41 +T·4= 0·4+ 1·4+ 0·4+ 1·4+ 3·4+ 1·4+ 1·4+ 3·4= 456710

On the other hand, inverse operation from decimal number to Cod-Inchrosil number is as follows:

Supposing decimal number to be 18910

189 div 4   = 47 - minus (189.4) =1 : G
47   div 4   = 11 - minus (47.4)   =3 : T
11   div 4   = 2 - minus (11.4)   =3 : T
2     div 4   = 2 - minus (2.4)     =2 : C
Final result - 18910 =CTTG4

Furthermore, with decimal numbers as 873.87510 to convert DNA codification, Integer part is calculated with method above, being TGCCG4.Also decimal part is calculated as follows:

0.875×4=3.5 - the integer is 3 so is T.
0.5×4=2  - the integer is 2 so is C
Final number is TGCCG.TC4.

In general, numerical system can represent any integer and decimal representation, being compatible with other numerical systems. Also, this article describes a several arithmetic operation with numerical quantities (addition, subtraction, multiplication, division, etc.). In particular about addition operation is gives the guidelines for the addition of two nucleotides.

Fig 2. Addition Operation

For example, suggesting two numbers ATCGATA and TAGCCAA, the addition both number as follow:


Also other example with carry in addition operation as follow:


Next operation to define is the subtraction, which is known as an arithmetic operation opposite to addition. The guidelines to subtract two nucleotides are defined following Fig 3.

Fig 3. Subtraction operation.

Good way to describe the subtraction is by means of one example. Considering, two numbers in Cod-Inchrosil: TGAT4 (in decimal 211) and GCCC4 (in decimal 106), the result of the subtraction is 105 in decimal and GCCG4 in Cod-Inchrosil, then the calculation is performed with the subtraction that has been showed in next operation Fig 4.


Fig 4. Example Subtraction between two numbers.

How show in operation above (Fig 4), when T and C is subtracted no carry occurs, and the result is G, on the other hand, when A and C is subtracted, the result is C, but the carry is G, this quantity has to be added to the following nucleotide, in this case it is C. T is obtained from this addition operation.

In general, the mechanism is the same as in the subtraction in decimal, but in this case is used nucleotides. Other way to do subtraction operation is by means of the addition first operand and two’s complement of second operand. That is very easy to do, because it is possible to obtain complementary chain, and add G to obtain two’s complement.

The following operation defined is multiplication. The multiplication or product is defined as an arithmetic operation where successive additions are performed. As with previous operations, guidelines are given for multiplication at a nucleotide level, these products are observed following Fig 5.

Fig5. the multiplication of two nucleotides.

Considering two numbers, the multiplication of both numbers is showed in Fig 6:

Fig 6. Example multiplication.

With this numerical system could be represented decimal numbers, note, all floating point numbers (decimal numbers) follow an established pattern, which can be summarized in the following equation: r=m · be, where r is the real number, m is the mantissa, b is the base and, finally, e is the exponent, which permits raising the base. For example, number 0.00012310 in fixed point (we use the decimal system), in floating point could have the following representations: 1.23 10−4, 0.123 10−3, 0.0123 10−2, 0.00123 10−1, etc. As is observed in the previous representations, all make reference to the same number but with different representation in floating point, in DNA codification is not an exception about ambiguous representation. By this reason, and to avoid ambiguities in the representations of numerical quantities, a standard (IEEE, 1985) was established which permits the homogenous representation of floating point numbers. Consequently, this numerical system has adapted this standard to DNA codification. Next tables show different representation in float point of Cod-Inchrosil, where each cell in Fig 7 and Fig 8 represents nucleotides.


Fig 7. Simple precision representation.

Fig 8. Double precision representation.


Fig 9. Codifications

References

[1].        Adleman, L. M. (1994). Molecular computation of solutions to combinatorial problems. Science , 1021–1024.
[2].        Benenson, Y., Gil, B., Ben-Dor, U., Adar, R., & Shapiro, E. (2004). An autonomous molecular computer for logical control of gene expression. Nature , 423-429.
[3].        Boneh, D., Dunworth, C., Lipton, R. J., & Sgall, J. I. (1996). On the computational power of DNA. Discrete Applied Mathematics , 79–94.
[4].        F, G., M, F., & C, B. (1996). Making DNA add. Science , 220–223.
[5].        Furber, S. B. (1989). VLSI Risc Architecture and Organization. CRC Press.
[6].        G.Rozenberg, T.Back, & J.Kok. (2012). Handbook of Natural Computing. Springer.
[7].        Kari, L., Gloor, G., & Yu, S. (2000). Using DNA to solve the Bounded Post Correspondence Problem. Theoretical Computer Science , 192–203.
[8].        Knuth, D. (1962). Positional Number Systems. In The Art of Computer Programming (pp. 194–213). Addison–Wesley.
[9].        Ogihara, M., & Ray, A. (1999). Simulating Boolean circuits on a DNA computer. Algorithmica , 239–250.
[10].Turing, A. (1936). On Computable Numbers, with an Application to the Entscheidungsproblem., (pp. 230–65).
[11].Von Neumann, J. (1945). First Draft of a Report on the EDVAC.
[12].Watson, J., & Crick, F. (1953). Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid. Nature , 737–738.
[13].Wu, G., & Seeman, N. C. (2006). Multiplying with DNA. Natural Computing , 427–441.
[14].Jeffreys A.J., Wilson V., Thein S.W. (1984). "Hypervariable 'minisatellite' regions in human DNA".