Biological Programming

Started by
17 comments, last by Sagar_Indurkhya 18 years, 1 month ago
I'm writing a proposal for a research project and I don't know where to start with this issue: I am proposing to design a programming language / scripted language that would compile to DNA and then be inserted in E. Coli. Now I know I can do take the DNA and put it in the E. Coli because I am working with a team of at least 30 engineers(grad students, profs, post docs) who have done this type of thing. However, I am confused as to how I would start thinking about the programming language aspect. The cell is a living organism, and say I would like to create an oraganelle inside the cell that would glow. Well my plan would be to have the compiler express an instruction as the nucleotides that would compose of a flourescent gene. Yet, how can I expand this model to be abstract? Any thoughts are welcome. What I said above could be totally wrong. I just wanted to see what you guys had to say about this whole idea.
Advertisement
This sounds really ambitious...

Though here is how I view your issue: DNA, basically, was identified as comprising series of genes, each gene "coding for a given protein". DNA is trasnlated to RNA, which is then translate to these proteins.

The DNA can be seen as language itself that is expressed in the form of proteins, then the presence of such proteins inside the cell/blood/organs has an influence (or not...) on the host metabolism.

It is also known that there can be strong interdependence between the genes.

Now regarding programming languages, all languages are, a one point or another, translated into the CPU assembly language. The thing is tha the CPU assembly was designed in a deterministic way by humans, and the purpose and effect of each instruction is, thus, well known and documented.

In a way, the DNA is already a programming language in that it is a sequence of coding instructions.

The usual aim, in designing a programming language, is to change the paradigm for the programmer: for instance, C focuses on functions, which make it possible to organize modular code in regard to ASM. Object-Oriented languages revolve around design concepts that make it possible to model applications in a different perspective.

In a word I see 2 distinct reseqrch domqin in your question:

1. Compilation to DNA: here, you can use a very simple language that would for example be only composed of the 21 basic aminated acids (?) and their control sequences, and develop a hardware device that generates the DNA molecule corresponding to your sequence.

2. Given you already have the device described above, abstract the DNA "language" into a new programming paradigm that will be at the center of the language syntax and grammar. Now, I don't know enough about genetics to have an idea how to abstract it...

Hope this helps.
Part 1 of what you described we have discussed and we know we can do this(although the biologists will have to try and shorten the timeframe hopefully).

"2. Given you already have the device described above, abstract the DNA "language" into a new programming paradigm that will be at the center of the language syntax and grammar. Now, I don't know enough about genetics to have an idea how to abstract it..."

This is precisely what I am having trouble with... I think that all these years of programming in C++ has corrupted my mind with one paradigm.
I don't think DNA works like that, does it? I mean, each "instruction" of DNA is supposed to code for a particular amino acid to be turned into a specific protein. DNA and living organisms just don't function at a basic level like computers do. I'm sort of getting the impression that you're trying to turn E. Coli into a computer, and I just don't think they work that way.

Now, I may be totally off-base, of course, and I'm not in university yet (although I'm taking a university level bio course right now); this is all just off the top of my head. It's a bit confusing as to what you're trying to do...
my siteGenius is 1% inspiration and 99% perspiration
Quote:Yet, how can I expand this model to be abstract?

That would be better answered by a geneticist than a computer scientist.

Then again, word has it that Nature codes like this guy.
Free Mac Mini (I know, I'm a tool)
Programming and Meta-Programming the Human biocomputer - Lilly

Have fun with it :P
Ooh, you're on the wrong forum chief! Try a DNA programming newsgroup or something. Your project sounds interesting though. :)
Quote:Original post by silverphyre673
I don't think DNA works like that, does it? I mean, each "instruction" of DNA is supposed to code for a particular amino acid to be turned into a specific protein. DNA and living organisms just don't function at a basic level like computers do. I'm sort of getting the impression that you're trying to turn E. Coli into a computer, and I just don't think they work that way.

Now, I may be totally off-base, of course, and I'm not in university yet (although I'm taking a university level bio course right now); this is all just off the top of my head. It's a bit confusing as to what you're trying to do...


Well I recognize that we can't just turn E. Coli into a computer. So I'm trying to grasp how to build a language around DNA. To do this I need to probably draw upon a really deep knowledge of molecular biology and computer science. Too bad I'm a junior in High School, so my knowledge is both fields isn't all that great.

What my team is trying to do is develop the ability to program a cell. What I see is:

1) Write some source code in the language i'm trying to develop. Compile with interpreter(we'll write this as well) and output as dna sequence.

2) Fabricate DNA

3) Insert DNA in E. Coli

4) Make observations on respective E. Coli cells and make changes to the language.

The idea is that we could theoretically reprogram cells to do what we wanted(to a degree). Will we fully complete this? Probably not. Is it possible for this idea to work? I suspect so.

I don't think the language that we will have to develop will look like any traditional languages like C/C++, Java, LISP, etc. Upon further googling I have determined that this field is related to Systems Biology, although I am working with a Synthetic Biology team.
I'm definitely not very knowledgable about this kind of thing, but I have tried one experiment. Remember the human genome project? Well, as part of the project, you can download a (mostly complete) set of text files listing the A/C/G/T pattern. So, being a programmer, I thought "I can compress this data by using two bits per letter", and hacked together a quick program to do the conversion.

Once it was done, I popped the file open in my hex-editor (which uses an ascii-character-per-byte rather than hexadecimal display), and noticed that the output looked VERY similar to common compressed data that you'd see in a ZIP file or so.

So I tried compressing the output file using WinZip and WinRAR. Usually when you compress something that has very few repetetive patterns in it (or if it's already compressed), the resulting file is just about the same size as the input. And that's what happened with the human genome output.


And while I have no idea how to READ the data, it still freaks the heck out of me to think that it might be some kind of compressed data.


That got me thinking... the DNA is data. What operates on the DNA? (I'm not totally sure... enzymes or something). So it's kind of like a Turing machine model - the DNA is turing tape, and all the entities that read/write DNA are Turing machines.

If you guys who're more into biology than I am spend time figuring out the rules to whatever is reading the DNA, you could probably go a LONG way toward figuring out how to write a language that "compiles" into DNA.


This leads to other interesting discussions, like:

- Since DNA isn't an active entity (it's just data), does that mean that comparing DNA between different species might be missing the possibility that whatever is reading the DNA might be performing totally different operations?!

- DNA is susceptible to mutations and other things changing the sequence. But what about the other parts of the cell that are acting on the DNA?

- How do you find out all of the rules that govern how DNA is read? These interactions occur at the molecular scale. How do you scientifically record and analyze a (human?) cell in normal operation?

[Edited by - Nypyren on March 17, 2006 10:40:51 PM]
Are you trying to perform computations using the cell or trying to write a "DNA compiler" that allows you code the organism characteristics in a "higher level" langauge than nucleotide sequences?
- The trade-off between price and quality does not exist in Japan. Rather, the idea that high quality brings on cost reduction is widely accepted.-- Tajima & Matsubara

This topic is closed to new replies.

Advertisement