3 of 45

Synthetic virology

Synthetic virology is a scientific discipline focused on the study and development of synthetic, artificial viruses. It is a multidisciplinary science that incorporates virology, synthetic biology, computer and theoretical biology, and DNA nanotechnology, borrowing and integrating concepts and methodologies from these fields.
There is a broad spectrum of applications for synthetic virology technologies, including medicine, investigative tools, and more.
The synthesis of an artificial poliovirus type 1 Mahoney (PV1 [M]) was first achieved in 2002 in the absence of a natural template. The synthetic cDNA of the poliovirus contained 27 deliberate nucleotide changes, serving as genetic markers, which were inserted into the genome. When cultivated in HeLa cells, an effective tissue culture system for poliovirus proliferation, the synthetic virus (designated sPV1 [M]) did not exhibit phenotypic changes compared to the wild-type PV1 (M). However, upon intracerebral injection into CD155 tg mice, which are transgenic for the poliovirus receptor CD155, the median lethal dose (LD50) was five orders of magnitude lower than that of the wild-type virus.

4 of 45

Poliovirus type 1 Mahoney (PV1 [M]) structure:

5 of 45

�The complete genomes of several other RNA viruses have been chemically synthesized, including the poliovirus, influenza virus, human endogenous retrovirus, HIV cpz, and SARS-like coronavirus.

The human genome, consisting of 3 × 10^9 base pairs (i.e., 2.8 × 10^8 base pairs), contains approximately 8% sequences of retroviral origin. Thus, these are viral fossils, but the function of these remnants in human evolution, physiology, and disease remains unclear. Most genes or gene fragments, however, are inactive due to various replication errors during host cell proliferation. An exception is the env gene, which likely persists because it may play a crucial role in the physiology of hominoids. Nevertheless, all ancient human retroviruses have degenerated, including the human mammary tumor virus-like proviruses (HML-2), human endogenous retrovirus (HERV) K (HERV-K (HML-2)). The latter may have been added to humans relatively recently from Old World primates, but no functional proviruses capable of producing infectious particles have been identified.

To reconstruct a retrovirus resembling the ancestor of HERV-K, a consensus genome was developed, enabling the synthesis of the entire genome and obtaining a proviral clone, HERV-KCON. It likely resembles the precursor of HERV-K (HML-2). This virus entered the human genome in the last few million years. Site-directed mutagenesis can be applied to obtain this provirus. This reconstructed virus is named Phoenix.

6 of 45

Studying the pathogenic potential of the virus that likely circulated in the population of ancient humans for millions of years can provide valuable information about its impact on human evolution.

Electronic microscopy of viral particles generated by the Phoenix provirus.

7 of 45

For a long time, it was suspected that chimpanzees are a natural reservoir for simian immunodeficiency viruses that gave rise to zoonotic infections responsible for the HIV/AIDS pandemic. However, direct evidence was lacking, as the simian immunodeficiency virus most closely related to HIV-1 (SIVcpz) was only found in captive chimpanzees. Through the consensus synthesis of the SIVcpz sequence, infectious molecular clones were obtained. The analysis of these clones demonstrated that natural SIVcpz strains possess biological properties necessary for human infection.

8 of 45

To understand how the SARS-CoV virus can be transmitted from animals to humans, its sequence was isolated, and a cDNA (29.7 kb) was synthesized. Subsequently, SARS-CoV was successfully transformed into an infectious clone in bats by exchanging the region encoding its receptor-binding domain (RBD) with the corresponding region from human SARS-CoV. This resulted in the most replicating genome. Rational design, synthesis, and reconstruction of a hypothetical recombinant virus can be utilized for studying the mechanisms of interspecies transmission and hold great potential for aiding in the event of a SARS epidemic by developing antiviral drugs and vaccines.

9 of 45

So far, the complete genome of only one DNA virus, ɸX174, has been assembled through synthesis. DNA synthesis has also been employed to understand the structure and function of the bacteriophage DNA, but this involves creating fragments rather than reconstructing the entire viral genome.

10 of 45

Use of Synthetic viruses

The use of viral components for therapy and other technological manipulations involves the use of synthetic viruses. These are multifunctional vectors that mimic key properties of viruses to overcome cellular barriers and deliver genes using mechanisms similar to viral vectors. The capsid proteins are named "coat proteins" because they surround viral nucleic acids. Peptides, which can self-assemble and create these viral shells (capsids), are often used for these purposes.

These peptides can also be genetically modified, allowing for the alteration of their properties. These peptides typically carry positive charges, enabling them to interact with negatively charged polyelectrolytes such as RNA or DNA, as well as various synthetic polymers.
There are different approaches to the artificial synthesis of viral capsids. One such approach involves creating a three-block structure consisting of a cationic protein spermine, a helical protein (helical coil), and polyethylene glycol.
This modular approach resembles assembly, which is often seen in natural systems when the necessary components of the capsid initially assemble and target a specific region of DNA/RNA. Final self-assembly then occurs.

11 of 45

Additionally, there are methods that utilize natural peptides for this purpose. These peptides can be synthesized using yeast. While these structures are flexible, they lack stability. To achieve rigid structures, synthetic proteins mimicking natural "coat proteins" are used. This way, virus-like particles (VLPs) can be created that are indistinguishable from natural VLPs. In 2014, a VLP structure capable of infecting HeLa cells was obtained.

Another way to change the properties of viral particles is through three distinct approaches.

The first approach involves genetic modification.
The second approach is chemical modification.
The third approach is a combination, where both genetic and chemical modification pathways are utilized.

12 of 45

In the genetic pathway, modifications can be introduced into the primary structure of the envelope protein by incorporating specific amino acids or entire polypeptides with N-terminal or C-terminal sequences of the envelope protein. Addition of individual amino acids can be combined with chemical modification. The amino acids most commonly modified are lysine, modified at the (-NH2) group; aspartic and glutamic acids, modified at the (-COOH) group; and cysteine, modified at the (-SH) group.

14 of 45

�The addition of peptide fragments to the ends of the polypeptide chain leads to changes in the properties of viral capsids, such as signaling, recognition, binding, and even the catalytic properties of the proteins. All of these modifications hold significant importance for biomedical technologies.

Viruses replicate by infecting a host, which means that genetic material needs to be delivered to the host cell to initiate infection. The infection process occurs with high efficiency. Therefore, this mechanism can be used for delivering genetic and other materials into cells other than viral nucleic acids. The delivery includes various necessary materials or products:

Inclusion/encapsulation of bioactive components.
Release of encapsulated materials.
Local accumulation where the delivery of necessary components is intended.
Elimination from the host organism.
Circulation time.
Immune response of the body to delivery agents.

15 of 45

�Until recently, these methods relied on PCR or RT-PCR amplification of templates from a pre-existing virus, followed by subcloning into appropriate plasmid vectors with or without mutagenesis. Such methods will continue to be invaluable for virology and vaccinology. Advances in gene synthesis have allowed the generation of viruses in the absence of an available infectious virus. This is significant not only from a dual-use perspective but also for our understanding of the evolution and properties of important pathogens. Furthermore, the synthesis of genomes for both DNA and RNA viruses will lead to unprecedented opportunities for modifying natural genomes, thus enabling new investigations into viral genome, expression of viral genes, and gene function.

16 of 45

�Creation of artificial cells

The cell is the elementary unit of living matter. Modern cell biology not only studies the structure, functions, and principles of cell operation but also explores the methodology of cellular engineering. This includes the use of biotechnologies in medicine, acting as containers for drug delivery, developing new pharmaceuticals, biosensors, bioremediation technologies, and contemporary understandings of the origin and evolution of life.

17 of 45

The concept of creating artificial cells was proposed by Thomas Ming Swi Chang in 1957. Artificial cells can be used as biomimetic systems to study and understand the biological properties of cells and can be applied as substitutes for natural cells. Artificial cells can be of two types based on their internal characteristics - typical and atypical. Typical artificial cells should have cell-like structures and possess at least some key characteristics of biological cells, such as the ability to change, self-replicate, and undergo metabolism. To create atypical artificial cells, various materials are used to mimic one or more features of biological cells, and most importantly, they are not limited in structure.

18 of 45

Typical Artificial Cells

The construction of typical artificial cells is considered one of the tasks of synthetic biology. Research on these synthetic cells has various goals, such as:

Understanding how cells function.
Bridging the gap between the non-living and living worlds.
Adding new functions that are absent in natural cells to develop new properties.
Providing a plausible theory for the origin of life.

19 of 45

Cells possess three main structures to perform fundamental life functions:

I. Stable, semi-permeable membrane surrounding the cell. Its membrane components protect the cell from damage in the external environment and allow the transport of necessary substances for metabolism and energy exchange.

II. Biomacromolecules (DNA or RNA) carrying genetic information, controlling the cell, and endowing it with the ability to develop and adapt.

III. A series of metabolic pathways used to provide the cell with energy. This enables the cell to survive, self-renew, and process information.

Natural cells require three key components:

these three components are interrelated. The early stage of protocell evolution suggests that metabolism and compartments are sufficient for achieving cellular replication
(b). In this case, information is not required.

20 of 45

Methods for creating artificial cells

The first one starts with a living organism. In this approach, the genome is deconstructed, and the minimal set of genes necessary to support basic properties of cellular life is selected or entirely replaced with a synthetic genome. This method is referred to as "top-down.“

The second method, "bottom-up," is an approach that starts from scratch. It seeks to create a "living" artificial cell by assembling biological and/or non-biological molecules. These two approaches are very different but complement each other in producing a wide range of artificial cells, from simple protocells to engineered living cells.

21 of 45

"Top-down" construction

"Top-down" construction is aimed at building "minimal" cells by reducing or simplifying the genome of a living cell. A minimal cell is one that has the minimum number of genes necessary to perform the most essential life functions and helps understand how vital processes are regulated.

22 of 45

In 1995, the Venter laboratory discovered that the parasitic bacterium Mycoplasma genitalium has only 517 genes, making it the simplest prokaryote among known living organisms. Therefore, this bacterium was chosen for creating a "minimal" cell. It was determined that approximately 256-350 genes out of the 517 are sufficient to support the cell's viability.

23 of 45

In 2004, the minimal number of genes required for cell viability was reevaluated. Based on both computational methods and experimental strategies, it was determined that 206 genes are necessary for the normal functioning of a bacterial cell. However, among these 206 genes, many have partially dispensable functions, and therefore, this number of genes can be further reduced. For instance, some enzymes responsible for the synthesis of low-molecular-weight compounds (nucleotides and amino acids) may be non-essential for cell functioning. Thus, the corresponding genes can be knocked out, and the products of these genes can be supplied from the surrounding environment. This reduces the number of genes to 150.

The survival of these cells requires that the corresponding compounds be available in the surrounding environment, and cell membranes must be permeable to these compounds. Therefore, a specific environment is necessary for the viability of such cells. The presence of certain resources in the surrounding environment allows the identification of genes that can be dispensable. Due to the specificity of the surrounding environment, these cells may be capable of exchanging substances and energy. The inability to synthesize certain substances may lead to a self-renewal problem.
In constructing artificial cells, a method is also used where natural genes are replaced with synthetic ones. This method has been employed to construct viruses and certain bacteria, rather than eukaryotic cells.

24 of 45

In 2010, the Venter group developed the computer-generated genome sequence JCVI-syn1.0 for Mycoplasma mycoides. The genomic sequences of two laboratory strains of Mycoplasma mycoides subsp. capri strain GM12 were determined using the genome sequence. The synthetic genomes developed differed somewhat from the originals, with some genes modified, added, or removed. Subsequently, the genome was introduced into yeast, where chemical synthesis and assembly of the new genome (circular DNA) took place. Afterward, it was transplanted into recipient cells of Mycoplasma capricolum. As a result, the new cells exhibited the phenotypic properties of Mycoplasma mycoides and were capable of self-replication. These cells were named "synthetic cells.“

25 of 45

Bottom-up strategy

Bottom-up strategy, in comparison to the top-down approach, is much more complex. Instead of starting to build a synthetic organism using a necessary cellular prototype, this strategy aims to create a cell from non-biotic components. This approach allows for the study of the relationship between the living and non-living nature, as well as understanding how life originated.

26 of 45

The three main elements necessary for creating a "living" artificial cell using the bottom-up approach include cell membranes, metabolic systems, and informational molecules. Informational molecules, RNA or DNA, determine the nature and functions of the cell. Without the creation of artificial membranes, it is not possible to create artificial cells. Biological membranes are very complex, but it is possible to build artificial cell membranes that can possess some properties of natural membranes.

Main artificial membranes include:

Phospholipid membranes
Fatty acid membranes
Protein-polymer nano-conjugate membranes
Semi-permeable membranes
Inorganic membranes
Coacervate vesicles
Non-standard artificial cells (cell mimics)�

27 of 45

A lipid-bound protocell has been developed, capable of synthesizing complex carbohydrates through an autocatalytic reaction involving the synthesis of sugar with formose. This reaction is capable of initiating quorum-sensing mechanisms in Vibrio harveyi bacteria and triggering a bioluminescent response.

The scheme of an autocatalytic reaction of sugar with formose leading to bioluminescence in Vibrio harveyi

28 of 45

�Schematic representation of possible processes of vesicle formation and transformation

29 of 45

Despite the progress in creating artificial cells, there is still a significant gap between them and biological cells. Many questions remain to be addressed, including:
How to make artificial cells effectively communicate with the surrounding environment and with each other?
How to construct artificial cellular networks? In other words, how to create artificial cells with different functions and make them work together in a community?
How to enhance the ability of reproduction, division, and development of artificial cells?
How to make artificial cells absorb nutrients and move like living organisms?

30 of 45

The answers to these questions will enable the development of new technologies and provide insights into the nature of biological cells. Progress toward this ultimate goal is likely to bring numerous benefits. In general, the potential advantages of artificial cells include:

Providing a plausible theory of the origin of life.
Bridging the gap between the non-living and living worlds.
Using artificially created organisms for the production of pharmaceuticals and fuels.
Replacing or supplementing deficient cells, drug delivery, or medical imaging, adding new functions absent in biological cells.
The creation of artificial cells holds promising opportunities in various fields such as biotechnology, medicine, and industry.

31 of 45

Synthetic Bioengineering and Biological Computers

Synthetic biology, as mentioned earlier, is a symbiosis of molecular biology and engineering sciences. The main principles of engineering include automation, abstraction, standardization, and division of labor.

Automation in synthetic biology primarily involves robotic systems for conducting experiments and computer programs that simplify the design of biosystems, known as biological CAD (computer-aided design) systems.

32 of 45

Automation of experimental procedures allows for conducting complex experiments faster and more precisely than a human would. It is capable of addressing highly relevant issues of reproducibility of experiment results across different laboratories. When the experimental protocol is encoded as a program executed by a robot, it eliminates the so-called human factor on one hand, and on the other hand, the protocol is specified in much greater detail than can be done in a scientific paper.

The further development of the idea of automating biological experiments is the creation of 'cloud laboratories,' where all research is carried out by robots. In modern biology, new terms have emerged, such as 'dry' and 'wet' biologists. Wet biologists engage in experiments, working in laboratories with real biomolecules and bio-objects. Dry biologists, on the other hand, are becoming more prevalent, and their workplace is exclusively behind a computer.

33 of 45

After this, a fully automated system located in a remote 'cloud' laboratory executes all commands, leaving scientists only to analyze the results. Similar to cloud computing services, this approach significantly reduces user costs - equipment will be idle less and shared among many researchers.

� The term "biological elements" most commonly refers to genetic elements, which are DNA fragments. A crucial milestone in the standardization of working with genetic elements was the creation of the BioBricks standard, enabling the assembly of large genetic constructs from individual elements called "biobricks." The DNA sequence of each "biobrick" consists of a functional part, to which special sequences - a prefix and a suffix - are attached on the sides. All of this is then enclosed in a small circular DNA molecule that also contains instructions for its replication by bacteria (where the bacterial cell acts as a "brick factory").

34 of 45

The analogy between hierarchical abstraction in electrical engineering and synthetic biology

35 of 45

Creating Gene Networks

Analogous to the concept of an integrated circuit, primarily as a device dealing with the processing of electrical signals, the notion of a genetic circuit has emerged. A genetic circuit refers to a genetic program that instructs a cell to process signals in a specific way, similar to the appearance of certain molecules.

� In order for bioengineers to create complex systems, genetic programs that function as logical elements, akin to transistors in electrical engineering, are needed. Such logical elements can be constructed, for instance, based on known gene expression regulation systems.

36 of 45

In 1961, François Jacob and Jacques Monod described the regulatory system of E. coli known as the lactose operon. The lactose operon comprises a set of co-regulated genes encoding enzymes involved in lactose carbohydrate metabolism. If glucose is present in the environment, the bacterium utilizes it instead of lactose. The lactose operon is activated only when lactose is present in the environment and glucose is absent. How does this happen? The lactose molecule binds to a repressor protein, which attaches to a DNA segment and blocks the expression of genes. Subsequently, the repressor detaches from DNA, and upon a decrease in glucose concentration, an activation signal for the operon is triggered. In short, the lactose operon behaves like a logical element, taking on a value of 0 (enzymes are not synthesized) or 1 (enzymes are synthesized) depending on the input information.

37 of 45

�In the late 1990s, James Collins, Charles Cantor, and Tim Gardner created the first artificial genetic toggle switch. (In electrical engineering, a toggle switch is a device capable of long-term staying in one of two stable states and alternating them under the influence of external signals, similar to the behavior of the lactose operon.) The Collins and Gardner toggle switch consisted of a system with two genes, A and B, inhibiting each other's activity. A bit later, Michael Elowitz and Stanislas Leibler constructed the first biological oscillator, called a repressilator – a system with three genes connected by feedback mechanisms. The product of the first gene inhibits the action of the second, the second inhibits the third, and the third inhibits the first. The concentrations of the three proteins exhibited harmonic oscillations with specified amplitude values.

38 of 45

Simpler electrical engineering devices - switch (minimal substitute) and oscillator (rhythm generator), their analogs in aesthetic systems, and diagrams of genetic network art

39 of 45

biocomputer

A biocomputer (also biological computer, molecular computer) is a computer that functions as a living organism or contains biological components. The creation of biocomputers is based on the field of molecular computing. Proteins and nucleic acids that react with each other are used as computational elements.

40 of 45

Unlike conventional computers, biocomputers consume very little energy, making them extremely cost-effective computing devices. Computers based on bacteria have been created, capable of performing basic logical operations such as logical addition, multiplication, and subtraction.

� An alternative to cellular systems is DNA computers capable of performing their functions outside cells directly in a test tube. The operation of such systems is based on the properties of the DNA molecule: information is encoded in the DNA chain as a sequence of nucleotides, which can be modified using enzymes. Thus, DNA computers can store and process information. Biocomputers can be of three types: biochemical computers, biomechanical computers, and bioelectronic computers.

41 of 45

Biochemical computers utilize a vast array of feedback cycles characteristic of biochemical reactions to achieve computational functionality. The potential application of biocomputers includes embedding computational systems into the human body and using them in gene therapy for disease detection and treatment.

42 of 45

�The standard visual symbols for the Synthetic Biology Open Language (SBOL), used in the BioBricks standard

43 of 45

Biological constructor

Each BioBrick part is a DNA sequence in a circular plasmid. It acts as a vector and carries BioBrick components. The first approach to the BioBrick standard involved introducing standard prefix (a standardized sequence containing restriction enzyme sites) and suffix (a standardized sequence containing restriction enzyme sites) sequences that flank the 5′ and 3′ ends of the DNA part (BioBrick). These standard sequences encode specific restriction enzyme sites. The prefix sequence encodes EcoR1 and Xba1 sites, while the suffix sequence encodes Spe1 and Pst1 sites. The prefix and suffix are not part of BioBrick. To facilitate the assembly process, a BioBrick part should not contain any restriction enzyme sites. During the assembly of two different parts, one plasmid is cleaved using EcoR1 and Spe1, while another plasmid is cleaved using EcoR1 and Xba1. As a result, both plasmids are formed with 4 base pair overhangs at the 5′ and 3′ ends. EcoR1 sites will ligate with Pst1 sites, and Xba1 and Spe1 sites will also ligate. Thus, both DNA parts will be present in one plasmid. The ligation between the two BioBrick parts results in a so-called "scar" region with an 8-base pair overhang. Since the scar site is a hybrid of XbaI and SpeI sites, it is not recognized by restriction enzymes.

1 of 45

2 of 45