Genetically Modified Pig As A Cancer Prone Model

  • Published: Nov 15, 2012
  • Earliest Priority: May 12 2011
  • Family: 1
  • Cited Works: 57
  • Cited by: 6
  • Cites: 5
  • Sequences: 240
  • Additional Info: Cited Works Full text

GENETICALLY MODIFIED PIG AS A CANCER PRONE MODEL

The present invention relates to a genetically modified pig and porcine cells comprising a genetically modified porcine p53 sequence. Methods to produce such a genetically modified pig are herein provided. In particular, the pig and cells are generated using zinc finger nuclease-mediated or Transcription activator-like effector nuclease-mediated technologies. Nucleases targeting porcine p53 nucleic acid sequences as well as nucleic acid sequences encoding said nucleases are herein described. Also herein provided are methods of assessing effects of agents in animals and cells comprising a genetically modified porcine p53 sequence.

The pig according to the present invention can advantageously be used as a proliferative, preferably cancer, prone model. Pig and cells herein described can be used in toxicology and in oncology researches in order to identify new biomarkers of cancer and/or new compounds for preventing, treating or alleviating proliferative diseases such as cancer. BACKGROUND OF THE INVENTION

Cancer is fundamentally a disease of failure of regulation of tissue growth. In order for a normal cell to transform into a cancer cell, the genes which regulate cell growth and differentiation must be altered.

The affected genes are divided into two broad categories including oncogenes and tumor suppressor genes. Oncogenes are genes which promote cell growth and reproduction. Tumor suppressor genes are genes which inhibit cell division and survival. Malignant transformation can occur through the formation of novel oncogenes, the inappropriate over-expression of normal oncogenes, or by the under-expression or inactivation of tumor suppressor genes. Typically, changes in many genes are required to transform a normal cell into a cancer cell.

Genetic changes can occur at different levels and by different mechanisms. The gain or loss of an entire chromosome can occur through errors during mitosis. More common are mutations inducing changes in the nucleotide sequence of genomic DNA.

Replication of the enormous amount of data contained within the DNA of living cells will probabilistically result in some errors (mutations). Complex error correction and prevention is built into the process and safeguards the cell against cancer. If significant error occurs, the damaged cell can "self destruct" through programmed cell death, termed apoptosis. If the error control processes fail, then the mutations will survive and be passed along to daughter cells. Some environments make errors more likely to arise and propagate. Such environments can include for example the presence of disruptive substances called carcinogens, repeated physical injury, heat, ionising radiation and/or hypoxia. The transformation of normal cell into cancer is akin to a chain reaction caused by initial errors, which compound into more severe errors, each progressively allowing the cell to escape the controls that limit normal tissue growth. Once cancer has begun to develop, this ongoing process, termed clonal evolution drives progression towards more invasive cancer stages.

The p53 tumor suppressor serves as a "guardian of the genome" (Lane (1992), Nature 358: 15-16) and has been studied intensively for over 30 years. p53 is activated as a transcription factor in response to cellular stresses, such as DNA damage, hypoxia and cell-cycle aberrations. It can thus help to promote the repair and survival of damaged cells by inducing cell-cycle arrest, or it can promote the permanent removal of damaged cells by inducing programmed cell death or senescence (Levine & Oren (2009), Nat. Rev. Cancer 9: 749-758).

As the p53 gene (also identified as TP53) is either mutated or deleted in more than 50% of human tumours (Hollstein et al., (1991), Science 253: 49-53), functional p53 is clearly very important in protecting against cancer development. The vast majority of p53 mutations in human tumours are single missense mutations that cluster in the core DNA-binding domain of the protein (i.e. between amino acid residues 100-300 of the human p53 amino acid sequence of SEQ ID NO: 3). These mutations can lead to both the disruption of normal p53 function and the accumulation of high levels of mutant p53 with various gain-of-function activities (reviewed in Brosh & Rotter (2009), Nat. Rev. Cancer 9: 701-713; and Soussi & Lozano (2005), Biochem. Biophys. Res. Commun. 331 : 834-842).

The use of mouse models has greatly contributed to the understanding of the role of p53 in tumour suppression. Knockout (KO) mice have been generated using homologous recombination in ES cells. Mice homozygous for a deletion in the p53 gene develop tumours at high frequency, providing essential evidence for the importance of p53 as a tumour suppressor. Additionally, crossing these knockout mice or transgenic expression p53 dominant negative alleles with other tumour-prone mouse strains has allowed the effect of p53 loss on tumour development to be examined further. In a variety of mouse models, absence of p53 facilitates tumorigenesis, thus providing a means to study how the lack of p53 enhances tumour development and to define genetic pathways of p53 action. Depending on the particular model system, loss of p53 either results in deregulated cell-cycle entry or aberrant apoptosis (programmed cell death), confirming results found in cell culture systems and providing insight into in vitro function of p53. Finally, as p53 null mice rapidly develop tumours, they are useful for evaluating agents for either chemopreventive or therapeutic activities.

Until recently, genetic engineering in non rodent species particularly for KO genes were not routine for the skilled person. The emergence of new technologies, more particularly the Zinc Finger Technology (ZFN) in 2005 and the transcription activator-like (TAL) effector Technology in 2009, allowing the induction of specific double-strand breaks in DNA, opens genetic engineering in larger species such as pig or sheep.

Zinc Finger nucleases (ZFNs) are hybrid molecules composed of (i) a designed polymeric zinc finger domain specific for a DNA target sequence and (ii) a Fokl nuclease cleavage domain (Kandavelou and Chandrasegaran 2009; Kim et al. 1997; Miller et al. 2007; Porteus 2008; Santiago et al. 2008; Shukla et al. 2009; Szczepek et al. 2007). Fokl requires dimerization to cut DNA. The binding of two heterodimers of designed ZFN-Fokl hybrid molecules to two contiguous target sequences in each DNA strand, separated by about 5 or 6 base-pair cleavage site, results in Fokl dimerization and subsequent DNA cleavage (Figure 1).

The specificity of ZFN's is determined by their polymeric zinc finger domains, the DNA binding properties of which are generated through modular assembly of individual zinc fingers [reviewed in (Beerli and Barbas 2002; Pabo et al. 2001)]. Two major platforms exist for generating polymeric zinc fingers with defined specificities: a proprietary platform developed by Sangamo Biosciences (Isalan et al. 2001; Urnov et al. 2005), and the OPEN platform developed by the Zinc Finger Consortium (Maeder et al. 2008; Sander et al. 2007; Wright et al. 2006).

Similarly, transcription activator-like effector nucleases (TALEN) are hybrid molecules composed of (i) designed polymeric DNA binding domain specific for a DNA target sequence and (ii) a Fokl nuclease cleavage domain. As described for the ZFN, Fokl requires dimerization to cut DNA. The binding of two heterodimers of designed TALE-Fokl hybrid molecules to two contiguous target sequences in each DNA strand, separated by about 5 to 20 base pair cleavage site, results in Fokl dimerization and subsequent DNA cleavage (Figure 12A and 12C).

The specificity of TALEN depends on a variable number of imperfect, typically 34, amino acid repeats (Schornack et al, 2007, Proc. Natl. Acad. Sci. USA 104, 10720). Polymorphism is primarily at repeat positions 12 and 13, which are called the repeat-variable diresidue (RVD). RVDs of TAL effectors correspond directly to the nucleotides in their target sites, one RVD corresponding typically to one nucleotide, with some degeneracy and no apparent context dependence (Moscou and Bogdanove (2009) Science, 326: 1501 ; Boch et al (2009) Science 326: 1509-1512).

Induction of a DNA double strand break by a ZFN or a TALEN results in the activation of a cellular response known as the DNA damage response. A double strand break can be repaired in two different ways (Figure 2). Non-homologous end joining (NHEJ) generates short insertions or deletions at the cleavage site. Repair by homologous recombination (HR) using a DNA template results in gene knock-in or gene knock-out that either result in a perfect repair or, if a modified template is introduced, in sequence replacement.

Precisely targeted site-specific cleavage of genomic loci offers an efficient supplement and/or an alternative to conventional homologous recombination for generating animal models as herein demonstrated. Creation of a double-strand break (DSB) increases the frequency of homologous recombination at the targeted locus more than 1000-fold. The imprecise repair of a site-specific DSB by non-homologous end joining (NHEJ) can also result in gene disruption. Creation of two such DSBs results in deletion of arbitrarily large regions. The modular DNA recognition preferences of zinc- fingers protein allows for the rational design of site-specific multi-finger DNA binding proteins as well as DNA recognition preferences of RVD allows for the rational design of site-specific TAL effector (TALE). Fusion of the nuclease domain from the Type II restriction enzyme Fok I to site- specific zinc-finger proteins or a TAL effector allows for the creation of site-specific nucleases. Patent application US 2011/0023149 Al describes an animal genetically modified using the zinc finger nuclease technology (WO 03/087341), in particular a rat, which comprises at least one edited chromosomal sequence encoding a protein involved in tumor suppression. P53 is cited among a long list of different genes encoding such proteins. As well pig is identified as a possible genetically modified animal among a long list of different species comprising mammals. US 2011/0023149 Al however provides no concrete example of a genetically modified pig.

Patent application US 2010/0287628 Al describes a genetically modified non-human mammal, in particular a rat, comprising a genetic mutation in one more tumor suppressor gene that causes the mammal to have a greater susceptibility to cancer than a mammal not comprising the genetic mutation. P53 is again cited among a long list of different genes encoding such proteins. US 2010/0287628 Al however does not suggest using the zinc finger nuclease technology or TALEN technology and provides no concrete example of a genetically modified pig.

Cancer models and cancer prone models are critically needed for pharmaceutical research in oncology (e.g. preclinical studies) as well as for toxicology studies (e.g. carcinogenesis of chemical, biological or physical compounds).

The actual models mainly rely on the use of rodent animals where the p53 gene has been knocked-out. Rodent models are however limited. First, their sizes do not mimick the human one, and does not allow the development of tumors with an adequate size for testing for example radiation therapies in conditions similar to the human's. Moreover, the physiology of rodents can differ from that of human- being, particularly in the molecular process of cancer development as well as in the immune reaction. For example, in humans, missense mutations in the human adenomatous polyposis coli (APC) gene lead to metastatic cancer of the colon and rectum, but in mice the same mutations cause non- invasive, non-metastatic neoplasia of the small intestine (D.E. Corpet, F. Pierre, Point: From animal models to prevention of colon cancer. Systematic review of chemoprevention in min mice and choice of the model system, Cancer Epidemiol Biomarkers Prev 12 (2003) 391-400). Another type of cancer model is the immunodeficient mice where human cancer cells are engrafted. Such models allow evaluating for example the efficacy of a treatment, but do not take into account the capacity of the immune system to combat the cancer cells.

Differences in drug metabolism can also result in dramatically different responses in human and mice, reducing the predictive value of rodent studies. It is also very difficult to develop surgical or endoscopic techniques in mice because of their small size. There is therefore a need to extend genetically- defined disease models beyond mice to other species.

Although non-human primates are clearly the closest species to human, their use for biomedical research poses technical and ethical concerns. Moreover, considerable part of primates used for biomedical research are small-sized and are thus not suited for translational research in the fields of gene and cell therapy, or for testing medical devices. Also, experimentation on non-human primates is mainly limited to healthy animals with the inability to perform preclinical studies in diseased animals. Finally, the cost of experimentation on non-human primates is exceptionally high compared to other species. For all these reasons, experimentation on alternative species would be of great value.

The data reviewed by Ismail Kola & John Landis (Nature Reviews Drug Discovery 3, 711-716 (August 2004)) indicate that the average success rate for all therapeutic areas is approximately 11%. In other words, only one in nine compounds makes it through development and gets approved by the European and/or the US regulatory authorities. More interestingly, the success rates vary considerably between the different therapeutic areas: cardiovascular, for instance, have a 20% rate of success, whereas oncology and central nervous system (CNS) disorders have 5% and 8% success, respectively. Any R&D portfolio, therefore, would need to aggregate the percent success based on the weight of the various therapeutic areas to calculate how many first-in-man studies are needed to approximate the requisite business case for growth.

The major causes of attrition in the clinic in 2000 were lack of efficacy (accounting for approximately 30% of failures) and safety (toxicology and clinical safety accounting for a further approximately 30%)). The lack of efficacy might be contributing more significantly to therapeutic areas in which animal models of efficacy are notoriously unpredictive (Booth, B., Glassman, R. & Ma, P. Oncology's trials. Nature Rev. Drug Discov. 2, 609-610 (2003)), such as CNS and oncology, both of which have relatively higher failure rates in Phase II and III trials. In the case of oncology, small Phase II trials looking at tumour regression in small cohorts of patients with different tumour types do not always translate to outcomes subsequently obtained in larger Phase III trials. Nevertheless, in general, failures due to lack of efficacy and safety demonstrate the need for the development of more predictive animal models where possible and, more importantly, the need to develop experimental medicine paradigms that are more predictive of outcomes and to carry out such proof-of-concept clinical trials much earlier in development. It is interesting that oncology and CNS— two therapeutic areas with very high attrition rates— are also the areas in which animal models are not very predictive of the true human pathophysiology. For example, most pharmaceutical companies still use xenograft models for oncology testing, in which a tumour cell line that might have little relevance to the tumour in vivo is injected into a nude mouse (which does not resemble the immunology of the host; nor does the artificial location of the tumour significantly resemble what happens in vivo during tumorigenesis). The use of appropriate genetic models (for example, transgenic and gene knockout animals) of tumorigenesis might be more pathophysiologically relevant. The present invention now provides advantageous pig carrying a genetically modified nucleic acid sequence in the p53 gene which can in particular be used as cancer prone model. The modification can result from non-homologous end joining (NHEJ) repair mechanism or homologous recombination (HR) following double strand DNA cleavage preferably in a sequence of the porcine p53 genomic sequence of SEQ ID NO: 2.

SUMMARY OF THE INVENTION

The p53 gene is mutated in over 50% of all human cancers (Hollstein et al., (1991), Science 253: 49- 53), but mutations associated with p53 or the p53 pathway (e.g. MDM2, AKT) are found in virtually 100% of all cancers. mdm2 amplifications occur in 10%> of tumors, especially sarcomas; pl4ARF is lost in around 20-30%) tumors (either by locus deletion, methylation or point mutations). These alterations contribute to deregulation of p53 function. Moreover, AKT activation is necessary to shuttle MDM2 across nuclear membrane, a necessary movement to initiate p53 degradation. AKT is activated in many tumors either by PTEN mutations (approx. 40%>) or PI3K amplifications (10%>). Constitutive activation of AKT also contributes to p53 deregulation. Furthermore, Li-Fraumeni syndrome is an autosomal dominant hereditary disorder, where germline mutations in the p53 tumour suppressor gene lead to a variety of cancers (Varley JM (2003) Hum. Mutat. 21 (3): 313-20). The ZFP and ZFN described here can be used to prepare a genetically modified pig usable to study and identify new compounds and/or treatment to prevent, treat or alleviate various proliferative diseases, in particular a tumor, preferably a malignant tumor (cancer).

The pig of the present invention carries a genetically modified nucleic acid sequence in the p53 gene. Pig of the invention may advantageously be used as a pig model, typically a cancer prone model as explained previously.

The genetic modification advantageously results from non-homologous end joining (NHEJ) repair mechanism or homologous recombination following double strand DNA cleavage preferably in a four to seven bases sequence (cleavage site), even more preferably in a five, six or seven bases sequence, of the p53 genomic sequence of SEQ ID NO: 2, preferably of a region of approximately 300bp around SEQ ID NO: 4 and SEQ ID NO: 67 both located in SEQ ID NO:2, or even more preferably i) of SEQ ID NO: 4 or a polymorphic sequence thereof (such as SEQ ID NO: 66), or ii) of SEQ ID NO:67 or a polymorphic sequence thereof (such as SEQ ID NO: 118). The genetic modification possibly consists in i) deletion, insertion or substitution affecting at least one nucleotide and at most 7 nucleotides, typically 2, 3, 4, 5 or 6 nucleotides, in the cleavage site of SEQ ID NO: 2, or ii) larger deletion of about 100 to about 150 base pairs or more around the cleavage site in SEQ ID NO: 2.

The present invention encompasses any cell or population of cells derived from a pig according to the invention, for example a cell selected from a stem cell, an induced pluripotent stem cell (iPS cell), a germ cell, a gamete and a somatic cell. The cell can further be selected from a fertilized egg, a zygote, a morula, a blastocyst, an embryo, or a fetus derived from a pig model according to the invention. The cell can be non malignant cell or a tumor cell.

The present invention further encompasses the nucleus of a cell according to the invention as well as a cell line derived from such a cell.

Also herein described are in particular i) the use of a pig or of a cell according to the present invention, for the evaluation, in particular for the in vivo evaluation, of the ability of a compound, to induce, prevent or treat a cancer or for the screening of a compound for inducing, preventing or treating a cancer, ii) the use of a cell or of a population of cells according to the invention, for the in vitro or ex vivo evaluation of the ability of a test compound or a physical treatment applied to the animal body to induce, prevent or treat a cancer or for the screening of a compound for inducing, preventing or treating a cancer, and iii) the use of a cell or of a population of cells according to the invention, for the in vitro or ex vivo identification of a biomarker usable to diagnose a cancer or to identify a compound for preventing or treating a cancer.

DESCRIPTION OF DRAWINGS

Figure 1 : Zinc Finger nucleases (ZFNs) mechanism of action

(A) Schematic model of zinc finger interactions with DNA. A single finger {left) is capable of structurally-conserved amino acid-base contacts from positions on consecutive turns of the finger a- helix (shown as circled residues, numbered relative to the start of helix, Position -1). Arrows indicate potential base contacts (from positions -1, 2, 3 and 6) to DNA bases ("X"). Note that contacts are primarily to one strand of the DNA. Each finger potentially binds 4-base-pair overlapping sites, and multi-finger constructs may be linked as shown {right) to bind longer DNA target sites.

(B) Schematic depiction of the architecture of a pair of ZFNs binding a cognate binding site. Zinc finger nucleases (ZFNs) are artificial proteins consisting of a zinc finger DNA binding domain fused through a short amino-acid linker to the nuclease domain of Fokl (mis-labeled as "Fogl" in this figure). For efficient cutting, a pair of ZFNs needs to bind their cognate half sites. Depicted in this figure is a pair of 3-finger ZFNs that recognize the same target nine basepair target site but efficient cutting of DNA can also be obtained when two different ZFNs are used to recognize a nonsymmetric recognition sequence. The dimerization requirement means that a full ZFN site for a pair of 3-finger proteins would be 18 bp long - long enough to create a unique binding site in the genome. ZFNs can generally cut target sites efficiently when the two zinc finger binding sites are separated by 5, 6 or 7 basepairs. The advantage of ZFNs is that the zinc finger DNA binding domain can be re-designed to recognise novel target sites.

Figure 2: NHEJ & Homologous recombination following double strand break by ZFNs

(A) General schematic mechanism of NHEJ mediated knockouts following double strand break by ZFNs. In the absence of a template for Homologous Recombination, ZFN-induced double strand breaks are fixed by NHEJ repair mechanism, which often generates imprecise junctions, including both deletions and additions at the joint. Typical NHEJ junctions have a few nucleotides deleted or added. If these imprecise joints alter the reading frame or remove key amino acids, gene function will be compromised, potentially generating null mutations.

(B) General Schematic of ZFN induced double strand breaks and subsequent repair by homologous recombination

Figure 3: Zinc finger nucleases designed for the targeting the porcine, ovine and human p53 genes. (A) Canonical model of the designed zinc finger nuclease (zl l66) against degenerate target sites encompassing the p53 gene in different species. Arrows indicate likely base contacts. Zinc finger alpha helix sequences, involved in DNA recognition, are indicated (Fl, Finger 1, etc.). The designed mutations that alter the cross-strand contacts, to allow species flexibility, are designated by "S" in grey boxes. (B) The human, porcine and ovine target sequences. Sequence differences are highlighted with grey shading. Non contacted bases (according to the canonical model of binding) are in lower case. The central gap between the nuclease pair, representing the DNA cutting site, is underlined. 3bp finger recognition subsites are separated by commas. (C) The mutations in the zinc finger recognition alpha helices required to target all species. (D) Aligment of recognition alpha helices with the degenerate DNA recognition site (3'-5'). This binding compatibility is enabled by the rational mutations in panel C. Degenerate bases are written in the standard IUPAC code.

Figure 4: Lentiviral constructs (expression plasmids) of Ptl60 and Ptl 61 containing expression cassettes of porcine zl 166L and zl 166R ZFNs

The lentiviral plasmid allows the production of a vector genome that can be encapsidated by transcomplementation function. The plasmids contain all the necessary cis elements for encapsidation of the vector and for ensuring its cycle after it enters a cell. The plasmid vector comprise a self- inactivating (SIN) lentiviral 3'LTR, a cis-acting nucleic acid sequence facilitating the RNA nuclear export, preferably the HIV-1 rev Responsive Element (RRE), one copy of the cPPT and CTS cis- acting regions ("triplex e") of HIV-1, an HIV retroviral packaging nucleic acid sequence comprising an HIV retroviral 5' splice donor sequence, and a psi sequence.

The expression cassette comprises the ZFN encoding gene or sequence under the control of a CMV promoter and the WPRE (post-transcriptionnal regulatory element).

The Ptl60 encodes the porcine zl 166L ZFN gene and corresponds to SEQ ID NO: 10.

The Ptl61 encodes the zl 166R ZFN gene and corresponds to SEQ ID NO: 9.

Figure 5: Transcomplementation plasmid construct (PI 23, PI 24 and PI 25)

The transcomplementation plasmid comprises nucleic acid sequences allowing the expression of the HIV retroviral GAG, POL, TAT and REV proteins, and linked to an heterologous polyadenylation signal.

The transcomplementation plasmid encodes none of the vif, vpr, vpu and nef accessory genes. It is also devoid of the psi encapsidation signal as well as of the RRE sequence.

The POL gene is modified in order to provide within the vectors a mutated lentiviral integrase (the integrase gene sequence is boxed) preventing the integration of a recombinant genome into the genome of a host cell.

The transcomplementation vector comprising a D64V mutation of the integrase coding sequence within the POL gene corresponds to SEQ ID NO: 13.

The transcomplementation vector comprising a K186Q/Q214L/Q216L mutation (also herein named "LQ mutation") of the integrase coding sequence within the POL gene corresponds to SEQ ID NO: 14.

The transcomplementation vector comprising a R262A/R263A/K264H mutation (also herein named "N mutation") of the integrase coding sequence within the POL gene corresponds to SEQ ID NO: 15.

Figure 6 : Envelope plasmid construct (P31)

The envelope plasmid is comprises an expression cassette of an envelope glycoprotein, for instance the Vesicular Stomatitis Virus glycoprotein (VSV-G), under the control of the CMV promoter. The expression of the VSV-G into encapsidation cells allows the pseudotyping, i.e. the production of enveloped vectors bearing eventually an heterologous envelope (VSV-G in place of the HIV envelope) The Pt31 plasmid (VSV-pMD2.G) corresponds to SEQ ID NO: 16.

Figure 7 : Lentiviral constructs (expression plasmids) of Ptl32 and Pt 133 containing expression casstes of GFP recognizing ZFNs The lentiviral plasmid allows the production of a vector genome that can be encapsidated by transcomplementation function. The plasmids contain all the necessary cis elements for encapsidation of the vector and for ensuring its cycle after it enters a cell. The plasmid vector comprise a self- inactivating (SIN) lentiviral 3'LTR, a cis-acting nucleic acid sequence facilitating the RNA nuclear export, preferably the HIV-1 rev Responsive Element (RRE), one copy of the cPPT and CTS cis- acting regions ("triplex e") of HIV-1, an HIV retroviral packaging nucleic acid sequence comprising an HIV retroviral 5' splice donor sequence, and a psi sequence.

The expression cassette comprises the ZFN encoding gene or sequence under the control of a CMV promoter and the WPRE (post-transcriptionnal regulatory element).

The Ptl32 encodes the GFP recognizing ZFN3 gene and corresponds to SEQ ID NO: 11.

The Ptl33 encodes the GFP recognizing ZFN4 gene and corresponds to SEQ ID NO: 12.

Figure 8: Efficient GFP inactivation through NHEJ repair mechanism using non- integrating lentiviral vectors expressing GFP recognizing ZFNs.

Hela HI 1 cells bearing a unique copy of the GFP reporter gene under the control of a CMV promoter have been transduced with non integrating vectors (NILVs) bearing a mutation in the integrase gene (D64V or LQ or N mutation). The HI 1 Hela cells were treated with various vectors: no vector, mock control (Control group, grey line with quare), cotransduction with D64V-NILV-ZFN3 and D64V- NILV-ZFN4 (Experimental group 1 , black line with triangles) or cotransduction with LQ-NILV-ZFN3 and LQ-NILV-ZFN4 (Experimental group 2, black line with rhomb). Group cotransduced with N- NILV-ZFN3 and N-NILV-ZFN4 is not shown on the graphic.

The experiment shows that HI 1 cells treated with the ZFN3 and 4 vectors, whether bearing a D64V or LQ mutation, display extinction of GFP expression, suggesting NHEJ-mediated knockout of the GFP gene in these cells.

Figure 9: Persistence of GFP inactivation after transduction of Hl l Hela cells with GFP recognizing ZFNs

Untreated Hl l Hela cells bearing and Hl l Hela cells treated with D64V-NILV-ZFN3 and D64V- NILV-ZFN4 have been kept in culture for 7, 38 and 85 days post-treatment. The cells treated with the vectors maintain the extinction of GFP overtime, demonstrating that the extinction results from a permanent modification of the GFP cassette by NHEJ repair mechanism.

Figure 10: Efficient GFP inactivation through NHEJ repair mechanism using D64V non- integrating lentiviral vectors expressing GFP recognizing ZFN3 and ZFN4

Hela HI 1 cells bearing a unique copy of the GFP reporter gene under the control of a CMV promoter have been transduced with non integrating vectors bearing a mutation in the integrase gene (D64V- NILV). The HI 1 Hela cells were treated with: no vector (Untransduced group), with D64V-NILV- ZFN3 alone, with D64V-NILV-ZFN4 alone or with both D64V-NILV-ZFN3 and D64V-NILV-ZFN4. Only the group cotransduced with both vectors shows extinction of GFP by NHEJ repair mechanism, showing that the double strand break inducing NHEJ occurs only when both ZFN3 and ZFN4 are coexpressed.

Figure 11: In vitro cleavage by p53 zinc finger nucleases.

(A) Schematic of palindromic- (pts) or heterodimer (ts) DNA target sites of zl 166 ZFN. Both strands of DNA are shown with the top strand written 5' to 3' and the bottom strand written 3' to 5'. The primary strand of the 12-bp target sites is highlighted (' 1166L' in light grey, ' 1166R' in dark grey).

(B) Analysis of homo- and heterodimer cleavage reactions. In vitro expressed ZFNs were incubated with a linear target DNA substrate and cleavage products were analyzed by agarose gel electrophoresis. Cleavage of the target DNA results in two DNA molecules of the same size, simplifying the analysis.

(C) The rational nuclease design allows cleavage of individual DNA sequences described within this degenerate target sequence.

Figure 12: TAL effector nucleases (TALENs)

(A) Schematic depiction of the architecture of a pair of TALENs binding a cognate binding site.

Transcription activator-like effector nucleases (TALENs) are artificial proteins consisting of a transcription activator like effector fused through a short amino-acid linker to a nuclease Fokl. For efficient cutting, a pair of TALENs needs to bind their cognate half sites. Depicted in this figure is a pair of TALEN that recognize two 18 base pair sequences separated by a 12 base pair sequence. The Fokl-Fokl dimer will induce a double strand break in the 12 base pair sequence, or in the 18 base pair recognition sequences, or in a region around the 18 base pair recognition sequence.

(B) Schematic depiction of a single TALEN with a zoom of the amino acid sequence of a repeated motif (34 amino acids) responsible for the nucleic acid interaction, in particular the variable diresidue (RVD) is figured squared. In this example, the RVD corresponds to 'NT, which interacts preferentially with the nucleotide Ά'. A non- exhaustive list of other RVDs is presented in a table with their respective preferential nucleotide ('HD' interacts with 'C, 'NN' interacts with 'G' or 'A' and 'NG' interacts with 'T').

(C) Example of a TALEN DNA target site in the p53 porcine gene. Squared nucleotides (SEQ ID NO: 105) in the upper strand are recognised by TALENl, squared nucleotides (SEQ ID NO: 106) in lower strand are recognised by TALEN2. Figure 13: Lentiviral constructs (expression plasmids) of Pt205 and Pt206 containing expression cassettes of porcine TALEN1 and TALEN2.

The lentiviral plasmid allows the production of a vector genome that can be encapsidated by transcomplementation function. The plasmids contain all the necessary cis elements for encapsidation of the vector and for ensuring its cycle after it enters a cell. The plasmid vector comprise a self- inactivating (SIN) lentiviral 3'LTR, a cis-acting nucleic acid sequence facilitating the RNA nuclear export, preferably the HIV-1 rev Responsive Element (RRE), one copy of the cPPT and CTS cis- acting regions ("triplex e") of HIV-1, an HIV retroviral packaging nucleic acid sequence comprising an HIV retroviral 5' splice donor sequence, and a psi sequence.

The expression cassette comprises the TALEN encoding gene or sequence under the control of a CMV promoter and the WPRE (post-transcriptionnal regulatory element).

The Pt205 encodes the porcine TALEN1 gene and corresponds to SEQ ID NO: 70.

The Pt206 encodes the porcine TALEN2 gene and corresponds to SEQ ID NO: 71. Figure 14: Typical results of analysis of PCR products from the porcine p53 gene performed on DNA extracted from cells transduced by lentiviral vectors expressing porcine p53 ZFNs or porcine p53 TALENs.

Cells have been transduced with 5μΕ of lentiviral vector expressing ZFNp53L and 5μΕ of lentiviral vector expressing ZFNp53R (lanes 2 and 3), ΙΟμΕ of lentiviral vector expressing ZFNp53L and ΙΟμΕ of lentiviral vector expressing ZFNp53R (lanes 4 and 5), 2μΕ of lentiviral vector expressing TALEN1 and 2μΕ of lentiviral vector expressing TALEN2 (lanes 6 and 7), ΙΟμΕ of lentiviral vector expressing TALENl and ΙΟμΕ of lentiviral vector expressing TALEN2 (lanes 8 and 9), or not transduced (lanes 10 and 1 1). After transduction and amplification of the cells, DNA is extracted and submitted to PCR to amplify the p53 gene. PCR products are subsequently boiled and annealed and then submitted to digestion by T7 nuclease (lanes 3, 5, 7, 9, 11) or not (lanes 2, 4, 6, 8, 10). (First lane: DNA ladder)

LIST OF SEQUENCES

SEQ ID NO: 1 : Porcine p53 cDNA sequence

SEQ ID NO: 2 : Porcine p53 genomic Sequence

SEQ ID NO: 3 : Human p53 amino acid sequence

SEQ ID NO: 4 : Porcine p53 target site nucleic acid sequence n°l

SEQ ID NO: 5 : zl 166R ZFN amino acid sequence

SEQ ID NO: 6 : Porcine zl 166L ZFN amino acid sequence

SEQ ID NO: 7 : zl 166R ZFP amino acid sequence

SEQ ID NO: 8 : Porcine zl l66L ZFP amino acid sequence

SEQ ID NO: 9 : Ptl 61 (Ptrip-CMV- zl 166R -WPRE) nucleotidic sequence (p53 ZFN) SEQ ID NO: 10 : Ptl60 (Ptrip-CMV- zl 166L -WPRE) nucleotidic sequence (p53 ZFN) SEQ ID NO: 11 : Ptl32 (Ptrip-CMV- ZFN3 -WPRE) nucleotidic sequence (GFP ZFN) SEQ ID NO: 12 : Ptl33 (Ptrip-CMV- ZFN4 -WPRE) nucleotidic sequence (GFP ZFN) SEQ ID NO: 13 : Transcomplementation P125 (p8.91-D64V) plasmid sequence SEQ ID NO: 14 : Transcomplementation P123 (p8.91-LQ) plasmid sequence

SEQ ID NO: 15 : Transcomplementation P124 (p8.91-N) plasmid sequence

SEQ ID NO: 16 : P31 (VSV-pMD2.G) envelope plasmid sequence

SEQ ID NO: 17 : Murine p53 forward PCR primer sequence

SEQ ID NO: 18 : Murine p53 reverse PCR primer sequence

SEQ ID NO: 19 : Porcine p53 forward PCR primer sequence

SEQ ID NO: 20 : Porcine p53 reverse PCR primer sequence

SEQ ID NO: Fl Zinc fin£ er domain amino acid sequence (i)

SEQ ID NO: F2 Zinc fin£ er domain amino acid sequence (i)

SEQ ID NO: F3 Zinc finj *er domain amino acid sequence (i)

SEQ ID NO: F4 Zinc fin£ er domain amino acid sequence (i)

SEQ ID NO: Fl Zinc fin£ ;er domain amino acid sequence (ϋ)

SEQ ID NO: F2 Zinc fn¾ jer domain amino acid sequence (")

SEQ ID NO F3 Zinc fn¾ *er domain amino acid sequence (")

SEQ ID NO F4 Zinc fn¾ icr domain amino acid sequence (ϋ)

SEQ ID NO Fl Zinc fn¾ icr domain amino acid sequence (iii)

SEQ ID NO F2 Zinc fn¾ icr domain amino acid sequence (iii)

SEQ ID NO F3 Zinc finj jer domain amino acid sequence (iii)

SEQ ID NO F4 Zinc fin^ *er domain amino acid sequence (iii)

SEQ ID NO Fl Zinc fin^ *er domain amino acid sequence (iv)

SEQ ID NO F2 Zinc fin^ *er domain amino acid sequence (iv)

SEQ ID NO F3 Zinc fin^ *er domain amino acid sequence (iv) SEQ ID NO: 36 : F4 Zinc finger domain amino acid sequence (iv)

SEQ ID NO: 37 : : Fl Zinc finger domain amino acid sequence (v)

SEQ ID NO: 38 : : F2 Zinc finger domain amino acid sequence (v)

SEQ ID NO: 39 : : F3 Zinc finger domain amino acid sequence (v)

SEQ ID NO: 40 : : F4 Zinc finger domain amino acid sequence (v)

SEQ ID NO: 41 : : Human p53 target site nucleic acid sequence (v)

SEQ ID NO: 42 : Ovine p53 target site nucleic acid sequence

SEQ ID NO: 43 : : Murine (mouse) p53 target site nucleic acid sequence

SEQ ID NO: 44 : : Target nucleic acid sequence (i)

SEQ ID NO: 45 : : Target nucleic acid sequence (ii)

SEQ ID NO: 46 : : Target nucleic acid sequence (iii)

SEQ ID NO: 47 : : Target nucleic acid sequence (iv)

SEQ ID NO: 48 : : Target nucleic acid sequence (v)

SEQ ID NO: 49 : : Human zl 166L binding site

SEQ ID NO: 50 : : Murine zl 166L binding site

SEQ ID NO: 51 : : Ovine zl 166L binding site

SEQ ID NO: 52 : : Porcine zl 166L binding site

SEQ ID NO: 53 : Primer Target seqLl nucleic acid sequence

SEQ ID NO: 54 : Primer Target seqRl nucleic acid sequence

SEQ ID NO: 55 : Primer T7kozak_FWD nucleic acid sequence

SEQ ID NO: 56 : Primer UnivQQR R NotI nucleic acid sequence

SEQ ID NO: 57 : Wild type cleavage site in GFP nucleic acid sequence

SEQ ID NO: 58 : Mutated cleavage site in GFP nucleic acid sequence through NHEJ

SEQ ID NO: 59 : Mutated cleavage site in GFP nucleic acid sequence through NHEJ

SEQ ID NO: 60 : Mutated cleavage site in GFP nucleic acid sequence through NHEJ

SEQ ID NO: 61 : Mutated cleavage site in GFP nucleic acid sequence through NHEJ

SEQ ID NO: 62 : Porcine cleavage site in the p53 gene

SEQ ID NO: 63 : zl 166R binding site

SEQ ID NO: 64 : GFP forward PCR primer sequence

SEQ ID NO: 65 : GFP reverse PCR primer sequence

SEQ ID NO: 66 : Polymorphic porcine p53 target site nucleic acid sequence

SEQ ID NO: 67 : Porcine p53 target site nucleic acid sequence n°2

SEQ ID NO: 68 : TALEN1 nucleic acid sequence

SEQ ID NO: 69 : TALEN2 nucleic acid sequence

SEQ ID NO: 70 : Pt205 (Ptrip-CMV-TALENl -WPRE) nucleic acid sequence

SEQ ID NO: 71 : Pt206 (Ptrip-CMV-TALEN2-WPRE) nucleic acid sequence

SEQ ID NO: 72 : p53 wild type target site n°l SEQ ID NO: 73 : p53 target site (clone A9) mutated through NHEJ

SEQ ID NO: 74 : p53 target site (clone H4) mutated through NHEJ

SEQ ID NO: 75 : p53 target site (clone A3) mutated through NHEJ

SEQ ID NO: 76 : p53 target site (clone B4) mutated through NHEJ SEQ ID NO: 77 : p53 target site (clone C3) mutated through NHEJ

SEQ ID NO: 78 : p53 target site (clone F6) mutated through NHEJ

SEQ ID NO: 79 : p53 target site (clone G2) mutated through NHEJ

SEQ ID NO: 80 : p53 target site (clone H12) mutated through NHEJ

SEQ ID NO: 81 : p53 wild type target site n°2

SEQ ID NO: 82 : p53 target site (clone A7) mutated through NHEJ

SEQ ID NO: 83 : p53 target site (clone B9) mutated through NHEJ

SEQ ID NO: 84 : p53 target site (clone El) mutated through NHEJ

SEQ ID NO: 85 : p53 target site (clone H4) mutated through NHEJ

SEQ ID NO: 86 : p53 target site (clone A8) mutated through NHEJ SEQ ID NO: 87 : p53 target site (clone D8) mutated through NHEJ

SEQ ID NO: 88 : p53 target site (clone Fl 1) mutated through NHEJ

SEQ ID NO: 89 : p53 target site (clone El) mutated through NHEJ

SEQ ID NO: 90 : p53 target site (clone F2) mutated through NHEJ

SEQ ID NO: 91 : p53 target site (clone F5) mutated through NHEJ SEQ ID NO: 92 : p53 target site (clone F6) mutated through NHEJ

SEQ ID NO: 93 : p53 wild type target site n°3

SEQ ID NO: 94 : p53 target site 1 mutated through NHEJ

SEQ ID NO: 95 : p53 target site 2 mutated through NHEJ

SEQ ID NO: 96 : p53 target site 3 mutated through NHEJ

SEQ ID NO: 97 : p53 target site 4 mutated through NHEJ

SEQ ID NO: 98 : p53 target site 5 mutated through NHEJ

SEQ ID NO: 99 : p53 target site 6 mutated through NHEJ

SEQ ID NO: 100 : p53 target site 7 mutated through NHEJ

SEQ ID NO: 101 : p53 target site 8 mutated through NHEJ SEQ ID NO: 102 : p53 target site 9 mutated through NHEJ

SEQ ID NO: 103 : wild type GFP sequence

SEQ ID NO: 104 : GFP sequence mutated through NHEJ

SEQ ID NO: 105 : TALEN1 DNA recognition site

SEQ ID NO: 106 : TALEN2 DNA recognition site

SEQ ID NO: 107 : Porcine cleavage site 2 in the p53 gene

SEQ ID NO: 108 : Porcine cleavage site 3 in the p53 gene

SEQ ID NO: 109 : TALE1 nucleic sequence SEQ ID NO: 1 10 TALE2 nucleic sequence

SEQ ID NO: 111 TALENl amino acid sequence

SEQ ID NO: 1 12 TALEN2 amino acid sequence

SEQ ID NO: 1 13 TALEN DNA binding domain - module 1 amino acid sequence

SEQ ID NO: 1 14 TALEN DNA binding domain - module 2 amino acid sequence

SEQ ID NO: 1 15 TALEN DNA binding domain - module 3 amino acid sequence

SEQ ID NO: 1 16 TALEN DNA binding domain - module 4 amino acid sequence

SEQ ID NO: 117 TALEN DNA binding domain - module 5 amino acid sequence

SEQ ID NO: 1 18 Polymorphic porcine p53 target site nucleic acid sequence

SEQ ID NO: 1 19 TALE amino acid sequence

SEQ ID NO: 120 TALE2 amino acid sequence

DETAILED DESCRIPTION

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art of cell culture, molecular genetics, nucleic acid chemistry and biochemistry.

Unless otherwise indicated, the practice of the present invention employs conventional techniques of chemistry, molecular biology, microbiology, recombinant DNA technology, chemical methods, pharmaceutical formulations and delivery and treatment of patients, which are within the capabilities of a person of ordinary skill in the art. Such techniques are also explained in the literature, for example, J. Sambrook, E. F. Fritsch, and T. Maniatis, 1989, Molecular Cloning: A Laboratory Manual, Second Edition, Books 1-3, Cold Spring Harbor Laboratory Press; Ausubel, F. M. et al. (1995 and periodic supplements; Current Protocols in Molecular Biology, ch. 9, 13, and 16, John Wiley & Sons, New York, N. Y.); B. Roe, J. Crabtree, and A. Kahn, 1996, DNA Isolation and Sequencing: Essential Techniques, John Wiley & Sons; J. M. Polak and James O'D. McGee, 1990, In Situ Hybridisation: Principles and Practice, Oxford University Press; M. J. Gait (Editor), 1984, Oligonucleotide Synthesis: A Practical Approach, IRL Press; and D. M. J. Lilley and J. E. Dahlberg, 1992, Methods of Enzymology: DNA Structure Part A: Synthesis and Physical Analysis of DNA Methods in Enzymology, Academic Press. Each of these general texts is herein incorporated by reference.

The term 'amino acid' in the context of the present invention is used in its broadest sense and is meant to include naturally occurring L a-amino acids or residues. The commonly used one and three letter abbreviations for naturally occurring amino acids are used herein: A=Ala; C=Cys; D=Asp; E=Glu; F=Phe; G=Gly; H=His; I=Ile; K=Lys; L=Leu; M=Met; N=Asn; P=Pro; Q=Gln; R=Arg; S=Ser; T=Thr; V=Val; W=Trp; and Y=Tyr (Lehninger, A. L., (1975) Biochemistry, 2d ed., pp. 71-92, Worth Publishers, New York). The general term 'amino acid' may further include D-amino acids, retro- inverso amino acids as well as chemically modified amino acids such as amino acid analogues, naturally occurring amino acids that are not usually incorporated into proteins such as norleucine, and chemically synthesised compounds having properties known in the art to be characteristic of an amino acid, such as β-amino acids. For example, analogues or mimetics of phenylalanine or proline, which allow the same conformational restriction of the peptide compounds as do natural Phe or Pro, are included within the broad definition of amino acid. Such analogues and mimetics are referred to herein as 'functional equivalents' of the respective amino acid. Other examples of amino acids are listed by Roberts and Vellaccio, The Peptides: Analysis, Synthesis, Biology, Gross and Meiehofer, eds., Vol. 5 p. 341, Academic Press, Inc., N.Y. 1983, which is incorporated herein by reference. The term 'peptide' as used herein (e.g. in the context of a zinc finger peptide (ZFP) or zinc finger nuclease (ZFN) refers to a plurality of amino acids joined together in a linear or circular chain. The term oligopeptide is typically used to describe peptides having between 2 and about 50 or more amino acids. Peptides larger than about 50 amino acids are often referred to as polypeptides or proteins. For purposes of the present invention, however, the term 'peptide' is not limited to any particular number of amino acids, and is used interchangeably with the terms 'polypeptide' and 'protein'.

The terms 'nucleic acid', 'polynucleotide', and 'oligonucleotide' are used interchangeably and refer to a deoxyribonucleotide (DNA) or ribonucleotide (RNA) polymer, in linear or circular conformation, and in either single- or double-stranded form. For the purposes of the present invention such DNA or RNA polymers may include natural nucleotides, non-natural or synthetic nucleotides, and mixtures thereof. Non-natural nucleotides may include analogues of natural nucleotides, as well as nucleotides that are modified in the base, sugar and/or phosphate moieties (e.g. phosphorothioate backbones). Examples of modified nucleic acids are PNAs and morpholino nucleic acids. Generally an analogue of a particular nucleotide has the same base-pairing specificity, i.e. an analogue of G will base-pair with C. For the purposes of the invention, these terms are not to be considered limiting with respect to the length of a polymer.

The present disclosure provides new zinc finger peptides (ZFPs) and zinc finger nucleases (ZFNs) comprising a ZFP and a nuclease domain as well as new transcription activator-like (TAL) effector peptides (TALEs) and TAL effector nucleases (TALENs) comprising a TALE and a nuclease domain.

Zinc finger and Zinc finger nucleases

A 'zinc finger peptide' (or ZFP) refers to a polypeptide that comprises one or more zinc finger domain. Generally, a ZFP of the invention is a peptide or portion of a larger polypeptide that performs the function of recognising / targeting / binding a desired nucleic acid sequence. Meanwhile, a 'zinc finger nuclease' (or ZFN) comprises a nuclease or cleavage domain (or portion) in addition to a ZFP domain (or portion). For expediency, when ZFP is referred to herein in connection with a particular aspect, embodiment or feature, the term should be taken to include ZFN, unless it is apparent that a ZFN in the context of that aspect, embodiment or feature would not be intended or appropriate. A single ZFN may or may not have nucleic acid cleaving activity in isolation from other peptides and particularly other ZFNs. Thus, the nuclease domain of a ZFN may be an active nuclease domain or an inactive portion or fragment of a nuclease domain. Nuclease domains that are only functional as dimers (or multimers) may be considered to be partial or 'half nuclease or cleavage domains. Nuclease domains that are only functional as dimers may be engineered so as to function as homo- and heterodimers, homodimers only, or heterodimers only, depending on requirements. Preferred for use herein are nuclease domains that only function as heterodimers, i.e. in conjunction with another (perhaps closely related but) different nuclease domain.

As used herein, the term 'zinc finger domain' refers to an individual 'finger', which comprises a ββα- fold stabilized by a zinc ion. Each zinc finger domain typically includes approximately 30 amino acids. The term 'domain' (or 'module'), according to its ordinary usage in the art, refers to a discrete continuous part of the amino acid sequence of a polypeptide that can be equated with a particular function. Zinc finger domains are largely structurally independent and may retain their structure and function in different environments. Typically, a zinc finger domain recognizes and binds a triplet or (overlapping) quadruplet nucleotide sequence in a double-stranded DNA target sequence. Adjacent zinc finger domains arranged in tandem are joined together by linker sequences. A ZFP or ZFN of the invention is composed of a plurality of zinc finger domains, which in combination do not exist in nature. Therefore, they may be considered to be artificial, synthetic or engineered ZFPs or ZFNs. Zinc finger proteins generally contain strings or chains of zinc finger domains (or modules). Thus, a natural zinc finger protein may include 2 or more zinc finger domains, which may be directly adjacent one another (i.e. separated by a short (canonical) linker sequence), or may be separated by longer, flexible or structured polypeptide sequences. Directly adjacent zinc finger domains are expected to bind to contiguous nucleic acid sequences, i.e. to adjacent trinucleotides / triplets. In some cases, cross-binding may also occur between adjacent zinc fingers and their respective target triplets, which helps to strengthen or enhance the recognition of the target sequence, and leads to the binding of overlapping quadruplet sequences (Isalan et al., (1997) Proc. Natl. Acad. Sci. USA, 94: 5617-5621). By comparison, distant zinc finger domains within the same protein may recognise (or bind to) noncontiguous nucleic acid sequences or even to different molecules (e.g. protein rather than nucleic acid).

When binding to a nucleic acid sequence, a zinc finger domain generally interacts mainly with one strand of a double stranded nucleic acid molecule (the primary strand or sequence). However, there can be subsidiary interactions between amino acids of a zinc finger domain and the complementary (or secondary) strand of the double-stranded nucleic acid molecule. On binding to dsDNA, the a-helix of the zinc finger domain almost invariably lies within the major groove and aligns anti-parallel to the target nucleic acid strand. Accordingly, the primary nucleic acid sequence is arranged 3' to 5' in order to correspond with the N-terminal to C-terminal sequence of the ZFP. Since nucleic acid sequences are conventionally written 5' to 3', and amino acid sequences N-terminus to C-terminus, when a target nucleic acid sequence and a ZFP sequence are aligned according to convention, the primary interaction of the ZFP is with the complementary (or minus) strand of the nucleic acid sequence, since it is this strand which is aligned 3' to 5' (see e.g. Figures 1 , 2 or 3). These conventions are followed in the nomenclature used herein.

A nucleic acid 'target', 'target site' or 'target sequence', as used herein, is a nucleic acid sequence to which a ZFP or ZFN of the invention will bind, provided that conditions of the binding interaction are not prohibitive. A target site may be a nucleic acid molecule or a portion of a larger polynucleotide. Particularly suitable target sites comprise mutated (i.e. abnormal) genetic sequences. In accordance with the invention, a target sequence for a ZFP of the invention may comprise a single contiguous nucleic acid sequence, or more than one non-contiguous nucleic acid sequence (e.g. two separate contiguous sequences, each representing a partial target site), which are interspersed by one or more intervening nucleotide or sequence of nucleotides. Where the target sequences are partial, they may be bound by the same or different ZFPs. Where the target sequence is bound by two ZFPs (e.g. as for dimeric ZFNs used in the present invention), the partial target sites may be on opposite strands of a double-stranded nucleic acid molecule. The above terms may also be substituted or supplemented with the terms 'binding site', 'binding sequence', 'recognition site' or 'recognition sequence', which are used interchangeably. In the context of the present invention, examples of such target sites are SEQ ID NO: 46 and 48.

The specificity of ZFNs depends on artificially-engineered DNA-binding domains: multi-zinc finger arrays that recognize long DNA sequences. There exists a large body of work on zinc finger engineering (reviewed in Pabo et al., (2001) Annu. Rev. Biochem. 70: 313-340). Briefly, the established engineering methods (amongst others) range from rational design (Sera & Uranga (2002) Biochemistry 41 : 7074-7081), modular assembly with pre-made fingers (Beerli & Barbas (2002) Nat. Biotechnol. 20: 135-141), overlapping finger assembly (Greisman & Pabo (1997) Science 275: 657- 661 ; Isalan et al., (2001) Nat. Biotechnol. 19: 656-660) and bacterial-two hybrid (Hurt et al., (2003) Proc. Natl. Acad. Sci. USA 100: 12271-12276).

A recent development is the emergence of two publically-available sources of zinc fingers, the academic Zinc Finger Consortium (ZFC; www.zincfingers.org) and the commercially-available CompoZr, offered by Sigma Aldrich (Pearson (2008) Nature 455: 160-164.). In particular, the ZFC has provided a variety of open source tools for the community to employ (Wright et al., (2006) Nat. Protoc. 1 : 1637-1652).

Although both sources facilitate obtaining ZFNs, these have to be tested on a case-by-case basis for in vivo functionality, and suboptimal candidates often have to be abandoned because of a lack of straightforward optimization protocols. Whereas screening systems exist for 1- to 2-finger libraries (e.g. phage display) and 3-finger mini- libraries of pre-selected modules (e.g. bacterial two-hybrid: Hurt et al. (2003)), no straightforward system exists to optimize the 4- to 6-finger type scaffolds (as provided by Sigma Aldrich, for example). Another type of selection or maturation system that may be used in accordance with the invention to identify DNA-binding peptides is the one-hybrid system, such as yeast one-hybrid systems. In these assays, polypeptides that bind to a 'bait element' (e.g. a cis-acting regulatory element or any other short DNA sequence), are identified by detecting the activation or repression of a reporter gene (see for example, Wei et al. (1999) Mol. Cell. Biol. 19: 1271-1278). In a similar fashion to two-hybrid systems, in the one-hybrid system the open-reading frame of the polypeptide of interest is fused to a transcriptional repressor or activator domain (generally an activator) of a transcription factor. When the fusion protein binds to a promoter of interest (i.e. the bait element) through its cognate DNA binding domain, reporter gene expression is activated by the activation domain part of the fusion protein and these active proteins can be selected for in resultant colonies (e.g. growing yeast cells).

TAL effector and TAL effector nucleases

A 'TAL effector' (or TALE) refers to a polypeptide that comprises one or more DNA binding domains. Generally, a TALE of the invention is a peptide or portion of a larger polypeptide that performs the function of recognising / targeting / binding a desired nucleic acid sequence. Meanwhile, a 'TALE nuclease' (or TALEN) comprises a nuclease or cleavage domain (or portion) in addition to a TALE domain (or portion). For expediency, when TALE is referred to herein in connection with a particular aspect, embodiment or feature, the term should be taken to include TALEN, unless it is apparent that a TALEN in the context of that aspect, embodiment or feature would not be intended or appropriate. A single TALEN may or may not have nucleic acid cleaving activity in isolation from other peptides and particularly other TALENs. Thus, the nuclease domain of a TALEN may be an active nuclease domain or an inactive portion or fragment of a nuclease domain. Nuclease domains that are only functional as dimers (or multimers) may be considered to be partial or 'half nuclease or cleavage domains. Nuclease domains that are only functional as dimers may be engineered so as to function as homo- and heterodimers, homodimers only, or heterodimers only, depending on requirements. Preferred for use herein are nuclease domains that only function as heterodimers, i.e. in conjunction with another (perhaps closely related but) different nuclease domain.

In the context of the TALE or TALEN technology, the term 'DNA binding domain' refers to an individual 'module' or a string of individual modules, said modules comprising 33 to 35, typically 34 amino acids among which the 12th and the 13th are particularly responsible for the specificity of the interaction with a particular nucleotide. Typically, an individual module of a TALE DNA binding domain recognizes and binds a single nucleotide in a double-stranded DNA target sequence. TALE proteins generally contain strings or chains of DNA binding domains (or modules). Adjacent modules of the DNA binding domain recognize adjacent nucleotides in the target DNA sequence. Adjacent repeated DNA binding domains are joined together and flanked by TALE backbone elements originating for instance from the natural protein, said elements being possibly modified. A TALE or TALEN of the invention is composed of a plurality of DNA binding domains, which in combination do not exist in nature. Therefore, they may be considered to be artificial, synthetic or engineered TALEs or TALENs. When binding to a nucleic acid sequence, a DNA binding domain of a TALE or of a TALEN generally interacts mainly with one strand of a double stranded nucleic acid molecule (the primary strand or sequence). The primary nucleic acid sequence is arranged 5' to 3' in order to correspond with the N-terminal to C-terminal sequence of the TALE. Since nucleic acid sequences are conventionally written 5' to 3', and amino acid sequences N-terminus to C-terminus, when a target nucleic acid sequence and a TALE sequence are aligned according to convention, the primary interaction of the TALE is with the plus strand of the nucleic acid sequence, since it is this strand which is aligned 5' to 3' (see e.g. Figure 12). These conventions are followed in the nomenclature used herein.

A nucleic acid 'target', 'target site' or 'target sequence', as used herein, is a nucleic acid sequence to which a TALE or TALEN of the invention will bind, provided that conditions of the binding interaction are not prohibitive. A target site may be a nucleic acid molecule or a portion of a larger polynucleotide. Particularly suitable target sites comprise mutated (i.e. abnormal) genetic sequences. In accordance with the invention, a target sequence for a TALE of the invention may comprise a single contiguous nucleic acid sequence, or more than one non-contiguous nucleic acid sequence (e.g. two separate contiguous sequences, each representing a partial target site), which are interspersed by one or more intervening nucleotide or sequence of nucleotides. Where the target sequences are partial, they may be bound by the same or by different TALEs. Where the target sequence is bound by two TALEs (e.g. as for dimeric TALENs used in the present invention), the partial target sites may be on opposite strands of a double-stranded nucleic acid molecule. The above terms may also be substituted or supplemented with the terms 'binding site', 'binding sequence', 'recognition site' or 'recognition sequence', which are used interchangeably. In the context of the present invention, examples of such target sites are SEQ ID NO: 105 and 106.

The specificity of TALENs depends on artificially- engineered DNA-binding domains. These DNA binding domains are a string of 33 to 35, typically 34, imperfect amino acid repeats (Schornack et al, 2007, Proc. Natl. Acad. Sci. USA 104, 10720). Polymorphism is primarily at repeat positions 12 and 13, which are called the repeat- variable diresidue (RVD). RVDs of TAL effectors correspond directly to the nucleotides in their target sites, typically one RVD to one nucleotide, with some degeneracy and no apparent context dependence (Moscou and Bogdanove (2009) Science, 326: 1501; Boch et al (2009) Science 326: 1509-1512, see Figure 12B for examples). These repeats are flanked in C terminus and N terminus of elements responsible for the molecular architecture of the TALE, possibly originating from a natural TALE, for instance from a TAL effector originating from Xanthamonas bacteria.

Several methods for constructing TALENs have been described (Morbitzer et al, 2011, Nucleic Acid Research, 39:5790-5799; Zhang et al, 2011, Nature Biotechnology, 29:149-153; Geissler et al, 2011, Plos One 6 :el9509 ; Weber et al, 2011, Plos One, 6 :el9722 ; Li et al, 2011, Nucleic Acid Research, 39 :6315-6325 ; Cermak et al, 2011, Nucleoic Acid Research, 36:e82; Sander et al, Nature Biotechnology, 29:697-698; Huang et al, 2011, Nature Biotechnology, 29 :699-700), many of these methods enable production of TAL effector repeat arrays composed of only a certain fixed numbers of repeats. One of these method is based on modifying the molecular architecture of the TALEN PthXol from the rice pathogen X. oryzae pv. oryzae (Yang et al. 2006). A commercial high-throughput platform exists— Cellectis Bioresearch has announced the capability to produce 7,200 TALENs per year— but details of this proprietary method are not publicly available. In addition, a recent report has suggested limits to the targeting range of TALENs based on a computational analysis of naturally occurring TAL effector binding sites (Cermak et al, 2011, Nucleoic Acid Research, 36:e82). Currently no cost-effective and high-throughput method for constructing these nucleases exists and TALENs have to be tested case by case.

In general, the genetically modified animal or cell of the invention is generated using a zinc finger nuclease-mediated or a TAL effector nuclease genetic modification process. The process for modifying a chromosomal sequence typically comprises: (a) introducing into an embryo or cell at least one nucleic acid encoding a zinc finger nuclease or a TAL effector nuclease that recognizes a target sequence in the chromosomal sequence and is able to cleave a site in the chromosomal sequence, and, optionally, (i) at least one donor polynucleotide comprising a sequence for integration flanked by an upstream sequence and a downstream sequence that share substantial sequence identity with either side of the cleavage site, or (ii) at least one exchange polynucleotide comprising a sequence that is substantially identical to a portion of the chromosomal sequence at the cleavage site and which further comprises at least one nucleotide change; and (b) culturing the embryo or cell to allow expression of the zinc finger nuclease or TAL effector nuclease such that the zinc finger nuclease or TALE effector nuclease introduces a double-stranded break into the chromosomal sequence, and wherein the double- stranded break is repaired by (i) a non-homologous end-joining (NHEJ) repair process such that an inactivating mutation is introduced into the chromosomal sequence, or (ii) a homology-directed repair process such that the sequence in the donor polynucleotide is integrated into the chromosomal sequence or the sequence in the exchange polynucleotide is exchanged with the portion of the chromosomal sequence (Figure 2).

The ZFNs and TALENs herein described may be used for genetic in vitro and/or in vivo engineering. More specifically, the ZFNs and TALENs of the invention are capable of binding to and cleaving a p53 gene sequence. The ZFNs and TALENs are of particular importance in a method for mutating or replacing nucleic acid sequences in the p53 gene, for example the porcine gene.

The ZFNs and TALENs can be used to mutate, delete or modify the expression of the target gene through non-homologous end joining (NHEJ) induced by genomic double-strand break.

Alternatively, The ZFNs and TALENs may be used in conjunction with a donor nucleic acid molecule capable of homologous recombination (HR) with the target nucleic acid sequence, in order to add a sequence of interest such as a marker gene or to replace the porcine p53 sequence by any desired nucleic acid sequence such as a p53 dominant negative mutant, preferably a human version thereof. A method for cleaving a porcine p53 gene sequence of SEQ ID NO: 1, SEQ ID NO:2 or preferably of SEQ ID NO: 4 (or of a polymorphic sequence thereof such as SEQ ID NO:66) or of SEQ ID NO: 67 (or of a polymorphic sequence thereof such as SEQ ID: 118) is in particular herein described. Although the ability to design ZFNs and TALENs targeted to virtually any DNA sequence of interest has been suggested (Bogdanove et al., 2011, Science 333:1843-1846), only a very limited number of endogenous genes have been altered using TALENs in the published literature to date. These alterations concern only species such as zebrafish and rodents cells. The number of available reports describing genetically modified animals using these technologies is further limited and in particular no pig containing a modified sequence of the p53 gene has been described until now. The rational leading the general architecture design of the TALE and TALEN is to date poorly understood and no system is available for systemic TALE and TALEN dimers generation. In addition, the definition of the optimal DNA target site is also a critical parameter. Similarly, the design of ZFN has to take into account the compatibility of each ZFP domain with other ZFP domains of the ZFN, in addition to the code driving ZFP interaction with nucleotide sequences. A 'gene', as used herein, is the segment of nucleic acid (typically DNA) that is involved in producing a polypeptide or ribonucleic acid gene product. It includes regions preceding and following the coding region (leader and trailer) as well as intervening sequences (introns) between individual coding segments (exons). Conveniently, this term also includes the necessary control sequences for gene expression (e.g. enhancers, silencers, promoters, terminators etc.), which may be adjacent to or distant to the relevant coding sequence, as well as the coding and/or transcribed regions encoding the gene product. The phrase 'genomic sequence' therefore, includes genes and fragments or partial gene sequences. It may refer to non-coding and/or coding gene sequences. Such genomic sequences may be within a gene (e.g. in its natural or cellular context), or it may be isolated or extracted from its corresponding gene, such as an isolated nucleic acid molecule, vector or plasmid. Genomic sequences include those present in chromosomes whether natural or artificial, and whether in the nucleus or other organelles (e.g., mitochondria and chloroplasts). The genomic sequences may be animal (e.g. mammals, such as humans), or may be viral, parasitic, bacterial or fungal (e.g. yeast). Furthermore, the sequences that are to be targeted by the ZFPs or ZFNs or TALEs or TALENs of the invention can be normal (i.e., wild-type) or mutant (or abnormal). Mutant sequences may include point mutations, insertions, deletions, translocations and rearrangements. A particularly suitable target genomic sequence for the ZFPs and ZFNs and TALEs and TALENs herein described is the p53 gene, or a portion or fragment (isolated or endogenous) thereof. Still more suitably the sequence is porcine or human (or derived from a porcine or human gene sequence).

A suitable target genomic sequence can further be a sequence comprising polymorphism(s), i.e., at least 80%, 85%, 90%, 95%, 98%, or preferably 99% identical to the wild type porcine or human target genomic sequence, or in other words of less than 100% identity to the wild type porcine or human target genomic sequence, e.g., preferably at least 94% identical to SEQ ID NO: 4, 46, 48, 52, 67, 72, 81 or 93. Even more preferably a suitable target genomic sequence is a polymorphic sequence of SEQ ID NO:4 (see SEQ ID NO: 66) or a polymorphic sequence of SEQ ID NO: 67 (see SEQ ID NO: 118).

In a particular embodiment, the method comprises contacting a nucleic acid molecule comprising the p53 gene sequence to be cleaved with one or more zinc finger nuclease (ZFN), the ZFN comprising a zinc finger peptide (ZFP) portion comprising four or more zinc finger domains and one or more endonuclease portions. Advantageously, for the targeting of nucleic acid sequences within the pig genome the ZFP portion recognizes at least 12bp (e.g. 12bp, 15bp, 18bp or more) (target site) of the p53 gene sequence. The amino acids sequences of the nucleic acid recognition regions of the ZFP domains may be designed rationally and/or by random mutagenesis and screening or selection. Therefore, the ZFP is generally not naturally occurring and is thus an engineered ZFP. Following cleavage of the target nucleic acid sequence, a gene sequence or segment may be mutated, deleted or replaced. In another embodiment, a method for mutating a porcine p53 gene sequence is herein described. This method comprises contacting a nucleic acid molecule comprising the porcine p53 gene sequence to be mutated with one or more zinc finger nuclease (ZFN), the ZFN comprising a zinc finger peptide (ZFP) portion comprising four or more engineered zinc finger domains and one or more endonuclease portions. The ZFP portion beneficially recognizes at least 12bp of the p53 gene sequence in a region of the p53 gene sequence to be mutated, and the ZFN cleaves the p53 gene sequence within the region of the p53 gene sequence to be mutated, such that cleavage of the p53 gene sequence by the ZFN stimulates mutation of the p53 gene sequence by non-homologous end joining (NHEJ) through imperfect repair mechanism. In other embodiments, the methods may comprise contacting the nucleic acid molecule comprising the porcine p53 gene sequence with a first and a second ZFN, the first ZFN comprising a first ZFP portion and a first endonuclease portion, and the second ZFN comprising a second ZFP portion and a second endonuclease portion. The first and second ZFP portions may each comprise four or more engineered zinc finger domains, the first ZFP portion recognizing a first porcine p53 gene sequence comprising at least 12bp and the second ZFP portion recognizing a second porcine p53 gene sequence comprising at least 12bp. Suitably, the first and second porcine p53 gene sequences do not overlap with each other. More suitably, the two sequences are located within approximately lOObp of each other within the porcine p53 gene sequence, still more preferably within approximately 50bp of each other. In some cases, the sequence may be within 30bp, within 20bp or within lObp of each other. In these embodiments, the first and second endonuclease portions of each polypeptide are beneficially partial nucleic acid cleavage domains, which are unable to cleave the target nucleic acid in isolation, but are able to interact (e.g. associate or bind to one another) to constitute a functional endonuclease domain. The partial endonuclease domains may be capable or homodimerisation (associating with another identical domain) or may only be capable of heterodimerisation in order to form a functional endonuclease domain. Once both ZFNs are simultaneously bound to their respective target sites, the porcine genomic p53 sequence of SEQ ID NO: 2, preferably SEQ ID NO: 4 or a polymorphic sequence of SEQ ID NO:4 (see for example SEQ ID NO: 66), may therefore be cleaved.

In yet another aspect of the invention there is provided a polypeptide comprising a non-naturally occurring zinc finger peptide (ZFP), the ZFP preferably comprising four or more engineered zinc finger domains, wherein the amino acid sequence of the nucleic acid recognition region (i.e. positions -l,+l,+2,+3,+4,+5,+6) of each of the four engineered zinc finger domains (numbered Fl, F2, F3, F4 in N to C terminal order) is substantially identical to the zinc finger nucleic acid recognition region sequences in the groups: (i) Fl : RSSHLSR (SEQ ID NO: 21); F2: RNDNRKT (SEQ ID NO: 22); F3: RSSNLSQ (SEQ ID NO: 23); F4: DNSSRIR (SEQ ID NO: 24); (ii) Fl : RNSSLTN (SEQ ID NO: 25); F2: ATNSLIE (SEQ ID NO: 26); F3: RS(G/C)HLKT (SEQ ID NO: 27); F4: RSDNLKT (SEQ ID NO: 28); (iii) Fl : RSDTLSR (SEQ ID NO: 29); F2: RKDARIN (SEQ ID NO: 30); F3: RSSHLST (SEQ ID NO: 31); F4: KSDNRTT (SEQ ID NO: 32); (iv) Fl : RSDNLIV (SEQ ID NO: 33); F2: QNANRNT (SEQ ID NO: 34); F3: RSDALSR (SEQ ID NO: 35); F4: NSSNRTV (SEQ ID NO: 36). For example, the amino acid recognition sequences may be identical in at least 6 of the 7 positions of the corresponding recognition sequences. In some preferred embodiments, a ZFN of the invention comprises the amino acid recognition sequences of Fl to F4 given in any of groups (i) to (iv) above. Polypeptides comprising such ZFP domains are particularly suitable for targeting human p53 genomic sequences. In an alternative embodiment, the polypeptide of the invention may comprise a ZFP or ZFN comprising four zinc finger domains having nucleic acid recognition sequences substantially the same or identical to the group (v): Fl : RSSNLIV (SEQ ID NO: 37); F2: QNANRNT (SEQ ID NO: 38); F3: RSSALSR (SEQ ID NO: 39); F4: NSSNRTV (SEQ ID NO: 40). Such a polypeptide may be particularly useful for targeting pig, sheep and/or mouse genomic p53 sequences (Figure 3). In some embodiments, however, the ZFP or ZFN of the invention may comprise an array of more than four zinc finger domains. In such embodiments the ZFP or ZFN may comprise four adjacent zinc finger domains in N-terminal to C-terminal order having one or more groups of zinc finger domains having the recognition sequences of groups (i) to (v) above. The four-finger groups need not necessarily correspond to the N-terminal four zinc finger domains.

In another embodiment, the method comprises contacting a nucleic acid molecule comprising the p53 gene sequence to be cleaved with one or more TAL effector nucleases (TALEN), the TALEN comprising a TAL effector peptide (TALE) portion comprising twelve or more DNA binding domains, each one containing a repeat-variable diresidue (RVD) in positions 12 and 13, and one or more endonuclease portions. Advantageously, for the targeting of nucleic acid sequences within the pig genome, the TALE portion recognizes at least 12bp (e.g. 12bp, 15bp, 18bp or more) (target site) of the p53 gene sequence. The amino acids sequences of the nucleic acid recognition regions of the TALEN domains may be designed rationally and/or by random mutagenesis and screening or selection. Therefore, the TALE is generally not naturally occurring and is thus an engineered TALE. Following cleavage of the target nucleic acid sequence, a gene sequence or segment may be mutated, deleted or replaced.

In another embodiment, a method for mutating a porcine p53 gene sequence is herein described. This method comprises contacting a nucleic acid molecule comprising the porcine p53 gene sequence to be mutated with one or more TAL effector nuclease (TALEN), the TALEN comprising a TAL effector nuclease peptide (TALE) portion comprising twelve or more engineered DNA binding domains and one or more endonuclease portions. The TALE portion beneficially recognizes at least 12bp of the p53 gene sequence in a region of the p53 gene sequence to be mutated, and the TALEN cleaves the p53 gene sequence within the region of the p53 gene sequence to be mutated, such that cleavage of the p53 gene sequence by the TALEN stimulates mutation of the p53 gene sequence by non-homologous end joining (NHEJ) through imperfect repair mechanism.

In other embodiments, the methods may comprise contacting the nucleic acid molecule comprising the porcine p53 gene sequence with a first and a second TALEN, the first TALEN comprising a first TALE portion and a first endonuclease portion, and the second TALEN comprising a second TALE portion and a second endonuclease portion. The first and second TALE portions may each comprise twelve or more engineered DNA binding domains, the first TALE portion recognizing a first porcine p53 gene sequence comprising at least 12bp and the second TALE portion recognizing a second porcine p53 gene sequence comprising at least 12bp. Suitably, the first and second porcine p53 gene sequences do not overlap with each other. More suitably, the two sequences are located within about or approximately lOObp or 150 bp of each other within the porcine p53 gene sequence, still more preferably within approximately 50bp of each other. In some cases, the sequence may be within 30bp, within 20bp, within 16bp or within lObp of each other. In these embodiments, the first and second endonuclease portions of each polypeptide are beneficially partial nucleic acid cleavage domains, which are, each individually, unable to cleave the target nucleic acid, but are able to interact (e.g. associate or bind to one another) to constitute a functional endonuclease domain. The partial endonuclease domains may be capable or homodimerisation (associating with another identical domain) or may only be capable of heterodimerisation in order to form a functional endonuclease domain. Once both TALENs are simultaneously bound to their respective target sites, the porcine genomic p53 sequence of SEQ ID NO: 2, preferably SEQ ID NO: 67 or a polymorphic sequence thereof (for example SEQ ID NO: 118), may therefore be cleaved.

In yet another aspect of the invention a polypeptide comprising a non-naturally occurring TAL effector peptide (TALE) is herein provided. The TALE preferably comprises twelve or more engineered DNA binding domains or modules, for example chosen from SEQ ID NO: 113 to 117. These modules consist in repeat sequences of typically 34 amino acids, among which the 12th and the 13th are named repeat variable diresidues (RVD). As one module interacts with one nucleotide, the number of repeats defines the size of the recognised DNA sequence. The TALE or TALEN of the invention may comprise 12 or more modules, for instance 17 or 19 modules.

The polypeptide of the invention comprising a non-naturally occurring zinc finger peptide (ZFP) or a non-naturally occurring TAL effector may have a variety of uses and, therefore, may further comprise one or more effector domain selected from transcriptional repressor domains, transcriptional activator domains, transcriptional insulator domains, chromatin remodelling, condensation or decondensation domains, nucleic acid or protein cleavage domains, dimerisation domains, enzymatic domains, signalling/targeting sequences or domains. Suitably, the one or more effector domain is an endonuclease domain or an active or inactive fragment or portion thereof. Particularly, the effector may comprise at least a fragment of a Type IIS restriction endonuclease, such as a Fokl endonuclease. Advantageously, the nuclease portion is a partial (e.g. half) domain which may dimerise to form an active endonuclease. Thus, a preferred embodiment of the invention relates to a polypeptide which is a ZFN fusion polypeptide, comprising a ZFP having at least four zinc finger domains (Fl to F4) selected from one of the groups: (i) Fl : RSSHLSR (SEQ ID NO: 21); F2: RNDNRKT (SEQ ID NO: 22); F3: RSSNLSQ (SEQ ID NO: 23); F4: DNSSRIR (SEQ ID NO: 24); (ii) Fl : RNSSLTN (SEQ ID NO: 25); F2: ATNSLIE (SEQ ID NO: 26); F3: RS(G/C)HLKT (SEQ ID NO: 27); F4: RSDNLKT (SEQ ID NO: 28); (iii) Fl : RSDTLSR (SEQ ID NO: 29); F2: RKDARIN (SEQ ID NO: 30); F3: RSSHLST (SEQ ID NO: 31); F4: KSDNRTT (SEQ ID NO: 32); (iv) Fl : RSDNLIV (SEQ ID NO: 33); F2: QNANRNT (SEQ ID NO: 34); F3: RSDALSR (SEQ ID NO: 35); F4: NSSNRTV (SEQ ID NO: 36); and (v) Fl : RSSNLIV (SEQ ID NO: 37); F2: QNANRNT (SEQ ID NO: 38); F3: RSSALSR (SEQ ID NO: 39); F4: NSSNRTV (SEQ ID NO: 40); and a covalently linked Fokl partial endonuclease domain. Such polypeptides are beneficially used in pairs to specifically target and cleave a desired gene. Preferred pairs of ZFNs include a polypeptide of group (i) in combination with a polypeptide of group (ii); and a polypeptide of group (iii) in combination with a polypeptide of group (iv) or (v) above. Accordingly, the invention may provide compositions comprising at least a first and a second polypeptide of the invention. More preferred pairs of ZFNs that are particularly suited for targeting porcine or ovine or murine p53 nucleic acid sequences include a polypeptide of group (iii) (i.e. from group of SEQ ID NOs: 29 to 32) in combination with a polypeptide of group (v) (i.e. from group of SEQ ID NOs: 37 to 40).

Another preferred embodiment of the invention relates to a polypeptide which is a TALEN fusion polypeptide comprising a TAL having at least twelve engineered DNA binding domains or modules for example chosen from SEQ ID NO: 113 to 117; and a covalently linked Fokl partial endonuclease domain. Such polypeptides are beneficially used in pairs to specifically target and cleave a desired gene. A TALEN of the invention may comprise 12 or more, for instance 17 or 19 modules, and a pair of TALENs of the invention may comprises 17 and 19 modules.

It will be appreciated that the ZFPs and ZFNs of the invention may target nucleic acid sequences in any desirable polynucleotide, gene or genome. However, preferred target sequences are within the p53 gene, in particular the human, porcine, ovine, or mouse ones, or fragments thereof. Suitable target sites include human p53 gene sequences, such as CCACCATCCACTACAACTACATGTGTAACAGT (SEQ ID NO: 41); porcine p53 gene sequences, such as CCACCATCCACTACAACTtCATGTGcAACAGc (SEQ ID NO: 4), a polymorphic sequence of SEQ ID NO:4 such as for example SEQ ID NO: 66 (CCACCATCCACTACAACTWCATGTGYAACAGY) wherein W represents A or T, Y represents C or T (IUPAC nomenclature); ovine p53 gene sequences, such as CCACCATCCACTACAACTtCATGTGTAACAGc (SEQ ID NO: 42); and mouse p53 gene sequences, such as CCACCATCCACTACAAGTACATGTGTAAtAGc (SEQ ID NO: 43); or sequences comprising 12 or more consecutive nucleotides thereof (Figure 3). In some preferred embodiments, the ZFP of the invention recognises a target nucleic acid sequence comprising one or more sequences selected from: (i) 5'- GCCAAGCAGGGG -3' (SEQ ID NO: 44); (ii) 5'- CAGTGGCTCATG -3' (SEQ ID NO: 45); (iii) 5'- TAGTGGATGGTG -3' (SEQ ID NO: 46); and (iv) 5'- CATGTGTAACAG -3' (SEQ ID NO: 47). In an alternative embodiment, the target nucleic acid may comprise the sequence (v) 5'- CATGTGCAACAG -3' (SEQ ID NO: 48). In specific embodiments, target nucleic acid sequences (i) to (v) (i.e. SEQ ID NOs: 44 to 48) are recognised by ZFPs or ZFNs of groups (i) to (v) (i.e. groups from SEQ ID NOs: 21 to 24, SEQ ID NOs: 25 to 28, SEQ ID NOs: 29 to 32, SEQ ID NOs: 33 to 36, and SEQ ID NOs: 37 to 40), respectively.

In a preferred specific embodiment, target nucleic acid sequences (iii) and (v) (i.e. SEQ ID NOs: 46 and 48) are recognised by ZFPs or ZFNs of groups (iii) and (v) (i.e. groups from SEQ ID NOs: 29 to 32 and SEQ ID NOs: 37 to 40), respectively.

It will also be appreciated that the TALEs and TALENs of the invention may target nucleic acid sequences in any desirable polynucleotide, gene or genome. However, preferred target sequences are within the p53 gene, in particular the human, porcine, ovine, or mouse ones, or fragments thereof. Suitable target sites include porcine p53 gene sequences, such as TCGTCCCCCAGGTCGGCTCTGACTGTACCACCATCCTGATGTTGAAGTACACAT (SEQ ID NO: 67), a polymorphic sequence of SEQ ID NO:67 such as for example SEQ ID NO: 118. In some preferred embodiments, the TALE of the invention recognizes a target nucleic acid sequence selected from: (i) 5'- CGTCCCCCAGGTCGGCTCT -3' (SEQ ID NO: 105); (ii) 5'- ACACATGAAGTTGTAGT -3' (SEQ ID NO: 106).

The disclosure further encompasses nucleic acids/polynucleotides that encode one or more polypeptides, ZFPs, ZFNs, TALEs, TALENs or fusion proteins. Suitable expression vectors and viral vectors are therefore encompassed.

It will be appreciated that polypeptides, ZFPs and ZFNs and TALEs and TALENs herein described may be further derivatised or conjugated to additional molecules.

The binding affinity of a selected ZFP or TALE for its selected target sequence can be measured using techniques known to the person of skill in the art, such as surface plasmon resonance or biolayer interferometry. Biosensor approaches are reviewed by Rich et al. (2009) Anal. Biochem., 386: 194-216. Alternatively, real-time binding assays between a ZFP and target site may be performed using biolayer interferometry with an Octet Red system (Fortebio, Menlo Park, CA). Genetically modified pig

Herein provided is a genetically modified animal, preferably a pig. Such an animal is herein identified as a "knock-out", "knock-down", "inactivated", "mutated" or "altered" animal depending on the nature of the genetic modification this modification resulting preferably from non-homologous end joining (NHEJ) repair mechanism or from an homologous recombination (HR). Alteration or inactivation of the p53 pathway plays a role in cell cycle control, apoptosis, angiogenesis, carcinogenesis, and the phenotypes of various model organisms (and humans). Some p53 mutations result in partial loss of function or "knockdown" and others result in full or complete loss of function mutations or "knockout".

The p53 activity resulting from a loss of function in one or several p53 effectors has completely different and variable phenotypes, some resulting in less severe tumor development or no tumor development. Complete loss of function or "knockout" of p53 resulting in loss of function in all of its effectors always results in early onset tumor development in known animal models. The genetically modified animal disclosed herein may be heterozygous for the genetically modified chromosomal sequence. Alternatively, the genetically modified animal may be homozygous for the genetically modified chromosomal sequence. The genetically modified animal may comprise one (in the context of an heterozygous mutation, alteration or modification) or two (in the context of an homozygous mutation, alteration or modification) genetically modified genomic p53 sequence.

In one embodiment, the present disclosure provides a genetically modified pig in which the p53 gene chromosomal (or genomic) sequence associated with tumor suppression has been genetically modified. For example, the genetically modified chromosomal sequence may be inactivated such that the sequence is not transcribed and/or a functional protein is not produced (gene knock-out). Alternatively, the chromosomal sequence may be genetically modified such that the regulation of expression of the protein is altered. For instance, the chromosomal sequence may be modified such that the p53 protein is over-produced or under-produced (gene knock-down, typically resulting from a decreased mRNA expression, an altered mRNA splicing regulation, and/or an altered mRNA half- life).

The genetically modified chromosomal sequence may also be modified such that it codes for an altered p53 protein, provided such a protein is functional although altered, or is responsible for an altered p53 activity.

The genomic sequence may be modified such that at least one nucleotide is changed and the expressed protein comprises at least one changed amino acid residue (i.e., comprises a missense or stop mutation).

The genomic sequence may be modified such that several nucleotides are changed and the expressed protein comprises at least one changed amino acid residue, possibly several changed amino acid residues.

The altered p53 protein may have altered activity, for example altered substrate specificity, altered kinetic rates, and so forth. In some embodiments, the modified protein comprises at least one modification such that the altered version of the protein attenuates tumor suppression (allowing the generation of a cancer prone or susceptible pig model). In other embodiments, the modified protein comprises at least one modification such that the altered version of the protein provides tumor suppression activity (allowing the generation of a cancer resistant pig model).

The genetically modified genomic sequence may otherwise or in addition comprise a chromosomally integrated sequence, for example a sequence encoding a protein associated with tumor suppression or a reporter sequence. The integrated sequence may encode an endogenous protein associated with tumor suppression normally found in the animal, or the integrated sequence may encode an exogenous protein associated with tumor suppression, for example an orthologous (from a different species) protein, or combination thereof.

In one embodiment, the p53 genomic sequence may be inactivated. The inactivated p53 sequence may include a mutation, typically a deletion (i.e., deletion of one or more nucleotides), an insertion (i.e., insertion of one or more nucleotides), or one or more point mutation (i.e., substitution of a single nucleotide by a distinct nucleotide). The deletion, insertion, or point mutation may lead to frame shift and/or splice site mutations such that at least one premature stop codon is introduced. As a consequence of the mutation, the targeted chromosomal sequence is inactivated and a functional protein is not produced. The inactivated chromosomal sequence preferably and advantageously comprises no exogenously introduced sequence and preferably results from a NHEJ repair mechanism.

In a preferred embodiment, the pig of the invention carries a genetically modified nucleic acid sequence in the p53 gene, wherein the modification results from non-homologous end joining (NHEJ) repair mechanism or homologous recombination (HR) following double strand DNA cleavage in a four to seven bases sequence, preferably in a five bases sequence or in a six bases sequence, said cleavage occurring in a sequence selected from SEQ ID NO: 62, SEQ ID NO: 107 and SEQ ID NO: 108 of SEQ ID NO:2, of the porcine p53 genomic sequence of SEQ ID NO: 2, preferably of SEQ ID NO: 4 or a polymorphic sequence thereof (such as SEQ ID NO: 66) or in SEQ ID NO: 67 or a polymorphic sequence thereof (such as SEQ ID NO: 118).

When the modification results from non-homologous end joining (NHEJ) imperfect repair mechanism, the modification is preferably a NHEJ-induced mutation selected from a substitution, a deletion, an addition and any combination thereof.

In a particular embodiment, the double-strand DNA cleavage resulting from NHEJ or HR is in the six bases sequences 5'-TTCATG-3' (SEQ ID NO: 62) located in SEQ ID NO: 4 or in a polymorphic sequence thereof (such as SEQ ID NO:66). In another particular embodiment, the double-strand DNA cleavage site resulting from NHEJ or HR is in the 16bp sequence 5'- GACTGTACCACCATCC-3 ' (SEQ ID NO: 107) located in SEQ ID NO: 67 or in a polymorphic sequence thereof (such as SEQ ID NO: 118).

In another embodiment, the cleavage resulting from NHEJ or HR is in the 500bp sequence of SEQ ID NO: 108 located in SEQ ID NO: 2 or in a polymorphic sequence thereof.

In a particular embodiment, a zinc-finger-nuclease (ZFN) heterodimer specifically recognizing the pig p53 genomic sequence of SEQ ID NO: 46 and SEQ ID NO: 48 located in SEQ ID NO: 4 is responsible for the double strand DNA cleavage.

The zinc-finger-nuclease (ZFN) heterodimer advantageously comprises a ZFN of SEQ ID NO: 5 comprising a ZFP of SEQ ID NO: 7 and a ZFN of SEQ ID NO: 6 comprising a ZFP of SEQ ID NO: 8.

In another embodiment, a TAL effector-nuclease (TALEN) heterodimer specifically recognizing the pig p53 genomic sequence of SEQ ID NO: 105 and SEQ ID NO: 106 located in SEQ ID NO: 67 is responsible for the double strand DNA cleavage.

The TAL effector-nuclease (TALEN) heterodimer advantageously comprises a TALEN of SEQ ID NO: 111 comprising a TALE of SEQ ID NO: 119 and a TALEN of SEQ ID NO: 112 comprising a TALE of SEQ ID NO: 120. Genetically modified pigs can be discriminated from wild-type animals by genotyping. For this, various techniques can be used in order to detect the genetic modification in their genome at the p53 locus. For example, routine procedures for sequencing the p53 alleles as well as genotyping process involving PCR or southern blot techniques could be used. Also, one could discriminate between animals having genetic modification in one allele, or both alleles of the p53 gene using the same techniques.

Derivatives

Other objects of the present invention are a genetically modified pig, including a genetically modified pig obtainable by a method as herein described, at the various stage of its formation and development (fertilized egg, zygote, morula, blastocyst, embryo, fetus, young pig, juvenile pig or adult pig), as well as its elements, typically an isolated element, and its progeny. The pig may be alive or not. The pig or any of its elements can be frozen using any method known by the skilled person. The genetically modified pig is advantageously a domestic pig. The domestic pig can be selected for example from a miniature pig, a minipig and a micropig. The genetically modified pig according to the present invention preferably comprises a modified p53 genomic sequence. The pig may be modified in its somatic cells and/or in its germ cells.

Elements of the genetically modified pig comprise in particular a genetically modified cell or tissue, i.e., a genetically modified cell or tissue derived from said pig, for example a cell expressing a p53 sequence that is inactivated such that the sequence is not transcribed and/or a functional protein is not produced; typically a cancer cell; any cellular or sub-cellular extract of such a genetically modified cell such as the nucleus, a protein, a nucleic acid in particular a DNA or RNA, an organelle; a population of genetically modified cells directly obtained (sampled) from said pig or derived (cultured) from an isolated genetically modified cell as described previously, typically a cell line.

Non-limiting examples of further biological elements or components include antibodies, receptor proteins, altered tumor suppressor proteins, and the like.

The genetically modified cell can be selected from a cancer cell, a stem cell, in particular an induced pluripotent stem cell (iPS cell), a germ cell, a gamete and a somatic cell.

The skilled person is able to determine suitable methods and procedure for obtaining genetically modified cells from the genetically modified pigs or cell lines from said genetically modified cells. The genetically modified cells and/or cell lines are suitable in vitro test systems.

Genetically modified cell lines can in particular be established in order to generate a standardized screening cellular model system (herein included as a particular embodiment of the invention) usable in oncology and/or toxicology research, for example in a screening method.

The genetically modified cells and/or cell lines, preferably the genetically modified somatic cells and/or cell lines, can also be used to obtain genetically modified pigs, such as by cloning strategies. Further herein enclosed is therefore a pig derived from a genetically modified cell as herein described.

The term "progeny" indeed herein includes not only the progeny obtained by the normal sexual reproduction, but also the pigs cloned from somatic cells having the same chromosomal DNA as the genetically modified pigs herein described (produced by the conventional somatic cell nuclear transfer cloning technique or any other suitable method that the skilled person can use to generate cloned pigs) which contain the modified nucleic acid.

Production of pigs genetically modified in the p53 gene

Herein described are methods for preparing a genetically modified pig usable for example as a cancer prone model for oncology and/or toxicology studies. A particular process for producing a genetically modified pig according to the present invention comprises the steps of:

a) providing at least one nucleic acid sequence encoding a zinc finger nuclease, for example a sequence encoding amino acid sequences of SEQ ID NO:5 and SEQ ID NO:6, or preferably two nucleic acid sequences each encoding a ZFN of respectively SEQ ID NO: 5 and SEQ ID NO: 6, b) delivering said nucleic acid sequences to a porcine embryo under conditions in which said nucleic acid sequences are transiently expressed within the embryo; and

c) causing said embryo to go to term so as to generate a genetically modified pig. Preferably, the ZFN encoding nucleic acid sequence is a plasmid or is contained in a plasmid. This plasmid may be delivered directly to an embryo or may be used to prepare a viral vector, preferably a non- integrating and non-replicating lentiviral vector, as described in WO2006/010834. The delivery will be performed in the context of the method as herein described for preparing the genetically modified pig of the invention.

According to an alternative embodiment, capped, polyadenylated mRNA encoding the ZFN can be used to deliverer the ZFN to the pig embryos. Capped, polyadenylated mRNA encoding the ZFN may be produced using known molecular biology techniques, including but not limited to a technique substantially similar to the technique described in Geurts et al. (Science (2009), Jul 24; 325(5939):433). The mRNA may be delivered, for example injected, to porcine cells. Additionally, cells may be injected with mRNA encoding GFP as a control.

According to a further distinct embodiment, a ZFN polypeptide or protein, typically two ZFN polypeptides or proteins, for example a ZFN of SEQ ID NO: 5 and a ZFN of SEQ ID NO: 6, can be used to deliver the ZFN to the pig embryos. Polypeptides or proteins are produced using known molecular biology techniques, including but not limited to a technique substantially similar to the technique described in (Luckow B et al., Genesis. 2009 Aug;47(8):545-58). Polypeptides or proteins may be delivered to porcine embryos or cells. Another process for producing a genetically modified pig according to the present invention comprises the steps of:

a) providing at least one nucleic acid sequence encoding a TAL effector nuclease, for example a sequence encoding amino acid sequences of SEQ ID NO: 111 and SEQ ID NO: 112, or preferably two nucleic acid sequences each encoding a TALEN of respectively SEQ ID NO: 111 and SEQ ID NO: 112,

b) delivering said nucleic acid sequences to a porcine embryo under conditions in which said nucleic acid sequences are transiently expressed within the embryo; and c) causing said embryo to go to term so as to generate a genetically modified pig.

Preferably, the TALEN encoding nucleic acid sequence is a plasmid or is contained in a plasmid. This plasmid may be delivered directly to an embryo or may be used to prepare a viral vector, preferably a non- integrating and non-replicating lentiviral vector, as described in WO2006/010834. The delivery will be performed in the context of the method as herein described for preparing the genetically modified pig of the invention.

According to an alternative embodiment, capped, polyadenylated mRNA encoding the TALEN can be used to deliverer the TALEN to the pig embryos. Capped, polyadenylated mRNA encoding the TALEN may be produced using known molecular biology techniques, including but not limited to a technique substantially similar to the technique described in Geurts et al. (Science (2009), Jul 24; 325(5939):433). The mRNA may be delivered, for example injected, to porcine cells. Additionally, cells may be injected with mRNA encoding GFP as a control.

According to a further distinct embodiment, a TALEN polypeptide or protein, typically two TALEN polypeptides or proteins, for example a TALEN of SEQ ID NO: 111 and a TALEN of SEQ ID NO: 112, can be used to deliver the TALEN to the pig embryos. Polypeptides or proteins are produced using known molecular biology techniques, including but not limited to a technique substantially similar to the technique described in (Luckow B et al., Genesis. 2009 Aug;47(8):545-58). Polypeptides or proteins may be delivered to porcine embryos or cells.

A preferred viral vector is a retroviral vector, preferably a non-integrating and non-replicating lentiviral vector, in particular a HIV-derived retroviral vector, typically a HIV-1 -, SIV- or EIAV- derived retroviral vector, preferably a lentiviral vector prepared or produced using at least two recombinant nucleic acid sequences, preferably plasmids, consisting respectively in SEQ ID NO: 9 and 10 or SEQ ID NO: 70 and SEQ ID NO: 71, or comprising SEQ ID NO: 9 and 10 (Figure 4) or SEQ ID NO: 70 and SEQ ID NO: 71 (Figure 13). A non- integrating and non-replicating lentiviral vector advantageously usable in the context of the present invention comprises HIV retroviral GAG and POL proteins, said POL proteins being preferably modified in order to provide within the vectors a mutated lentiviral integrase preventing the integration of said recombinant genome into the genome of a host cell, the mutation consisting of one or more point mutations affecting the genomic integration of the vector, an heterologous ENV protein, preferably a Vesicular stomatitis virus (VSV) ENV protein, a retroviral genome comprising the recombinant nucleic acid sequence encoding the ZFN or the TALEN amino acid sequence of interest operably linked to a regulatory sequence, preferably to a promoter sequence as herein described, and said retroviral genome comprising cis-acting nucleic acid sequences.

HIV retroviral cis-acting nucleic acid sequences typically comprise:

Long terminal Repeat (LTR) sequences, preferably HIV-LTR sequences, preferably comprising a self- inactivating (SIN) lentiviral 3 'LTR;

a cis-acting nucleic acid sequence facilitating the RNA nuclear export, preferably the HIV-1 rev Responsive Element (RRE),

preferably one copy of the cPPT and CTS cis-acting regions ("flap sequence") of HIV-1, an HIV retroviral packaging nucleic acid sequence comprising an HIV retroviral 5' splice donor sequence, and

a psi sequence.

The self- inactivating (SIN) lentiviral 3 'LTR may be:

a 3 'LTR deleted from the U3 region,

- a 3 'LTR deleted from the enhancer sequence of the U3 region, or

a 3 'LTR deleted from the enhancer and promoter sequences of the U3 region.

The viral vector is preferably produced in a packaging host cell containing:

at least one, possibly two or more, transcomplementation plasmid(s) providing nucleic acid sequences linked to a heterologous regulatory nucleic acid sequence that respectively encode the HIV retroviral GAG, POL, TAT and REV proteins, and linked to an heterologous polyadenylation signal; said POL proteins being preferably modified in order in order to provide within the vectors a mutated lentiviral integrase preventing the integration of said recombinant genome into the genome of a host cell (Figure 5);

- an envelope plasmid providing a nucleic acid encoding a heterologous ENV protein, preferably derived from VSV-G (Figure 6); and

an expression plasmid, preferably a recombinant nucleic acid from (i) SEQ ID NO: 9 or 10 (Figure 4) or (ii) SEQ ID NO: 70 or 71 (Figure 13), providing a nucleic acid sequence containing preferably an HIV retroviral packaging signal flanked by HIV retroviral cis-acting nucleic acid sequences (as described previously); a less than full length HIV gag structural gene; and the nucleic acid expression cassette comprising a promoter as herein described operably linked to a recombinant nucleic acid encoding ZFN or TALEN sequence of interest, preferably a recombinant nucleic acid encoding a polypeptide of (i) SEQ ID NO: 5 or 6 or (ii) SEQ ID NO: 111 or 112 or a recombinant nucleic acid encoding respectively both polypeptides of SEQ ID NO: 5 and 6 or both polypeptides of SEQ ID NO: 111 and 112. The transcomplementation plasmid is preferably devoid of one or more accessory genes (vif, vpr, vpu and nef genes) or in other words does encode at least one of said accessory genes. Preferably, the transcomplementation plasmid encodes none of the vif, vpr, vpu and nef genes. As explained previously, in a preferred aspect of the method herein described for obtaining a genetically modified pig according to the present invention, nucleic acid expression cassettes each comprising a promoter operably linked to a recombinant nucleic acid encoding a ZFN or a TALEN sequence are introduced into early embryos via lentiviral vectors, preferably non-integrating and non- replicating lentiviral vectors.

The promoter may advantageously be selected from an ubiquitous promoter or any promoter active in embryonic cells. Preferably, the promoter is selected from CMV, PGK, CAG, nestin and ubiquitin.

According to another possible embodiment for obtaining a genetically modified pig according to the present invention, the method may comprise the use of capped, polyadenylated mRNA encoding the ZFN(s) or the use of ZFN protein(s) or polypeptide(s).

Alternatively, another possible embodiment for obtaining a genetically modified pig according to the present invention is a method comprising the use of capped, polyadenylated mRNA encoding the TALEN(s) or the use of the TALE protein(s) or polypeptide(s).

Preferred viral construct are shown on Figures 4 and 13 and corresponds respectively to SEQ ID NO: 9 and 10 and SEQ ID NO: 70 and 71.

Any other method for introducing the recombinant nucleic acid sequence(s) encoding ZFNs of SEQ ID NO: 5 and of SEQ ID NO: 6 or TALEN(s) of SEQ ID NO: 111 and of SEQ ID NO: 1 12 into early embryos is however possible. The genetically modified pig according to the present invention can be obtained by introducing the recombinant nucleic acid sequences, preferably the previously mentioned expression cassettes containing a promoter regulating its expression, into a fertilized egg or the like (clonal egg or embryo for example), by the conventional method of pronuclear injection or by the conventional sperm vector method; and developing an individual from the fertilized egg or the like by returning the embryo to the uterus of a foster mother at an appropriate stage.

Suitable methods of introducing the nucleic acids to the embryo or cell include microinjection, electroporation, sonoporation, biolistics, calcium phosphate-mediated transfection, cationic transfection, liposome transfection, dendrimer transfection, heat shock transfection, nucleofection transfection, magneto fection, lipofection, impalefection, optical transfection, proprietary agent- enhanced uptake of nucleic acids, and delivery via liposomes, immunoliposomes, vectors, virosomes, or artificial virions. In one embodiment, the nucleic acids may be introduced into an embryo by microinjection.

The "clonal egg" herein means an egg obtained by transplanting a nucleus of a somatic cell (in case of a somatic cell clone) or a fertilized egg (in case of fertilized egg clone) into an enucleated recipient egg.

The "embryo" herein means an embryo in an optimal stage between a unicellular egg and an embryo which can develop, preferentially to term, if returned to a uterus (preferably an embryo in the completely hatched blastocyst stage). However, introducing the ZFNs or TALENs (nucleic acid(s) or proteic forms) in the stage of unicellular egg is preferred because genetic modification will be transmitted to most, substantially all, or all of the cells of the transgenic pig. An individual can be developed, preferably by growing the egg or embryo into which the genetic modification was introduced from the unicellular stage up to the morula stage, and returning the resulting embryo to the uterus of an animal.

Alternatively, the expression vector can be introduced into somatic cells which will then be used for nuclear transfer to generate a cloned transgenic animal according to the conventional somatic cell nuclear transfer method (Gil MA et al., 2010, Reprod Domest Anim. 2010 Jun;45 Suppl 2:40-8).

Applications The herein described genetically modified pig can advantageously be used for research studies, for diagnosis or monitoring, as well as for prevention and/or treatment, of a proliferative disease such as cancer.

Research studies cover preclinical assays in oncology as well as toxicology studies to evaluate for example the carcinogenic potential (ability to induce a tumor or favours its development) of a particular compound, treatment or of an environmental condition.

The term "compound" refers to a drug, a nucleic acid, a cell, a population of cells, a functional food, a therapeutic vector, a chemical compound, and any mixture thereof.

The term "drug" herein refers to a substance, a medication or pharmaceutical composition that, when administered to a living organism, is used in the treatment (preferably complete cure), prevention, or diagnosis of disease or used to otherwise enhance physical well-being.

The term "functional food" or medicinal food refers to any healthy food which has a health-promoting or disease-preventing property beyond the basic function of supplying nutrients. The term "functional food" encompasses food, food complements and mixture thereof.

The "therapeutic vector" is a vector that allows inserting, altering or removing a gene within an individual's cell to treat a disease. Therapeutic vectors comprise any synthetic vectors, for example nucleic acids (DNA or RNA, or a mixture thereof), whether naked or complexed, for example pegylated nucleic acids or nucleic acids encapsulated into nanovectors or nanoparticles; virus derived vectors, for example adenovirus-derived vectors, retrovirus derived vectors, in particular lentivirus- derived vectors, adeno-associated-virus derived vectors, alphavirus-derived vectors and baculovirus- derived vectors; as well as mixtures thereof.

The term "chemical compound" includes any substance, or a mixture of such substances, the environmental toxicity or carcinogenic potential of which need to be assessed or tested.

The "environmental conditions" refer to chemicals, gases and radiations, in particular any ionizing radiations, as well as any radioactive radiations an animal may be exposed to.

The term "cancer treatment" refers to the administration of any known anti-cancer therapies, amongst which feature chemotherapy [anthracyclines such as daunorubicine, doxorubicin (DX), idarubicin and mitoxantrone (MTX), as well as oxali-platinum (oxaliplatin or OXP), cis-platinum (cisplatin or CDDP), and taxanes (paclitaxel or docetaxel) are considered as the most efficient cytotoxic agents of the oncologist armamentarium], a radiotherapy [XR], an hormonotherapy, an immunotherapy, a specific kinase inhibitor-based therapy, an antiangiogenic agent based-therapy, an antibody-based therapy, in particular a monoclonal antibody-based therapy, a surgery, a gene therapy, including gene repair therapy.

The expression "proliferative disease" refers to a disease with a failure of regulation of tissue growth. The "proliferative disease" is typically a cancer. The cancer may be any kind of cancer or neoplasia. The malignant tumor or cancer may be solid or not and is typically selected from a carcinoma, a sarcoma, a myeloma, a lymphoma, a melanoma, a paediatric tumour and a leukaemia tumour.

Preferably the genetically modified pigs of the invention are used as model systems for studying the pathogenesis, i.e., the onset, development and progress, of a proliferative disease such as cancer.

Preferably, the genetically modified pigs of the invention are used as model systems for the prevention and/or treatment i.e., for identifying means and methods suitable for the prevention and/or treatment (complete cure of the disease or alleviation of the disease' symptoms), of a disease as herein described.

The use of a pig as herein described as a model system for the prevention and/or treatment of such a disease typically comprises the development and evaluation of a therapeutic strategy of the disease. Different treatment regimens can be evaluated with regard to efficacy and safety, in a continuous manner (over the lifetime or over certain period of time during the life of a genetically modified pig) or in time intervals. The time intervals are preferably six-monthly, three-monthly, monthly, two weeks intervals or even weekly intervals. The skilled artisan is able to choose further suitable time intervals. In a particular embodiment, a method for studying the pathogenesis, the prevention and/or the treatment of a disease as herein described using a pig model according to the present invention, whatever the stage of its formation and development, or an element thereof as herein defined, is provided.

The genetically modified pigs of the invention are, as explained previously, highly suitable models or model systems for a proliferative disease such as cancer, or for susceptibility to develop such a disease. The genetically modified pig model furthermore overcomes the previously described limitations of existing models, in particular of murine models as explained previously. For the first time, a genetically modified large animal model with impaired p53 function is produced. This model exhibits genetic, physiologic and biochemical functions, in particular immunologic, endocrine as well as metabolic functions, more similar to those of a human being than the corresponding functions of other animal models. Also, in the respect of eating habits, pigs are omnivorous with a gastrointestinal tract resembling the human being's.

There are anatomical, physiological and molecular similarities between the human and the porcine tumors (SJ Adam et al., Oncogene 2007, 26, 1038-1045).

The parallels in cancer biology between human and pigs are conserved at the molecular level. For example, in both species telomerase is suppressed in a number of tissues but reactivated during cancer. Tumors can also reach sizes seen in humans, which is particularly relevant for many preclinical applications.

Additionally, the size, anatomy and physiology of the pig mirrors that of humans in many respects, and as such the pig is often the primary biomedical model for a number of diseases, surgical research and organ transplantations. The genomic sequence homology between human and pigs are also very high and the porcine pregnane X receptor protein that regulates p450 cytochrome CYP3A, which metabolizes almost half of prescription drugs in humans is more similar to humans than for example mice.

It is an object of the present invention to use a genetically modified pig as herein described, whatever the stage of its formation and development, for the in vivo identification of or evaluation of the ability of, a compound, also herein identified as test compound, to prevent or treat a proliferative disease, in particular a cancer, or to use a genetically modified cell, a population of cells or a tissue comprising such genetically modified cell as herein described, for the in vitro or ex vivo evaluation of the ability of a test compound to prevent or treat such a disease, in particular a cancer. In still another embodiment, an animal comprising a genetically modified p53 sequence may be used to evaluate potentially advantageous effects, including enhanced potency as well as reduced untoward effects, of a compound or treatment. The method comprises inducing tumor formation in a pig comprising a genetically modified p53 sequence, and then comparing the responses of a first group of animals contacted with the candidate compound or treatment to a second group of animal not contacted with the candidate compound or treatment. A method for evaluating the efficacy of a compound for preventing or treating a proliferative disease such as cancer is more particularly herein described. This method comprises the steps of i) providing a pig model according to the present invention, whatever the stage of its formation and development, or an element thereof as herein defined, ii) administering to said pig model or element a compound the efficacy of which is to be evaluated or exposing said pig model to a treatment the efficacy of which is to be evaluated, and iii) evaluating the effect, if any, of the compound or treatment on the phenotype allowed by the genetically modified p53 gene (which phenotype may be enhanced by any means known by the skilled person such as an exposition to carcinogenic radiations or chemical compound, or oncogenic viruses). A screening method of the invention preferably comprises the following steps of i) providing a pig model according to the present invention, whatever the stage of its formation and development, or an element thereof as herein defined, (ii) providing a compound to be tested, (iii) administering the compound to said pig model or element, (iv) determining whether the tested compound is capable of preventing or treating a proliferative disease as herein described.

The compound tested in vitro or ex vivo in a genetically modified cell, in a population of cells or in a tissue comprising such genetically modified cell may be anyone of those previously described and is preferably selected from a drug, a chemical compound, a therapeutic vector, as well as mixture thereof.

In yet another embodiment, the genetically modified animals disclosed herein may be used for research in gene therapy. For example, a genetically modified pig having a mutation in the p53 tumor suppressor gene may be genetically modified by correcting the genomic sequence comprising the mutation. Accordingly, the animal may no longer be susceptible to tumor formation or cancer development.

In particular, the gene therapy compositions of the invention may provide for the expression of therapeutic ZFPs, ZFNs and other polypeptides such as TALEs or TALENs in target cells for repressing the expression of target genes, such as those having either wild-type or non-wild-type sequences, and especially the p53 gene. ZFPs and ZFP-effector and TAL effector fusion proteins of the invention are particularly useful for therapeutic applications in which it is desired to alter (e.g. up or down-regulate) the expression level of a target endogenous gene. ZFNs and TALENs of the invention (e.g. as fusion proteins with the Fokl nuclease domain or partial domain) may also be useful in gene therapy treatments for gene cutting or for directing the site of integration of therapeutic genes to specific chromosomal sites, as previously reported by Durai et al. (2005) Nucleic Acids Res. 33, 18: 5978-5990. In yet another embodiment, the genetically modified animals disclosed herein may be used in the context of physical therapeutic cancer treatments efficiency and toxicity assessment, in particular radiation therapies.

Another screening method of the invention preferably comprises the following steps of i) providing a pig model according to the present invention, whatever the stage of its formation and development, or an element thereof as herein defined, (ii) exposing said model to a treatment to be tested, (iii) determining whether the tested treatment is capable of preventing or treating a proliferative disease as herein described. It is a further object of the present invention to use a genetically modified cell, a population of cells or a tissue comprising such genetically modified cell as herein described, for the in vitro or ex vivo identification of a biomarker or therapeutic target usable to diagnose, monitor, prevent or treat a proliferative disease, such as cancer. It is another object of the present invention to use a genetically modified pig as herein described, for the in vivo evaluation of a diagnosis means or device, in particular an imaging device, usable to diagnose a proliferative disease such as cancer.

A further aspect of the present disclosure encompasses methods for using the genetically modified animals herein described in toxicology research.

In one embodiment, a genetically modified pig as herein described comprises a p53 genetically modified sequence and may be used to determine susceptibility to developing tumors.

The method comprises exposing the genetically modified pig comprising a p53 genetically modified sequence and a wild-type pig to a carcinogenic compound or treatment, and then monitoring the development of a tumor. The cancer prone model consisting of a genetically modified pig as herein described comprising the p53 genetically modified sequence has an increased risk for tumor formation. Moreover, an animal homozygous for the inactivated p53 tumor suppressor sequence may have increased risk relative to an animal heterozygous for the same inactivated sequence, which in turn may have increased risk relative to a wild-type animal. A similar method may be used to screen for spontaneous tumors, wherein the animals are not exposed to a carcinogenic compound or treatment.

In another embodiment, a pig comprising a genetically modified p53 sequence may be used to evaluate the carcinogenic potential of a test compound or treatment. The method comprises contacting the genetically modified pig and a wild-type pig to the test compound or treatment, and then monitoring the development of tumors. If the genetically modified pig has an increased incidence of tumors relative to the wild-type animal, the test compound may be considered as carcinogenic.

In a further embodiment, a pig comprising a genetically modified p53 sequence may be used to determine the toxicity of a compound, treatment, or combination thereof. The method comprises inducing tumor formation in pig comprising a genetically modified p53 sequence, and then comparing the responses of a first group of pigs contacted with the compound, treatment, or combination thereof, to a second group of pigs not contacted with the compound, treatment, or combination thereof.

In another embodiment, a genetically modified pig comprising a genetically modified p53 sequence may be used to test the ADME/ Tox profile of a compound, a treatment, or a combination of compounds and/or treatments. The method is similar to those detailed above, and assessment parameters include damage to DNA, metabolic consequence, and behavioral effects of the compound, a treatment, or a combination of compounds and/or treatments. Behavioral tests include test of learning/memory, anxiety/depression, and sensori-motor functions.

Non-limiting examples of behavioral tests suitable for assessing the motor function of animals includes open field locomoter activity assessment, the rotarod test, the limb-placement or grid walk test, the vertical pole test, the Inverted grid test, the adhesive removal test, the painted paw or catwalk (gait) tests, the beam traversal test, and the inclined plane test.

Non-limiting examples of behavioral tests suitable for assessing the long-term memory function of animals include the elevated plus maze test, the Morris water maze swim test, contextual fear conditioning, the Y-maze test, the T-maze test, the novel object recognition test, the active avoidance test, the passive (inhibitory) avoidance test, the radial arm maze test, the two-choice swim test, the hole board test, the olfactory discrimination (go-no-go) test, and the pre-pulse inhibition test.

Non- limiting examples of behavioral tests suitable for assessing the anxiety of pigs include the open field locomotion assessment, observations of marble-burying behavior, the elevated plus maze test, the light/dark box test. Non- limiting examples of behavioral tests suitable for assessing the depression of pigs includes the forced swim test, the hot plate test, anhedonia observations, and the novelty suppressed feeding test. Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

Further aspects and advantages of the present invention will be described in the following examples, which should be regarded as illustrative and not limiting.

EXPERIMENTAL PART

Example 1: Design of zinc finger peptides against porcine p53 target site.

Zinc fingers were first rationally designed against the human p53 genomic DNA sequence and were then optimised by yeast one hybrid in order to obtain functional zinc finger nucleases. However, the DNA sequence targeted varies between human versus porcine and other species, especially in the recognition site of the zll66L zinc fingers (Table 1).

Table 1 : zl l66 binding site alignment, showing differences between human, porcine, ovine and murine sequences; as well as differences between the a-helix nucleic acid recognition sequences for the zl l66L ZFPs.

Therefore to allow binding of zl l66L ZFP to the p53 sequence of other species' genes, appropriate rationally- designed mutations had to be introduced into the DNA-contacting recognition helices. In Figure 3, the protein recognition helices are shown aligned to the bases they contact. Two rational mutations at helical position +2 of fingers 1 and 3 have to be made in order to correct inappropriate cross-strand DNA contacts, as described in Isalan et al. (1997) Proc Natl Acad Sci U S A. 94(11):5617-21.

3 -TG A C A A T G T G T A CA- 5 ' HUMAN (SEQ ID NO: 49)

RSDNLIV QNANRNT RSDALSR NSSNRTV Z1166L HUMAN

Fl F2 F3 F4

3 -cG A t A A T G T G T A CA- 5 ' MURINE (SEQ ID NO: 50)

3 -cG A c A A T G T G T A Ct- 5 ' OVINE (SEQ ID NO: 51)

3 -cG A c A A c G T G T A Ct- 5 ' PORCINE (SEQ ID NO: 52)

RSSNLIV QNANRNT RSSALSR NSSNRTV Z1166L PORCINE -OVINE-MURINE-HUMAN

Fl F2 F3 F4

Table 2: Alignment of Z1166L zinc finger alpha helices with the different p53 variant recognition sites. To allow binding to all polymorphisms, the two aspartates in bold ("D") were mutated to serine ("S").

The aspartates ("D" in Table 2, human Fl and F3 above) clash with the complementary base to the lower case "c"s in the murine, ovine and porcine sequences. Hence the inventors found that to make a functional nuclease, for example for use in pigs, the "D"s have to be replaced by the two "S"s in bold (coding for serine). The two other main contacted positions that vary (Fl Val-to-t/C and F2 Thr-to- T/c; in bold) are promiscuous at these positions and accommodate any of the species DNA sequences above. Finally, the right hand A/t variable position (near the 5' end, underlined) is not contacted so does not require special designs.

The amino acid sequences of zinc finger domains 1 to 4 of porcine zl l66L (SEQ ID NO: 6) and zl l66R (SEQ ID NO: 5) are shown in Table 3. Full amino acid sequences of the ZFN comprising porcine zl l66L ZFP (SEQ ID NO: 8) or zl l66R ZFP (SEQ ID NO: 7) with obligate dimer Fokl nuclease domains are also illustrated in Table 3 [SEQ ID NO: 6 (porcine zl l66L ZFN - amino acid sequence N-C) and SEQ ID NO: 5 (zl 166R ZFN - amino acid sequence N-C) respectively]. ZFPs and ZFNs comprising the porcine zl l66L ZFP sequence or the zl l66R ZFP sequence can be constructed as previously described.

SEQ ID NO: 6 (porcine zll66L ZFN - amino acid sequence N-C)

MAIPKKKRKV[AEERP

FOCRICMRNFSRSSNLIVHIRTHTGEKP

FACDICGRKFAONANRNTHTKIHTGSERP

FQCRICMRNFSRSSALSRHIRTHTGEKP

FACDICGRKFANS SNRTVHTKIHQNKKQ

lvkseleekkselrli kyvpheyielieiamstqdrilemkvm

gspidygvivdtkaysggynlpigqademqryvkenqtrnkhinpnewwkvypssvtefkflfvsghfkgny kaqltrlnhktncngavlsveelliggemikagtltleevrrkfnngeinf*

SEQ ID NO: 5 (zll66R ZFN - amino acid sequence N-C)

MA|PKKKRKV[AEERP

FQCRICMRNFSRSDTLSRHIRTHTGEKP

FACDICGRKFARKDARINHTKIHTGSERP

FQCRICMRNFSRSSHLSTHIRTHTGEKP

FACDICGRKFAKSDNRTTHTKIHONKKO

lvkseleekkselrliklkyvpheyielieiamstqdrilemkvm

gspidygvivdtkaysggynlpigqademeryveenqtrnkhlnpnewwkvypssvtefkflfVsghfkgny

kaqltrlnhitncngavlsveelliggemikagtltleevrrkfnngeinf*

Porcine zll66L ZFP (SEQ ID NO: 8)

FOCRICMRNFSRSSNLIVHIRTHTGEKP FACDICGRKFAONANRNTHTKIHTGSERP FQCRICMRNFSRSSALSRHIRTHTGEKP FACDICGRKFANS SNRTVHTKIH

zll66R ZFP (SEQ ID NO: 7)

FQCRICMRNFSRSDTLSRHIRTHTGEKP FACDICGRKFARKDARINHTKIHTGSERP FQCRICMRNFSRSSHLSTHIRTHTGEKP FACDICGRKFAKSDNRTTHTKIH

Key:

a. A TG (nucleic acid sequences) = start codon

b. PKKKRKV (peptide sequences) = SV40 nuclear localisation signal

c. ZFP peptide sequence a-helices underlined from position -1 to position +6

d. [ACTAGTl = Spel site

e. Fokl sequences for obligate dimer shown (lower case in polynucleotide and peptide

sequences); although normal Fokl sequences could be substituted according to preference

f. TAA = stop codon

Table 3: SEQ ID NO: 6 is ZFN peptidic sequence for porcine zl l66L for targeting porcine and/or ovine and/or murine p53 gene sequences. These sequences comprise the sequence of zinc finger domains 1 to 4 of porcine zl l66L (SEQ ID NO: 8); and the Fokl obligate dimer nuclease domain sequences (Miller et al. (2007) Nat. Biotech., PMID 17603475). Alternatively, standard Fokl domain could be used in place of the obligate dimer sequence.

SEQ ID NO: 5 is ZFN peptidic sequence for zl 166R for targeting porcine and/or ovine and/or murine and/or human p53 gene sequences. These sequences comprise the sequence of zinc finger domains 1 to 4 of porcine zl l66R (SEQ ID NO: 7); and the Fokl obligate dimer nuclease domain sequences (Miller et al. (2007) Nat. Biotech., PMID 17603475). Alternatively, standard Fokl domain could be used in place of the obligate dimer sequence.

Although the zl l66L binding sites are not identical between porcine, ovine and murine p53 gene sequences, the modified porcine / ovine /murine zl 166L ZFP and corresponding ZFNs (also including zl l66R ZFN, constructed similarly to that described in the Examples above) were tested in vitro and found to be active against nucleic acids having either porcine, ovine or murine gene sequences in the zl 166 region.

Example 2: In vitro DNA cleavage of porcine p53 target site.

To validate the yeast one-hybrid assay-based protein-DNA interactions, ZFP genes were recovered from yeast colonies by PCR, while introducing a T7 promoter for subsequent expression. After a PCR- based fusion to the Fokl domain (described in Example 3), the full length ZFN candidates were expressed in vitro and tested for specific cutting activity in an in vitro cleavage assay.

In vitro DNA cleavage assay

A 154bp linear target DNA was amplified by PCR from pTARGET plasmids with primers Target seqLl (5'-accagttggtctggtgtcaa-3') and Target seqRl (5'-ctgaacttgtggccgtttac-3') corresponding respectively to SEQ ID NO: 53 and SED ID NO : 54. DNA templates for in vitro expression of ZFNs were amplified by PCR from ZFN expression vectors using primers T7kozak_FWD

(5'-tcgagtaatacgactcactatagggagaaacaccatagattgccatggccgagcgccccttc-3') corresponding to SEQ ID NO: 55 and UnivQQR R NotI (5'-aaggaaaaaagcggccgcaaaaggaaaaggatcctcattaaaagtttatc-3') corresponding to SEQ ID NO: 56. All PCR reactions were purified with the QiaQuick PCR Purification Kit (Qiagen).

The ZFNs were expressed from 100 ng T7-PCR templates, using the TnT-Quick coupled transcription-translation system (Promega), according to the manufacturer's instructions, except that ZnCl2 was added to a final concentration of 4 mM.

To analyse ZFN activities, TnT reactions were mixed with 200 ng of target DNA and diluted with Fokl cleavage buffer to a final concentration of: 20 mM Bis-Tris-propane pH7.0, 100 mM NaCl, 5 mM MgCl2, 0.1 mM ZnCl2, 5 mM DTT, 1.8% (vol/vol) glycerol, 20 μ^ιηΐ poly-d(I-C), 0.1 mg/ml BSA. After incubation for 5 to 6 hrs at 30°C, the reactions were cleaned with a PCR Purification Kit (Qiagen) or, alternatively, were treated with 1 ml RNase A (10 mg/ml, Qiagen) for 30 minutes at 37°C and 1 ml Proteinase K (20 mg/ml, Qiagen) for 1 hour at 37°C. Samples were then analysed on a 1.8% agarose gel, with ethidium bromide staining.

From the results, four-finger anti-p53 ZFNs that bind and cleave their palindromic target sites efficiently in vitro were identified (zl 166R ZFN and zl 166L ZFN). The ZFNs were similarly tested in homo- and heterodimer pairs, for cleaving either palindromic test sites or full DNA target sites (in the configuration shown in Figure 3A; L=left; R=right). The L/L and R/R-homodimers only cut their respective palindromic target sites but not the unspecific targets (see Figure 11B; left panel). Conversely, the full heterodimer target was cleaved only by the correct combination of zl l66L/R ZFNs, respectively, showing that a full pair of ZFNs was required for cutting the desired heterodimeric target site (see Figure 1 IB; right panel). The full range of target sites recognised by the rationally engineered porcine heterodimeric zinc finger nuclease are found within SEQ ID NO: 66 (Figure 11C). Example 3: Zinc finger nuclease vectorization for GFP NHEJ-induced mutagenesis.

In order to introduce the ZFNs into a cell and generate specific DNA double-strand breaks, non- integrating lentiviral vectors were used. The first step involved the production of non-integrating lentiviral vectors bearing expression cassettes of the ZFNs.

First, expression cassettes for a GFP recognizing pair of ZFN were designed. These expression cassettes were introduced into a lentiviral vector plasmid, giving rise to plasmids Ptl32 (Pt-CMV- ZFN3-WPRE) and Ptl33 (Pt-CMV-ZFN4-WPRE) of SEQ ID NO: 11 and SEQ ID NO: 12 respectively (Figure7).

To produce the non-integrating lentiviral vectors, HEK293T cells were co-transfected with:

- the lentiviral plasmid Ptl32 or Ptl33 (SEQ ID NO: 11 or 12) (Figure 7), with

- a transcomplementation plasmid (SEQ ID NO: 13, 14 or 15) bearing a mutation into the lentivirus integrase coding sequence (corresponding respectively to the D64V, LQ, N mutations of the integrase coding sequence) (Figure 5), and

- an envelope plasmid (SEQ ID NO: 16) allowing the expression of the VSV glycoprotein or envelope (Figure 6).

After co-transfection, the cell medium was replaced. Later, the medium containing the lentiviral vectors was harvested, filtered, treated with Dnase I and ultracentrifugated in order to concentrate the vectors. Pellets were resuspended in cold PBS. The resultaing concentrated vector batches (NILV- ZFN3 and NILV-ZFN4) were then aliquoted in small volumes and frozen at -80°C until use.

To test the efficiency of these vectors to introduce the ZFNs into a cell and induce NHEJ-mediated inactivation of the GFP gene, a Hela cell clone containing a unique copy of the GFP cDNA under the control of a CMV promoter (Hela clone HI 1) was used.

HI 1 Hela cells were treated with various vectors :

No vectors, mock control (Control group)

D64V NILV ZFN3 and D64V NILV-ZFN4 ( Experimental group 1 )

- LQ NILV ZFN3 and LQ NILV-ZFN4 ( Experimental group 2 ) Cells were seeded at the density of 20 000 cells per well in 24-well plate. The cells were transduced 72h later in 200μ1 of medium supplemented with Ι μΜ DEAE-Dextran. Contact with the vectors was allowed for six hours then medium was removed and replaced with fresh medium.

Seven days after transduction, cells of each group were harvested for analysis of GFP expression by FACS. The control group showed similar level of GFP expression. To the opposite, the experimental groups that were transduced with both vectors showed a very strong decrease of the GFP+ cells, up to 80%. This suggested that, in these groups, the ZFNs were able to efficiently cut the GFP gene and induce its inactivation by NHEJ (Figure 8). Moreover, this extinction of GFP was stable over time (up to 85 days, last point checked) (Figure 9).

Secondly results first obtained were confirmed in optimal conditions (doses, types of vector). In this way, more controls were used and did the experience in triplicates.

HI 1 Hela cells were treated with:

No vectors, mock control (Control group 1)

- Only D64V NILV-ZFN3 vector (Control group 2)

Only D64V NILV-ZFN3 vector (Control group 3)

D64V NILV ZFN3 and D64V NILV-ZFN4 ( Experimental group)

The cells were transduced in the same conditions that in the first experience. Seven days after transduction, cells of each group were harvested for analysis of GFP expression by FACS. All control groups showed similar level of GFP expression. To the opposite, the experimental group that was transduced with both vectors showed a very strong decrease of the GFP+. These results confirm the first observations and suggest the specificity and efficacy of the system (Figure 10).

To verify this, genomic DNA from experimental group cells was extracted and sequenced. Genomic DNA was prepared by proteinase K digestion in lysis buffer followed by phenol/chloroform extraction. GFP PCR was did on the genomic DNA with the primers GFP forward and GFP reverse of SEQ ID NO: 64 and 65. PCR products were cloned in pGEMTeasy plasmid to be sequenced with T7 and Sp6 primers. Most of the clones showed genetic modification resulting from NHEJ repair mechanism as expected (see the below table 4).

Table 4: NHEJ mediated mutation of target GFP sequence mediated by GFP ZFN pair (ZFN3 and ZFN4). All untreated Hela Hl l cells display the wild-type TTCAAG sequence (SEQ ID NO: 57). Clones that do not express GFP and obtained by transduction of Hela Hl l cells with ZFN3 and ZFN4 have mutated target sequence (SEQ ID 58 to 61 and SEQ ID 104).

Example 4: Zinc finger nuclease vectorization for p53 NHEJ-induced mutagenesis in mouse cells.

Because the ZFNs were designed to recognized a conserved sequence in pig and mouse, the same constructs can be used in both species. In order to validate the efficacy of the designed ZFNs to induce NHEJ-mediated mutagenesis of genomic p53, said ZFNs were first tested in mouse cells.

First, expression cassettes were designed for the porcine p53 recognizing pair of ZFNs. These expression cassettes were introduced into a lentiviral vector plasmid, giving rise to plasmids Ptl60 (Ptrip-CMV-zl l66L-WPRE) and Ptl 61 (Ptrip-CMV-zl 166R-WPRE) of SEQ ID NO: 10 and SEQ ID NO: 9 respectively.

To produce the non-integrating lentiviral vectors, HEK293T cells were co-transfected with

- the lentiviral plasmid Ptl60 or Ptl61 (SEQ ID NO: 10 or 9) (Figure 4), with

- a transcomplementation plasmid (SEQ ID NO: 13) bearing a mutation into the lentivirus integrase coding sequence (D64V mutation of the integrase coding sequence) (Figure 5) and

- an envelope plasmid (SEQ ID NO: 16) allowing the expression of the VSV glycoprotein or envelope (Figure 6).

After co-transfection, the cell medium was replaced. Later, the medium containing the lentiviral vectors was harvested, filtered, treated with Dnase I and ultracentrifugated in order to concentrate the vectors. Pellets were resuspended in cold PBS. The resulting concentrated vector batches (NILV- zl 166R and NILV-zl 166L) were then aliquoted in small volumes and frozen at -80°C until use. Mouse N2A cells (Neuronal cell line) were treated with various vectors :

Control group 1 : no vectors (mock control)

Control group 2 : only NILV-zl 166R vector (D64V)

Control group 3 : only NILV-zl 166L vector (D64V)

- Experimental group : both NILV-zl 166R and NILV-zl 166L vectors

The cells were transduced in the same conditions that in ZFN GFP experience. Seven days after transduction, genomic DNA from each group cells was extracted and sequenced in the same way as example 3. PCR uses murine-specific primers designed against the sequence of the p53 gene, Murine P53 forward (SEQ ID NO: 17) and Murine P53 reverse (SEQ ID NO:18) primers.

Genetic modification of the p53 genomic sequence could be observed only in the experimental group 4 (transduced with both NILV-zl 166R and NILV-zl 166L vectors), confirming the functionality of the ZFNs to induce NHEJ-mediated mutagenesis at the genomic p53 locus. Example 5: Microinjection of vectorized p53-ZFN in pig embryos and reimplantation to generate p53 genetically modified pigs.

Pig embryos were microinjected with the pair of ZFNs recognizing the porcine p53 gene, using a pair of non-integrating lentiviral vectors.

First, expression cassettes were designed for the porcine p53 recognizing pair of ZFNs. These expression cassettes were introduced into a lentiviral vector plasmid, giving rise to plasmids Ptl60 (Ptrip-CMV-zl l66L-WPRE) and Ptl 61 (Ptrip-CMV-zl 166R-WPRE) of SEQ ID NO: 10 and SEQ ID NO: 9 respectively (Figure4).

To produce the non-integrating lentiviral vectors, HEK293T cells were co-transfected with

- the lentiviral plasmid Ptl60 or Ptl61 (SEQ ID NO: 10 or 9) (Figure 4), with

- a transcomplementation plasmid (SEQ ID NO: 13) bearing a mutation into the lentivirus integrase coding sequence (D64V mutation of the integrase coding sequence) (Figure 5) and

- an envelope plasmid (SEQ ID NO: 16) allowing the expression of the VSV glycoprotein or envelope (Figure 6).

After co-transfection, the cell medium was replaced. Later, the medium containing the lentiviral vectors was harvested, filtered, treated with Dnase I and ultracentrifugated in order to concentrate the vectors. Pellets were resuspended in cold PBS. The resultaing concentrated vector batches (NILV- zl 166R and NILV-zl 166L) were then aliquoted in small volumes and frozen at -80°C until use.

Embryos were produced from Large- White gilts that were approximately 9 months of age and weighed at least 120 kg at time of use. Super-ovulation was achieved by feeding, between day 11 and 15 following an observed oestrus, 20 mg altrenogest (Regumate, Hoechst Roussel Vet. Ltd., Milton Keynes, UK) once daily for 4 days and 20 mg altrenogest twice on the fifth day. On the sixth day, 1500 international units (IU) of eCG (PMSG, Intervet UK Ltd, Cambridge, UK) were injected at 8:00 P.M. Eighty three hours later 750 IU hCG (Chorulon, Intervet UK Ltd, Cambridge, UK) were injected. Donors gilts were inseminated twice 6h apart after exhibiting heat generated following super- ovulation. Recipient females were treated identically to donor gilts but remained un-mated.

Embryos were surgically recovered from mated donors by mid-line laparotomy under general anesthesia on day 1 following oestrus. (Heat=estrus Day 0).

Embryos were injected with the non-integrating lentiviral vectors (NILV-zl 166R and NILV-zl 166L) by sub-zonal injection into the per-vitalin space using fine glass needles under an inverted microscope. Immediately following treatment fertilized embryos were transferred to recipient gilts following a mid- line laparotomy under general anesthesia. During surgery, the reproductive tract was exposed and embryos were transferred into the oviduct of recipients using a 3.5 French gauge tomcat catheter.

Animals investigated in this study were hemizygous male and female transgenic pigs and non- transgenic (littermate) control animals. All animal experiments were carried out following ethical review and conducted under The Animal (Scientific Procedures) Act 1986 (UK).

Example 6: Genetically modified piglets selection by genotyping

Genomic DNA was prepared from ear clips by proteinase K digestion in lysis buffer followed by phenol/chloroform extraction. Offspring were genotyped by PCR using specific primers of SEQ ID NO: 19 and 20 designed against the sequence of the porcine p53 gene from SEQ ID NO: 2.

PCR products were sequenced to identify animals bearing mutations into the p53 gene at or around the cleavage site of sequence 5'- CAACTT-3' (SEQ ID NO:62) due to NHEJ imperfect repair mechanism induced by the ZFNs. Example 7: Zinc finger nuclease vectorization for p53 NHEJ-induced mutagenesis in pig cells.

Because the ZFNs were designed to recognize a conserved sequence in pig and mouse, the same constructs can be used in both species. In order to validate the efficacy of the ZFNs to induce NHEJ- mediated mutagenesis of genomic p53, said ZFNs were also tested in pig cells. Lentiviral vectors expressing zl 166L and zl 166R were produced as described in example 4.

Different doses of these vectors were used to transduce several pig cell lines, such as SK-RST and PEF (Pig embryonic fibroblasts) (same conditions as described in example 3). Cells were grown for 5 to 10 days and genomic DNA was extracted, as described in example 3. Control untransduced cells were also used.

Genomic DNA from transduced and untransduced cells was submitted to PCR amplification using primers specific for the pig p53 sequence (SEQ ID NO: 19 and SEQ ID NO: 20). PCR products were then subcloned in pGEMT easy for sequencing, as described in example 3 or submitted to fusion (95°C) and annealing (95°C to 85°C, -2°C/sec; 85°C to 25°C, 0. PC/sec), before T7 nuclease digestion (ΙΟμΙ. PCR product, 10U T7 nuclease, 1.5 hours).

Analysis of p53 sequence show genomic modification only in cells treated with both zl l66L and zl 166R (see SEQ ID NO: 73 to 80, compare with SEQ ID NO: 72).

Similarly, the T7 digestion of PCR products show mismatches only in cells treated with both zl l66L and zl 166R (see Figure 14).

Example 8: Design of transcription activator-like effector nuclease against porcine p53 target site and vectorization system.

Transcription activator-like effectors were first rationally designed against the pig p53 genomic DNA sequence and were then fused to Fokl domains.

The molecular architecture of the synthetic TALENs is derived from the TALE PthXol from the rice pathogen X. oryzae pv. oryzae (Yang et al. 2006).

The target site on the genomic DNA is shown on Figure 12-C with the recognition site of the left TALEN (TALENl): CGTCCCCCAGGTCGGCTCT (SEQ ID NO: 105) and the recognition site of the right TALEN (TALEN2): ACACATGAAGTTGTAGT (SEQ ID NO: 106).

The amino acid sequences of the repeats responsible for DNA recognition were designed to recognize specifically SEQ ID NO: 105 and SEQ ID NO: 106, as shown in Table 5 below. sequence SEQ ID Target ref nucleotide

TALENl

Module 1 LTPAQWAIASHDGGKQALETVQRLLPVLCQDHG 113 C

Module 2 LTPDQWAIASNNGGKQALETVQRLLPVLCQDHG 114 G

Module 3 LTPDQWAIASNGGGKQALETVQRLLPVLCQDHG 115 T

Module 4 LTPDQWAIASHDGGKQALETVQRLLPVLCQDHG 113 C

Module 5 LTPDQWAIASHDGGKQALETVQRLLPVLCQDHG 113 c

Module 6 LTPDQWAIASHDGGKQALETVQRLLPVLCQDHG 113 c

Module 7 LTPDQWAIASHDGGKQALETVQRLLPVLCQDHG 113 c

Module 8 LTPDQVVAIASHDGGKQALETVQRLLPVLCQDHG 113 c

Module 9 LTPDQWAIASNIGGKQALETVQRLLPVLCQDHG 116 A

Module 10 LTPDQWAIASNNGGKQALETVQRLLPVLCQDHG 114 G

Module 11 LTPDQWAIASNNGGKQALETVQRLLPVLCQDHG 114 G

Module 12 LTPDQWAIASNGGGKQALETVQRLLPVLCQDHG 115 T

Module 13 LTPDQWAIASHDGGKQALETVQRLLPVLCQDHG 113 C

Module 14 LTPDQWAIASNNGGKQALETVQRLLPVLCQDHG 114 G

Module 15 LTPDQWAIASNNGGKQALETVQRLLPVLCQDHG 114 G

Module 16 LTPDQWAIASHDGGKQALETVQRLLPVLCQDHG 113 C

Module 17 LTPDQWAIASNGGGKQALETVQRLLPVLCQDHG 115 T

Module 18 LTPDQWAIASHDGGKQALETVQRLLPVLCQDHG 113 c

Module 19 LTPDQWAIASNGGGKQALE 117 T TALEN2

Module 1 LTP AQ WA 1 AS N]G G KQAL ETVQ RL L PVLCQAH G 116 A

Module 2 LTPAQWAIASHDGGKQALETMQRLLPVLCQAHG 113 C

Module 3 LPPDQWAIASNjGGKQALETVQRLLPVLCQAHG 116 A

Module 4 LTPDQWAIASHDGGKQALETVQRLLPVLCQAHG 113 C

Module 5 LTPDQWAIASNIGGKQALETVQRLLPVLCQAHG 116 A

Module 6 LTPDQWAIASNGGGKQALETVQRLLPVLCQAHG 115 T

Module 7 LTPDQWAIASNNGKQALETVQRLLPVLCQAHG 114 G

Module 8 LTPDQWAIASNjGGKQALETVQRLLPVLCQTHG 116 A

Module 9 LTPAQWAIASNIGGKQALETVQQLLPVLCQAHG 116 A

Module 10 LTPDQWAIASNNGGKQALATVQRLLPVLCQAHG 114 G

Module 11 LTPDQWAIASNGGGKQALETVQRLLPVLCQAHG 115 T

Module 12 LTPDQWAIASNGGGKQALETVQRLLPVLCQAHG 115 T

Module 13 LTQVQWAIASNNGGKQALETVQRLLPVLCQAHG 114 G

Module 14 LTPAQWAIASNGGGKQALETVQRLLPVLCQAHG 115 T

Module 15 LTPDQWAIASNjGGKQALETVQRLLPVLCQAHG 116 A

Module 16 LTQEQWAIASNNGGKQALETVQRLLPVLCQAHG 114 G

Module 17 LTP N Q WAI AS N G G KQAL E 117 T

Table 5: Modules of amino acid repeats constituting the DNA interaction domain of TALENl and TALEN2. The TALENl DNA binding domain is constituted of 19 repeats that interact with corresponding DNA sequence SEQ ID NO: 105 in SEQ ID NO: 2. TALEN2 DNA binding domain is constituted of 17 repeats that interact with the corresponding DNA sequence SEQ ID NO: 106 in SEQ ID NO: 2. Repeats are chosen between SEQ ID NO: 113 to 117, repeat variable diresidues are figured with bold and underlined letters.

Resulting TALE1 (SEQ ID NO: 109) and TALE2 (SEQ ID NO: 110) were then fused to Fokl nuclease domains to generate TALENl (SEQ ID NO: 111) and TALEN2 (SEQ ID NO: 112). Corresponding nucleotide sequences encoding said peptides were synthesized and introduced into a lentiviral vector plasmid under the transcriptional control of a CMV promoter to generate plasmid Pt205 (SEQ ID NO: 70, Figure 13A) and Pt206 (SEQ ID NO: 71, Figure 13B).

Example 9: TAL effector nuclease vectorization for p53 NHEJ-induced mutagenesis in pig cells.

In order to validate the efficacy of the TALENs to induce NHEJ-mediated mutagenesis of genomic p53, said TALENs were tested in pig cells. Lentiviral vectors expressing TALENl and TALEN2 were produced as described in example 4 using respectively vector plasmids Pt205 (SEQ ID NO: 70) and Pt206 (SEQ ID NO: 71).

Different doses of these vectors were used to transduce several pig cell lines, such as SK-RST and PEF (Pig embryonic fibroblasts) (same conditions as described in example 3). Cells were grown for 5 to 10 days and genomic DNA was extracted, as described in example 3. Control untransduced cells were also used. Genomic DNA from transduced and untransduced cells was submitted to PCR amplification using primers specific for the pig p53 sequence (SEQ ID NO: 19 and SEQ ID NO: 20).

PCR products were then subcloned in pGEMT easy for sequencing, as described in example 3 or submitted to fusion and annealing, before T7 nuclease digestion, as described in example 7.

The T7 digestion of PCR products show mismatches only in cells treated with both TALENl and TALEN2 (see Figure 14).

Similarly, analysis of p53 sequences show genomic modifications only in cells treated with both TALENl and TALEN2. Examples of modified sequences obtained are shown in SEQ ID NO: 82 to 92 (compare to SEQ ID NO: 81) and SEQ ID NO: 94 to 102 (compare to SEQ ID NO: 93).

Example 10: Microinjection of vectorized p53-TALEN in pig embryos and reimplantation to generate p53 genetically modified pigs.

Pig embryos are microinjected with the pair of TALENs recognizing the porcine p53 gene, using a pair of non-integrating lentiviral vectors.

First, expression cassettes were designed for the porcine p53 recognizing pair of TALENs. These expression cassettes were introduced into a lentiviral vector plasmid, giving rise to plasmids Pt205 (Ptrip-CMV-TALENl-WPRE) and Pt206 (Ptrip-CMV-TALEN2-WPRE) of SEQ ID NO: 70 and SEQ ID NO: 71 respectively (Figure 13).

To produce the non-integrating lentiviral vectors, HEK293T cells were co-transfected with

- the lentiviral plasmid Pt205 or Pt206 (SEQ ID NO: 70 or 71) (Figure 13), with

- a transcomplementation plasmid (SEQ ID NO: 13) bearing a mutation into the lentivirus integrase coding sequence (D64V mutation of the integrase coding sequence) (Figure 5) and

- an envelope plasmid (SEQ ID NO: 16) allowing the expression of the VSV glycoprotein or envelope (Figure 6).

After co-transfection, the cell medium was replaced. Later, the medium containing the lentiviral vectors was harvested, filtered, treated with Dnase I and ultracentrifugated in order to concentrate the vectors. Pellets were resuspended in cold PBS. The resulting concentrated vector batches (NILV- TALENl and NILV-TALEN2) were then aliquoted in small volumes and frozen at -80°C until use. Embryos are produced from Large- White gilts that are approximately 9 months of age and weighed at least 120 kg at time of use. Super-ovulation is achieved by feeding, between day 11 and 15 following an observed oestrus, 20 mg altrenogest (Regumate, Hoechst Roussel Vet. Ltd., Milton Keynes, UK) once daily for 4 days and 20 mg altrenogest twice on the fifth day. On the sixth day, 1500 international units (IU) of eCG (PMSG, Intervet UK Ltd, Cambridge, UK) were injected at 8:00 P.M. Eighty three hours later 750 IU hCG (Chorulon, Intervet UK Ltd, Cambridge, UK) are injected.

Donors gilts are inseminated twice 6h apart after exhibiting heat generated following super-ovulation. Recipient females are treated identically to donor gilts but remain un-mated. Embryos are surgically recovered from mated donors by mid- line laparotomy under general anesthesia on day 1 following oestrus. (Heat=estrus Day 0).

Embryos are injected with the non-integrating lentiviral vectors (NILV-TALEN1 and NILV- TALEN2) by sub-zonal injection into the per-vitalin space using fine glass needles under an inverted microscope.

Immediately following treatment fertilized embryos are transferred to recipient gilts following a midline laparotomy under general anesthesia. During surgery, the reproductive tract is exposed and embryos are transferred into the oviduct of recipients using a 3.5 French gauge tomcat catheter.

Animals investigated in this study are hemizygous male and female transgenic pigs and non-transgenic (littermate) control animals. All animal experiments are carried out following ethical review and conducted under The Animal (Scientific Procedures) Act 1986 (UK).

Example 11: Genetically modified piglets selection by genotyping

Genomic DNA is prepared from ear clips by proteinase K digestion in lysis buffer followed by phenol/chloroform extraction. Offspring are genotyped by PCR using specific primers of SEQ ID NO: 19 and 20 designed against the sequence of the porcine p53 gene from SEQ ID NO: 2.

PCR products are sequenced to identify animals bearing mutations into the p53 gene at or around the cleavage site 5'- GACTGTACCACCATCC-3 ' (SEQ ID NO: 107) due to NHEJ imperfect repair mechanism induced by the TALENs.

CLAIMS

1. A pig carrying a genetically modified nucleic acid sequence in the p53 gene, wherein the modification results from non-homologous end joining (NHEJ) repair mechanism or homologous recombination (HR) following double strand DNA cleavage in a five, six or seven bases sequence of the p53 genomic sequence of SEQ ID NO: 2.

2. The pig of claim 1, wherein the modification is a NHEJ-induced mutation selected from a substitution, a deletion, an addition and any combination thereof.

3. The pig of any one of claims 1 or 2, wherein a zinc-finger-nuclease (ZFN) heterodimer specifically recognizing the pig p53 genomic sequence of SEQ ID NO: 46 and SEQ ID NO: 48 located in SEQ ID NO: 4 is responsible for the double strand DNA cleavage. 4. The pig of claim 3, wherein the zinc-finger-nuclease (ZFN) heterodimer comprises a ZFN of SEQ ID NO: 5 comprising a ZFP of SEQ ID NO: 7 and a ZFN of SEQ ID NO: 6 comprising a ZFP of SEQ ID NO: 8.

5. The pig of claim 3 or 4, wherein the double strand DNA cleavage is in 5'- CAACTT -3' (SEQ ID NO: 62) located in SEQ ID NO: 4 or a polymorphic sequence thereof such as SEQ ID NO: 66.

6. The pig of any one of claims 1 or 2, wherein a transcription activator-like effector nuclease (TALEN) heterodimer specifically recognizing the pig p53 genomic sequence of SEQ ID NO: 105 and SEQ ID NO: 106 located in SEQ ID NO: 67 is responsible for the double strand DNA cleavage.

7. The pig of claim 6, wherein the transcription activator-like effector-nuclease (TALEN) heterodimer comprises a TALEN of SEQ ID NO: 111 comprising a transcription activator- like effector (TALE) of SEQ ID NO: 119 and a TALEN of SEQ ID NO: 112 comprising a TALE of SEQ ID NO: 120.

8. The pig of claim 6 or 7, wherein the double strand DNA cleavage is in 5'- GACTGTACCACCATCC - 3' (SEQ ID NO: 107) located in SEQ ID NO: 67 or a polymorphic sequence thereof such as SEQ ID NO: 118. 9. The pig of any one of claims 1 to 8 which is a proliferative disease, preferably cancer, prone model.

10. A cell or the nucleus of a cell derived from a pig according to anyone of claims 1 to 9.

11. The cell of claim 10, wherein said cell is selected from a stem cell, an induced pluripotent stem cell (iPS cell), a germ cell, a gamete, a somatic cell. 12. The cell of claim 10, wherein said cell is a tumor cell.

13. A pig, a population of cells or a cell line derived from a cell according to claim 10.

14. A pig, a population of cells or a cell line derived from a cell according to claim 12.

15. A fertilized egg, a zygote, a morula, a blastocyst, an embryo, or a fetus derived from the pig model according to claim 9.

16. Use of a pig according to anyone of claims 1 to 9, 13 and 14, or of a cell according to claim 10, for the evaluation of the ability of a compound, to induce, prevent or treat a cancer or for the screening of a compound for inducing, preventing or treating a cancer.

17. Use of a cell according to anyone of claims 10 to 12 or of a population of cells according to claim 13 or 14, for the in vitro or ex vivo evaluation of the ability of a test compound or a physical treatment applied to the animal body to induce, prevent or treat a cancer or for the screening of a compound for inducing, preventing or treating a cancer.

18. Use of a cell according to anyone of claims 10 to 12 or of a population of cells according to claim 13 or 14, for the in vitro or ex vivo identification of a biomarker usable to diagnose a cancer or to identify a compound for preventing or treating a cancer.


Feedback