| CPC C40B 40/10 (2013.01) [C12N 15/1037 (2013.01); C12N 15/81 (2013.01); C40B 40/08 (2013.01); C40B 50/06 (2013.01)] | 12 Claims |

|
1. A method for producing a VHH library, comprising:
(a) providing a first plurality of nucleic acids encoding amino acid sequences of one or more CDR1 fragments replicated from naturally occurring antibodies;
(b) providing a second plurality of nucleic acids encoding amino acid sequences of one or more CDR2 fragments replicated from naturally occurring antibodies;
(c) providing a third plurality of nucleic acids encoding amino acid sequences of one or more CDR3 fragments replicated from naturally occurring antibodies;
(d) providing a nucleic acid gene encoding amino acid sequences of a common VHH domain comprising a framework region 1, a CDR1 region, a framework region 2, a CDR2 region, a framework region 3, a CDR3 region, and a framework region 4; and
(e) assembling the first plurality of nucleic acids, the second plurality of nucleic acids, and the third plurality of nucleic acids into the CDRI region, the CDR2 region, and the CDR3 region, respectively, of the common VHH domain, thereby producing a population of nucleic acids encoding a VHH domain library;
the VHH domain library comprising:
a plurality of nucleic acids encoding a population of VHH domains comprising one or more CDR1s, one or more CDR2s, and one or more CDR3s located at the CDR1 region, the CDR2 region, and the CDR3 region of a VHH gene, respectively;
wherein the nucleic acid sequences encoding the amino acid sequences of the one or more CDR1s and the one or more CDR2s are from naturally occurring antibodies of a mammalian species;
wherein at least 90% of the one or more CDRIs and at least 90% of the one or more CDR2s are free of amino acid sequence liabilities, wherein the amino acid sequence liabilities are: (i) a glycosylation site comprising the motif NXS, NXT, or NXC, in which X represents any naturally occurring amino acid residue except for proline; (ii) a deamidation site comprising the motif of NG, NS, NT, NN, NA, NH, ND, NQ, NF, NW or NY; (iii) an isomerization site comprising the motif of DT, DH, DS, DG, DN, DR, DY or DD; (iv) any cysteines; (v) net charge greater than 1; (vi) a tripeptide motif containing at least two residues with aromatic side chains comprising F, H, W or Y; (vii) a polyspecificity site comprising the motif GG, GGG, RR, VG, W, WV, WW, WWW, YY, or WXW, in which X represents any amino acid residue; (viii) a protease sensitive or hydrolysis prone site comprising the motif of DX, in which X is P, G, S, V, Y, F, Q, K, L, or D; (ix) an integrin binding site comprising RGD, RYD, LDV, or KGD; (x) a lysine glycation site comprising KE, EK, or ED; (xi) a metal catalyzed fragmentation site comprising the motif of HS, SH, KT, HXS, or SXH, in which X represents any amino acid residuc; (xii) a polyspecificity aggregation site comprising a motif of X.sub.1X.sub.2X.sub.3, wherein each of X.sub.1, X.sub.2, and X.sub.3 is independently selected from the group consisting of F, I, L, V, W and Y; (xiii) a streptavidin binding motif comprises the motif HPQ, EPDW (SEQ ID NO: 49), PWXWL (SEQ ID NO: 50), in which X represents any amino acid residue, GDWVFI (SEQ ID NO: 51), or PWPWLG (SEQ ID NO: 52); (xiv) one or more arginine residues; (xv) a hydrophobic CDR sequence; and/or (xvi) a CDR mutation that reduces binding to protein A said CDR mutation comprising any mutation in the last amino acid of the CDR2, according to the IMGT definition, to A, G, C, D, E, F, G, H, I, L, M, N, P, Q, S, V, W or Y;
wherein at least 90% of the one or more CDR1s, at least 90% of the one or more CDR2s, and at least 90% of the one or more CDR3s are free of non-functional members; wherein functional members are well folded and can form well folded VHHs;
wherein at least two of the framework regions 1, 2, 3, and 4 arc from a partially humanized VH germline sequence or from a llama consensus framework;
wherein each framework region can contain up to five amino acid substitutions, and wherein the nucleic acid sequences encoding the amino acid sequences of the one or more CDR3s are from heavy chain CDR3s of human donor lymphocytes.
|