| CPC C07K 16/464 (2013.01) [C07K 16/465 (2013.01); C40B 50/00 (2013.01); G16B 15/00 (2019.02); G16B 35/00 (2019.02); G16B 35/10 (2019.02); G16B 35/20 (2019.02); G16C 20/60 (2019.02); C07K 2317/24 (2013.01); C07K 2317/565 (2013.01); C07K 2317/567 (2013.01); C40B 40/10 (2013.01)] | 14 Claims |
|
1. A system to produce an in silico population of nucleic acids encoding at least one protein comprising at least one immunoglobulin variable domain having a non-human-derived CDR3 amino acid sequence embedded in essentially human framework sequences, the system comprising:
a computer processor; and
a non-transitory computer-readable medium storing instructions that, when executed by the computer processor, configures the computer processor to:
(a) provide at least one nucleic acid encoding a non-human-derived complementarity determining region 3 (CDR3) amino acid sequence or an amino acid sequence further encompassing 1, 2, or 3 amino acids N-terminal and/or C-terminal of the non-human-derived CDR3 amino acid sequence,
(b) generate the in silico population of nucleic acids encoding at least one protein comprising at least one immunoglobulin variable domain having a non-human CDR3 amino acid sequence of step (a) embedded in essentially human framework sequences, wherein the human framework sequences comprise a first human framework region (FR1), a second human framework region (FR2), a third human framework region (FR3), and a fourth human framework region (FR4), such that the FR1 and FR2 regions are interspaced by a complementarity determining region 1 (CDR1), the FR2 and FR3 regions are interspaced by a complementarity determining region 2 (CDR2), and the FR3 and FR4 regions are interspaced by a non-human-derived CDR3 amino acid sequence,
wherein the nucleic acid sequences encoding the CDR1 and CDR2 amino acid sequences are diversified among the population of nucleic acids encoding at least one protein comprising at least one immunoglobulin variable domain,
wherein each nucleic acid sequence encoding a CDR1 or CDR2 amino acid sequence is independently based
(i) on a nucleic acid sequence encoding a human CDR1 or CDR2, respectively, or
(ii) on a nucleic acid sequence encoding a non-human CDR1 or CDR2, respectively,
wherein at least some of the nucleic acid sequences encoding a CDR1 or CDR2 amino acid sequence have been modified to encode at least one amino acid present in non-human CDR1 or CDR2 amino acid sequences, respectively, in case of human CDR1 or CDR2, respectively, or to encode at least one amino acid present in human CDR1 or CDR2 amino acid sequences, respectively, in case of non-human CDR1 or CDR2, respectively, and
wherein the human FR1, FR2, FR3 and FR4 regions are human framework regions selected to provide a scaffold conducive for non-human CDR3 amino acid sequences,
with the proviso that:
the two C-terminal amino acids of FR2 are optionally non-human, and
the two C-terminal amino acids of FR3 are optionally non-human,
(c) generate a first positional weight matrix (PWM) of amino acid positional variability from naturally-occurring, non-human CDRIs and CDR2s by calculating a first relative frequency of each amino acid at each position of the naturally-occurring, non-human CDRIs and CDR2s;
(d) generate a second PWM of amino acid positional variability from naturally-occurring, human CDRIs or CDR2s by calculating a second relative frequency of amino acids at each position of the naturally-occurring, human CDRIs or CDR2s;
(e) blend the first PWM and the second PWM to produce a blended PWM that provides for amino acid variation observed in both human and non-human CDRIs and CDR2s; and
(f) generate a plurality of scaffold-encoding nucleic acids comprising different CDR1 and different CDR2 and produce an immunoglobulin scaffold suitable for a graft of a non-human CDR3, wherein amino acids of the CDR1 and CDR2, respectively, occur at about their frequencies at their respective positions within the blended PWM.
|