Isotopic Labelling of Proteins in a Eukaryotic System

“An Effective Solution for Labelling when Protein Folding is a Challenge”

In this case study we present data on β₂m and an Interleukin demonstrating our ability to produce labelled proteins to support NMR structural studies using mammalian HEK293 cells. This now allows us to tackle and produce proteins for NMR that are not amenable to expression in E. coli.

Current status of labelling proteins for NMR studies

In structural biology, X‑ray crystallography, single‑particle cryo‑EM, and Nuclear Magnetic Resonance (NMR) have been instrumental in revealing the atomic-level structures of macromolecules, including peptides, proteins, glycoproteins, and nucleic acids. Among these approaches, NMR stands out for its unique ability to probe not only structure, but also molecular dynamics and interactions in solution under a wide range of physiologically relevant conditions. See our mini review on Protein NMR and its Role in Drug Discovery

Despite its versatility, NMR-based studies have historically faced a key limitation: the production of isotopically labelled proteins. Many established labelling strategies rely on Escherichia coli expression systems, which are not always suitable—particularly for complex eukaryotic proteins that require specialized folding machinery or post-translational modifications. This challenge is especially relevant for the systems discussed in this case study, which focuses exclusively on protein labelling in eukaryotic hosts.

It is important to note that both unlabelled and isotopically labelled macromolecules can be studied by NMR across a broad size range, from small molecules (~1 kDa) to megadalton complexes. However, isotopic enrichment becomes increasingly critical as molecular size increases. Proteins in the 3–20 kDa range typically only require uniform ¹⁵N or sometimes ¹⁵N/¹³C labelling, while those between 20–50 kDa often benefit from additional deuteration (²H). For larger systems (>50 kDa), extensive deuteration combined with selective ¹³C labelling of methyl groups (e.g., isoleucine, leucine, valine, alanine, methionine, and threonine) is generally necessary to obtain high-quality spectra.

In practice, the incorporation of stable isotopes (¹⁵N, ¹³C, and ²H) is achieved through de novo recombinant protein synthesis in cell culture or cell-free systems. Prokaryotic hosts—particularly E. coli—have been the workhorse of isotopic labelling due to their low cost, rapid growth, and adaptability to isotope-enriched media, including growth in 100% deuterium oxide (²H). These features make them especially well-suited for producing labelled proteins for NMR studies.

Why do we need to label proteins in a eukaryotic system?

However, prokaryotic systems have significant limitations when compared to eukaryotic expression platforms. They often lack the cellular machinery required for proper folding of complex proteins, correct disulfide bond formation, and essential post-translational modifications such as phosphorylation, methylation, and glycosylation. Consequently, many human proteins cannot be reliably expressed in E. coli. To address these challenges, alternative eukaryotic systems—including yeast, insect cells, and mammalian cell lines such as CHO and HEK cells—have been developed. While these systems enable more biologically relevant protein production, they introduce new challenges, particularly in achieving efficient and cost-effective isotopic labelling for NMR applications.

But, what are the reasons we don’t do that more routinely ?

In this context, a limited number of ¹⁵N‑ or ¹⁵N/¹³C‑labelling methodologies have been developed for yeast, insect, and CHO cells, with even fewer available for HEK cells. A major drawback of these approaches is the exceptionally high cost of isotopically enriched media, which typically ranges from approximately 3,500 to 6,500 USD per litre. These costs are often unsustainable for poorly expressed proteins or for high‑throughput NMR screening campaigns in fragment‑based drug discovery (FBDD).

Case study: Production of labelled-β₂m and an Interleukin in HEK 293 cells

To support NMR‑based FBDD for our current and future partners, we are pleased to announce the launch of an isotopic labelling service for proteins expressed in HEK-293E eukaryotic system. β₂‑microglobulin (β₂m) and one interleukin were used as model proteins to validate our capabilities.

Absolutely key to the success of this project was the input from our cell science team. The HEK293-6E cell line was adapted to an affordable custom media prior to developing and optimising a method for expressing isotopically labelled proteins in this system. Our in-house expertise with the HEK293-6Es was crucial for developing these new protocols, when at first achieving expression proved to be challenging. β₂‑microglobulin (β₂m) and one interleukin were used as model proteins to validate our capabilities.

Both proteins were uniformly enriched with the ¹⁵N isotope and subsequently purified. Sample quality was assessed by 2D-NMR, focusing primarily on the amide resonances. The resulting spectra showed well‑dispersed and well‑defined resonances with good signal‑to‑noise ratios. For β₂m, approximately 86 HN resonances were observed out of 94 expected. For the interleukin, ~102 HN resonances were identified out of 113 expected (Figure 1). Missing resonances for both proteins may be attributed to alternative conformations or post‑translational protein modifications; these aspects will be addressed in a forthcoming post in our NMR-blog series.

**Figure 1.** 2D-NMR spectrum of expressed proteins. β2m is displayed on the left, while the interleukin is on the right. The amide resonances are shown in blue

To demonstrate the potential of the isotopically labelled material, we selected β₂m for further descriptive analysis. Based on the reported NMR assignments of β₂m (BMRB: 51097; produced in E. coli), we were able to transfer approximately 96% (91 out of 94 expected) of the HN assignments to our sample (figure 2).

**Figure 2.** Amide assignment of β2m, transferred from BMRB: 51097. Unassigned resonances are labelled with *.

Subsequently, we continued with structural analysis of β₂m assessed by NMR. Our findings suggest that β₂m contains three distinct regions (1, 2 and 3) that appear to be less structured compared with the rest of the protein. These observations correlate well with the β₂m three‑dimensional structure (PDB: 3CIQ), in which regions 1, 2, and 3 are located within the loops connecting the B–C, D1–E, and E–F β‑strands, respectively. In addition, our data suggest that the D2 β‑strand of β2m may also exhibit reduced structural order. See figure 3 for additional details.

**Figure 3.** Structural analysis of β2m in solution, approached by NMR. A. quantification of the protein compactness at residue level. Three distinctive unstructured regions are highlighted as 1, 2 and 3. Solid black line along the plot represent the moving average trendline of the data set; missing data can be associated with prolines, overlapped resonances or low signal/noise ratio in the NMR-experimentation B. Unstructured identified regions from A are mapped in red on the 3D structure of β2m. PDB: 3CIQ.

Summary and Impact

In summary, we successfully produced and isotopically labelled two human proteins using a eukaryotic expression system. In addition, we demonstrated the suitability of the produced material to support NMR studies. Please do not hesitate to contact us if you require any additional information or support for your current projects.

Integrated Drug Discovery

Target Identification & Validation

Hit Identification

Hit-to-Lead

Lead Optimization

Featured Resources

STORM Therapeutics: From Lead to Pre-Candidate Nomination in 18 Months

Computer Aided Drug Design

Bioinformatics

Target Analysis

Virtual Screening

Structure Based Drug Design

Generative AI and Machine Learning

Ligand Based Drug Design

Library Design

ADMET Predictions

Informatics

SygDesign

Featured Resources

Rapid creation of ideas using generative AI (GenAI) with Iktos Makya

Protein Science and Structural Biology

Protein Expression and Purification

Membrane Proteins

Antibodies and Biotherapeutics

Structural Biology

X-Ray Crystallography

cryo-EM

Macromolecular NMR

Protein Characterization

Protein Mass Spectrometry

FIDA

Featured Resources

A Comparison of the Structural Techniques used at Sygnature Discovery: X-ray Crystallography, NMR and Cryo-EM

Bioscience

Cell Line Generation

Assay Development

Electrophysiology

High Throughput Screening

Cell Based Assays

Biochemical Assays

Biophysical Assays

Primary Cells and Tissues

Integrating Hit Finding and Direct-to-Biology to Shorten Time to Development Candidate

Chemistry

Medicinal Chemistry

Synthetic Chemistry

Process & Scale-up Chemistry

NMR-Spectroscopy

Separation Sciences

High Throughput Chemistry

Direct-to-Biology

Why Early Process Chemistry Matters?

Drug Metabolism and Pharmacokinetics

Physicochemical Profiling

Chemical & Metabolic Stability

Permeability

Tissue Binding

Drug-Drug Interactions

Biotransformation

Analytical Method Development & Bioanalysis

in vivo Pharmacokinetic & Pharmacodynamic Studies

PK & PK/PD Modelling and Simulation

DMPK Consultancy

Dedicated DMPK for Bifunctional Degraders: A Flexible Approach for Flexible Molecules

in vivo Pharmacology

CNS/Pain Models

Metabolic Disease Models

Inflammation & Immunology Models

Oncology Models

Non-Regulatory Toxicology

in vivo Pharmacokinetics

ex vivo Tissue Analysis

in vivo Neurochemistry

Form & Formulation

Early Clinical Formulation

Solid Form Analysis

Salt and Co-Crystal Screening

Polymorph Screening

Pre-Clinical Formulation

Assessing Lypoprotectant Effect on Particle Size of Liposomal Systems