The Large envelope protein (L) of hepatitis B virus is a complex glycoprotein of approximately 39 kD that contains three domains: the S region (also found in the small surface antigen), the pre-S2 region, and the pre-S1 region . In primate hepadnaviruses, including those infecting gorillas, the L protein plays critical roles in viral assembly and infectivity. The L protein carries the receptor recognition domain, which allows efficient binding to cell surface receptors . Gorilla HBV, like other hepadnaviruses found in great apes, shares significant homology with human HBV, as these viruses can infect humans, great apes (gorillas, chimpanzees, and orangutans), and lesser apes (gibbons) . The L protein undergoes important post-translational modifications, including myristoylation at the amino-terminal glycine of the pre-S1 domain, which is required for infectivity .
The PreS part of the L protein plays a central role in various interactions with viral and cellular proteins . The N-terminal residues (first 48 amino acids) of PreS1 are involved in binding to the HBV receptor, sodium taurocholate co-transporting polypeptide (NTCP) . The hydrophobic stretches between residues 50-70 may function in membrane interactions and potentially viral fusion . The PreS1/PreS2 border region (approximately amino acids 90-120) is believed to be involved in interactions with the viral capsid during particle formation . Additionally, specific sites in PreS1 interact with the Hsc70 chaperone, which influences the orientation of PreS observed in immature viral particles . These multiple interaction sites highlight the multifunctional nature of the L protein in the viral lifecycle.
For expressing functional recombinant HBV L proteins, several systems have proven effective and would be applicable to Gorilla HBV. Cell-free expression using wheat germ extract has been used successfully for human HBV L protein expression and allows for proper phosphorylation through endogenous kinases . For larger-scale production, mammalian cell culture systems such as Chinese hamster ovary (CHO) cells have been used to generate recombinant HBsAg particles that are antigenically indistinguishable from plasma-derived particles . For bacterial expression, co-expression with human kinases such as MAPK14 can help ensure proper phosphorylation of the recombinant protein . The choice of expression system should consider the specific research application, with mammalian systems generally providing better post-translational modifications, while bacterial systems offer higher yield but may require additional engineering for proper modification.
For purification of recombinant L protein, affinity tags such as Strep-tag II can be fused to either the N- or C-terminal end of the protein . The positioning of the tag should be carefully considered based on the functional region being studied, as N-terminal tags may interfere with myristoylation and receptor binding. Purification under non-denaturing conditions is crucial to maintain the native conformation of the protein, particularly if studying interactions with receptors or antibodies. For structural studies, size exclusion chromatography following affinity purification helps obtain homogeneous preparations. When analyzing phosphorylation patterns, phosphatase inhibitors should be included throughout the purification process to preserve the modification state.
Recent research has identified multiple phosphorylation sites in the PreS domain of the human HBV L envelope protein . Two major phosphorylation sites were identified at S6 and S98, along with seven minor sites, distributed throughout the PreS1 and PreS2 domains . These phosphorylation sites occur in all major functional regions of PreS, including areas involved in receptor binding, membrane fusion, capsid interaction, and chaperone binding . While comprehensive phosphorylation mapping of Gorilla HBV L protein has not been specifically reported in the provided research, the high conservation among primate hepadnaviruses suggests similar phosphorylation patterns may exist. Interestingly, in avian hepadnaviruses, phosphorylation of L has been shown to be dispensable for infectivity, indicating potential evolutionary differences in the functional significance of these modifications .
For comprehensive phosphorylation analysis of recombinant L proteins, a combination of mass spectrometry and NMR spectroscopy provides the most reliable results . Mass spectrometry allows for identification of phosphorylation sites through peptide mapping, while NMR can confirm modifications through chemical shift analysis . When using recombinant expression systems, it's important to note that neither cell-free systems (like wheat germ extract) nor bacterial co-expression systems (with human MAPK14 kinase) provide complete phosphorylation . Therefore, phosphorylation mimics using S/T to E mutations may be the best strategy for exploring the impact of phosphorylation on PreS interactions in structural studies . For comparative analysis across primate species, parallel expression and analysis under identical conditions is crucial to identify true species-specific differences rather than methodological artifacts.
The PreS1 domain of the L glycoprotein is a critical determinant of receptor binding and host range in hepadnaviruses . Research on primate hepadnaviruses indicates that the PreS1 domain may comprise two regions affecting infectivity: one within the amino-terminal 40 amino acids and another downstream region . These regions likely mediate specific interactions with host cell receptors that vary slightly between primate species. Hepadnaviruses display a characteristic hepatic tropism and restricted host range that typically extends only to closely related species . For example, HBV infects humans and other great apes including gorillas, but other Old World nonhuman primates like baboons are not susceptible, presumably due to differences in virus-receptor interactions . Understanding these PreS1 variations is crucial for predicting cross-species transmission potential and developing targeted interventions.
Effective experimental approaches for studying host range restriction include the use of HDV particles pseudotyped with the envelope proteins of different hepadnaviruses, including Gorilla HBV . These pseudotyped particles can be used to infect primary hepatocytes from various primate species to assess host range biases in vitro . Competitive inhibition studies using synthetic peptides derived from the PreS1 domain can help map specific regions involved in receptor binding . Creation of chimeric L proteins, where segments from different primate hepadnavirus envelope proteins are exchanged, allows mapping of domains responsible for host specificity . Additionally, directed mutagenesis of specific amino acids within the PreS1 domain can identify key residues involved in host-specific interactions. These approaches, combined with structural analyses, provide comprehensive insights into the molecular basis of host range determination.
The PreS domains of the L protein are believed to represent intrinsically disordered protein regions, which has been experimentally supported for avian hepadnaviruses and is likely true for primate hepadnaviruses as well . This intrinsic disorder may allow the PreS domains to adopt different conformations when interacting with various binding partners. Several hydrophobic stretches in the PreS1 domain, particularly between residues 50-70, may be involved in membrane interactions and potentially function as fusion peptides . The N-terminal myristoylation of the L protein is essential for receptor binding and infectivity . NMR analysis confirms the disordered nature of the PreS protein while identifying regions that may undergo structural transitions upon binding to partners . This conformational flexibility likely contributes to the multifunctional nature of the L protein throughout the viral lifecycle.
Due to the intrinsically disordered nature of the PreS domains, traditional structural biology techniques like X-ray crystallography may be challenging for full-length L protein analysis. NMR spectroscopy has proven valuable for analyzing the structural properties of the PreS domains and confirming phosphorylation sites through chemical shift analysis . For transmembrane regions within the S domain, a combination of cryo-electron microscopy and molecular dynamics simulations may be more appropriate. When expressing recombinant L protein for structural studies, maintaining native post-translational modifications is crucial, as these modifications (particularly myristoylation and phosphorylation) can influence the protein's conformation and interactions. Segmental labeling approaches, where specific domains are isotopically labeled for NMR analysis, may help overcome size limitations when studying the full-length protein. Additionally, hydrogen-deuterium exchange mass spectrometry can provide insights into dynamic regions and binding interfaces.
Myristoylation of the L protein at the amino-terminal glycine of the PreS1 domain is essential for infectivity in hepadnaviruses . This lipid modification likely facilitates membrane association and receptor binding. Phosphorylation occurs at multiple sites throughout the PreS domains, with major sites identified at S6 and S98 in human HBV . These phosphorylation sites are distributed across regions involved in receptor binding, membrane interaction, capsid binding, and chaperone interaction . While myristoylation appears consistently essential across hepadnaviruses, the functional significance of phosphorylation may vary among species, as phosphorylation has been shown to be dispensable for infectivity in avian hepadnaviruses . In primate hepadnaviruses, the strategic positioning of phosphorylation sites at functional interfaces suggests they may regulate protein-protein interactions or conformational changes during the viral lifecycle, possibly in a species-specific manner.
To distinguish between phosphorylation-dependent and -independent functions, several complementary approaches should be employed. Site-directed mutagenesis of phosphorylation sites to either non-phosphorylatable residues (S/T to A) or phosphomimetic residues (S/T to E/D) can help assess the impact on specific functions . For kinetic studies of phosphorylation events, in vitro kinase assays with recombinant L protein fragments and purified kinases can reveal the sequential order of modifications. Cell-based assays using L proteins with mutations at specific phosphorylation sites can assess impacts on viral assembly, secretion, and infectivity. Interaction studies (pull-downs, surface plasmon resonance, or ELISA) comparing wild-type and phosphorylation-deficient L proteins can identify binding partners whose association is regulated by phosphorylation. Finally, structural studies comparing non-phosphorylated, partially phosphorylated, and fully phosphorylated (or phosphomimetic) proteins can reveal conformational changes induced by these modifications.
The immunogenicity of recombinant HBV surface antigens is significantly influenced by their post-translational modifications and conformational integrity. Recombinant HBsAg particles secreted from Chinese hamster ovary (CHO) cells have been shown to be antigenically indistinguishable from HBsAg particles derived from infected patients . These cell culture-derived recombinant antigens demonstrate superior immunogenicity compared to yeast-derived or plasma-derived antigens, as evidenced by faster antibody production kinetics and higher maximum titers in animal models . For L protein specifically, proper myristoylation and phosphorylation likely contribute to maintaining native epitope conformations. When developing vaccines based on recombinant Gorilla HBV proteins, the expression system should be carefully selected to ensure proper post-translational modifications, with mammalian cell systems generally providing modifications most similar to those in viral particles from infected hosts.
Cross-reactivity between human and non-human primate HBV envelope proteins stems from their high sequence homology, particularly in the S domain. Studies with recombinant HBsAg vaccines have demonstrated that immunization with one subtype (e.g., ad) can provide protection against infection with different subtypes (e.g., ay) , suggesting broader cross-protection. For Gorilla HBV, the close evolutionary relationship with human HBV likely results in significant cross-reactivity of antibody responses. This cross-reactivity has important implications for both vaccine development and diagnostic testing. When designing experiments to study immune responses to Gorilla HBV L protein, researchers should consider including cross-reactivity assessments with human HBV antigens and antibodies. Evaluating both humoral and cellular immune responses is important, as recombinant cell culture-derived vaccines have been shown to elicit strong cellular immune responses as measured by cell proliferation assays .
For studying L protein-receptor interactions, competitive inhibition studies using synthetic peptides derived from the PreS1 domain have proven valuable . These peptides can block infection in a dose-dependent manner, allowing mapping of regions critical for receptor binding. Surface plasmon resonance (SPR) or bio-layer interferometry (BLI) can provide quantitative binding kinetics between purified recombinant L protein (or PreS1 fragments) and potential receptor molecules. For cellular studies, fluorescently labeled PreS1 peptides can be used to visualize binding to hepatocytes from different primate species. Pull-down assays using tagged recombinant L protein can identify interacting partners from hepatocyte lysates, followed by mass spectrometry identification. Cross-linking approaches combined with mass spectrometry can map specific interaction interfaces. For in vivo relevance, HDV particles pseudotyped with different hepadnavirus envelopes can assess infectivity and receptor usage across species .
Mapping binding interfaces between L protein and its partners requires a multi-method approach. Hydrogen-deuterium exchange mass spectrometry (HDX-MS) can identify regions of the protein that become protected upon complex formation. Chemical cross-linking followed by mass spectrometry (XL-MS) can identify specific residues in close proximity at binding interfaces. Alanine scanning mutagenesis, where series of amino acids are systematically mutated to alanine, can identify critical residues for specific interactions. For the PreS1/PreS2 border region involved in capsid interactions (approximately amino acids 90-120) , co-immunoprecipitation experiments using truncated or mutated L proteins can define minimal interaction domains. NMR chemical shift perturbation experiments using isotopically labeled PreS domains can map binding interfaces at atomic resolution. Finally, computational approaches including molecular docking and molecular dynamics simulations can predict and refine binding interface models based on experimental constraints.
Hepadnaviruses display characteristic host range restrictions that typically extend only to closely related species . HBV infects humans and other great apes including gorillas, chimpanzees, and orangutans, as well as lesser apes like gibbons . The evolutionary patterns in L protein sequences reflect these host restrictions, with highest conservation in the S domain and greater variability in the PreS domains, particularly PreS1. The PreS1 domain is a key determinant of receptor binding and host range , with variations in this region likely mediating specific interactions with host receptors. Comparative sequence analysis across primate hepadnaviruses can identify positively selected sites that may represent host adaptation signatures. Additionally, conservation of post-translational modification sites, particularly myristoylation and key phosphorylation sites, provides insights into the functional constraints on these modifications throughout primate hepadnavirus evolution.
For comparative functional analysis, parallel expression of L proteins from different primate hepadnaviruses using identical systems is essential to avoid methodological artifacts. Creating chimeric proteins where domains are swapped between viruses can map regions responsible for functional differences . HDV pseudotyping with different hepadnavirus envelopes provides a uniform background to test entry and infectivity differences . Competitive inhibition studies using species-specific PreS1 peptides can reveal differences in receptor binding mechanisms . Structural studies including hydrogen-deuterium exchange mass spectrometry and NMR can identify conformational differences between L proteins from different species. For phosphorylation analysis, parallel mass spectrometry and NMR studies can identify species-specific modification patterns . Host range studies using primary hepatocytes from various primate species can reveal tropism differences mediated by L protein variations . These approaches collectively provide a comprehensive comparative framework for understanding functional evolution of L proteins across primate hepadnaviruses.
Several technical challenges complicate structural and functional studies of full-length L protein. The protein contains both hydrophobic transmembrane regions and intrinsically disordered domains, making it difficult to express, purify, and study using traditional structural biology approaches . The L protein undergoes multiple post-translational modifications (myristoylation, glycosylation, phosphorylation) that are difficult to recapitulate completely in recombinant expression systems . Neither cell-free systems nor bacterial co-expression systems provide complete phosphorylation patterns . The multiple conformational states of the L protein during different stages of the viral lifecycle further complicate structural characterization. Additionally, the protein's tendency to form oligomeric assemblies and its natural embedding in lipid membranes create challenges for obtaining homogeneous preparations suitable for high-resolution structural studies. These technical limitations necessitate creative approaches combining multiple complementary techniques for comprehensive characterization.