FAM18A is typically expressed in Escherichia coli under optimized conditions to ensure solubility and functionality:
Expression System: E. coli BL21(DE3) or similar strains ( ).
Purification: Affinity chromatography using nickel-NTA resins for His-tag binding, yielding >80% purity (SDS-PAGE) ( ).
Solubility: Enhanced using molecular chaperones and urea-containing buffers ( ).
Thermostability: FAM18A retains secondary structure integrity at ≤37°C, as confirmed by circular dichroism spectroscopy ( ).
Species Cross-Reactivity: Shares 73% sequence identity with mouse and rat orthologs, enabling cross-species studies ( ).
Limitations: Requires urea for solubility, which may interfere with downstream assays. Alternatives like maltose-binding protein (MBP) fusions are under investigation ( ).
Ongoing research aims to:
FAM111A is a PCNA-interacting protein with trypsin-like serine protease activity that plays important roles in DNA replication and antiviral defense. It specifically functions to mitigate the effects of protein obstacles on replication forks, promoting genome stability . FAM111A contains a protease domain that is essential for its function in facilitating replication at sites of DNA-protein crosslinks and other protein obstacles, thereby preventing replication fork stalling and potential double-strand breaks . This protein is predominantly intracellular, distinguishing it from most other S1 family proteases which are typically extracellular or membrane-associated .
FAM111A functions as a dimerization-dependent protease with a characteristic serine protease domain (SPD) containing a narrow, recessed active site. X-ray crystallography studies have revealed that FAM111A dimerizes via the N-terminal helix within the SPD . This dimerization induces an activation cascade from the dimerization sensor loop to the oxyanion hole through disorder-to-order transitions. The protein exhibits chymotrypsin-like specificity (sharing 16.8% sequence identity with chymotrypsin) but with a stronger preference for phenylalanine at the P1 position compared to chymotrypsin . The full-length protein includes an N-terminal region that appears to provide some autoinhibitory regulation, as demonstrated by the higher specific activity observed in the isolated SPD compared to the full-length protein .
FAM111A contributes to genome stability by protecting replication forks from stalling when they encounter protein obstacles such as DNA-protein crosslinks (DPCs) and tight nucleoprotein complexes . During DNA replication, the replisome may encounter various obstacles that can cause replication fork stalling, which, if prolonged, can lead to double-strand breaks and genomic instability . FAM111A's proteolytic activity is essential for overcoming these obstacles, particularly protein adducts like topoisomerase 1 cleavage complexes (TOP1ccs) stabilized by camptothecin (CPT) or poly(ADP-ribose) polymerase 1 (PARP1) trapped by PARP inhibitors . By removing these protein obstacles, FAM111A enables the progression of replication forks and prevents the deleterious effects of fork stalling.
Based on experimental design methodologies for recombinant protein expression, optimal conditions for expressing FAM111A in E. coli would likely include induction at mid-log phase (absorbance of approximately 0.8 at 600 nm) with moderate IPTG concentrations (around 0.1 mM) at lower temperatures (25°C rather than 37°C) . The expression medium composition significantly impacts protein yield and solubility, with balanced concentrations of yeast extract (5 g/L) and tryptone (5 g/L) providing good results for many recombinant proteins .
For FAM111A specifically, considering its propensity to form dimers and its proteolytic activity, expression conditions should be carefully optimized to prevent protein aggregation or premature activation. The inclusion of 1 g/L glucose in the medium can help regulate protein expression by preventing leaky expression from the lac promoter commonly used in E. coli expression systems . Expression time should be limited to around 4 hours post-induction to balance protein yield with potential toxicity or aggregation issues .
Verification of proper folding and activity of recombinant FAM111A can be achieved through multiple complementary approaches:
Proteolytic activity assay: Using fluorescent peptide substrates containing phenylalanine, tyrosine, or tryptophan at the P1 position, with preference for phenylalanine-containing substrates. The release of AMC (7-amino-4-methylcoumarin) from these substrates can be measured as a linear increase in fluorescence over time, indicating active protease .
Dimerization assessment: Since FAM111A activity depends on dimerization, techniques such as size exclusion chromatography, analytical ultracentrifugation, or native PAGE can be used to verify the dimeric state of the protein .
Functional assays: Testing FAM111A's ability to protect replication forks in cellular assays, such as measuring replication fork progression in the presence of TOP1ccs or trapped PARP1 using DNA fiber assays or evaluating cell survival after treatment with PARP inhibitors .
Control experiments: Comparing the activity of wild-type FAM111A with catalytically inactive mutants (e.g., S541A) or monomeric variants to confirm that observed effects are specifically due to FAM111A's proteolytic activity and dimerization .
Research has revealed that dimerization is essential for FAM111A's proteolytic activity against substrates but, intriguingly, is dispensable for autocleavage . Dimerization occurs via the N-terminal helix within the serine protease domain, creating a coiled-coil interface. Structural studies comparing dimeric wild-type and engineered monomeric mutants have demonstrated that dimerization induces a disorder-to-order transition that stabilizes the oxyanion hole at the active site, which is critical for catalysis .
This dimerization-dependent activation represents a unique regulatory mechanism for FAM111A. The protein contains a "dimerization sensor loop" that transduces the dimerization signal to the active site through conformational changes . When dimerization is disrupted by mutations, FAM111A loses its ability to cleave substrates but retains autocleavage activity, indicating different structural requirements for these two processes. This differential regulation may be physiologically important, potentially allowing autocleavage to serve as a mechanism for self-regulation independent of the protein's activity toward other substrates .
Heterozygous missense mutations in the catalytic domain of FAM111A are associated with genetic disorders characterized by developmental defects . These mutations cause hyper-autocleavage of the protein, suggesting dysregulated proteolytic activity contributes to pathogenesis. When these disease-associated mutants are ectopically expressed, they cause impaired DNA replication, single-strand DNA exposure, DNA damage, nuclear structure disruption, and ultimately cell death .
These findings highlight the critical importance of properly regulated FAM111A activity for cellular homeostasis and development. The balance between FAM111A's protective functions in DNA replication and the potential detrimental effects of uncontrolled proteolytic activity must be carefully maintained. Understanding the molecular mechanisms underlying these disease-associated mutations could provide insights into both the normal function of FAM111A and potential therapeutic approaches for related disorders.
FAM111A represents one of several proteolytic mechanisms that cells employ to remove protein obstacles from DNA. While other proteases like SPRTN and the proteasome have been implicated in the removal of DPCs in a replication-coupled manner, FAM111A appears to have distinct substrate preferences and functional contexts .
A key distinction is that FAM111A, but not SPRTN, protects replication forks from stalling at poly(ADP-ribose) polymerase 1 (PARP1)-DNA complexes trapped by PARP inhibitors . This suggests that different proteases may be specialized for dealing with specific types of protein obstacles. FAM111A's interaction with PCNA via its PIP box also indicates that it is specifically recruited to active replication forks, positioning it to immediately address protein obstacles encountered during DNA synthesis .
Furthermore, unlike many proteases in the S1 family that are secreted or membrane-associated, FAM111A is one of the few intracellular proteases in this family . This localization is consistent with its role in nuclear processes like DNA replication and potentially explains some of its unique regulatory features, such as dimerization-dependent activation.
When investigating FAM111A function, multifactorial experimental designs are particularly valuable due to the complex interplay between FAM111A's proteolytic activity, dimerization state, interactions with PCNA, and effects on DNA replication. Statistical experimental design methodologies allow researchers to evaluate multiple variables simultaneously, accounting for their interactions and characterizing experimental error more thoroughly than traditional univariate approaches .
For cellular studies of FAM111A function in replication fork protection, well-controlled experiments with proper randomization are essential to eliminate lurking variables . This includes:
Comparing FAM111A knockout cells with reconstituted variants: Using isogenic cell lines where FAM111A has been knocked out and then reconstituted with either wild-type FAM111A, catalytically inactive mutants (S541A), or dimerization-defective mutants to isolate the specific contributions of these features to observed phenotypes .
Employing specific replication stress inducers: Using agents like camptothecin (to stabilize TOP1ccs) or PARP inhibitors (to trap PARP1 on DNA) as defined challenges to replication forks that can reveal FAM111A's specific protective functions .
Measuring multiple endpoints: Assessing replication fork progression, DNA damage markers, cell cycle progression, and cell survival to comprehensively characterize FAM111A's functions and the consequences of its loss or mutation .
Biochemical characterization of FAM111A requires careful attention to several factors:
Protein preparation: When expressing recombinant FAM111A, conditions must be optimized to ensure proper folding and prevent aggregation. For studying the full-length protein versus the isolated SPD, it's important to note that the N-terminal region appears to provide some autoinhibition, resulting in lower specific activity for the full-length protein .
Activity assays: Fluorogenic peptide substrates with phenylalanine at the P1 position provide a sensitive means to measure FAM111A's proteolytic activity. Control reactions with the catalytically inactive S541A mutant are essential to confirm specificity . Activity measurements should be performed across a range of enzyme concentrations to ensure linearity and allow accurate determination of specific activity.
Dimerization analysis: Since dimerization is critical for FAM111A's activity against substrates, methods to assess the oligomeric state (such as size exclusion chromatography, analytical ultracentrifugation, or native PAGE) should be incorporated into biochemical studies .
Substrate identification: While FAM111A exhibits chymotrypsin-like specificity with preference for phenylalanine at P1, its natural substrates remain largely unknown. Techniques such as proteomics analysis of FAM111A-associated proteins or degradation products could help identify physiologically relevant substrates.
Integrating proteomic and genetic approaches provides a powerful strategy for comprehensive investigation of FAM111A functions:
Protein-protein interaction studies: Identifying FAM111A interactors through techniques like immunoprecipitation followed by mass spectrometry can reveal both potential substrates and regulatory partners. Known interactions with PCNA suggest FAM111A functions within multiprotein complexes at replication forks .
Genetic screening: CRISPR screens for synthetic lethality or genetic interactions with FAM111A can identify functional pathways connected to FAM111A activity. Particularly informative would be screens in the context of replication stress induced by agents like PARP inhibitors, where FAM111A plays a protective role .
Integration with disease-associated variants: Correlating functional studies with known disease-associated variants can provide insights into both pathogenic mechanisms and normal protein function. The hyper-autocleavage observed with disease-associated mutations suggests complex regulatory mechanisms worthy of investigation .
Proteomic analysis of substrate cleavage: Comparative proteomics between wild-type cells and those expressing catalytically inactive FAM111A under various conditions (normal growth, replication stress) could identify proteins whose abundance or post-translational modification state depends on FAM111A activity.
While FAM111A has been characterized as having chymotrypsin-like specificity with preference for phenylalanine at the P1 position in peptide substrates , the identity of its physiological protein substrates during replication fork protection remains largely unknown. Key unresolved questions include:
How does FAM111A recognize specific protein obstacles on DNA versus general proteins?
What structural features beyond the P1 residue contribute to substrate recognition?
Are there specific sequence motifs in protein obstacles that direct FAM111A cleavage?
How does the interaction with PCNA influence substrate selection at replication forks?
Addressing these questions will require a combination of structural studies, peptide library screening, and identification of cleavage sites in putative physiological substrates such as trapped TOP1 or PARP1 complexes.
Developing tools to monitor FAM111A activity in living cells would significantly advance our understanding of its functions and regulation. Potential approaches include:
FRET-based sensors: Designing fluorescent protein constructs containing FAM111A cleavage sites that undergo changes in FRET efficiency upon cleavage.
Split fluorescent protein systems: Creating systems where FAM111A-mediated cleavage of a linker allows reconstitution of a fluorescent protein.
Live-cell imaging of DNA replication: Combining visualization of replication factors with FAM111A localization and activity measurements to understand the spatiotemporal dynamics of its function at replication forks.
Engineered substrates: Developing cellular reporter systems with engineered FAM111A substrates that release detectable signals (fluorescence, luminescence) upon cleavage.
Such tools would enable real-time monitoring of FAM111A activity in response to different types of replication stress and in various genetic backgrounds, providing insights into its regulation and function.