Skip to content

Devel#38

Open
huiwenke wants to merge 31 commits into
pylelab:masterfrom
huiwenke:devel
Open

Devel#38
huiwenke wants to merge 31 commits into
pylelab:masterfrom
huiwenke:devel

Conversation

@huiwenke

Copy link
Copy Markdown

This PR introduces significant refactoring and improvements to the flexible alignment (flexalign) functionality, centralizing the core logic into a unified dispatch engine, introducing a powerful FATCAT-hybrid mode, and exposing fine-grained optimization controls for secondary structures.

Key Changes & Features:

  • Unified Alignment Engine (flexalign_unified) & FlexAlignMode:
    Consolidated disparate flexible alignment workflows into a single entry routine named flexalign_unified. The execution flow is directly governed by the FlexAlignMode enum, which maps to the command-line -mm options as follows:

    • FLEX_STANDARD: The standard mode. Used by -mm 7 (passed with ss_opt = 0) and -mm 8 (passed with ss_opt = 1).
    • FLEX_BEST: Used by -mm 9. This mode dynamically overrides the single ss_opt to evaluate both secondary structure configurations and find the best result.
    • FLEX_FATCAT: Used by -mm 10. Dispatches execution to the new FATCAT-style flexible alignment pipeline (flexalign_fatcat_main).
  • Secondary Structure Optimization (ss_opt) and -mm 7/8/9:
    Implemented an evaluation loop (for (int cur_ss_opt = start_ss; cur_ss_opt <= end_ss; cur_ss_opt++)) in flexalign_unified to control secondary structure-aware scoring.

    • -mm 7: Enables secondary structure optimization (ss_opt = 0).
    • -mm 8: Disables secondary structure optimization (ss_opt = 1).
    • -mm 9: Loops through both ss_opt = 0 (enabled) and ss_opt = 1 (disabled), benchmarking the alignments to automatically retain the structural configuration that yields the highest overall TM-score.
  • FATCAT-style Flexible Alignment (flexalign_fatcat_main via -mm 10):
    Introduced a robust hybrid alignment module that synergizes FATCAT’s topological flexibility with US-align’s accurate TM-score optimization.

    • Algorithmic Workflow:
      1. Pre-evaluation: Executes a full-sequence baseline alignment (equivalent to -mm 9). If the structure already aligns excellently (global TM-score $\ge 0.85$), it early-exits to save computational resources.
      2. AFP Extraction & Merging: Precomputes intra-protein distance matrices, detects initial Aligned Fragment Pairs (AFPs) using Kabsch rotation, and merges them along diagonals.
      3. Dynamic Programming & Domain Splitting: Utilizes a dual dynamic programming approach with gap, twist, and RMSD penalty models (generate_bounds) to optimally chain AFPs and split domains at geometrically justified hinge points.
      4. US-align Block Refinement: Feeds the dynamically determined structural blocks back into the core US-align engine (execute_flexalign_with_fallback), optimizing the local sub-alignments while assembling them into the final global flexible alignment.
  • CLI Enhancements:
    Exposed the -hinge option in the main USalign command-line help menu (previously commented out). Added internal parsing (hinge_set) so users can explicitly define the maximum number of hinges allowed during flexible alignments, or omit the -hinge option to remove the hinge limit for processing large and complex structures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant