Structure Modules¶
Structure modules convert trunk representations into 3D atomic coordinates. They operate on the single representation and pair representation to produce per-residue frames and atom positions.
Quick Start¶
from molfun.modules.structure_module import STRUCTURE_MODULE_REGISTRY
# List available structure modules
print(STRUCTURE_MODULE_REGISTRY.list())
# ["diffusion", "ipa"]
# Build a structure module
sm = STRUCTURE_MODULE_REGISTRY.build("ipa", c_s=384, c_z=128, num_heads=12)
# Swap in a model
from molfun import MolfunStructureModel
model = MolfunStructureModel.from_pretrained("openfold_v2")
model.swap("structure_module", "diffusion")
STRUCTURE_MODULE_REGISTRY¶
STRUCTURE_MODULE_REGISTRY
module-attribute
¶
BaseStructureModule¶
BaseStructureModule ¶
Bases: ABC, Module
Maps (single_repr, pair_repr) → 3D structure.
Different paradigms: - IPA (AF2): iterative refinement with invariant point attention - Diffusion (RF-Diffusion/AF3): denoising diffusion on frames - Equivariant (SE3-Transformers): equivariant message passing
All must produce a StructureModuleOutput with at minimum
the positions field populated.
forward
abstractmethod
¶
forward(single: Tensor, pair: Tensor, aatype: Tensor | None = None, mask: Tensor | None = None, **kwargs) -> StructureModuleOutput
Predict 3D coordinates from representations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
single
|
Tensor
|
Per-residue features [B, L, D_single]. |
required |
pair
|
Tensor
|
Pairwise features [B, L, L, D_pair]. |
required |
aatype
|
Tensor | None
|
Residue types [B, L] (int64, 0-20). |
None
|
mask
|
Tensor | None
|
Residue mask [B, L] (1 = valid). |
None
|
**kwargs
|
Subclass-specific args (e.g. |
{}
|
Returns:
| Type | Description |
|---|---|
StructureModuleOutput
|
StructureModuleOutput with predicted coordinates. |
Abstract base class for structure modules.
Forward Signature¶
| Parameter | Type | Description |
|---|---|---|
s |
Tensor |
Single representation (B, L, c_s) |
z |
Tensor |
Pair representation (B, L, L, c_z) |
mask |
Tensor \| None |
Sequence mask (B, L) |
Returns: StructureModuleOutput
StructureModuleOutput¶
StructureModuleOutput
dataclass
¶
Standardized output from any structure prediction module.
Dataclass holding structure module outputs.
| Field | Type | Description |
|---|---|---|
positions |
Tensor |
Atom positions (B, L, 37, 3) in Angstroms |
frames |
Tensor |
Backbone frames (B, L, 4, 4) as rigid transforms |
plddt |
Tensor |
Per-residue confidence (B, L) in [0, 1] |
pae |
Tensor \| None |
Predicted aligned error (B, L, L) |
IPA (Invariant Point Attention)¶
IPAStructureModule ¶
Bases: BaseStructureModule
Simplified IPA structure module for custom model building.
Iteratively refines backbone frames using invariant point attention. This is a research-friendly implementation — for production with pre-trained AlphaFold2 weights, use OpenFoldAdapter instead.
Invariant Point Attention structure module from AlphaFold2. Iteratively refines backbone frames using geometric attention that is invariant to global rotations and translations.
sm = STRUCTURE_MODULE_REGISTRY.build(
"ipa",
c_s=384,
c_z=128,
num_heads=12,
num_layers=8,
num_query_points=4,
num_value_points=8,
)
output = sm(s, z, mask=mask)
coords = output.positions # (B, L, 37, 3)
plddt = output.plddt # (B, L)
| Parameter | Type | Default | Description |
|---|---|---|---|
c_s |
int |
required | Single representation dimension |
c_z |
int |
required | Pair representation dimension |
num_heads |
int |
12 |
Number of IPA heads |
num_layers |
int |
8 |
Number of IPA refinement layers |
num_query_points |
int |
4 |
Number of query points per head |
num_value_points |
int |
8 |
Number of value points per head |
Diffusion¶
DiffusionStructureModule ¶
Bases: BaseStructureModule
Denoising diffusion structure prediction.
During training, coordinates are noised and the module learns to predict the clean coordinates (x0-prediction). During inference, coordinates are generated by iteratively denoising from pure noise.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
d_single
|
int
|
Single representation dimension. |
384
|
d_pair
|
int
|
Pair representation dimension. |
128
|
d_model
|
int
|
Internal hidden dimension. |
256
|
n_layers
|
int
|
Number of denoising network layers. |
4
|
n_steps
|
int
|
Diffusion timesteps for inference. |
100
|
noise_schedule
|
str
|
"linear" or "cosine". |
'cosine'
|
forward ¶
forward(single: Tensor, pair: Tensor, aatype: Tensor | None = None, mask: Tensor | None = None, gt_coords: Tensor | None = None) -> StructureModuleOutput
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
single
|
Tensor
|
Per-residue features [B, L, D_single]. |
required |
pair
|
Tensor
|
Pairwise features [B, L, L, D_pair]. |
required |
aatype
|
Tensor | None
|
Residue types [B, L] (int64, optional). |
None
|
mask
|
Tensor | None
|
Residue mask [B, L] (optional). |
None
|
gt_coords
|
Tensor | None
|
Ground-truth Cα coordinates [B, L, 3]. Required for training. During inference this is ignored and coordinates are generated from noise. |
None
|
Diffusion-based structure module that generates 3D coordinates through iterative denoising, inspired by diffusion models for molecular generation.
sm = STRUCTURE_MODULE_REGISTRY.build(
"diffusion",
c_s=384,
c_z=128,
num_steps=100,
noise_schedule="cosine",
)
output = sm(s, z, mask=mask)
coords = output.positions
| Parameter | Type | Default | Description |
|---|---|---|---|
c_s |
int |
required | Single representation dimension |
c_z |
int |
required | Pair representation dimension |
num_steps |
int |
100 |
Number of diffusion denoising steps |
noise_schedule |
str |
"cosine" |
Noise schedule ("cosine", "linear") |
num_heads |
int |
8 |
Number of attention heads in the denoiser |