Chapter 1.4 : Protein Domains and Architecture#

1. Introduction#

Welcome to the structural heart of drug targeting. In this section, we will move from abstract protein sequences to the tangible, functional units that drugs actually bind to. Your learning journey will begin by understanding how proteins are organized into distinct domains and motifs. Understanding how proteins are organized is fundamental to identifying where and how drugs can interact with them. Just as architects design buildings with functional modules (kitchens, bedrooms, bathrooms), evolution has designed proteins using modular building blocks called domains and motifs.


2. Key Concepts and Definitions#

  • Protein Motif: A specific 3D arrangement of amino acids that appears across different proteins. Motifs are smaller than domains and may not fold independently.

  • Protein Domain: A distinct, independently folding structural unit within a protein that typically performs a specific function. Domains usually range from 50-200 amino acids and can exist independently or be combined with other domains.

  • Protein Fold: The overall 3D structure of a protein domain, representing how the polypeptide chain is arranged in space.

  • Domain Architecture: The arrangement and order of domains within a multi-domain protein. This organization often reflects the protein’s evolutionary history and functional capabilities.


3. Main Content#

Understanding recurring structural patterns helps you recognize functional elements across protein families.

3.1 Secondary Structures#

Secondary structures (α-helices and β-sheets) are the fundamental building blocks of protein architecture

Structure

Description

Key Features

α-Helix

Right-handed spiral structure

• 3.6 residues per turn
• Stabilized by hydrogen bonds between backbone atoms

β-Sheet

Extended strands structure

• Connected by hydrogen bonds
• Can be parallel or antiparallel

Turns & Loops

Connecting structures

• Connect helices and sheets
• Often contain functional sites

Let’s visualize the secondary structure of the Gram-positive bacteria protein, lipotechoic acid synthase, a potential antibiotics drug target.

In the example below, you will see:

  • α-Helix coloured as red

  • β-Sheet coloured as yellow

  • Turns & Loops coloured as green

Hide code cell source

import py3Dmol
import requests
from IPython.display import display

# --- 1. Get Example PDB Data ---
pdb_id = "2W5T"
url = f"https://files.rcsb.org/download/{pdb_id}.pdb"
pdb_data = ""
try:
    pdb_data = requests.get(url).text
except requests.exceptions.RequestException as e:
    print(f"Error fetching PDB file: {e}")

if pdb_data:
    # --- 2. Initialize the py3Dmol Viewer ---
    viewer = py3Dmol.view(width=760, height=760)

    # --- 3. Load the PDB Data ---
    viewer.addModel(pdb_data, 'pdb')
    
    # --- 4. Style Manually (No addSSlabels) ---
    # We apply styles in order, from general to specific.
    
    # First, set the default for everything: green cartoon
    # This will color all loops, turns, and unstructured parts.
    viewer.setStyle({'cartoon': {'color': 'green'}})
    
    # Next, add a style for helices: red cartoon
    # This selects only 'ss':'h' (helix) and overrides the green.
    viewer.addStyle({'ss':'h'}, {'cartoon': {'color': 'red'}})
    
    # Finally, add a style for sheets: yellow cartoon
    # This selects only 'ss':'s' (sheet) and overrides the green.
    viewer.addStyle({'ss':'s'}, {'cartoon': {'color': 'yellow'}})

    # --- 5. Center the View and Display ---
    viewer.zoomTo()
    
    display(viewer)
else:
    print("No PDB data loaded, skipping display.")

3Dmol.js failed to load for some reason. Please check your browser console for error messages.

<py3Dmol.view at 0x1d1166eb340>

3.2 Common Structural Motifs#

Structural motifs are specific arrangements of secondary structures that perform distinct functions

  • Many of these motifs are important drug targets, particularly those involved in:

    • Transcription regulation (HTH, Zinc Fingers, Leucine Zippers)

    • Signal transduction (EF-Hand calcium binding)

    • Protein-protein interactions

Motif

Structure

Function

Drug Discovery Relevance

Helix-Turn-Helix (HTH)

• Two α-helices separated by short turn
• Recognition helix fits into DNA major groove

DNA binding (common in transcription factors)

Target for cancer therapeutics (e.g., blocking oncogenic transcription factors)

Zinc Finger

• Coordinated by zinc ion
• Typically involves β-sheet and α-helix
• Example: CCHH zinc fingers

• DNA/RNA binding
• Protein-protein interactions

Increasingly druggable with small molecules that disrupt zinc coordination

Leucine Zipper

• Amphipathic α-helix
• Leucines at every 7th position

• Dimerization
• DNA binding

Disrupting dimerization can inhibit transcription factor activity (e.g., AP-1 inhibitors)

EF-Hand

• Helix-loop-helix motif that binds Ca²⁺
• Found in: Calmodulin, troponin C

Calcium binding

Modulating calcium signaling pathways

Different structure motifs

⚠️ WARNING
You can run the code below to view the different proteins. Click on the rocket icon on top of the Jupyter-book and click Live Code. Wait for the kernel to load before running the codes
!pip install py3Dmol
import py3Dmol

# --- Define the dictionary of proteins ---
proteins = {
    "1LMB": {
        "desc": "Motif: Helix-turn-helix (Lambda repressor)",
        "highlight": "33-52" 
    },
    "1ZAA": {
        "desc": "Motif: Zinc finger (Zif268)",
        "highlight": "12-24,39-51,66-78" 
    },
    "1YSA": {
        "desc": "Motif: Leucine zipper (GCN4)",
        "highlight": "250-281" 
    },
    "1CLL": {
        "desc": "Motif: EF-hand (Calmodulin)",
        "highlight": "12-29,48-65,85-102,121-138" 
    }
}

# --- 1. Display the menu for the students ---
print("--- Protein Selection Menu ---")
for pdb_id, data in proteins.items():
    print(f"  [{pdb_id}]: {data['desc']}")

# --- 2. Get the choice using input() ---
selected_id = input("Enter the 4-character PDB ID to render: ").upper().strip()

# --- 3. Validate the choice and render the protein ---
if selected_id in proteins:
    protein_data = proteins[selected_id]
    highlight_residues = protein_data.get('highlight')
    
    # Set up the py3Dmol viewer
    viewer = py3Dmol.view(query=f'pdb:{selected_id}', width=760, height=760)
    viewer.setStyle({}, {'line': {}})
    
    if highlight_residues:        
        viewer.setStyle({}, {'cartoon': {'color': 'lightgrey'}})
        viewer.setStyle({'resi': highlight_residues}, {'cartoon': {'color': 'red'}})
    else:
        viewer.setStyle({}, {'cartoon': {'color': 'spectrum'}})

    viewer.setStyle({'hetflag': True}, {'stick': {'colorscheme': 'element', 'radius': 0.3}})
    viewer.setStyle({'elem': ['ZN', 'CA']}, {'sphere': {'radius': 0.8}})
    viewer.zoomTo()
    viewer.show()
else:
    print(f"\nError: '{selected_id}' is not a valid choice.")
    print("Please re-run the cell and select a PDB ID from the list.")

3.3 Major Protein Domains and Their Functions#

Domains are the functional units that are commonly “druggable”. Here are the most pharmacologically relevant families:

Domain

Function

Structure

Drug Discovery Relevance

Kinase Domain

Phosphorylate substrates using ATP

Bilobal structure with ATP-binding pocket between lobes

Gold mine with >50 FDA-approved inhibitors (e.g., imatinib for CML); Druggable sites: ATP-binding pocket, allosteric sites, substrate-binding groove

SH2 Domain
(Src Homology 2)

Recognize and bind phosphorylated tyrosine residues

Central β-sheet flanked by α-helices

Disrupting protein-protein interactions in signaling cascades

SH3 Domain
(Src Homology 3)

Bind proline-rich sequences (PxxP motifs)

β-barrel structure

Protein-protein interaction inhibitors

PDZ Domain

Recognize C-terminal sequences (typically S/T-X-V/L)

Six β-strands and two α-helices

Scaffolding proteins in signaling complexes

Immunoglobulin (Ig) Domain

Cell adhesion, immune recognition

β-sandwich (two β-sheets)

Target for monoclonal antibodies and small molecule inhibitors

WD40 Repeat

Scaffolding for protein complexes

β-propeller (typically 7 blades, each with 4 β-strands)

Difficult to drug but increasingly targeted (e.g., WDR5 in cancer)

Death Domain

Protein-protein interactions in apoptosis

Six α-helical bundle

Modulating cell death pathways in cancer

Protein domains

⚠️ WARNING
You can run the code below to view the different proteins. Click on the rocket icon on top of the Jupyter-book and click Live Code. Wait for the kernel to load before running the codes
import py3Dmol

# --- Define the dictionary of proteins---
proteins = {
    "7Q7E": {
        "desc": "Domain: Kinase domain of human Serine/Threonine Kinase 17B in complex with ATP"
    },
    "1SHA": {
        "desc": "Domain: SH2 (Src) with phosphotyrosine peptide"
    },
    "1CKA": {
        "desc": "Domain: SH3 (Src) with proline-rich peptide"
    },
    "1BE9": {
        "desc": "Domain: PDZ (PSD-95) with C-terminal peptide"
    },
    "1TIT": {
        "desc": "Domain: Immunoglobulin (Ig) from Titin"
    },
    "2H9M": {
        "desc": "Domain: WD40 Repeat (WDR5 beta-propeller)"
    },
    "1E3Y": {
        "desc": "Domain: Death Domain (FADD 6-helix bundle)"
    }
}


# --- 1. Display the menu ---
print("--- Protein Domain Selection Menu ---")
for pdb_id, data in proteins.items():
    print(f"  [{pdb_id}]: {data['desc']}")

# --- 2. Get choice using input() ---
selected_id = input("Enter the 4-character PDB ID to render: ").upper().strip()


# --- 3. Validate the choice and render the protein ---
if selected_id in proteins:
    protein_data = proteins[selected_id]
    highlight_residues = protein_data.get('highlight')
        
    # Set up the py3Dmol viewer
    viewer = py3Dmol.view(query=f'pdb:{selected_id}', width=760, height=760)
    viewer.setStyle({}, {'line': {}})
    
    if highlight_residues:
        print(f"Applying special style: Highlighting motif (residues {highlight_residues}) in red.")
        
        viewer.setStyle({}, {'cartoon': {'color': 'lightgrey'}})
        viewer.setStyle({'resi': highlight_residues}, {'cartoon': {'color': 'red'}})
    else:
        viewer.setStyle({}, {'cartoon': {'color': 'spectrum'}})

    viewer.setStyle({'hetflag': True}, {'stick': {'colorscheme': 'element', 'radius': 0.3}})
    viewer.setStyle({'elem': ['ZN', 'CA']}, {'sphere': {'radius': 0.8}})
    viewer.zoomTo()
    viewer.show()
else:
    print(f"\nError: '{selected_id}' is not a valid choice.")
    print("Please re-run the cell and select a PDB ID from the list.")

3.4 Domain architecture: Multi domain proteins#

Most human proteins contain multiple domains.

Linear Domain Arrangement

Domains are arranged like beads on a string along the polypeptide chain.

Example: Receptor Tyrosine Kinases (RTKs)

There are three domains of the RTK

  1. Extracellular ligand-binding domain

  2. Transmembrane domain

  3. Intracellular domain (kinase domain)

Depending on the type of domains, you may have different drug strategy:

  • Extracellular: Blocking antibodies (e.g., trastuzumab for HER2)

  • Kinase domain: Small molecule inhibitors (e.g., gefitinib for EGFR)

Modular Signaling Proteins

Example: Src Family Kinases There are three domains of the Src family kinases

  1. SH3 domain

  2. SH2 domain

  3. Kinase domain

SH2/SH3 domains regulate kinase activity through intramolecular interactions.


4. Summary and Key Takeaways#

In this section, we’ve explored how a protein’s function is organized into a modular architecture of domains and motifs. We learned that these domains are the primary targets for modern drugs.

  • Domains are functional modules: Think of proteins as modular machines where each domain performs a specific job. Understanding these modules helps identify where drugs can act.

  • Motifs are structural patterns: Recurring 3D arrangements (helix-turn-helix, zinc fingers) signal specific functions. Recognizing motifs accelerates functional annotation.

  • Architecture reflects function: The order and combination of domains determine protein behavior. Multi-domain proteins integrate multiple functions.

  • Conservation indicates importance: Highly conserved domains/motifs across species are usually functionally critical and make good drug targets.

  • Structural similarity doesn’t guarantee functional similarity: The same fold can perform different functions in different contexts.

This foundational skill is important for analyzing potential binding sites and for drug-target interaction.