The Mine Works
Browse on Apify
Federal Register API: How to Track US Rules, Proposed Rules, and Executive Orders
← All posts
tutorial June 22, 2026 · 8 min read Updated June 22, 2026

Federal Register API: How to Track US Rules, Proposed Rules, and Executive Orders

The Federal Register publishes every US executive action, proposed rule, and final rule via a REST API. Here is how to query it and what the data contains.

Try the scraper

The actor referenced in this article is live on Apify. Pay only for results delivered.

Open on Apify →

Every rule the US federal government makes goes through the Federal Register. Every proposed rule, every final rule, every agency notice, every executive order. The Federal Register is the official daily journal of the US federal government and it has been published since 1936.

It also has a clean REST API, requires no authentication, and covers the full text of every document published since 1994. For compliance teams, policy researchers, legal professionals, and anyone building regulatory monitoring tools, this is one of the most useful and underused government APIs available.

TL;DR: The Federal Register API (federalregister.gov/api/v1) is public, free, and requires no API key. The primary endpoint is /documents.json with query parameters for full-text search, agency filtering, document type filtering, and date ranges. It returns structured metadata plus a full_text_url pointing to the complete document text, making it suitable for RAG pipelines over regulatory content.

What the Federal Register Is

The Federal Register is published every federal business day (about 250 times per year) by the Office of the Federal Register, a unit of the National Archives. Each issue contains:

Rules. Final rules that have completed the notice-and-comment rulemaking process and are now binding law. These are the ones that actually change regulatory requirements.

Proposed Rules. Draft rules that agencies are proposing and accepting public comment on. The comment period is typically 30 to 60 days. Tracking proposed rules is how compliance professionals get advance warning of upcoming changes.

Notices. Agency announcements that do not create binding rules. Grant award notices, meeting announcements, information collection notices, and environmental impact statements all appear as notices.

Presidential Documents. Executive orders, proclamations, and presidential memos. These are published first in the Federal Register before being compiled into the Code of Federal Regulations or other official records.

The Federal Register is distinct from the Code of Federal Regulations (CFR). The CFR is the codified version of currently effective rules, organized by title and section. The Federal Register is the chronological record of the rulemaking process. A rule appears in the Federal Register when it is proposed, when comments close, and when it becomes final. Then it gets incorporated into the CFR.

The API

The base URL is https://www.federalregister.gov/api/v1/. No API key, no authentication, no rate limit published (in practice, be reasonable; the server is not high-capacity).

The primary endpoint is:

GET /documents.json

Key query parameters:

ParameterDescription
conditions[term]Full-text search across title and abstract
conditions[agencies][]Agency slug (e.g. environmental-protection-agency)
conditions[type][]Document type: RULE, PRORULE, NOTICE, PRESDOCU
conditions[publication_date][gte]Published on or after this date (YYYY-MM-DD)
conditions[publication_date][lte]Published on or before this date
conditions[effective_date][gte]Effective on or after this date
fields[]Specific fields to return (improves response size)
per_pageResults per page (max 1000)
pagePage number
orderSort order: newest, oldest, relevance, executive_order_number

Python: Search Proposed Rules by Agency

Here is how to pull all proposed rules from EPA in 2025:

import requests

BASE_URL = "https://www.federalregister.gov/api/v1"

def get_proposed_rules(agency_slug, year=2025, per_page=100):
    """Fetch all proposed rules from a specific agency in a given year."""
    
    params = {
        "conditions[agencies][]": agency_slug,
        "conditions[type][]": "PRORULE",
        "conditions[publication_date][gte]": f"{year}-01-01",
        "conditions[publication_date][lte]": f"{year}-12-31",
        "fields[]": [
            "document_number",
            "title",
            "abstract",
            "publication_date",
            "effective_on",
            "agencies",
            "action",
            "comment_url",
            "comment_date",
            "html_url",
            "full_text_url",
            "regulation_id_number_info"
        ],
        "per_page": per_page,
        "order": "newest",
    }
    
    all_docs = []
    page = 1
    
    while True:
        params["page"] = page
        resp = requests.get(f"{BASE_URL}/documents.json", params=params)
        resp.raise_for_status()
        data = resp.json()
        
        results = data.get("results", [])
        if not results:
            break
        
        all_docs.extend(results)
        
        count = data.get("count", 0)
        total_pages = (count + per_page - 1) // per_page
        
        print(f"Page {page}/{total_pages}: {len(all_docs)}/{count} documents")
        
        if page >= total_pages:
            break
        
        page += 1
    
    return all_docs

# EPA's agency slug in the Federal Register API
epa_rules = get_proposed_rules("environmental-protection-agency", year=2025)

print(f"\nTotal EPA proposed rules in 2025: {len(epa_rules)}")
for rule in epa_rules[:5]:
    print(f"\n{rule['publication_date']}: {rule['title']}")
    print(f"  Comment deadline: {rule.get('comment_date', 'N/A')}")
    print(f"  URL: {rule['html_url']}")

Agency slugs follow a hyphenated lowercase format. A few common ones:

AgencySlug
EPAenvironmental-protection-agency
FDAfood-and-drug-administration
FCCfederal-communications-commission
SECsecurities-and-exchange-commission
OSHAoccupational-safety-and-health-administration
CFPBconsumer-financial-protection-bureau
FTCfederal-trade-commission

The full agency list is at https://www.federalregister.gov/api/v1/agencies.json.

Key Data Fields

A full document object contains more than 40 fields. The most useful for compliance monitoring:

{
    "document_number": "2025-04821",        # Unique FR document ID
    "title": "National Ambient Air Quality Standards for Particulate Matter",
    "abstract": "EPA is proposing to revise...",  # Short summary
    "action": "Proposed rule",              # What the agency is doing
    "publication_date": "2025-03-15",       # Date published in FR
    "effective_on": "2025-05-14",           # When it takes effect (rules only)
    "comment_date": "2025-04-14",           # Comment period deadline
    "comment_url": "https://www.regulations.gov/...",
    "html_url": "https://www.federalregister.gov/documents/...",
    "full_text_url": "https://www.federalregister.gov/documents/.../full_text.xml",
    "pdf_url": "https://www.gpo.gov/fdsys/pkg/FR-2025-03-15/pdf/...",
    "agencies": [
        {
            "name": "Environmental Protection Agency",
            "id": 199,
            "slug": "environmental-protection-agency"
        }
    ],
    "regulation_id_number_info": {
        "2060-AV12": {
            "xml_url": "...",
            "issue": "Fall 2024"
        }
    },
    "significant": true,  # Designated as economically significant (>$100M impact)
}

The regulation_id_number_info contains the Regulation Identifier Number (RIN), which lets you track a rule across its entire lifecycle from the Unified Regulatory Agenda.

Monitoring a Regulatory Topic Across Agencies

One of the more powerful use cases is tracking a topic keyword across all federal agencies to catch rules you might not know to look for:

def monitor_regulatory_topic(keyword, days_back=30):
    """Search all agencies for recent documents on a topic."""
    from datetime import datetime, timedelta
    
    cutoff = (datetime.now() - timedelta(days=days_back)).strftime("%Y-%m-%d")
    
    params = {
        "conditions[term]": keyword,
        "conditions[type][]": ["RULE", "PRORULE"],
        "conditions[publication_date][gte]": cutoff,
        "fields[]": [
            "document_number",
            "title",
            "abstract",
            "agencies",
            "publication_date",
            "action",
            "comment_date",
            "html_url",
            "full_text_url",
        ],
        "per_page": 100,
        "order": "newest",
    }
    
    resp = requests.get(f"{BASE_URL}/documents.json", params=params)
    resp.raise_for_status()
    data = resp.json()
    
    docs = data.get("results", [])
    total = data.get("count", 0)
    
    print(f"Found {total} documents mentioning '{keyword}' since {cutoff}")
    
    # Group by document type
    rules = [d for d in docs if d.get("type") == "Rule"]
    proposed = [d for d in docs if d.get("type") == "Proposed Rule"]
    
    print(f"  Final rules: {len(rules)}")
    print(f"  Proposed rules: {len(proposed)}")
    
    return docs

# Track cryptocurrency regulation across all agencies
crypto_docs = monitor_regulatory_topic("cryptocurrency", days_back=90)

# Track AI regulation
ai_docs = monitor_regulatory_topic("artificial intelligence", days_back=90)
for doc in ai_docs[:5]:
    agencies = ", ".join(a["name"] for a in doc.get("agencies", []))
    print(f"\n{doc['publication_date']} [{agencies}]")
    print(f"  {doc['title']}")
    print(f"  Action: {doc.get('action', 'N/A')}")

Pulling Full Document Text for RAG Pipelines

The full_text_url field points to the complete XML document on the Federal Register server. This is the actual regulatory text. For building a RAG pipeline over regulatory content, this is the field you want:

import xml.etree.ElementTree as ET

def fetch_document_text(doc):
    """Fetch and extract plain text from a Federal Register document."""
    
    full_text_url = doc.get("full_text_url")
    if not full_text_url:
        return None
    
    resp = requests.get(full_text_url)
    resp.raise_for_status()
    
    # The full_text_url returns XML
    root = ET.fromstring(resp.content)
    
    # Extract all text nodes
    text_parts = []
    for elem in root.iter():
        if elem.text and elem.text.strip():
            text_parts.append(elem.text.strip())
        if elem.tail and elem.tail.strip():
            text_parts.append(elem.tail.strip())
    
    full_text = " ".join(text_parts)
    return full_text

def build_regulatory_dataset(keyword, agency_slug=None, years_back=2):
    """Build a dataset of regulatory documents with full text for RAG indexing."""
    from datetime import datetime, timedelta
    
    cutoff = (datetime.now() - timedelta(days=365 * years_back)).strftime("%Y-%m-%d")
    
    params = {
        "conditions[term]": keyword,
        "conditions[publication_date][gte]": cutoff,
        "conditions[type][]": ["RULE", "PRORULE"],
        "fields[]": ["document_number", "title", "abstract", "publication_date",
                     "agencies", "html_url", "full_text_url", "action"],
        "per_page": 100,
        "order": "newest",
    }
    
    if agency_slug:
        params["conditions[agencies][]"] = agency_slug
    
    resp = requests.get(f"{BASE_URL}/documents.json", params=params)
    data = resp.json()
    docs = data.get("results", [])
    
    dataset = []
    for doc in docs:
        text = fetch_document_text(doc)
        if text:
            dataset.append({
                "id": doc["document_number"],
                "title": doc["title"],
                "agency": doc.get("agencies", [{}])[0].get("name", ""),
                "date": doc["publication_date"],
                "action": doc.get("action", ""),
                "url": doc["html_url"],
                "text": text,
                "char_count": len(text),
            })
            print(f"Fetched: {doc['document_number']} ({len(text):,} chars)")
    
    return dataset

# Build a dataset for RAG over financial regulation
finreg_dataset = build_regulatory_dataset(
    keyword="bank capital requirements",
    years_back=3
)
print(f"\nDataset: {len(finreg_dataset)} documents, "
      f"{sum(d['char_count'] for d in finreg_dataset):,} total chars")

This pattern works well for building regulatory intelligence tools where you want to let users ask natural language questions about what regulations apply to them. Index the text field into a vector database, store the structured metadata, and you have a searchable corpus of the actual regulatory text.

Building a Regulatory Feed

For ongoing monitoring, you want a scheduled job that pulls the prior day’s documents and sends alerts:

from datetime import datetime, timedelta

def daily_regulatory_feed(watch_list):
    """
    watch_list: list of dicts with keys 'agency_slug', 'keywords', 'types'
    Returns new documents from the prior business day.
    """
    yesterday = (datetime.now() - timedelta(days=1)).strftime("%Y-%m-%d")
    today = datetime.now().strftime("%Y-%m-%d")
    
    new_docs = []
    
    for watch in watch_list:
        params = {
            "conditions[publication_date][gte]": yesterday,
            "conditions[publication_date][lte]": today,
            "fields[]": ["document_number", "title", "abstract", "agencies",
                        "publication_date", "action", "html_url", "comment_date"],
            "per_page": 100,
        }
        
        if watch.get("agency_slug"):
            params["conditions[agencies][]"] = watch["agency_slug"]
        if watch.get("keywords"):
            params["conditions[term]"] = " OR ".join(watch["keywords"])
        if watch.get("types"):
            params["conditions[type][]"] = watch["types"]
        
        resp = requests.get(f"{BASE_URL}/documents.json", params=params)
        if resp.status_code == 200:
            docs = resp.json().get("results", [])
            for doc in docs:
                doc["_watch_source"] = watch.get("label", "unknown")
            new_docs.extend(docs)
    
    return new_docs

# Example watch list for a financial services firm
watch_list = [
    {
        "label": "SEC rules",
        "agency_slug": "securities-and-exchange-commission",
        "types": ["RULE", "PRORULE"],
    },
    {
        "label": "CFPB anything",
        "agency_slug": "consumer-financial-protection-bureau",
        "types": ["RULE", "PRORULE", "NOTICE"],
    },
    {
        "label": "AI regulation cross-agency",
        "keywords": ["artificial intelligence", "machine learning", "algorithm"],
        "types": ["RULE", "PRORULE"],
    },
]

todays_updates = daily_regulatory_feed(watch_list)
print(f"New regulatory documents today: {len(todays_updates)}")
for doc in todays_updates:
    print(f"[{doc['_watch_source']}] {doc['title']}")

The Federal Register publishes by noon ET on each business day. A feed job scheduled for 1 PM ET will capture the day’s documents reliably.

The managed Federal Register scraper handles the agency slug lookup, pagination, full-text extraction, and scheduled delivery. For teams that need the data in a structured format without maintaining the polling infrastructure, the scheduled scraper run delivers a structured dataset of each day’s regulatory activity.

Related Actor

Try the scraper referenced in this article — live on Apify, pay only for results.

Open federal-register-scraper on Apify →