Federal Register API: How to Track US Rules, Proposed Rules, and Executive Orders
The Federal Register publishes every US executive action, proposed rule, and final rule via a REST API. Here is how to query it and what the data contains.
The actor referenced in this article is live on Apify. Pay only for results delivered.
Every rule the US federal government makes goes through the Federal Register. Every proposed rule, every final rule, every agency notice, every executive order. The Federal Register is the official daily journal of the US federal government and it has been published since 1936.
It also has a clean REST API, requires no authentication, and covers the full text of every document published since 1994. For compliance teams, policy researchers, legal professionals, and anyone building regulatory monitoring tools, this is one of the most useful and underused government APIs available.
TL;DR: The Federal Register API (federalregister.gov/api/v1) is public, free, and requires no API key. The primary endpoint is
/documents.jsonwith query parameters for full-text search, agency filtering, document type filtering, and date ranges. It returns structured metadata plus afull_text_urlpointing to the complete document text, making it suitable for RAG pipelines over regulatory content.
What the Federal Register Is
The Federal Register is published every federal business day (about 250 times per year) by the Office of the Federal Register, a unit of the National Archives. Each issue contains:
Rules. Final rules that have completed the notice-and-comment rulemaking process and are now binding law. These are the ones that actually change regulatory requirements.
Proposed Rules. Draft rules that agencies are proposing and accepting public comment on. The comment period is typically 30 to 60 days. Tracking proposed rules is how compliance professionals get advance warning of upcoming changes.
Notices. Agency announcements that do not create binding rules. Grant award notices, meeting announcements, information collection notices, and environmental impact statements all appear as notices.
Presidential Documents. Executive orders, proclamations, and presidential memos. These are published first in the Federal Register before being compiled into the Code of Federal Regulations or other official records.
The Federal Register is distinct from the Code of Federal Regulations (CFR). The CFR is the codified version of currently effective rules, organized by title and section. The Federal Register is the chronological record of the rulemaking process. A rule appears in the Federal Register when it is proposed, when comments close, and when it becomes final. Then it gets incorporated into the CFR.
The API
The base URL is https://www.federalregister.gov/api/v1/. No API key, no authentication, no rate limit published (in practice, be reasonable; the server is not high-capacity).
The primary endpoint is:
GET /documents.json
Key query parameters:
| Parameter | Description |
|---|---|
conditions[term] | Full-text search across title and abstract |
conditions[agencies][] | Agency slug (e.g. environmental-protection-agency) |
conditions[type][] | Document type: RULE, PRORULE, NOTICE, PRESDOCU |
conditions[publication_date][gte] | Published on or after this date (YYYY-MM-DD) |
conditions[publication_date][lte] | Published on or before this date |
conditions[effective_date][gte] | Effective on or after this date |
fields[] | Specific fields to return (improves response size) |
per_page | Results per page (max 1000) |
page | Page number |
order | Sort order: newest, oldest, relevance, executive_order_number |
Python: Search Proposed Rules by Agency
Here is how to pull all proposed rules from EPA in 2025:
import requests
BASE_URL = "https://www.federalregister.gov/api/v1"
def get_proposed_rules(agency_slug, year=2025, per_page=100):
"""Fetch all proposed rules from a specific agency in a given year."""
params = {
"conditions[agencies][]": agency_slug,
"conditions[type][]": "PRORULE",
"conditions[publication_date][gte]": f"{year}-01-01",
"conditions[publication_date][lte]": f"{year}-12-31",
"fields[]": [
"document_number",
"title",
"abstract",
"publication_date",
"effective_on",
"agencies",
"action",
"comment_url",
"comment_date",
"html_url",
"full_text_url",
"regulation_id_number_info"
],
"per_page": per_page,
"order": "newest",
}
all_docs = []
page = 1
while True:
params["page"] = page
resp = requests.get(f"{BASE_URL}/documents.json", params=params)
resp.raise_for_status()
data = resp.json()
results = data.get("results", [])
if not results:
break
all_docs.extend(results)
count = data.get("count", 0)
total_pages = (count + per_page - 1) // per_page
print(f"Page {page}/{total_pages}: {len(all_docs)}/{count} documents")
if page >= total_pages:
break
page += 1
return all_docs
# EPA's agency slug in the Federal Register API
epa_rules = get_proposed_rules("environmental-protection-agency", year=2025)
print(f"\nTotal EPA proposed rules in 2025: {len(epa_rules)}")
for rule in epa_rules[:5]:
print(f"\n{rule['publication_date']}: {rule['title']}")
print(f" Comment deadline: {rule.get('comment_date', 'N/A')}")
print(f" URL: {rule['html_url']}")
Agency slugs follow a hyphenated lowercase format. A few common ones:
| Agency | Slug |
|---|---|
| EPA | environmental-protection-agency |
| FDA | food-and-drug-administration |
| FCC | federal-communications-commission |
| SEC | securities-and-exchange-commission |
| OSHA | occupational-safety-and-health-administration |
| CFPB | consumer-financial-protection-bureau |
| FTC | federal-trade-commission |
The full agency list is at https://www.federalregister.gov/api/v1/agencies.json.
Key Data Fields
A full document object contains more than 40 fields. The most useful for compliance monitoring:
{
"document_number": "2025-04821", # Unique FR document ID
"title": "National Ambient Air Quality Standards for Particulate Matter",
"abstract": "EPA is proposing to revise...", # Short summary
"action": "Proposed rule", # What the agency is doing
"publication_date": "2025-03-15", # Date published in FR
"effective_on": "2025-05-14", # When it takes effect (rules only)
"comment_date": "2025-04-14", # Comment period deadline
"comment_url": "https://www.regulations.gov/...",
"html_url": "https://www.federalregister.gov/documents/...",
"full_text_url": "https://www.federalregister.gov/documents/.../full_text.xml",
"pdf_url": "https://www.gpo.gov/fdsys/pkg/FR-2025-03-15/pdf/...",
"agencies": [
{
"name": "Environmental Protection Agency",
"id": 199,
"slug": "environmental-protection-agency"
}
],
"regulation_id_number_info": {
"2060-AV12": {
"xml_url": "...",
"issue": "Fall 2024"
}
},
"significant": true, # Designated as economically significant (>$100M impact)
}
The regulation_id_number_info contains the Regulation Identifier Number (RIN), which lets you track a rule across its entire lifecycle from the Unified Regulatory Agenda.
Monitoring a Regulatory Topic Across Agencies
One of the more powerful use cases is tracking a topic keyword across all federal agencies to catch rules you might not know to look for:
def monitor_regulatory_topic(keyword, days_back=30):
"""Search all agencies for recent documents on a topic."""
from datetime import datetime, timedelta
cutoff = (datetime.now() - timedelta(days=days_back)).strftime("%Y-%m-%d")
params = {
"conditions[term]": keyword,
"conditions[type][]": ["RULE", "PRORULE"],
"conditions[publication_date][gte]": cutoff,
"fields[]": [
"document_number",
"title",
"abstract",
"agencies",
"publication_date",
"action",
"comment_date",
"html_url",
"full_text_url",
],
"per_page": 100,
"order": "newest",
}
resp = requests.get(f"{BASE_URL}/documents.json", params=params)
resp.raise_for_status()
data = resp.json()
docs = data.get("results", [])
total = data.get("count", 0)
print(f"Found {total} documents mentioning '{keyword}' since {cutoff}")
# Group by document type
rules = [d for d in docs if d.get("type") == "Rule"]
proposed = [d for d in docs if d.get("type") == "Proposed Rule"]
print(f" Final rules: {len(rules)}")
print(f" Proposed rules: {len(proposed)}")
return docs
# Track cryptocurrency regulation across all agencies
crypto_docs = monitor_regulatory_topic("cryptocurrency", days_back=90)
# Track AI regulation
ai_docs = monitor_regulatory_topic("artificial intelligence", days_back=90)
for doc in ai_docs[:5]:
agencies = ", ".join(a["name"] for a in doc.get("agencies", []))
print(f"\n{doc['publication_date']} [{agencies}]")
print(f" {doc['title']}")
print(f" Action: {doc.get('action', 'N/A')}")
Pulling Full Document Text for RAG Pipelines
The full_text_url field points to the complete XML document on the Federal Register server. This is the actual regulatory text. For building a RAG pipeline over regulatory content, this is the field you want:
import xml.etree.ElementTree as ET
def fetch_document_text(doc):
"""Fetch and extract plain text from a Federal Register document."""
full_text_url = doc.get("full_text_url")
if not full_text_url:
return None
resp = requests.get(full_text_url)
resp.raise_for_status()
# The full_text_url returns XML
root = ET.fromstring(resp.content)
# Extract all text nodes
text_parts = []
for elem in root.iter():
if elem.text and elem.text.strip():
text_parts.append(elem.text.strip())
if elem.tail and elem.tail.strip():
text_parts.append(elem.tail.strip())
full_text = " ".join(text_parts)
return full_text
def build_regulatory_dataset(keyword, agency_slug=None, years_back=2):
"""Build a dataset of regulatory documents with full text for RAG indexing."""
from datetime import datetime, timedelta
cutoff = (datetime.now() - timedelta(days=365 * years_back)).strftime("%Y-%m-%d")
params = {
"conditions[term]": keyword,
"conditions[publication_date][gte]": cutoff,
"conditions[type][]": ["RULE", "PRORULE"],
"fields[]": ["document_number", "title", "abstract", "publication_date",
"agencies", "html_url", "full_text_url", "action"],
"per_page": 100,
"order": "newest",
}
if agency_slug:
params["conditions[agencies][]"] = agency_slug
resp = requests.get(f"{BASE_URL}/documents.json", params=params)
data = resp.json()
docs = data.get("results", [])
dataset = []
for doc in docs:
text = fetch_document_text(doc)
if text:
dataset.append({
"id": doc["document_number"],
"title": doc["title"],
"agency": doc.get("agencies", [{}])[0].get("name", ""),
"date": doc["publication_date"],
"action": doc.get("action", ""),
"url": doc["html_url"],
"text": text,
"char_count": len(text),
})
print(f"Fetched: {doc['document_number']} ({len(text):,} chars)")
return dataset
# Build a dataset for RAG over financial regulation
finreg_dataset = build_regulatory_dataset(
keyword="bank capital requirements",
years_back=3
)
print(f"\nDataset: {len(finreg_dataset)} documents, "
f"{sum(d['char_count'] for d in finreg_dataset):,} total chars")
This pattern works well for building regulatory intelligence tools where you want to let users ask natural language questions about what regulations apply to them. Index the text field into a vector database, store the structured metadata, and you have a searchable corpus of the actual regulatory text.
Building a Regulatory Feed
For ongoing monitoring, you want a scheduled job that pulls the prior day’s documents and sends alerts:
from datetime import datetime, timedelta
def daily_regulatory_feed(watch_list):
"""
watch_list: list of dicts with keys 'agency_slug', 'keywords', 'types'
Returns new documents from the prior business day.
"""
yesterday = (datetime.now() - timedelta(days=1)).strftime("%Y-%m-%d")
today = datetime.now().strftime("%Y-%m-%d")
new_docs = []
for watch in watch_list:
params = {
"conditions[publication_date][gte]": yesterday,
"conditions[publication_date][lte]": today,
"fields[]": ["document_number", "title", "abstract", "agencies",
"publication_date", "action", "html_url", "comment_date"],
"per_page": 100,
}
if watch.get("agency_slug"):
params["conditions[agencies][]"] = watch["agency_slug"]
if watch.get("keywords"):
params["conditions[term]"] = " OR ".join(watch["keywords"])
if watch.get("types"):
params["conditions[type][]"] = watch["types"]
resp = requests.get(f"{BASE_URL}/documents.json", params=params)
if resp.status_code == 200:
docs = resp.json().get("results", [])
for doc in docs:
doc["_watch_source"] = watch.get("label", "unknown")
new_docs.extend(docs)
return new_docs
# Example watch list for a financial services firm
watch_list = [
{
"label": "SEC rules",
"agency_slug": "securities-and-exchange-commission",
"types": ["RULE", "PRORULE"],
},
{
"label": "CFPB anything",
"agency_slug": "consumer-financial-protection-bureau",
"types": ["RULE", "PRORULE", "NOTICE"],
},
{
"label": "AI regulation cross-agency",
"keywords": ["artificial intelligence", "machine learning", "algorithm"],
"types": ["RULE", "PRORULE"],
},
]
todays_updates = daily_regulatory_feed(watch_list)
print(f"New regulatory documents today: {len(todays_updates)}")
for doc in todays_updates:
print(f"[{doc['_watch_source']}] {doc['title']}")
The Federal Register publishes by noon ET on each business day. A feed job scheduled for 1 PM ET will capture the day’s documents reliably.
The managed Federal Register scraper handles the agency slug lookup, pagination, full-text extraction, and scheduled delivery. For teams that need the data in a structured format without maintaining the polling infrastructure, the scheduled scraper run delivers a structured dataset of each day’s regulatory activity.
Try the scraper referenced in this article — live on Apify, pay only for results.
Open federal-register-scraper on Apify →How to Scrape AmbitionBox Company Reviews and Ratings
AmbitionBox is India largest employer review platform with 300,000 companies. Learn how to pull ratings, review counts, salary data, and dimension scores as structured JSON without any official API.
AliExpress Product Data API: Prices, Ratings, and Orders in Python
AliExpress affiliate API has restricted coverage. Learn how to scrape AliExpress product listings for prices, ratings, order counts, and seller data as structured JSON — no affiliate approval needed.
ClinicalTrials.gov API v2: How to Search 500,000 Studies and Track Trial Status
ClinicalTrials.gov upgraded to a v2 REST API in 2024. Here is how to use it, what changed from v1, and how to build automated trial monitoring pipelines in Python.