Literature Reviews and R&D Intelligence at Scale with the OpenAlex Scraper
Search 250M+ research papers from OpenAlex as structured JSON — authors, citations, venues and abstracts
The actor referenced in this article is live on Apify. Pay only for results delivered.
OpenAlex is the open index of the world’s research — over 250 million papers, with authors, institutions, venues and citation networks. It’s the free successor to Microsoft Academic Graph, and it turns literature review and R&D intelligence from a manual slog into a query. This guide shows how.
TL;DR: Use the OpenAlex Scholarly Works Scraper to pull papers filtered by topic, year, citation count, open-access status and type — flattened to one clean record per work with authors, institutions, venue, citations and reconstructed abstracts. Run a literature review, map a research field, or scout talent. No API key, zero charge on empty runs, first 25 works free.
Why a structured feed beats manual search
A literature review or competitive R&D scan means assembling a filtered, citation-ranked corpus on a topic — then reading the abstracts. Doing that by hand across hundreds of papers is the slow part. The scraper returns them already structured: title, DOI, year, type, authors, institutions, venue, citation count, open-access status, top concepts and (on request) the reconstructed abstract.
Run a literature review
Pull the most-cited recent work on a topic, with abstracts:
{
"searchTerm": "perovskite solar cells",
"fromYear": 2022,
"minCitations": 25,
"includeAbstract": true,
"maxResults": 300
}
Results come back most-cited first, so the field’s anchor papers are at the top. Feed the abstracts to an LLM to cluster themes, summarize the state of the art, or draft the related-work section — grounded in real papers with DOIs.
Map a research field or institution
Drop the citation floor and group the results by author_institutions or concepts and you have a map of who is publishing on a technology, where, and how the work connects. That’s the raw material for R&D landscaping, partner scouting, and talent identification.
Track open access and trends
Filter openAccessOnly to build a corpus you can actually read in full, or set a year range and count works per year to quantify whether a field is heating up or cooling down — a defensible, sourced trend line in one run.
Pricing and reliability
The OpenAlex Scholarly Works Scraper is pay-per-result: first 25 works free on every account, then $0.002 per work. Searches that return nothing are never charged. No API key — OpenAlex is fully open, and the actor joins its fast “polite pool” for speed.
FAQ
Do I need an API key? No. OpenAlex is fully open; the actor uses its fast polite pool.
How many works are indexed? Over 250 million, across every field of research.
Can I get abstracts? Yes — turn on includeAbstract and the actor reconstructs readable abstract text from OpenAlex’s inverted index.
How do I get the most-cited papers? Just set searchTerm (and optionally minCitations); results are returned most-cited first.
Can I filter to open-access only? Yes — set openAccessOnly to build a corpus you can read in full.
Try the scraper referenced in this article — live on Apify, pay only for results.
Open openalex-scholarly-works on Apify →Building a Legal & Regulatory Intelligence Pipeline with Court Records, Federal Rules, and Contract Data
Track case law, new federal regulations, and government contract awards automatically. A step-by-step guide to wiring three public-data scrapers into a
The Economic Data Stack: GDP, Trade Flows, and Open Government Data as Clean JSON
Build a macroeconomic intelligence pipeline from authoritative open data. World Bank indicators, bilateral trade flows
Building an Academic Research Data Stack: Crossref, OpenAlex, and Citation-Aware RAG
How to assemble a literature-review and research-intelligence pipeline from open scholarly data. Search 150M+ works, map citation networks