The Mine Works
Browse on Apify
World Bank API in Python 2025: GDP, Inflation, and 1,400 Indicators Without the SOAP Hell
← All posts
tutorial June 22, 2026 · 8 min read Updated June 22, 2026

World Bank API in Python 2025: GDP, Inflation, and 1,400 Indicators Without the SOAP Hell

The World Bank has a REST API but it returns XML by default, uses quirky pagination, and has undocumented quirks. Here is how to actually use it in Python.

Try the scraper

The actor referenced in this article is live on Apify. Pay only for results delivered.

Open on Apify →

Getting macroeconomic data programmatically is harder than it should be. The IMF requires SDMX queries. The OECD has a REST API that works, mostly, but its documentation is scattered across four different sites. The UN Statistical Division returns XML that predates anyone’s enthusiasm for XML.

The World Bank API is better than most. It is a genuine REST API with JSON support, covers 1,400+ indicators across 200+ countries going back 60 years, and does not require authentication for standard use. The problem is that it has enough quirks that most first attempts either fail or return incomplete data.

This post covers what the API actually returns, the undocumented behavior that catches developers, and working Python code for the pulls you most likely need.

TL;DR: The World Bank API returns XML by default. Append format=json to every request or you will parse XML forever. Pagination uses page and per_page parameters, not cursors, and the first item in the JSON response is a metadata object, not data. Most recent data lags by 1 to 2 years. For bulk pulls across many countries or indicators, a managed scraper handles pagination and normalization automatically.

What the API Covers

The World Bank Data API (also called the Indicators API) provides access to the World Development Indicators database. As of 2025:

  • 1,400+ indicators covering GDP, inflation, unemployment, trade, health, education, poverty, gender, environment, and governance
  • 200+ countries and territories including aggregate regions like Sub-Saharan Africa and the World total
  • Annual data going back to 1960 for most indicators, some going back further
  • Quarterly data for a subset of financial indicators

The API covers World Bank collected data only. For some indicators, the World Bank aggregates from national statistical agencies; for others it uses its own estimates. The metadata for each indicator explains the source.

The base URL for the Indicators API is https://api.worldbank.org/v2/.

The Format Gotchas

The API was designed to return XML. JSON is available but not the default. Every request needs format=json appended as a query parameter, or you will receive XML.

https://api.worldbank.org/v2/country/US/indicator/NY.GDP.MKTP.CD?format=json

The JSON response structure has another surprise: it is a two-element array. The first element is a metadata object with pagination details. The second element is the actual data array. Most parsing bugs happen here because developers expect the root to be the data.

[
  {
    "page": 1,
    "pages": 3,
    "per_page": 50,
    "total": 123
  },
  [
    {"indicator": {...}, "country": {...}, "value": 25462700000000, "date": "2023"},
    ...
  ]
]

Pagination parameters:

  • page — which page (1-indexed, default 1)
  • per_page — results per page (default 50, max 32767)
  • mrv — most recent values (e.g., mrv=5 returns the 5 most recent years without specifying a date range)

Setting per_page=32767 effectively turns off pagination for most indicators, which simplifies code considerably. Only do this if you know the result set is bounded.

Indicator Codes You Will Actually Use

The indicator codes are not intuitive. Here are the most commonly needed ones:

CodeName
NY.GDP.MKTP.CDGDP (current US$)
NY.GDP.PCAP.CDGDP per capita (current US$)
NY.GDP.MKTP.KD.ZGGDP growth rate (annual %)
FP.CPI.TOTL.ZGInflation, CPI (annual %)
SL.UEM.TOTL.ZSUnemployment rate (% of labor force)
NE.EXP.GNFS.ZSExports of goods and services (% of GDP)
NE.IMP.GNFS.ZSImports of goods and services (% of GDP)
BX.KLT.DINV.WD.GD.ZSForeign direct investment, net inflows (% of GDP)
GC.DOD.TOTL.GD.ZSCentral government debt (% of GDP)
SP.POP.TOTLPopulation, total
SI.POV.DDAYPoverty headcount ratio at $2.15/day (% of population)
EG.ELC.ACCS.ZSAccess to electricity (% of population)

You can search for indicator codes at https://api.worldbank.org/v2/indicator?format=json&per_page=100&search=inflation.

Python: GDP Time Series for Multiple Countries

Here is a function that pulls GDP for a list of countries over a date range and returns a pandas DataFrame:

import requests
import pandas as pd

BASE_URL = "https://api.worldbank.org/v2"

def get_indicator(countries, indicator, start_year, end_year):
    """
    Pull a World Bank indicator for multiple countries.
    
    countries: list of ISO 3166-1 alpha-2 or alpha-3 codes, or 'all'
    indicator: World Bank indicator code (e.g., 'NY.GDP.MKTP.CD')
    Returns: pandas DataFrame with columns [country, year, value]
    """
    country_str = ";".join(countries) if isinstance(countries, list) else countries
    url = f"{BASE_URL}/country/{country_str}/indicator/{indicator}"
    
    params = {
        "format": "json",
        "per_page": 32767,
        "date": f"{start_year}:{end_year}",
    }
    
    response = requests.get(url, params=params, timeout=30)
    response.raise_for_status()
    data = response.json()
    
    # data[0] = metadata, data[1] = records
    if len(data) < 2 or data[1] is None:
        return pd.DataFrame(columns=["country", "iso3", "year", "value"])
    
    records = []
    for item in data[1]:
        records.append({
            "country": item["country"]["value"],
            "iso3": item["countryiso3code"],
            "year": int(item["date"]),
            "value": item["value"],   # None if data not available
        })
    
    return pd.DataFrame(records)

# Pull GDP for US, China, Germany, India from 2010 to 2023
gdp_df = get_indicator(
    countries=["US", "CN", "DE", "IN"],
    indicator="NY.GDP.MKTP.CD",
    start_year=2010,
    end_year=2023
)

print(gdp_df.pivot(index="year", columns="country", values="value"))

The value field is None when the World Bank does not have data for that country-year combination. Filter these before analysis: df = df.dropna(subset=["value"]).

Python: Inflation Comparison Across Countries

The CPI inflation indicator (FP.CPI.TOTL.ZG) is one of the most queried. Here is a comparison pulling the most recent 10 years using mrv instead of a fixed date range:

def get_recent_inflation(countries, years=10):
    """Get the most recent N years of CPI inflation for a list of countries."""
    country_str = ";".join(countries)
    url = f"{BASE_URL}/country/{country_str}/indicator/FP.CPI.TOTL.ZG"
    
    params = {
        "format": "json",
        "mrv": years,
        "per_page": 32767,
    }
    
    response = requests.get(url, params=params, timeout=30)
    response.raise_for_status()
    data = response.json()
    
    if len(data) < 2 or data[1] is None:
        return pd.DataFrame()
    
    records = [
        {
            "country": item["country"]["value"],
            "year": int(item["date"]),
            "inflation_pct": item["value"],
        }
        for item in data[1]
        if item["value"] is not None
    ]
    
    df = pd.DataFrame(records)
    return df.pivot(index="year", columns="country", values="inflation_pct").sort_index()

# Compare inflation across G7 countries
g7 = ["US", "GB", "DE", "FR", "JP", "IT", "CA"]
inflation_table = get_recent_inflation(g7, years=10)
print(inflation_table)

mrv=10 returns the 10 most recent data points. This avoids hardcoding a year and always returns current data.

Python: Paginating All Countries for One Indicator

When you want a full global snapshot of a single indicator, pulling all countries at once with country=all is more efficient than looping over individual country codes:

def get_global_snapshot(indicator, year):
    """
    Get a single year's value for an indicator across all countries.
    Returns DataFrame sorted by value descending.
    """
    url = f"{BASE_URL}/country/all/indicator/{indicator}"
    params = {
        "format": "json",
        "date": str(year),
        "per_page": 32767,
    }
    
    response = requests.get(url, params=params, timeout=30)
    response.raise_for_status()
    data = response.json()
    
    if len(data) < 2 or data[1] is None:
        return pd.DataFrame()
    
    records = [
        {
            "country": item["country"]["value"],
            "iso3": item["countryiso3code"],
            "value": item["value"],
        }
        for item in data[1]
        if item["value"] is not None
    ]
    
    df = pd.DataFrame(records)
    
    # Filter out aggregate regions (they have numeric codes like '1W', '4E')
    df = df[df["iso3"].str.len() == 3]
    
    return df.sort_values("value", ascending=False).reset_index(drop=True)

# Unemployment rate for 2022 across all countries
unemployment = get_global_snapshot("SL.UEM.TOTL.ZS", 2022)
print(unemployment.head(20))

The aggregate region filter (iso3.str.len() == 3) removes rows like “World”, “Sub-Saharan Africa”, and “High income” which have 2-character or non-standard codes. Keep these if you want regional aggregates.

Rate Limits and What Happens When You Hit Them

The World Bank API does not require authentication and does not publish a hard rate limit. In practice, requests start failing around 1,000 requests per minute from a single IP. The failure mode is a 429 or a connection timeout, not an informative error message.

For most use cases, rate limits are not a concern. A full global snapshot of one indicator is one API call. Pulling 10 indicators for 50 countries over 20 years is 10 API calls if you use the semicolon-separated country format correctly.

The rate limit becomes relevant when you are building a pipeline that pulls hundreds of indicator-country combinations in rapid succession. Add a time.sleep(0.5) between requests to stay well under any threshold.

The Data Freshness Problem

This is the most important limitation to communicate to stakeholders: World Bank data is not current-year data.

Most indicators have a 1 to 2 year lag. GDP data for 2024 typically appears in the API in late 2025 or early 2026. Poverty estimates often lag by 3 to 5 years because they require household surveys that take time to process.

The mrv=1 parameter returns the most recent available value, but that value may be from 2022 or 2023 for many countries and indicators. Check the date field in the response, not the year you requested.

For research and analysis this lag is usually acceptable. For dashboards claiming to show “current” macroeconomic conditions, you need to add a data-as-of date so users understand what they are looking at.

When to Use the Scraper for Bulk Pulls

The raw API works well for targeted pulls: one indicator, a handful of countries, a defined date range. When you move to bulk operations, several pain points accumulate:

Cross-indicator normalization. Pulling 50 indicators for a dataset means 50 separate API calls, each returning slightly different metadata structures. Normalizing these into a consistent schema adds code.

Missing data handling. Some country-indicator combinations return null values because the World Bank estimates or imputes from national statistics agencies. Deciding whether to forward-fill, drop, or flag these requires logic that lives outside the raw API.

Scheduled refreshes. The World Bank updates its data on a rolling schedule. Running a monthly refresh that catches newly published figures requires a job scheduler and state management.

The World Bank Indicators scraper handles indicator selection, pagination, null filtering, and structured output into a flat dataset. You specify the indicators and countries you want and receive a clean CSV or JSON file ready for analysis. For bulk research pipelines where data engineering time is the constraint, the scraper is faster than building the same logic from scratch.

The raw API is powerful and free. Its main friction is not access but ergonomics: the XML-first design, two-element response wrapper, and data freshness lag are solvable in an afternoon of Python but add up when you are maintaining production pipelines.

Related Actor

Try the scraper referenced in this article — live on Apify, pay only for results.

Open worldbank-indicators on Apify →