Instagram Profile Data Without the Meta API: Followers, Bio, and Posts at Scale
Meta restricts the Instagram Graph API to your own accounts. For researching public third-party profiles at scale, here is what data is available and how to collect it.
The actor referenced in this article is live on Apify. Pay only for results delivered.
Instagram has 2 billion monthly active users and is the primary channel for influencer marketing, brand discovery, and competitive intelligence in almost every consumer category. Getting structured data from it, however, is surprisingly constrained if you try to go through official channels.
TL;DR: The Instagram Graph API only works for accounts you own or manage. For researching public third-party profiles, follower counts, bios, post metrics, and account growth are visible on public pages without login. A scraper extracts this data into structured JSON, handling pagination across a list of handles. Useful for influencer vetting, competitor tracking, and market research.
The Meta API Problem
The Instagram Graph API is the official way to access Instagram data programmatically. The problem is who it is designed for.
To use the Graph API, you need a Facebook Developer account, a Facebook App approved for Instagram permissions, and a connected Facebook Page. The API then lets you access data for the Instagram Business or Creator account connected to that Page. That is the boundary. The API returns data for accounts you own or manage through your Facebook Business configuration. It does not let you query arbitrary public profiles.
This is not an oversight. Meta restricts Graph API access intentionally, partly for privacy reasons and partly because broad programmatic access to competitor profile data is not a use case Meta has an incentive to support.
The consequence is that the Graph API is useful for your own analytics: posting, reading your own comments, measuring your own reach. For any research involving accounts you do not control, such as an influencer you are considering for a paid partnership, a competitor brand you want to track, or an emerging creator in your niche, the API gives you nothing.
There is also the Basic Display API, which was a lighter permission scope for reading user media. Meta deprecated it in September 2024. It is gone.
What Public Instagram Profile Data Looks Like
Without any login, Instagram’s public profile pages expose a consistent set of fields for any public account. This is the data that renders when you visit instagram.com/username in an incognito browser window.
Profile metadata available publicly:
- Username and full name
- Bio text (up to 150 characters, including line breaks)
- Follower count and following count
- Total post count
- Verified badge status (blue checkmark)
- External website link (the URL in the profile bio link)
- Business category label (for accounts that have set one, such as “Clothing Brand” or “Personal Blog”)
- Profile picture URL
- Whether the account is a business account or creator account
Post-level data for the visible recent posts on a public profile:
- Thumbnail image URL for each post
- Like count per post
- Comment count per post
- Post type: photo, video, carousel, or Reel
- Timestamp
- Caption text (from the page if surfaced)
- Permalink
The number of posts visible on a public profile page varies. Instagram typically surfaces the most recent 12 posts in the grid view, with pagination available for older content.
What Requires Login
Some data is not accessible without an authenticated session:
- Private accounts. The profile page for a private account shows the follower and following counts and bio, but no posts. You cannot see any content without the account approving a follow request.
- Stories. Story content is ephemeral and not rendered on public profile pages.
- Direct messages. Not accessible without authentication, obviously.
- Full follower and following lists. The aggregate counts are public. The individual list of who follows an account or who an account follows is behind a login wall.
- Insights and reach data. Internal analytics like impressions, reach, and profile visits are only available to the account owner through the Graph API or the native app.
- Saved content and collections. Private to the account owner.
For influencer vetting and competitor research, what is publicly accessible is sufficient for most workflows: profile metadata, engagement counts per post, and posting frequency.
Fetching a Single Profile Programmatically
The managed Instagram Profile Scraper handles the extraction, pagination, and rate management. You trigger it via the Apify API and retrieve structured JSON results.
import requests
import time
APIFY_TOKEN = "your_apify_token"
def fetch_instagram_profile(username):
"""Fetch profile metadata and recent posts for a single Instagram handle."""
run_url = "https://api.apify.com/v2/acts/themineworks~instagram-profile-scraper/runs"
payload = {
"usernames": [username],
"maxPostsPerProfile": 50,
"scrapeProfile": True,
}
resp = requests.post(
run_url,
json=payload,
params={"token": APIFY_TOKEN}
)
resp.raise_for_status()
run_id = resp.json()["data"]["id"]
# Poll until the run finishes
status_url = f"https://api.apify.com/v2/actor-runs/{run_id}"
while True:
status_resp = requests.get(status_url, params={"token": APIFY_TOKEN})
status = status_resp.json()["data"]["status"]
if status in ("SUCCEEDED", "FAILED", "TIMED-OUT", "ABORTED"):
break
time.sleep(4)
if status != "SUCCEEDED":
raise RuntimeError(f"Scraper run ended with status: {status}")
dataset_id = status_resp.json()["data"]["defaultDatasetId"]
results_url = f"https://api.apify.com/v2/datasets/{dataset_id}/items"
results = requests.get(results_url, params={"token": APIFY_TOKEN}).json()
return results
profile_data = fetch_instagram_profile("levis")
print(f"Fetched {len(profile_data)} items")
for item in profile_data[:3]:
print(f"[{item.get('timestamp')}] {item.get('likesCount')} likes")
Bulk Fetching a List of Influencer Handles
For influencer research or competitor monitoring, you typically need to pull data for a list of accounts in one run. Pass multiple usernames in the same payload:
def bulk_fetch_profiles(usernames, posts_per_profile=30):
"""
Fetch profiles and recent posts for a list of Instagram handles.
Returns a flat list of post items with profile fields attached.
"""
run_url = "https://api.apify.com/v2/acts/themineworks~instagram-profile-scraper/runs"
payload = {
"usernames": usernames,
"maxPostsPerProfile": posts_per_profile,
"scrapeProfile": True,
}
resp = requests.post(
run_url,
json=payload,
params={"token": APIFY_TOKEN}
)
resp.raise_for_status()
run_id = resp.json()["data"]["id"]
status_url = f"https://api.apify.com/v2/actor-runs/{run_id}"
while True:
r = requests.get(status_url, params={"token": APIFY_TOKEN}).json()
current_status = r["data"]["status"]
if current_status in ("SUCCEEDED", "FAILED"):
break
time.sleep(5)
if current_status != "SUCCEEDED":
raise RuntimeError(f"Run failed with status: {current_status}")
dataset_id = r["data"]["defaultDatasetId"]
items_url = f"https://api.apify.com/v2/datasets/{dataset_id}/items"
all_items = []
offset = 0
limit = 100
while True:
page = requests.get(
items_url,
params={"token": APIFY_TOKEN, "offset": offset, "limit": limit}
).json()
if not page:
break
all_items.extend(page)
if len(page) < limit:
break
offset += limit
return all_items
# Vetting a shortlist of influencers before outreach
influencer_handles = [
"sustainablefashionblog",
"slowlivingco",
"consciousstyle",
"ethicallykate",
]
results = bulk_fetch_profiles(influencer_handles, posts_per_profile=24)
# Compute a basic engagement rate per handle
from collections import defaultdict
import statistics
posts_by_handle = defaultdict(list)
for item in results:
if item.get("type") == "post":
posts_by_handle[item["ownerUsername"]].append(item)
for handle, posts in posts_by_handle.items():
follower_count = posts[0].get("followersCount", 1)
avg_likes = statistics.mean(p.get("likesCount", 0) for p in posts)
avg_comments = statistics.mean(p.get("commentsCount", 0) for p in posts)
engagement_rate = ((avg_likes + avg_comments) / follower_count) * 100
print(f"@{handle}: {follower_count:,} followers, {engagement_rate:.2f}% engagement rate")
Fields in the Structured JSON Output
The scraper returns two types of records: profile records and post records. Both are in the same flat dataset, distinguished by a type field.
Profile record fields:
| Field | Type | Description |
|---|---|---|
type | string | Always "profile" for profile records |
username | string | Instagram handle without the @ |
fullName | string | Display name as set on the profile |
biography | string | Bio text, may contain newlines |
followersCount | integer | Follower count at time of scrape |
followsCount | integer | Number of accounts this profile follows |
postsCount | integer | Total posts on the account |
isVerified | boolean | Whether the blue checkmark is present |
isBusinessAccount | boolean | True for business and creator accounts |
businessCategoryName | string | Category label set by the account, or null |
externalUrl | string | Website link in the bio, or null |
profilePicUrl | string | CDN URL for the profile picture |
scrapedAt | string | ISO 8601 timestamp of when data was collected |
Post record fields:
| Field | Type | Description |
|---|---|---|
type | string | Always "post" for post records |
ownerUsername | string | Handle of the account this post belongs to |
postId | string | Instagram’s internal post ID |
shortCode | string | The code used in the permalink (e.g. CzXa...) |
permalink | string | Full URL to the post |
postType | string | One of: "image", "video", "sidecar" (carousel), "clips" (Reel) |
timestamp | string | ISO 8601 publish timestamp |
likesCount | integer | Like count at time of scrape |
commentsCount | integer | Comment count at time of scrape |
caption | string | Post caption text, or null if not accessible |
displayUrl | string | CDN URL for the post thumbnail image |
videoViewCount | integer | View count for video and Reel posts, or null |
isSponsored | boolean | Whether the post is labeled as a paid partnership |
Use Cases
Influencer vetting before outreach. Follower counts alone are not a useful signal. An account with 200,000 followers averaging 80 likes per post has a 0.04% engagement rate, which is well below what genuine organic reach looks like at that size. Pulling the last 24 posts for a shortlist of influencer candidates and computing engagement rate per account takes a few minutes with the scraper. Doing it manually across 20 accounts takes most of a day.
The isSponsored field on post records is useful for a second check: how many of a creator’s recent posts are paid partnerships? An account where 60% of recent content is sponsored is a less attractive placement than one where sponsored content is occasional and clearly differentiated.
Competitor brand monitoring. Pull 30 recent posts from 5 to 10 competitor brand accounts on a weekly schedule. Track follower count over time, posting frequency, which post types (Reels vs. carousels vs. static images) they are emphasizing, and which posts are getting significantly higher engagement than their baseline. This gives you a data-backed view of what content approaches are working in your category without any guesswork.
Market research on emerging accounts. Identifying accounts that are gaining traction in a niche before they reach mass visibility is useful for early influencer partnerships and for understanding where audience attention is moving. Scraping a broader list of accounts in a category and sorting by recent engagement rate relative to follower count surfaces accounts that are punching above their size.
Brand safety verification. Before a paid partnership, verify that an influencer account is genuine: follower count is in the range claimed, recent posting activity is consistent, and engagement is proportionate. Accounts with purchased followers typically show high follower counts and very low engagement counts, and the scraper gives you the numbers to check this quickly across a shortlist.
Rate Limits and Responsible Scraping
Instagram does not publish rate limits for public page access because there is no official public API. The practical constraints come from Instagram’s anti-bot infrastructure, which is extensive and shared with Facebook.
The public profile pages are accessible to any browser, including crawlers, in the same way a news site’s public articles are. Instagram’s terms of service restrict automated access for certain commercial purposes, so it is worth reviewing your specific use case against the current ToS before building production workflows.
For sustained use, the practical limits are:
- Keep request rates under what a fast manual user would generate. For profile fetches, 2 to 5 seconds between profiles is stable. Batch requests at machine speed without any delay will trigger throttling within minutes.
- Avoid hitting the same profile repeatedly in a short window. Instagram’s infrastructure is sensitive to patterns that do not look like organic browsing.
- For bulk pulls across many accounts, running the scraper during off-peak hours (relative to US Pacific time, when Instagram’s infrastructure is under less load) improves reliability.
- Build in error handling for 429 responses and temporary blocks. The scraper handles session rotation and retry logic internally, but if you are making raw API calls, you need to account for transient failures.
The scraper is built to stay within these limits. For a monitoring workflow covering 20 to 50 accounts on a daily or weekly cadence, it runs cleanly without manual intervention.
Try the scraper referenced in this article — live on Apify, pay only for results.
Open instagram-profile-scraper on Apify →How to Scrape AmbitionBox Company Reviews and Ratings
AmbitionBox is India largest employer review platform with 300,000 companies. Learn how to pull ratings, review counts, salary data, and dimension scores as structured JSON without any official API.
AliExpress Product Data API: Prices, Ratings, and Orders in Python
AliExpress affiliate API has restricted coverage. Learn how to scrape AliExpress product listings for prices, ratings, order counts, and seller data as structured JSON — no affiliate approval needed.
ClinicalTrials.gov API v2: How to Search 500,000 Studies and Track Trial Status
ClinicalTrials.gov upgraded to a v2 REST API in 2024. Here is how to use it, what changed from v1, and how to build automated trial monitoring pipelines in Python.