Claude Computer Use for Web Scraping: The Right Way

I spent 11 hours last quarter manually copying property listing data from competing agencies’ websites into a spreadsheet. Prices, square footage, location tags, amenity lists — all entered by hand, one row at a time. Then I tested Claude’s computer use feature for the same task. It finished a comparable data pull in under 25 minutes while I made coffee. That gap is why I wrote this tutorial.

Claude computer use is Anthropic’s feature that lets Claude actually control a browser and desktop environment — clicking, scrolling, reading screen content, filling forms. It’s not an API scraping library. It’s closer to watching someone operate a computer on your behalf. For a solo real estate operator like me, that distinction matters enormously, because most of the data I need lives behind JavaScript-heavy interfaces that traditional scrapers choke on.

This tutorial walks through exactly how I set it up for real estate data scraping, what prompts I use, where it breaks, and whether it’s actually worth your time in 2026.

What You’ll Build by the End of This Tutorial

By the time you finish reading and following along, you’ll have a working Claude computer use workflow that can:

  • Open a target website in a controlled browser environment
  • Navigate to a specific property listings page or search results
  • Extract structured data — prices, addresses, specs, agent names — into a readable format
  • Output that data as a CSV or structured JSON you can paste into a spreadsheet
  • Handle pagination across multiple pages without you clicking anything

This is not a theoretical walkthrough. Every step here reflects what I actually run for my Madeira market research.

Prerequisites Before You Start

Prerequisites Before You Start

You’ll need a few things in place before the first step:

  • Claude API access with computer use enabled. This requires an Anthropic API account. Computer use is available on Claude claude-3-5-sonnet-20241022 and newer models. Check Anthropic’s official computer use docs for current model availability.
  • Docker installed. The official demo environment Anthropic provides runs in a Docker container. If you don’t have Docker, install it from docker.com — it’s free.
  • Basic comfort with a terminal. You don’t need to code, but you’ll paste two commands and edit one config file.
  • A clear target. Know exactly which website and which data fields you want before starting. Vague prompts produce vague results.
  • Budget awareness. Computer use consumes more tokens than standard Claude prompts because it processes screenshots. Budget roughly $0.15–$0.40 per scraping session depending on how many pages you’re hitting and how complex the site is.

Step 1: Spin Up the Claude Computer Use Demo Environment

Anthropic provides a ready-made Docker image that gives Claude a virtual desktop with a browser. This is the fastest way to get started without building custom infrastructure.

Open your terminal and run:

docker pull ghcr.io/anthropics/anthropic-quickstarts:computer-use-demo-latest

Once that’s pulled, launch the container with your API key:

docker run 
    -e ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY 
    -v $HOME/.anthropic:/home/user/.anthropic 
    -p 5900:5900 
    -p 8501:8501 
    -p 6080:6080 
    -p 8080:8080 
    -it ghcr.io/anthropics/anthropic-quickstarts:computer-use-demo-latest

Replace $ANTHROPIC_API_KEY with your actual key if you haven’t set it as an environment variable. After about 30 seconds, open your browser and go to http://localhost:8080. You’ll see a virtual desktop interface — that’s Claude’s workspace.

You can also view the virtual desktop directly at http://localhost:6080 to watch Claude work in real time, which I highly recommend for your first few sessions.

Step 2: Write a Precise Scraping Prompt

Step 2 Write a Precise Scraping Prompt

The quality of your prompt is the difference between getting clean, structured data and getting Claude wandering around a homepage clicking random things. Be surgical.

Here’s the template I use for real estate listing data:

You are a data extraction assistant. Your task is to scrape real estate listing data from a specific website and return it in structured CSV format.

TARGET URL: [paste exact URL here — the search results page, not the homepage]

DATA FIELDS TO EXTRACT (extract these and only these):
- Listing title
- Price (in euros, numbers only)
- Location / neighborhood
- Property type (apartment, villa, house, land)
- Size in square meters
- Number of bedrooms
- Number of bathrooms
- Listing URL (the link to the individual listing page)

INSTRUCTIONS:
1. Open the target URL in the browser
2. Wait for the page to fully load
3. Scroll down to verify all listings are visible
4. Extract the above fields for every listing visible on the page
5. If there is a "Next page" or pagination button, click it and repeat until you have scraped [X] pages or there are no more pages
6. Do NOT click on individual listings unless a field is unavailable from the search results view
7. Output all extracted data as a clean CSV with headers in the first row
8. If a field is missing for a specific listing, write NULL in that cell

Begin now. Do not ask for confirmation.

Paste this into the chat interface at localhost:8080. Watch the virtual desktop at localhost:6080 to see Claude open the browser and start working.

Step 3: Monitor the Session and Handle Interruptions

Claude computer use is not fire-and-forget. Plan to stay nearby for the first 5–10 minutes. A few things commonly interrupt a scraping session:

  • Cookie consent popups. Most European real estate sites have GDPR banners. Claude usually handles these fine, but occasionally gets confused about which button to click. If it stalls, type “click the accept cookies button and continue” in the chat.
  • CAPTCHAs. If the site detects automated behavior and throws a CAPTCHA, Claude cannot solve it. This is a hard stop. I’ll address this more in the limitations section.
  • Lazy-loading content. Some sites only render listing cards as you scroll. Add the instruction “scroll slowly down the full page before extracting data” to your prompt if you notice incomplete results.
  • Login walls. If the target site requires a login, add your credentials to the prompt: “If prompted to log in, use email [x] and password [y].” Only do this for sites where you have a legitimate account.

Step 4: Collect and Clean the Output

Step 4 Collect and Clean the Output

When Claude finishes, it will paste the CSV directly into the chat window. Select all of it, copy it, and paste it into a plain text file with a .csv extension. Open it in Excel or Google Sheets.

Expect some cleaning. The most common issues I run into:

  • Price formatting inconsistencies (€350,000 vs 350000 vs 350.000)
  • Location names with trailing spaces or accented characters encoded oddly
  • Bedroom/bathroom counts written as text (“3 beds”) instead of numbers

I run a second prompt after collecting the raw CSV to fix these. Paste the raw data back to Claude (standard Claude, not computer use) and use this:

Clean the following CSV data for real estate listings. 
Apply these rules:
- Convert all prices to plain integers with no symbols or separators (e.g., 350000)
- Trim all leading/trailing whitespace from every cell
- Convert bedroom and bathroom counts to plain integers
- Standardize property types: use only these values: Apartment, Villa, House, Land, Other
- Remove any duplicate rows based on the listing URL column
- Return the cleaned CSV with the same headers

DATA:
[paste your raw CSV here]

This two-step approach — computer use for the raw pull, standard Claude for cleaning — is faster and cheaper than trying to do everything in one computer use session.

Step 5: Automate Recurring Scrapes with a Simple Script

If you need the same data weekly — say, tracking competitor pricing on a specific property portal — you can trigger Claude computer use sessions programmatically via the Anthropic API. Here’s a minimal Python script structure:

import anthropic

client = anthropic.Anthropic()

# Your scraping prompt
scraping_prompt = """
[paste your full scraping prompt here]
"""

# Run a computer use session
response = client.beta.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=4096,
    tools=[
        {"type": "computer_20241022", "name": "computer", "display_width_px": 1024, "display_height_px": 768}
    ],
    messages=[{"role": "user", "content": scraping_prompt}],
    betas=["computer-use-2024-10-22"]
)

print(response.content)

Pair this with a cron job set to run every Monday morning and you have a lightweight, automated market intelligence feed. I run mine every Tuesday at 7 AM Madeira time. It costs me roughly €1.20 per run across the three sites I monitor.

My Real-World Experience Using Claude Computer Use in Madeira

My Real-World Experience Using Claude Computer Use in Madeira

Let me be specific about what actually happened when I started using this.

Every quarter, I build a competitive pricing report for my clients — mostly international buyers looking at villas and apartments in Madeira’s southern coast. The report covers asking prices per square meter across three main price segments, broken down by neighborhood. Before Claude computer use, building the raw data layer for that report meant 2–3 hours of manual browsing and copy-pasting. I’d open a property portal, run a filtered search, copy the results row by row, move to the next page, repeat. Tedious, error-prone, and impossible to fully automate with standard scrapers because the portals I use rely heavily on JavaScript rendering.

In March 2026, I ran my first full computer use scraping session for this quarterly report. I targeted two Portuguese property portals with active Madeira listings. I used the prompt template I shared above, modified for the specific fields I needed, and let it run.

First session: 94 listings across 6 pages, completed in 22 minutes. The raw CSV had some formatting issues — prices were inconsistent and one portal used a different bedroom notation — but after the cleaning step (another 8 minutes), I had a usable dataset. Total time: 30 minutes. My old manual process for the same volume: approximately 2 hours and 15 minutes.

Second session, a week later on a third portal: Claude hit a cookie wall it couldn’t dismiss cleanly, stalled for about 3 minutes taking repeated screenshots, then recovered and continued. I lost maybe 12 listings from one page where the stall caused a scroll position issue, but 82 out of 94 listings came through clean. I added the explicit cookie handling line to my prompt after that.

Over the last two quarters, I’ve run 14 scraping sessions total. Average time per session: 28 minutes including cleanup. Average cost: €0.31 per session at current API rates. The manual equivalent of what I’ve scraped would have taken me approximately 19 hours. I got it done in under 7 hours total, including setup time, troubleshooting, and the learning curve of the first few sessions.

The practical upside beyond time: consistency. My Q1 2026 pricing report had cleaner data than any I’ve produced in the past four years, simply because human fatigue wasn’t involved. When you’re manually copying row 87 of 94 at 6 PM, you make mistakes. Claude doesn’t get tired of the 87th row.

Genuine Limitations I Hit During Testing

I said I’d be honest, so here are the actual pain points:

CAPTCHAs Are a Hard Wall

Two of the six sites I tested use Cloudflare bot protection. Claude hit the CAPTCHA wall on both and could not proceed. No workaround exists within the tool itself. If your target site aggressively blocks automated access, computer use won’t help you. This knocked out 2 of my 6 initial target sites entirely.

Speed Is Not Its Strength

Claude computer use is slower than a traditional scraper. It operates visually — taking screenshots, analyzing what it sees, deciding what to click — rather than parsing raw HTML. For 50–150 listings, 20–30 minutes is fine for my use case. For 1,000+ listings, you’d want a different approach. It’s not built for high-volume bulk operations.

Token Costs Add Up on Complex Pages

Sites with heavy visual design — lots of images, complex layouts — cost more per session because each screenshot is larger and more expensive to process. One luxury villa portal I tested ran me €0.87 for 40 listings. That’s more than double my usual cost per session. Keep an eye on this if you’re targeting image-heavy sites.

It Needs Supervision the First Few Runs

You cannot fully trust a new scraping target to run unattended until you’ve watched it succeed 2–3 times. The first run on any new site should be monitored. I learned this after a session ran for 45 minutes on an infinite scroll page, endlessly scrolling and re-extracting the same 20 listings in a loop. Fully unattended automation is a goal, not a starting point.

Quick Comparison: Claude Computer Use vs Other Scraping Approaches

Quick Comparison Claude Computer Use vs Other Scraping Approaches
Method Best For Handles JS Sites? Handles CAPTCHAs? Approx. Cost Technical Skill Needed
Claude Computer Use Small-medium scrapes, JS-heavy sites, non-coders Yes No $0.15–$0.90 per session Low (Docker + prompt)
BeautifulSoup / Scrapy High-volume, static HTML sites Partially No Essentially free High (Python required)
Playwright / Puppeteer Large-scale JS site scraping Yes No Hosting costs only High (JS/Python required)
Apify / Octoparse No-code scraping, pre-built templates Yes Partially (paid plans) $49–$249/month Low-Medium
Manual Copy-Paste Any site, no setup needed Yes Yes Your time None

For a solo operator without a developer background, Claude computer use sits in a sweet spot that nothing else quite covers: it handles JavaScript rendering without requiring you to write Playwright code, and it costs a fraction of dedicated scraping platforms like Apify at scale for small operations.

Troubleshooting: Most Common Problems and Fixes

Claude stops mid-session and says it’s done, but data is incomplete. Add “Do not stop until you have scraped all [X] pages. Confirm the total count of listings extracted before finishing.” to your prompt. Claude sometimes concludes too early without explicit completion criteria.

The virtual desktop shows a blank or broken browser window. Restart the Docker container. This happens occasionally if the container has been running for a while. Run docker restart [container-id].

CSV output has weird characters in accented names. This is a UTF-8 encoding issue. When saving your CSV file, explicitly choose UTF-8 encoding. In Google Sheets, import via File > Import and select UTF-8 in the encoding dropdown.

Claude is clicking the wrong elements on the page. Add a screenshot annotation step: “Before clicking anything, describe what you see on the page and identify the exact

Robson Penassi

Robson Penassi

Real estate consultant in Madeira, Portugal. Solopreneur since 2012. Testing AI tools since 2023 to automate his one-person business. Writes about what actually works — and what does not.

More articles by Robson →

Leave a Comment