Archive skill

Web archiving

Access inaccessible web pages and preserve web content for journalism, research, and legal purposes.

When to use

Archive service comparison

Service Best for
Wayback Machine Historical research
Archive.today Paywall bypass, quick saves
Perma.cc Legal citations
ArchiveBox Self-hosted, privacy
Conifer Interactive content

What's included

Wayback Machine API

Python code for checking availability, saving pages, and retrieving historical snapshots via CDX API.

Multi-archive redundancy

Archive to Wayback, Archive.today, and Perma.cc simultaneously for maximum preservation.

Legal evidence preservation

Chain of custody documentation, content hashing, and timestamped capture records.

ArchiveBox integration

Self-hosted archiving setup, Python integration, and scheduled archiving workflows.

Archive retrieval cascade

Try services in this order for maximum coverage:

1

Wayback Machine (archive.org)

916B+ pages, historical depth, API access

2

Archive.today (archive.is/archive.ph)

On-demand snapshots, paywall bypass

3

Google Cache

Recent pages, search: cache:url

4

Bing Cache

Click dropdown arrow in search results

5

Memento Time Travel (aggregator)

Searches multiple archives simultaneously

Installation

# Recommended: install the research-toolkit plugin

/plugin marketplace add jamditis/claude-skills-journalism

/plugin install research-toolkit@claude-skills-journalism

# Or copy just this skill from the plugin tree

git clone https://github.com/jamditis/claude-skills-journalism.git

cp -r claude-skills-journalism/research-toolkit/skills/web-archiving ~/.claude/skills/

Or browse this skill in the GitHub repository.

Related skills

Archive before it disappears

Wayback Machine API, multi-archive redundancy, and legal evidence preservation in one skill.

View on GitHub