From 03f389b6a12f6858ace0caca079aee65e4f7f346 Mon Sep 17 00:00:00 2001 From: Nick Sweeting Date: Wed, 20 Jan 2021 21:34:23 -0500 Subject: [PATCH] Update README.md --- README.md | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/README.md b/README.md index 15e08615..222a3d6a 100644 --- a/README.md +++ b/README.md @@ -32,19 +32,27 @@ ArchiveBox is a powerful self-hosted internet archiving solution written in Python. You feed it URLs of pages you want to archive, and it saves them to disk in a variety of formats depending on setup and content within. -#### 🔢  Intro +#### 🔢  Overview First Get ArchiveBox via Docker, Apt, Brew, Pip, etc. ([see below](#Quickstart)). + ```bash apt/brew/pip3 install archivebox ``` -1. `archivebox init`: Run this in an empty folder -3. `archivebox add 'https://example.com'`: Start adding URLs to archive. -4. `archivebox server`: Run the webserver and open the admin UI +Then use the `archivebox` CLI to set up your archive and start the web UI. -For each URL added, ArchiveBox saves several types of HTML snapshot (wget, Chrome headless, singlefile), a PDF, a screenshot, a WARC archive, any git repositories, images, audio, video, subtitles, article text, [and more...](#output-formats). -Open the web UI at http://127.0.0.1:8000 to manage your collection, or browse `./archive//` and view archived content directly from the filesystem. +```bash +archivebox init # run this in an empty folder +archivebox add 'https://example.com' # start adding URLs to archive +``` + +For each URL added, ArchiveBox saves several types of HTML snapshot (wget, Chrome headless, singlefile), a PDF, a screenshot, a WARC archive, any git repositories, images, audio, video, subtitles, article text, [and more...](#output-formats). + +```bash +archivebox server 0.0.0.0:8000 # run the admin UI webserver +ls ./archive/*/index.json # or browse via the filesystem +```