bump package version
This commit is contained in:
parent
c2bd71667c
commit
afca9cb3bd
2 changed files with 58 additions and 47 deletions
|
@ -1,6 +1,6 @@
|
||||||
Metadata-Version: 2.1
|
Metadata-Version: 2.1
|
||||||
Name: archivebox
|
Name: archivebox
|
||||||
Version: 0.4.24
|
Version: 0.5.0
|
||||||
Summary: The self-hosted internet archive.
|
Summary: The self-hosted internet archive.
|
||||||
Home-page: https://github.com/ArchiveBox/ArchiveBox
|
Home-page: https://github.com/ArchiveBox/ArchiveBox
|
||||||
Author: Nick Sweeting
|
Author: Nick Sweeting
|
||||||
|
@ -41,31 +41,62 @@ Description: <div align="center">
|
||||||
<hr/>
|
<hr/>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
ArchiveBox is a powerful self-hosted internet archiving solution written in Python 3. You feed it URLs of pages you want to archive, and it saves them to disk in a varitety of formats depending on the configuration and the content it detects. ArchiveBox can be installed via [Docker](https://docs.docker.com/get-docker/) (recommended) or [`pip`](https://www.python.org/downloads/). It works on macOS, Windows, and Linux/BSD (both armv7 and amd64).
|
ArchiveBox is a powerful self-hosted internet archiving solution written in Python 3. You feed it URLs of pages you want to archive, and it saves them to disk in a varitety of formats depending on the configuration and the content it detects. ArchiveBox can be installed via [Docker](https://docs.docker.com/get-docker/) (recommended), [`apt`](https://launchpad.net/~archivebox/+archive/ubuntu/archivebox/+packages), [`brew`](https://github.com/ArchiveBox/homebrew-archivebox), or [`pip`](https://www.python.org/downloads/). It works on macOS, Windows, and Linux/BSD (both armv7 and amd64).
|
||||||
|
|
||||||
Once installed, URLs can be added via the command line `archivebox add` or the built-in Web UI `archivebox server`. It can ingest bookmarks from a service like Pocket/Pinboard, your entire browsing history, RSS feeds, or URLs one at a time.
|
Once installed, URLs can be added via the command line `archivebox add` or the built-in Web UI `archivebox server`. It can ingest bookmarks from a service like Pocket/Pinboard, your entire browsing history, RSS feeds, or URLs one at a time.
|
||||||
|
|
||||||
The main index is a self-contained `data/index.sqlite3` file, and each snapshot is stored as a folder `data/archive/<timestamp>/`, with an easy-to-read `index.html` and `index.json` within. For each page, ArchiveBox auto-extracts many types of assets/media and saves them in standard formats, with out-of-the-box support for: 3 types of HTML snapshots (wget, Chrome headless, singlefile), a PDF snapshot, a screenshot, a WARC archive, git repositories, images, audio, video, subtitles, article text, and more. The snapshots are browseable and managable offline through the filesystem, the built-in webserver, or the Python API.
|
The main index is a self-contained `data/index.sqlite3` file, and each snapshot is stored as a folder `data/archive/<timestamp>/`, with an easy-to-read `index.html` and `index.json` within. For each page, ArchiveBox auto-extracts many types of assets/media and saves them in standard formats, with out-of-the-box support for: 3 types of HTML snapshots (wget, Chrome headless, singlefile), a PDF snapshot, a screenshot, a WARC archive, git repositories, images, audio, video, subtitles, article text, and more. The snapshots are browseable and managable offline through the filesystem, the built-in webserver, or the Python API.
|
||||||
|
|
||||||
|
|
||||||
#### Quickstart
|
#### Quickstart
|
||||||
|
|
||||||
|
**First, get ArchiveBox using your system package manager, Docker, or pip:**
|
||||||
```bash
|
```bash
|
||||||
# 1. Create a folder somewhere to hold your ArchiveBox data
|
# You can run it with Docker or Docker Compose (recommended)
|
||||||
mkdir ~/archivebox && cd ~/archivebox
|
docker pull archivebox/archivebox
|
||||||
docker run -v $PWD:/data -it archivebox/archivebox init
|
# https://raw.githubusercontent.com/ArchiveBox/ArchiveBox/master/docker-compose.yml
|
||||||
|
|
||||||
# 2. Archive some URLs to get started
|
# or Ubuntu/Debian
|
||||||
docker run -v $PWD:/data -t archivebox/archivebox add https://github.com/ArchiveBox/ArchiveBox
|
sudo add-apt-repository -u ppa:archivebox/archivebox
|
||||||
docker run -v $PWD:/data -t archivebox/archivebox add --depth=1 https://example.com
|
apt install archivebox
|
||||||
|
|
||||||
# 3. Then view the snapshots of the URLs you added via the self-hosted web UI
|
# or macOS
|
||||||
docker run -v $PWD:/data -it archivebox/archivebox manage createsuperuser # create an admin acct
|
brew install archivebox/archivebox/archivebox
|
||||||
docker run -v $PWD:/data -p 8000:8000 archivebox/archivebox # start the web server
|
|
||||||
open http://127.0.0.1:8000/ # open the interactive admin panel
|
# or for the Python version only, without wget/git/chrome/etc. included
|
||||||
ls archive/*/index.html # or just browse snapshots on disk
|
pip3 install archivebox
|
||||||
|
|
||||||
|
# If you're using an apt/brew/pip install you can run archivebox commands normally
|
||||||
|
# archivebox [subcommand] [...args]
|
||||||
|
# If you're using Docker you'll have to run the commands like this
|
||||||
|
# docker run -v $PWD:/data -it archivebox/archivebox [subcommand] [...args]
|
||||||
|
# And the equivalent in Docker Compose:
|
||||||
|
# docker-compose run archivebox [subcommand] [...args]
|
||||||
```
|
```
|
||||||
|
|
||||||
|
<small>Check that everything installed correctly with `archivebox --version`</small>
|
||||||
|
|
||||||
|
**To start using archivebox, you have to create a data folder and `cd` into it:**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
mkdir ~/archivebox && cd ~/archivebox # you can put the collection dir anywhere
|
||||||
|
archivebox init
|
||||||
|
```
|
||||||
|
|
||||||
|
**Then Add some URLs to your archive collection:**
|
||||||
|
```bash
|
||||||
|
archivebox add https://github.com/ArchiveBox/ArchiveBox
|
||||||
|
archivebox add --depth=1 https://example.com
|
||||||
|
```
|
||||||
|
|
||||||
|
**View the snapshots of the URLs you added via the self-hosted web UI:**
|
||||||
|
```bash
|
||||||
|
archivebox manage createsuperuser # create an admin acct
|
||||||
|
archivebox server 0.0.0.0:8000 # start the web server
|
||||||
|
open http://127.0.0.1:8000/ # open the interactive admin panel
|
||||||
|
ls ~/archivebox/archive/*/index.html # or browse the snapshots on disk
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
<div align="center">
|
<div align="center">
|
||||||
<img src="https://i.imgur.com/lUuicew.png" width="400px">
|
<img src="https://i.imgur.com/lUuicew.png" width="400px">
|
||||||
<br/>
|
<br/>
|
||||||
|
@ -79,16 +110,9 @@ Description: <div align="center">
|
||||||
|
|
||||||
# Overview
|
# Overview
|
||||||
|
|
||||||
ArchiveBox is a command line tool, self-hostable web-archiving server, and Python library all-in-one. It's available as a Python3 package or a Docker image, both methods provide the same CLI, Web UI, and on-disk data format.
|
ArchiveBox is a command line tool, self-hostable web-archiving server, and Python library all-in-one. It can be installed on Docker, macOS, and Linux/BSD, and Windows. You can download and install it as a Debian/Ubuntu package, Homebrew package, Python3 package, or a Docker image. No matter which install method you choose, they all provide the same CLI, Web UI, and on-disk data format.
|
||||||
|
|
||||||
It works on Docker, macOS, and Linux/BSD. Windows is not officially supported, but users have reported getting it working using the WSL2 + Docker.
|
To use ArchiveBox you start by creating a folder for your data to live in (it can be anywhere on your system), and running `archivebox init` inside of it. That will create a sqlite3 index and an `ArchiveBox.conf` file. After that, you can continue to add/export/manage/etc using the CLI `archivebox help`, or you can run the Web UI (recommended).
|
||||||
|
|
||||||
To use ArchiveBox you start by creating a folder for your data to live in (it can be anywhere on your system), and running `archivebox init` inside of it. That will create a sqlite3 index and an `ArchiveBox.conf` file. After that, you can continue to add/remove/search/import/export/manage/config/etc using the CLI `archivebox help`, or you can run the Web UI (recommended):
|
|
||||||
```bash
|
|
||||||
archivebox manage createsuperuser
|
|
||||||
archivebox server 0.0.0.0:8000
|
|
||||||
open http://127.0.0.1:8000
|
|
||||||
```
|
|
||||||
|
|
||||||
The CLI is considered "stable", the ArchiveBox Python API and REST APIs are in "beta", and the [desktop app](https://github.com/ArchiveBox/desktop) is in "alpha" stage.
|
The CLI is considered "stable", the ArchiveBox Python API and REST APIs are in "beta", and the [desktop app](https://github.com/ArchiveBox/desktop) is in "alpha" stage.
|
||||||
|
|
||||||
|
@ -252,32 +276,19 @@ Description: <div align="center">
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# archivebox <command> [args]
|
# archivebox <command> [args]
|
||||||
|
|
||||||
|
# on Debian/Ubuntu
|
||||||
|
sudo add-apt-repository -u ppa:archivebox/archivebox
|
||||||
|
apt install archivebox
|
||||||
|
|
||||||
|
# on macOS
|
||||||
|
brew install archivebox/archivebox/archivebox
|
||||||
```
|
```
|
||||||
|
|
||||||
First install the system, pip, and npm dependencies:
|
Initialize your archive in a directory somewhere and add some links:
|
||||||
```bash
|
```bash
|
||||||
# Install main dependendencies using apt on Ubuntu/Debian, brew on mac, or pkg on BSD
|
mkdir ~/archivebox && cd archivebox
|
||||||
apt install python3 python3-pip python3-dev git curl wget chromium-browser youtube-dl
|
|
||||||
|
|
||||||
# Install Node runtime (used for headless browser scripts like Readability, Singlefile, Mercury, etc.)
|
|
||||||
curl -s https://deb.nodesource.com/gpgkey/nodesource.gpg.key | apt-key add - \
|
|
||||||
&& echo 'deb https://deb.nodesource.com/node_14.x $(lsb_release -cs) main' >> /etc/apt/sources.list \
|
|
||||||
&& apt-get update \
|
|
||||||
&& apt-get install --no-install-recommends nodejs
|
|
||||||
|
|
||||||
# Make a directory to hold your collection
|
|
||||||
mkdir archivebox && cd archivebox # (can be anywhere, doesn't have to be called archivebox)
|
|
||||||
|
|
||||||
# Install the archivebox python package in ./.venv
|
|
||||||
python3 -m venv .venv && source .venv/bin/activate
|
|
||||||
pip install --upgrade archivebox
|
|
||||||
|
|
||||||
# Install node packages in ./node_modules (used for SingleFile, Readability, and Puppeteer)
|
|
||||||
npm install --prefix . 'git+https://github.com/ArchiveBox/ArchiveBox.git'
|
npm install --prefix . 'git+https://github.com/ArchiveBox/ArchiveBox.git'
|
||||||
```
|
|
||||||
|
|
||||||
Initialize your archive and add some links:
|
|
||||||
```bash
|
|
||||||
archivebox init
|
archivebox init
|
||||||
archivebox add 'https://example.com' # add URLs as args pipe them in via stdin
|
archivebox add 'https://example.com' # add URLs as args pipe them in via stdin
|
||||||
archivebox add --depth=1 https://example.com/table-of-contents.html
|
archivebox add --depth=1 https://example.com/table-of-contents.html
|
||||||
|
@ -396,7 +407,7 @@ Description: <div align="center">
|
||||||
- [Supported Outputs](https://github.com/ArchiveBox/ArchiveBox/wiki#can-save-these-things-for-each-site)
|
- [Supported Outputs](https://github.com/ArchiveBox/ArchiveBox/wiki#can-save-these-things-for-each-site)
|
||||||
- [Scheduled Archiving](https://github.com/ArchiveBox/ArchiveBox/wiki/Scheduled-Archiving)
|
- [Scheduled Archiving](https://github.com/ArchiveBox/ArchiveBox/wiki/Scheduled-Archiving)
|
||||||
- [Publishing Your Archive](https://github.com/ArchiveBox/ArchiveBox/wiki/Publishing-Your-Archive)
|
- [Publishing Your Archive](https://github.com/ArchiveBox/ArchiveBox/wiki/Publishing-Your-Archive)
|
||||||
- [Chromium Install](https://github.com/ArchiveBox/ArchiveBox/wiki/Install-Chromium)
|
- [Chromium Install](https://github.com/ArchiveBox/ArchiveBox/wiki/Chromium-Install)
|
||||||
- [Security Overview](https://github.com/ArchiveBox/ArchiveBox/wiki/Security-Overview)
|
- [Security Overview](https://github.com/ArchiveBox/ArchiveBox/wiki/Security-Overview)
|
||||||
- [Troubleshooting](https://github.com/ArchiveBox/ArchiveBox/wiki/Troubleshooting)
|
- [Troubleshooting](https://github.com/ArchiveBox/ArchiveBox/wiki/Troubleshooting)
|
||||||
- [Python API](https://docs.archivebox.io/en/latest/modules.html)
|
- [Python API](https://docs.archivebox.io/en/latest/modules.html)
|
||||||
|
|
|
@ -1,6 +1,6 @@
|
||||||
{
|
{
|
||||||
"name": "archivebox",
|
"name": "archivebox",
|
||||||
"version": "0.4.24",
|
"version": "0.5.0",
|
||||||
"description": "ArchiveBox: The self-hosted internet archive",
|
"description": "ArchiveBox: The self-hosted internet archive",
|
||||||
"author": "Nick Sweeting <archivebox-npm@sweeting.me>",
|
"author": "Nick Sweeting <archivebox-npm@sweeting.me>",
|
||||||
"license": "MIT",
|
"license": "MIT",
|
||||||
|
|
Loading…
Reference in a new issue