1
0
Fork 0

add bit about live updating

This commit is contained in:
Nick Sweeting 2017-05-05 06:56:12 -04:00 committed by GitHub
parent 031a9ec176
commit b12d7e53bc

View file

@ -8,20 +8,22 @@ Save an archived copy of all websites you star using Pocket, indexed in an html
## Quickstart
**Runtime:** I've found it takes about an hour to download 1000 articles, and they'll take up roughly 1GB.
Those numbers are from running it on my i5 4-core machine with 50mbps down. YMMV.
`archive.py` is a script that takes a [Pocket](https://getpocket.com/export) export, and turns it into a browsable html archive that you can store locally or host online.
**Dependencies:** Google Chrome headless, wget
**Runtime:** I've found it takes about an hour to download 1000 articles, and they'll take up roughly 1GB.
Those numbers are from running it signle-threaded on my i5 machine with 50mbps down. YMMV.
**Dependencies:** Google Chrome headless, wget, python3
```bash
brew install Caskroom/versions/google-chrome-canary
brew install wget
brew install wget python3
# OR on linux
wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | sudo apt-key add -
sudo sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list'
apt update; apt install google-chrome-beta
apt update; apt install google-chrome-beta python3 wget
```
**Archiving:**
@ -42,6 +44,14 @@ You can tweak parameters like screenshot size, file paths, timeouts, etc. in `ar
You can also tweak the outputted html index in `index_template.html`. It just uses python
format strings (not a proper templating engine like jinja2), which is why the CSS is double-bracketed `{{...}}`.
**Live Updating:** (coming soon)
It's possible to pull links via the pocket API instead of downloading an html export.
Once I write a script to do that, we can stick this in `cron` and have it auto-update on it's own.
For now you just have to download `ril_export.html` and run `archive.py` each time it updates. The script
will run fast subsequent times because it only downloads new links that haven't been archived already.
## Publishing Your Archive
The pocket archive is suitable for serving on your personal server, you can upload the pocket