Previous: Enclosures, Up: Usage
Similarly to enclosures downloading, you may run
downloading of X-URL
URLs, pointing to the article itself. If it
is a HTML document, it can depend on various other resources, like
images and stylesheets. GNU Wget
has the ability to download it with all required requisites. Moreover it is
able to output the whole document in
WARC format.
$ ./feeds-warcs [...] www.darkside.ru_news_rss/warcs/20220218-145755-www.darkside.ru_news_140480.warc [...]
It is not compressed by default. You can both view and compress them
with tofuproxy’s
help as an option. After you get pile of various *.warc files,
you can simply add them to running tofuproxy
:
$ for w (feeds/*/warcs/*.warc) print $w:a > path/to/tofuproxy/fifos/add-warcs
And then visit http://warc/ URL (when tofuproxy
already
acts as a proxy) to view and visit existing URLs.
Of course you can download only single feed:
$ cmd/warcs path/to/FEED [optional overriden destination directory]