Simple CLI content crawler for Joyreactor. He'll find all media content on the page you've provided and save it. If there will be any kind of pagination... he'll go through all pages as well unless you'll tell him to not.
Here's the quickest way to download something and test the crawler:
- Download a build according to your OS.
- Pick some URL from Joyreactor.
- Run the crawler
$ reactor-crw -p "http://joyreactor.cc/tag/digital+art"
There's a list of optional flags that adds a little more control over the crawler.
$ reactor-crw --help Allows to quickly download all content by its direct url or entire tag or fandom from joyreactor.cc. Example: reactor-crw -d "." -p "http://joyreactor.cc/tag/someTag/all" -w 2 -c "cookie-string" Usage: reactor-crw [flags] Flags: -c, --cookie string User's cookie. Some content may be unavailable without it -d, --destination string Save path for content. Default value is a user's home folder (example C:\Users\username for Windows) (default "/home/avpretty") -h, --help help for reactor-crw -p, --path string Provide a full page URL -s, --search string A comma separated list of content types that should be downloaded. Possible values: image,gif,webm,mp4. Example: -s "image,webm" (default "image,gif") -o, --single-page Crawl only one page -w, --workers int Amount of workers (default 1)
From all flags only
-p --path is required. All other flags can be omitted and default values will be used.
Here's another example:
$ reactor-crw -p "http://joyreactor.cc/post/000000" -d "." -s "mp4" -o -c "cookies from joyreactor"
This one will download only
mp4 content from the post and will save it to the current directory.
-o means that only the current page will be parsed, and the user's cookie
-s will be used by the crawler.
Note: some content may be parsed only with user's cookie.