This repository has been archived on 2021-12-02. You can view files and clone it, but cannot push or open issues or pull requests.
ImageScrapper/README.md

54 lines
1.4 KiB
Markdown

# Scrappers
Two scrappers:
* The 4chancrape one dls all images from a thread in best res
* The other one simply looks for "img" in any given page and downloads images
* 4chanthreadfinder looks for a keyword in thread names, and dls all images from relevant threads
## 4chanscrape, imgscrape
Install depedencies:
```
python3 -m pip install beautifulsoup4 mechanicalsoup wget --user
```
Use:
```
./4chanscrape.py -u https://boards.4channel.org/c/thread/3846676/gunsmith-cats-thread -f ./downloads
```
* -u : URL of the page
* -f : folder where you want to download all pictures
## 4chanthreadfinder
Install depedencies:
```
python3 -m pip install beautifulsoup4 mechanicalsoup wget --user
```
Use (oneshot):
```
./4chanthreadfinder.py -u https://boards.4chan.org/b/ -f ./downloads/thread -k 'ylyl thread'
```
* -u : URL of the page
* -f : folder where you want to download all pictures
* -k : keyword or keyphrase to search (better use a single word !)
Use (constant, multi-threaded):
```
./4chanthreadfinder.py -u https://boards.4chan.org/b/ -f ./downloads/threads -k 'thread' -c -t 3
```
* -u : URL of the page
* -f : folder where you want to download all pictures
* -k : keyword or keyphrase to search (better use a single word !)
* -c : constant : enables constant downloading
##Todo
* Filter by filetype
* Make a pretty website with some keywords running in the bg, making for some nice public folders (wallpapers...)