Skip to content

Kathange/crawler_for_unsplash

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

81 Commits
 
 
 
 
 
 
 
 

Repository files navigation

crawler_for_unsplash

Use BeautifulSoup & Selenium to download images

Image Source : unsplash

pip install beautifulsoup4 selenium requests

I'm not sure if you need to use chromedriver. Just in case, you can download.

  1. Check your Chrome Version

    settings -> about chrome -> your chrome version
  2. Chrome Driver Download

    If you don't see a version here that works for you, go to Chrome for Testing(CfT)

  3. After compressing the zip file, just put the exe file and the program in the same folder.

crawler_small_img.py

crawler_small_img.py Reference : Crawler Download Image

This program is more simple and easier to understand.

Just follow the reference to run the program.

Things that need to be adjusted before each execution of the program.

  1. input_image variable is search keyword
  1. img class is different, you should find yours
  2. limit variable can change number of images

crawler_large_img.py

crawler_large_img.py is referenced from everywhere, so it has no reference.

But in crawler_small_img.py, only up to 20 pictures can be downloaded because there is a button called "Load more", you can see more pictures after clicking this button.

Besides, after click the button, only got 40 pictures. If you want to get more pictures, just add sliding window function.

Things that need to be adjusted before each execution of the program.

  1. input_image variable is search keyword
  1. button class is different, you should find yours
  1. sliding window setup, can change number of strolls
  1. img class is different, you should find yours
  2. limit variable can change number of images

Releases

No releases published

Packages

No packages published

Languages