MomoProductCrawler

This is a crawler script for MOMO website, to get vendor information, image there.

Developed with python. Use selenium open browser to connect website and get the vendor info which extracting all the image URL and information through BeautifulSoup library from html, then download vendor’s image and stored vendor’s name, price, category, vendor name, etc… to MongoDB.

Get Started

Environment

$ brew install python3
$ curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
$ python3 get-pip.py
$ pip install virtualenv
$ git clone [email protected]:surpasstw/paritytw/MomoProductCrawler.git
$ cd MomoProductCrawler

Create an independent environment

$ virtualenv venv

Entering the environment

$ source venv/bin/activate

Install requirement packages

$ pip3 install -r requirements.txt

Install mongodb

$ brew install mongodb

Run

$ mkdir -p result/db
$ python app.py -r result -d mongo -dbpath result/db

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
screenshots		screenshots
.gitignore		.gitignore
README.md		README.md
app.py		app.py
catchimg3.json		catchimg3.json
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MomoProductCrawler

Get Started

Environment

Run

Screenshots

About

Releases

Packages

Contributors 2

Languages

tonyydl/MomoProductCrawler

Folders and files

Latest commit

History

Repository files navigation

MomoProductCrawler

Get Started

Environment

Run

Screenshots

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages