Skip to content

Latest commit

 

History

History
12 lines (9 loc) · 903 Bytes

File metadata and controls

12 lines (9 loc) · 903 Bytes

Lenovo-Laptops-Flipkart-Data-Extraction

Beautiful Soup is a Python library for getting data out of HTML, XML, and other markup languages. Beautiful Soup helps to pull particular content from a webpage, remove the HTML markup, and save the information. It is a tool for web scraping that helps to clean up and parse the documents that have pulled down from the web. In this work, data scrapping is done from Flipkart web page that contains the list of Lenovo laptops. Intially, url is input for subsequent pages of lenovo laptops with the help of function (inspecting the pages). This is followed by the use of beautiful soup to parse the HTML contents by creating trees that makes the parsing easier. Thus, the extracted data links are made active and saved to attached csv file.

Libraries Used:

bs4, re, requests, BeautifulSoup

Programing Language

Python

IDE Used

Jupyter Notebook