Skip to content

Selenium Python based LinkedIn Job scraper that scrapes Job listings on LinkedIn and sends the data automatically to specified mail.

License

Notifications You must be signed in to change notification settings

kumarAnand05/LinkedIn-Job-Scraper-with-Automated-Email-Funcionality

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LinkedIn Job Scraper with Automated Email Functionality

By Anand Kumar

Features

  • Fast Scrapping : Quickly extracts job data from LinkedIn
  • Advanced Job description filtering : Scans each job description, filters out jobs which doesn't matches your year of experience.
  • Adjustability to slow network : Auto optimization to slow internet connection with a single user input.
  • Reduced bot action detection : Mitigates the risk of being detected as a bot during data scraping.
  • Automated emailing functionality of scrapped data : Sends scrapped data to designated email addresses automatically.

⚠️NOTE :

I want to make it clear that I strongly discourage any attempts to scrape data from LinkedIn. If anyone is considering such actions, I urge them to first review LinkedIn's official policy regarding the use of scraping software, as outlined in their statement. This project was carried out as a implementation of knowledge during my learning journey and not to use the the LinkedIn data for any unauthorized usage and this project should not be used for any attempt to do so.

Instructions

After you have downloaded the project files. Follow the instructions below to setup your machine to make code functional.

Downloading/Installing dependencies

Of course you need Python and an IDE like VSCode, PyCharm etc. installed on your machine. Along with it you need to install/download some other packages on your machine which are mentioned below.

Install Selenium Python

Install Selenium python on your machine using command line/terminal. For any installation help please read Selenium Python Documentation

Download Selenium Webdriver

The code is written for automation of Chrome browser, however the code can be used to automate Edge Browser, Firefox and Safari as well by modifying the code line 52 of main.py file driver = webdriver.Chrome(options=browser_options) according to the browser of choice.

If you do not want to change any code then first Download Chrome on you system then follow instruction below. Go to settings in chrome > then go to about chrome Now note the build version of your chrome

For example say build version is 111.0.5563.65. Then visit selenium webdriver page for chrome From their download the latest version of zip file of webdriver for your operating system whose start version is same as your Chrome build version (in this case 111). Then unzip the folder open it then copy the path location of the folder on your local machine. For Example "C:\Users\Anand\Downloads\webdriver". Paste this location within quotes of line 51 of main.py file os.environ['PATH'] += r'C:\Users\Anand\Downloads\webdriver'

Install Pandas

Install the Pandas library on your machine using command line/terminal. For any installation help please read Pandas Documentation

Email Account Setup

Note that you need to have two mail ids for this functionality so in case you don't have two mail accounts then create it.Open the mail_system.py file in your IDE and in line 22 enter the email id which you want to use as sender within quotes.

For eg. user_mail = '[email protected]' and in line 24 provide the mail id within quotes on which you want to receive the Scraped Data. For eg. recipient = '[email protected]'

Setting passcode of sender's mail

The mail functionality is supported by the SMTP (Simple Mail Transfer Protocol) library which is in-built in Python latest versions. You cannot use your mail id password for this process as the popular mail services like Google Mail, Yahoo Mail blocks this action due to security reasons. So you need to create app password to use this. Please visit on the links below to know how to setup app passwords.

After you have created the app password. Enter the password in line 23 of mail_system.py file within quotes.

For eg. passcode = 'yourapppasscode'

Your machine is ready now!!!

Simply open the whole Job scraper folder in your IDE and run the main.py file to start LinkedIn Job Scraper. After entering the required input in the console, the browser will automatically begin scraping job listings based on the specified criteria. The browser closes automatically after the code completes running successfully.

How to comment or uncomment any line? : Simply click on the line that you want to comment or uncomment and with Ctrl/⌘ pressed on keyboard click forward slash key '/' on the keyboard.

Dos and Don'ts

Do's

  • You can use your machine during the process.
  • You can keep the browser and IDE in background.

Don'ts

  • Do not click on any element of the webpage as it can lead to termination of the code.
  • Do not use console during the process.
  • Do not turn off internet or close the automated browser session.
  • To prevent any unexpected action against your LinkedIn account, please do not have your browser logged in to LinkedIn which you are going to automate. Care to logout before running the scraper.

About

Selenium Python based LinkedIn Job scraper that scrapes Job listings on LinkedIn and sends the data automatically to specified mail.

Topics

Resources

License

Stars

Watchers

Forks

Languages