Skip to content

Showcase visualizations about Osaka Average Hotel Price. The data was collected from Booking.com

License

Notifications You must be signed in to change notification settings

sakan811/Find-Osaka-Average-Hotel-Price

Repository files navigation

Find the Hotel's Average Room Price in Osaka

Showcase visualizations about the hotel's Average Room Price in Osaka.

Average Nightly Room Price for one adult, one room.

Price in USD.

Find the Hotel's Average Room Price in Japan

Showcase visualizations about the hotel's Average Room Price for all prefectures in Japan.

Average Nightly Room Price for one adult, one room.

Price in USD.

Built on top of Find the Hotel's Average Room Price in Osaka project.

Status

CodeQL

Scraper Test

Scrape

Visualizations

Average Room Price in Osaka:

Click here for visualizations of this project.

Average Room Price for all Prefectures in Japan:

Project Details

Find the Hotel's Average Room Price in Osaka Project

Collect Osaka hotel property data from Booking.com

Data collecting period for Year 2025:

Consists of room price from

Data was collected daily using GitHub action.

Consists of Basic GraphQL and Whole-Month GraphQL scraper.

These scrapers can also be used to scrape data from other cities in Japan.

Find the Hotel's Average Room Price in Japan Project

Collect Japan hotel property data for all Prefectures from Booking.com

Data collecting dates: 23 Aug 2024.

Use Japan GraphQL scraper to scrape data.

Collected Data Archive

Click here to access a document about collected data.

To scrape hotel data

Setup Project

To scrape using Whole-Month GraphQL Scraper:

  • Example usage, with only required arguments for Whole-Month GraphQL Scraper:
    python main.py --whole_mth --year=2024 --month=12 --city=Osaka
  • Scrape data start from the given day of the month to the end of the same month.
    • Default start day is 1.
    • Start day can be set with --start_day argument.
  • Data is saved to SQLite.
    • The SQLite database is created automatically if it doesn't exist.
    • Default SQLite path is avg_japan_hotel_price_test.db.
    • SQLite path can be set with --sqlite_name argument.

To scrape using Basic GraphQL Scraper:

  • Example usage, with only required arguments for Basic GraphQL Scraper:
    python main.py --city=Osaka --check_in=2024-12-25 --check_out=2024-12-26 --scraper
  • Data is saved to SQLite.
    • The SQLite database is created automatically if it doesn't exist.
    • Default SQLite path is avg_japan_hotel_price_test.db.
    • SQLite path can be set with --sqlite_name argument.

To scrape using Japan GraphQL Scraper:

  • Example usage, with only required arguments for Japan GraphQL Scraper:
    python main.py --japan_hotel
  • Data is saved to DuckDB.
    • The DuckDB database is created automatically if it doesn't exist.
    • Default DuckDB path is japan_hotel_data_test.duckdb.
    • DuckDB path can be set with --duckdb_name argument.

Scraper's Arguments

Click here for Scraper's arguments details.

Find the missing dates in the database using Missing Date Checker

To ensure that all dates of the month were scraped, a function in check_missing_dates.py will check in the given SQLite database to find the missing dates.

Made only for the Find the Hotel's Average Room Price in Osaka project which saves scraped data in SQLite.

  • To check in the database, use the following command line as an example, only include required argument:
    python check_missing_dates.py --city=Osaka
  • If there are missing dates, a Basic Scraper will automatically start to scrape those dates.
    • Missing Date Checker shares arguments with Basic Scraper.
    • Arguments parsed to Missing Date Checker should be the same as used with Basic Scraper.
  • Only check the missing dates of the data that was scraped today in UTC time.
  • Only check the months that were scraped and loaded to the database.
  • The SQLite database can be specified with --sqlite_name
    • Default is avg_japan_hotel_price_test.db
  • Year of dates can be specified with --year
    • Default is the current year.

Code Base Details

Click here to read a brief docs of the scripts.

Click here to see the flowchart of this codebase.