Skip to content

Custom checks and extractions

Lu edited this page May 24, 2023 · 1 revision

This page describes how to bypass the internal check mechanism to use a custom check/extraction method for difficult to pass services.


How to bypass the internal logic?

In the case of ex. quora, the general approach of checking the service does not work. Additional headers are needed. To bypass the general logic, do the following:

  • Create a file <name of your service>.py in /service_bypass
  • Set the method: <name of your service> in the YAML file for your service
  • Make sure your file follows the following structure:

Structure of a custom script

# NOT PART OF THE TEMPLATE
# Use: from methods import extract_data
def extract_data(html: str, blocks: Dict[str, Dict[str, str]], url: str):
  # ...

# START OF THE TEMPLATE
def check(username) -> Tuple[bool, str]:
  # do your request magic here
  return ("<Whether user exists or not>", "<The URL with the given username>")

def extract(username) -> Dict[any, any]:
  # do your request magic here
  # extract the data and return it as dict
  # or use the extract_data function
  # html = your request response
  # blocks: See YAML blueprint for explanation on blocks
  # url = URL of the service; used to convert relative URLs
  results = # a dictionary of results
  return results