-
Notifications
You must be signed in to change notification settings - Fork 0
Custom checks and extractions
Lu edited this page May 24, 2023
·
1 revision
This page describes how to bypass the internal check mechanism to use a custom check/extraction method for difficult to pass services.
In the case of ex. quora, the general approach of checking the service does not work. Additional headers are needed. To bypass the general logic, do the following:
- Create a file
<name of your service>.py
in/service_bypass
- Set the
method: <name of your service>
in the YAML file for your service - Make sure your file follows the following structure:
# NOT PART OF THE TEMPLATE
# Use: from methods import extract_data
def extract_data(html: str, blocks: Dict[str, Dict[str, str]], url: str):
# ...
# START OF THE TEMPLATE
def check(username) -> Tuple[bool, str]:
# do your request magic here
return ("<Whether user exists or not>", "<The URL with the given username>")
def extract(username) -> Dict[any, any]:
# do your request magic here
# extract the data and return it as dict
# or use the extract_data function
# html = your request response
# blocks: See YAML blueprint for explanation on blocks
# url = URL of the service; used to convert relative URLs
results = # a dictionary of results
return results
Thanks for reading all this stuff and I'm happy to see and merge your contributions. Be it either a new feature, a new service, or simply improvements to the code.