Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Save/Commit Function - 3 Methods/Designs: NO writing to Remoteworksheet.update([dataframe.columns.values.tolist()] + dataframe.values.tolist()) #44

Open
iPoetDev opened this issue Jun 2, 2023 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@iPoetDev
Copy link
Owner

iPoetDev commented Jun 2, 2023

For the final stage of ETL is to commit changes to remote (Google Sheets), no method was found to work.

There are no exceptions, just a function call to remote
and no data state update.

For the save/commit to the remote (i..e the stateful data source): 3 attempts at three approaches to get the local dataset to either|

  1. INTEGRATE Overwrite using the specialist gspread_dataframe library as per https://gspread-dataframe.readthedocs.io/en/latest/#gspread_dataframe.set_with_dataframe & PerplexityAI
    https://www.perplexity.ai/search/2a256f73-47dc-4117-bbab-87e4c0a7cbe1?s=c

2 UPDATE: Overwrite using the basic gspread sheet.update as per
https://docs.gspread.org/en/latest/user-guide.html#using-gspread-with-pandas


  1. INJECTION: Inject a series data, which is the main data object for ETL, AND with support from Perplexity
    https://www.perplexity.ai/search/33fb1a34-54aa-49d4-84aa-45b5d846eba8?s=c

[pycriteria] controller.py (Lines 2031-2032)


 def save(self, saved: pd.DataFrame, series: pd.Series, index: int, action: str, debug: bool = False) -> None:
        """ Saves the dataframe and commits it to the remote source"""

Open in IDE · Open on GitHub
[pycriteria] controller.py (Lines 2033-2041)


# 1. Prompt the user to save the updated DataFrame
        if click.confirm("Are you ready to commit changes?"):
            sheet: gspread.Worksheet = Controller.load_wsheet()
            # 2. Check for Validation Client and Worksheet ID presence
            if sheet.client is not None and \
                    isinstance(sheet.client, gspread.Client):
                if sheet.id is not None and \
                        sheet.get_all_records():
                    

Open in IDE · Open on GitHub
[pycriteria] controller.py (Lines 2042-2064)


# 3. Convert sheet to a target DataFrame
                    target: pd.DataFrame = \
                        DataController.load_dataframe_wsheet(sheet)
                    # 4. SAVE ATTEMPT 1: INTEGRATE a single record into the target
                    integratedframe: pd.DataFrame = self.integrate(
                            single=saved,
                            source=target,
                            index=index)
                    # 4. SAVE ATTEMPT 1b: switch based on TEST Saving mode
                    # s1 = Use integratedframe &
                    # gspread_dataframe & set_with_dataframe => set_remote
                    # This is not commment out code, it is annotation
                    if action == 's1' and debug is False:
                        saving = integratedframe.astype(str)
                        set_remote(worksheet=sheet,
                                   dataframe=saving,
                                   row=1,
                                   col=1,
                                   include_index=False,
                                   include_column_header=True,
                                   resize=False,
                                   allow_formulas=False,
                                   string_escaping='default')

Open in IDE · Open on GitHub
[pycriteria] controller.py (Lines 2065-2075)


# 5. SAVE ATTEMPT 2: Commit the updated DataFrame to remote
                    # Use the integratedframe as a overwrit: sheet.update
                    # Using https://docs.gspread.org/en/latest/user-guide.html#using-gspread-with-pandas
                    elif action == 's2' and debug is False:
                        
                        sheet.update([saving.columns.values.tolist()] \ 
                                     + saving.values.tolist())
                        # Apparently, this is not working as documented.
                        # Expected type 'list' (matched generic type 'list[_T]')
                        # , got 'object' instead
                        pass

Open in IDE · Open on GitHub
[pycriteria] controller.py (Lines 2077-2084)


 # 6. SAVE ATTEMPT 3: Inject the updated series into the remote
                    # source, via the series row and index parameters.
                    # BY matching on the Position column, primary key
                    elif action == 's3' and debug is True:
                        self.injection(series=series,
                                       sheet=sheet,
                                       row=index,
                                       debug=debug)

Open in IDE · Open on GitHub
[pycriteria] controller.py (Lines 2110-2128)


 @staticmethod
    def integrate(single: pd.DataFrame,
                  source: pd.DataFrame,
                  index: int,
                  reset=True,
                  debug: bool = False) -> pd.DataFrame:
        """ Merges the source and target DataFrames"""
        # merge the single-row DataFrame into
        # the source DataFrame at the same index
        # https://www.perplexity.ai/search/2a256f73-47dc-4117-bbab-87e4c0a7cbe1?s=c
        # drop any NaN values from the single-row DataFrame
        single = single.dropna()
        
        if debug is True:
            rich.inspect(single)
        
        return pd.concat([source.iloc[:index],
                          single,
                          source.iloc[index:]]).reset_index(drop=reset)

Open in IDE · Open on GitHub
[pycriteria] controller.py (Lines 2130-2136)


@staticmethod
    def convertto(lt):
        """ Converts the list of lists to a list of strings without nan vals
        
        """
        
        return [[str(element) for element in sublist if str(element) != 'nan'] for sublist in lt]

Open in IDE · Open on GitHub
[pycriteria] controller.py (Lines 2089-2108)


@staticmethod
    def injection(series: pd.Series, sheet, row: int, debug: bool = False) -> None:
        """ Injects the updated series into the remote source, via the row"""
        records = sheet.get_all_records()
        
        row_id = None
        for record in records:
            if record['Position'] == series['Position']:
                row_id = int(record[row])
                break
        rich.inspect(row_id)
        # Update the row with the values from the series
        if row_id is not None:
            values = [series[key] for key in series.keys()]
            
            if debug is True:
                rich.inspect(values)
            
            if debug is False:
                sheet.update(str(row_id), values)

Open in IDE · Open on GitHub

Created from JetBrains using CodeStream

@iPoetDev iPoetDev self-assigned this Jun 2, 2023
@issuelabeler issuelabeler bot added the bug Something isn't working label Jun 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant